Case Study: Extract All Substack Article Titles and Links. Part A: Extract Individual Article Data

Using Selenium WebDriver to Extract the title, published date and link from the first Substack article.

Courtney Zhan

--

Non-Medium Members: Read this article free on Substack.

This article series:

  • Part A: Extract Individual Article Data
  • Part B: Extract 25 articles on one page
  • Part C: Extract All
  • Part D: Publish
  • Part E: Annotation by Zhimin *
    (offering valuable tips for test automation engineers to level up their skills, exclusively available on Substack)

The Task

My father has been transferring his articles — and mine — from Medium to Substack. It’s been a significant effort, especially with updating links across hundreds of articles. He assigned me the task of extracting all published Substack articles (titles and links) to compile them into a single page.

It is easy to illustrate with images.

  • Substack only lists 25 articles per page (and we have 500+)

--

--