Case Study: Extract All Substack Article Titles and Links. Part A: Extract Individual Article Data
Using Selenium WebDriver to Extract the title, published date and link from the first Substack article.
5 min readNov 30, 2024
Non-Medium Members: Read this article free on Substack.
This article series:
- Part A: Extract Individual Article Data
- Part B: Extract 25 articles on one page
- Part C: Extract All
- Part D: Publish
- Part E: Annotation by Zhimin *
(offering valuable tips for test automation engineers to level up their skills, exclusively available on Substack)
The Task
My father has been transferring his articles — and mine — from Medium to Substack. It’s been a significant effort, especially with updating links across hundreds of articles. He assigned me the task of extracting all published Substack articles (titles and links) to compile them into a single page.
It is easy to illustrate with images.
- Substack only lists 25 articles per page (and we have 500+)