Member-only story

Case Study: Extract All Substack Article Titles and Links. Part C: Extract All

Handling Pagination.

Courtney Zhan

3 min readDec 14, 2024

Non-Medium Members: Read this article free on Substack.

This article series:

Part A: Extract Article Data
Part B: Extract 25 articles on one page
Part C: Extract All
Part D: Publish
Part E: Annotation by Zhimin Zhan*
(offering valuable tips for test automation engineers to level up their skills, exclusively available on Substack)

After Part B, I got all 25 article data from the first page in a proper CSV file.

Extract All 500+ Articles Out

Let’s focus on extracting the 2nd page’s articles first.

Clicking the “Next Page” button.

driver.action.scroll_by(0, 2500).perform # to the bottom
next_button_xpath = ".../button[2]" # hide xpath intentionally 
next_page_btn = driver.find_element(:xpath, next_button_xpath)
next_page_btn.click
sleep 2

Case Study: Extract All Substack Article Titles and Links. Part C: Extract All

Handling Pagination.

Extract All 500+ Articles Out

Written by Courtney Zhan

No responses yet