Case Study: Automation Script to Extract the Top 10 Authors featured in the Software Testing Newsletters

To verify my father’s claim that he was the “most featured author in the leading software testing newsletters”.

Analyse

from software testing weekly issue #153 (2023–01–29)
driver.find_elements("//div[@class='issue__body']//p/a")
stw_categories = %w(cc-news cc-automation cc-toools cc-books cc-videos)
stw_categories.each do |category| 
section_links = driver.find_elements(:xpath,
"//section[@class='category #{category}']//div/p/a")
# ...
end
next if the_link_text.split.size > 3
exclude_words = ["this Reddit thread", "Test Model", "Architect of Quality"]
next if exclude_words.any?{|x| the_link_text.include?(x) }

Execution

it "Extract authors in Software Testing Weekly #153" do
driver.get("https://softwaretestingweekly.com/issues/153")
stw_categories = %w(cc-news cc-automation cc-toools cc-tools cc-books cc-videos)
exclude_words = ["this Reddit thread", "Test Model", "Architect of Quality"]
category_links = []
stw_categories.each do |category|
section_links = driver.find_elements(:xpath, "//section[@class='category #{category}']//div/p/a")
section_links.each do |one_link|
the_link_text = one_link.text
next if the_link_text.split.size > 3
next if exclude_words.any? { |x| the_link_text.include?(x) }
category_links << one_link
end
end
author_names = category_links.collect { |elem| elem.text }
puts "\n" + author_names.size.to_s + " in total"
end
Running one test step in TestWise debugging mode.
Antoine Craske 
Ricardo Bedin
Alan Richardson
Daniel Lehner
Maciej Rojek
Martin Ivison
Ioan Solderea
John Ferguson Smart
Elizabeth Zagroba
Jeff Cechinel
Paul de Witt
Criss Chan
Zhimin Zhan
Lutfi Fitroh Hadi
Dan Neciu
Debojyoti Chatterjee
Zhimin Zhan
Malith Senadheera
Nikola Dimic
Mike Harris
Jennifer Columbe
John Miller
puts author_names.tally
{"Antoine Craske"=>1, "Ricardo Bedin"=>1, "Alan Richardson"=>1, 
"Daniel Lehner"=>1, "Maciej Rojek"=>1, "Martin Ivison"=>1,
"Ioan Solderea"=>1, "John Ferguson Smart"=>1, "Elizabeth Zagroba"=>1,
"Jeff Cechinel"=>1, "Paul de Witt"=>1, "Criss Chan"=>1,
"Zhimin Zhan"=>2, "Lutfi Fitroh Hadi"=>1, "Dan Neciu"=>1,
"Debojyoti Chatterjee"=>1, "Malith Senadheera"=>1, "Nikola Dimic"=>1,
"Mike Harris"=>1, "Jennifer Columbe"=>1, "John Miller"=>1}
sorted = author_names.tally.sort_by(&:last)
[["John Miller", 1],
["Ricardo Bedin", 1],
["Alan Richardson", 1],
["Daniel Lehner", 1],
["Maciej Rojek", 1],
["Martin Ivison", 1],
["Ioan Solderea", 1],
["John Ferguson Smart", 1],
["Elizabeth Zagroba", 1],
["Jeff Cechinel", 1],
["Paul de Witt", 1],
["Criss Chan", 1],
["Antoine Craske", 1],
["Lutfi Fitroh Hadi", 1],
["Dan Neciu", 1],
["Debojyoti Chatterjee", 1],
["Malith Senadheera", 1],
["Nikola Dimic", 1],
["Mike Harris", 1],
["Jennifer Columbe", 1],
["Zhimin Zhan", 2]]
[["Zhimin Zhan", 2],
["Jennifer Columbe", 1],
...
]
top_10 = sorted[..9]
author_names = []
(56..153).each do |issue_no|
puts "Issue: #{issue_no}"
driver.get("https://softwaretestingweekly.com/issues/#{issue_no}")

# ... see above to extract one
# ...
author_names << the_link_text

sleep 1 # don't hit the server too hard
end

Note: I added a sleep of 1 second in between loading each issue to prevent spamming the server too much.

[
["Dennis Martinez", 40],
["Zhimin Zhan", 37],
["Antoine Craske", 37],
["Maaret Pyh\u00E4j\u00E4rvi", 30],
["Gleb Bahmutov", 28],
["Pramod Dutta", 24],
["Michael Bolton", 18],
["Mike Harris", 18],
["Callum Akehurst-Ryan", 16],
["Gil Zilberfeld", 16]
]
[
["Zhimin Zhan", 32],
["Gleb Bahmutov", 28],
["Dennis Martinez", 27],
["Pramod Dutta", 24],
["Gil Zilberfeld", 14],
["Oleksandr Romanov", 12],
["Filip Hric", 12],
["Paul Grizzaffi", 11],
["Marie Drake", 11],
["NaveenKumar Namachivayam", 11]
]

My father’s featured count in Coding Jag

A sample article in Coding Jag.
 links = driver.find_elements(:tag_name, "a")
link_texts = links.collect { |x| x["href"] }
zhimin_links = link_texts.compact.select { |y| y.include?("zhiminzhan") }.uniq
zhimin_total_count += zhimin_links.count
Total number of articles by Zhimin Zhan on Coding Jag: 60

Summary

Full Test Script

require 'rspec'
require 'selenium-webdriver'

describe "Analyse Popular Authors In Software Testing Newsletters" do
before(:all) do
@driver = Selenium::WebDriver.for(:chrome)
driver.manage().window().resize_to(1280, 720)
end
after(:all) do
driver.quit
end

def driver
@driver
end

it "Extract authors in Software Testing Weekly #56 to #153" do
stw_categories = %w(cc-news cc-automation cc-toools cc-tools cc-books cc-videos)
exclude_words = ["this Reddit thread", "Test Model", "Architect of Quality", "k6", "Cypress", "Playwright", "Postman"]
author_names = []

# 56
(56..153).each do |issue_no|
puts "Issue: #{issue_no}"
driver.get("https://softwaretestingweekly.com/issues/#{issue_no}")
sleep 0.5
stw_categories.each do |category|
section_links = driver.find_elements(:xpath, "//section[@class='category #{category}']//div/p/a")
section_links.each do |one_link|
the_link_text = one_link.text
next if the_link_text.split.size > 3
next if exclude_words.any? { |x| the_link_text.include?(x) }
author_names << the_link_text
end
end
sleep 1 # don't hit the server too hard
end
puts "\n" + author_names.size.to_s + " in total"
metrics = author_names.tally
sorted = metrics.sort_by { |_key, value| value }
sorted.reverse! # => the most poplular first
top_10 = sorted[..9]

#
File.open("/tmp/stw_authors.txt", "w").puts(sorted.inspect) if RUBY_PLATFORM =~ /darwin/
puts top_10.inspect
end
end
> rspec analyse_stw_top_authors_spec.rb
it "Coding Jag" do
first_issue = 22
latest_issue = 125

zhimin_total_count = 0
(first_issue..latest_issue).each do |issue_no|
driver.get("https://www.lambdatest.com/newsletter/editions/issue#{issue_no}")
sleep 0.5
links = driver.find_elements(:tag_name, "a")
link_texts = links.collect { |x| x["href"] }
zhimin_links = link_texts.compact.select { |y| y.include?("zhiminzhan") }.uniq
zhimin_total_count += zhimin_links.count
sleep 1
end
puts zhimin_total_count
end

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store