Shallow Thoughts : tags : selenium

Akkana's Musings on Open Source Computing and Technology, Science, and Nature.

Thu, 11 Nov 2021

Selenium: Handling Timeouts and Errors

This is part 3 of my selenium exploration trying to fetch stories from the NY Times ((as a subscriber).

Part I: Selenium Basics
Part II: Running Headless on a Server
Part III: Handling Errors and Timeouts (this article)

At the end of Part II, selenium was running on a server with the minimal number of X and GTK libraries installed.

But now that it can run unattended, there's nother problem: there are all kinds of ways this can fail, and your script needs to handle those errors somehow.

Before diving in, I should mention that for my original goal, fetching stories from the NY Times as a subscriber, it turned out I didn't need selenium after all. Since handling selenium errors turned out to be so brittle (as I'll describe in this article), I'm now using requests combined with a Python CookieJar. I'll write about that in a future article. Meanwhile ...

Handling Errors and Timeouts

Timeouts are a particular problem with selenium, because there doesn't seem to be any reliable way to change them so the selenium script doesn't hang for ridiculously long periods.

Sun, 07 Nov 2021

Configuring Selenium to Run Headless, Without a Desktop

This is part 2 of my selenium exploration trying to fetch stories from the NY Times ((as a subscriber).

Part I: Selenium Basics
Part II: Running Headless on a Server (this article)
Part III: Handling Errors and Timeouts

When we left off, I was learning the basics of selenium in order to fetch stories (as a subscriber) from the New York Times. Fetching stories was working properly, and all that remained was to put it in an automated script, then move it to a server where it could run automatically without my desktop machine needing to be on.

Unfortunately, that turned out to be the hardest part of the problem.

Tue, 02 Nov 2021

Web Scraping with Selenium in Python

This is part 1 of my selenium exploration.

At the New Mexico GNU & Linux User Group, currently meeting virtually on Jitsi, someone expressed interest in scraping websites. Since I do quite a bit of scraping, I offered to give a tutorial on scraping with the Python module BeautifulSoup.

"What about selenium?" he asked. Sorry, I said, I've never needed selenium enough to figure it out.

But then a week later, I found I did have a need.

<	November 2021					>
Su	Mo	Tu	We	Th	Fr	Sa
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

	Feeds: RSS 2.0 \| Atom
	@akkakk on Twitter
	@akkana@fosstodon.org on Mastodon
	Shallow Sky Home
	Contact Akkana