Fetching Browser Cookies Programmatically
In my eternal quest for a decent RSS feed for top World/National news, I decided to try subscribing to the New York Times online. But when I went to try to add them to my RSS reader, I discovered it wasn't that easy: their login page sometimes gives a captcha, so you can't just set a username and password in the RSS reader.
A common technique for sites like this is to log in with a browser, then copy the browser's cookies into your news reading program. At least, I thought it was a common technique -- but when I tried a web search, examples were surprisingly hard to find.
None of the techniques to examine or save browser cookies were all that simple, so I ended up writing a browser_cookies.py Python script to extract cookies from chromium and firefox browsers.
Run with no arguments, it will run locate
commands to find firefox and chrome cookie files found on the system.
(There's no error checking there. I suppose there should be, but
really, how does anyone live without locate?)
Chrome cookies are encrypted in an apparently tricky way. The code to extract them comes from a Stack Overflow thread on How to use Chrome cookies in requests. It even includes some Win32-specific code, which I have not tested, but the Linux version seems to work.
On Firefox, there was no decrypting hassle, but there was another
problem: Firefox locks the cookies.sqlite3 file in a way that doesn't
allow even read-only access: any attempt to read it raises
sqlite3.OperationalError: database is locked
.
So for Firefox cookies, I had to copy the cookies.sqlite
file to a temp file, then open the tempfile in sqlite3.
Anyway, just a silly little script, fairly easy to write; but it's a lot easier than opening the browser developer tools and then laboriously resizing all the columns to make it possible to see the actual cookie names. I expect it will also come in handy when I just want a quick count of how many cookies a given domain is setting, as part of our LWVLA Privacy Working Group.
[ 11:19 Mar 30, 2021 More programming | permalink to this entry | ]