In our discussion of the project, one of our senior developers had mentioned Chrome headless and that the developer of PhantomJS had stopped working on it because of it, but even with some research, we put that aside for the moment to get a proof of concept.
Ghost.py & PySide
Ghost.py was recommended by someone who wrote another similar project. It looked like this could have been the way to go: no browser dependency, and it didn't need Selenium (if you haven't used Selenium, you'll understand my pain in a bit); the Qt libraries were more than enough to get it working.
PhantomJS is essentially a Qt based project and Qt is extremely useful if you know how to use it, but learning to use it is a pain.
What we learned is that Qt earned its reputation for a reason. After a lot of issues getting Qt to compile, we managed to find a Docker image for Ghost.py that worked for our purposes.
Back to Selenium
Selenium in Python; a familiar friend, ended up being used here. After some work testing the different drivers, we found that Firefox and Chrome worked best with the project. The problem was not that it was running a GUI app unnecessarily, which was why people preferred PhantomJS; it was essentially a completely headless browser. We needed the power of Firefox or Chrome without the overhead of a GUI. In the end, we decided that headless Chrome was the best option.
Since version 59 of Chrome (for Mac and Linux, 60 for Windows), the browser has been able to run from the command-line using the –headless switch. It's pretty amazing. Our first thought was to dump Selenium completely, and drive Chromium through code; unfortunately, there was no easy way to do that in Python (at least, not that could be found at the time), and what was required (selecting portions of the rendered page) would not work well. Selenium gave the project the complete control it needed, even with some compromises. We set up our project in Python using an Alpine Linux Docker container:
options = webdriver.ChromeOptions()
options.binary_location = '/usr/bin/chromium-browser'
If you've used Selenium before, you're likely aware that it is actually a Java-based project that uses 'drivers' to control the browsers. You are safe if you use the local driver and your language's bindings. Typically Selenium works like a charm, but this project, which was meant to run continuously in production, probably should not have chromium and chromedriver run in the same container, as they were two distinctly different programs, and errors from one should not affect the other. You can find more info about Docker container isolation in their documentation.
We settled on using the RemoteWebDriver interface with chromium in one container and chromedriver in a separate container. This driver is a JAR file and the developer experience was rough. When you can connect to RemoteWebDriver and it works, it's like magic, but the problem is when it crashes. Not only does the Java stack trace not help at all, but you also need to get the browser up and running again, which is easier said than done.
After the project was up running, Java and Chromium managed to eat away at memory and CPU in the AWS ECS cluster, bringing things essentially to a halt.
It was not fun.
Even with settings like so for Chromium to make sure that certain mounted directories didn’t blow up the container…
…either Chromium would not stop crashing or the ECS instance would grind to a halt.
The next attempt was to run a Grid; don't do this if you don't have so. We figured using Selenium Grid would make sense to offload the work, and let containers recover, but not only is Grid essentially a beta feature, it would need to be maintained for production in the long run.
Our team lead had mentioned Grid providers. We were kind of aware that they existed, but had not considered offloading that problem to them instead. The RemoteWebDriver functionality wasn't completely useless! All that was needed was some credentials and their Grid was accessible, and all we had to deal with was costs. Unfortunately, that didn't fix the other issues: browsers still crash, access can be revoked, and the program still has to work.
BrowserStack has been working, but has been temperamental at best; it’s hard to keep sessions intact, and there tend to be a lot of hanging sessions. There’s reporting, and it works, but it doesn’t feel very clean.
One way or another, it looks like headless Chrome is the way forward, it just remains to be seen how that happens. R.I.P.
- Running Selenium With Headless Chrome
- StackOverflow – How to take partial screenshot with Selenium WebDriver in python?
- Driving Headless Chrome with Python
- phantomjs › [Announcement] Stepping down as maintainer