Over the past week I’ve had a lot of fun working on some small scripts to aid my general day-to-day living. My wife is currently waiting to take her driving test but our local area only has one test instructor with a waiting list of over 6 months! Occassionally slots open up due to cancellations and so she has been logging into the Government website multiple times a day to check for a nearer date. Time for some automation!
To stop her having to constantly go to a website, I decided to knock up a quick PHP script that would basically pretend to be a browser, log in, go to the correct page, and rip out the next available appointment. It’s been a while since I wrote a script like this so I had a look around for something better than
SimpleXML and found Goutte, a simple PHP Web Scraper. Within 10 minutes, my script was working. It looks something like this:
As you can see, the code is pretty simple, readable, and powerful. Within minutes, I was able to jump through 3 pages without worrying about sessions, cookies, or any of the other nonsense that often occurred when dealing with this kind of scraping. I run this every 5 minutes on a FortRabbit server and we both get emailed if a closer test date appears. For just a few minutes work, this means better peace of mind as we don’t have to check manually and this checks far more frequently than we ever could. Now to wait and hope for a nearer test date…