Duplicate Messages Bug Fixed on TubeUpdates.com
As you may already know, I run a website called TubeUpdates.com which provides a RESTful API for the TFL Underground network so you can build apps that utilize that data. Unlike some other services or XML files that try to provide this data, my API scrapes all of the pages on the TFL site in order to give out the specific messages for each line e.g. rather than having "Minor Delays" you'll be given an array of each message from their site about the delays such as "Sunday 17 May, suspended between Liverpool Street and Leytonstone. Two rail replacement bus services operate." which is obviously a lot more useful.
However, as my script relies on screen scraping, problems do occur when TFL decide to change their HTML or site structure which has happened recently. Previously, every tube line had it's own page with it's messages listed on it but now there is one single page with all of them shown and hidden with javascript. This meant that for a short period, my API was posting out the messages for every single line with each line returned (so if you wanted to look at circle line messages, you would in fact get every line).
This has now been fixed and actually makes the service slightly better for me as it now means I can crawl just 3 pages rather than the 14 I was crawling before (thus better for both bandwidth and CPU cycles). I have a number of functions set up to report when TFL change their site structure as the usual problem is that a site redesign changes class names or markup in such a way that the API just breaks. However, in this case none of these warnings kicked in as it was getting the data correctly and all seemed to be ok.
So, a big thank you to those of you that emailed in to report the bug! If you have any questions about the Tube Updates API, then please check out the site or drop me an email.