I love soccer. I especially love the World Cup. Despite the well founded concerns over corruption in FIFA and the social costs of spending so much on a sporting event, it’s truly the largest event in the world. During the 2010 World Cup, 3.2 billion people watched at least one of the games from home. In some countries, over 50% of the viewing population viewed that country’s matches. The World Cup is one of the few times we come together as an entire planet.
Every time the World Cup rolls around, some friends and I get together and draft teams to make some of the smaller matches more interesting and to make some friendly wagers. This year, we drafted the day before the Cup started. After it was over, my friends asked me if I could make a website to track who was in the lead at any given time.
As is usual for these things, I first said, well, that’s impossible. There’s no way I can build something that’s up-to-date automatically. Then I said, well, at least I can do something interesting. So I started with a simple MVP where we could all enter match results via the backend and our scores were calculated. But then I got to thinking, why couldn’t we do this automatically? After all, there has to be at least 100 sites that have up-to-date scores. It should be a simple matter of scraping the results and putting them somewhere else.
A simple matter…
First, I went looking for some easy-to-parse JSON APIs. Unfortunately, they all cost at least $100. One thing led to another, and I ended up building my own over the course of a few days. At the beginning, I built a scraper that could get the current results of the group stages and the scores of the current match to pipe them into our company Hubot. Then, I told Casey about it here at Software for Good, and he encouraged me to build a full-fledged JSON API to be released to the world. I decided to make it more full-featured and able to pull in the results of all the games in real time.
I spent the weekend playing with scraping the HTML of a few different sites, and I was finally able to scrape in what I wanted. I set up a few simple associations between teams and matches in Rails, ran into some time zone issues, learned the FIFA country codes, and taught myself how to use RABL to structure the JSON output. Voilà – the first full JSON API for the World Cup that is publicly available for free (as far as I know).
Using the API
Okay, this is what you all came for. The endpoints are as follows:
http://worldcup.sfg.io/matches
(All match data in JSON, updated every minute.)
More specialized match data can be pulled in as well. For example, the following endpoints all work
http://worldcup.sfg.io/matches/today
http://worldcup.sfg.io/matches/tomorrow
http://worldcup.sfg.io/matches/current
You can also view the matches of any team, if you know the country’s FIFA code.
http://worldcup.sfg.io/matches/country?fifa_code=BRA
During the group stages, you can get the currents results in every group (wins, losses, draws, goals_for, goals_against, and knockout status) by going to:
http://worldcup.sfg.io/group_results
Code
The code is open source and available at https://github.com/estiens/world_cup_json — feel free to host your own server or submit a pull request.
Moral
It wouldn’t be a blog post without a moral. Time and time again in my programming journey, I’ve come up against a problem that I was convinced I couldn’t solve. Time and time again, I was able to prove myself wrong by persevering.
The general progression goes like this. No way I can do this >> Maybe I can do this >> Okay, I got it part of the way there but there’s no WAY I can get it fully working >> Okay! I can get it fully working! (Then start all over again with, there’s no way I can refactor this mess into something that looks better…)
How I felt after the scraper started working correctly.
At this stage in my programming career, I like to think that I code a bit like the US team plays. It’s not always the most beautiful code in the world and it doesn’t come easy, but I’m able to stick to it and come out with a win at the last minute.
Now, on to US vs Portugal!
P.S. We can’t guarantee that this will always be up to date as the scraper may break, but we’ll do our best to keep it working through the final whistle. If you make anything cool with the API, whether a website or a Hubot add-on or an SMS goal notifier, please let us know!