As mentioned a few times above - I agree, calculating the ratings/scores and having it readable to interpret relative team strength is not terribly complicated once someone has the data. Sure - there are weighting options on how much to value newer games vs. older games, minimum number of games to be included as a team, and a few more, but that's all tweakable. Getting the data from gotsport/gotsoccer/all the random tournament sites, within a few days of it being posted up, is the secret sauce to anything like this at scale. But even that isn't necessarily is technically complicated as one might think at first glance. There is a ton of software available to handle the data acquisition by screen scraping websites, and get it into a usable data format. Check out
lists like this. Once configured, it likely only requires tweaking for a site once someone screams that it's not working. Then the other issue is as also been described above, how to represent that this team is actually this team across platforms where they are described differently. youthsoccerrankings had put together a pretty good system for that, where anyone could both merge teams that were the same programmatically, and could also report where a team's data was being shown incorrectly, by removing a data source. We only saw the front end, and not necessarily how it was handled on the backend, but it appeared that much of it was automated - and not just sending a message to a person via email for them to fix manually. That allows it to scale to thousands and thousands of teams, and eventually millions of games. Yes - that does allow for data pollution as people might unintentionally (or even intentionally) miscategorize teams, or add/remove incorrect data sources - but you can deal with it only when someone screams, rather than be in the middle of every transaction.