Youth Soccer Rankings ?

I heard one of the 2nd bracket g2010 ECNL teams coaches told parents that they were close to being placed in the 1st bracket but just missed out because of their league record.

If this is true it implies that somehow league records apply to the bracket Surf Cup places teams in. Also, if they "just missed" getting into the 1st bracket some kind of number ranking algorithm is being used.

Or maybe by "just missed" Surf Cup was being literal and that clubs dart just missed the board.

There are 8 ECNL teams in the Super Black (aka second) bracket for G2010. While I haven’t gone through each team to evaluate their rankings, I don’t see any there that should be in Best of the Best. Who am I overlooking? You might argue that LA Breakers FC ECNL should be in vs DMCV Sharks ECNL.
 
There are 8 ECNL teams in the Super Black (aka second) bracket for G2010. While I haven’t gone through each team to evaluate their rankings, I don’t see any there that should be in Best of the Best. Who am I overlooking? You might argue that LA Breakers FC ECNL should be in vs DMCV Sharks ECNL.
Sporting, Liverpool or Hawaii would be my choice over breakers and sharks.

I think breakers and sharks would be more evenly matched against Slammers RL (last years surf cup winner, Vegas cup finalist and Mojave division champ), Blues RL (southwest and Sonoran division champ and last years finalist) and Beach RL (played In Virginia and competes with slammers and blues).

I think some of the super white bracket and super black bracket could be interchanged to have better balance (bring up the high level RL teams to play those mid tier NL teams)
 
Sporting, Liverpool or Hawaii would be my choice over breakers and sharks.

I think breakers and sharks would be more evenly matched against Slammers RL (last years surf cup winner, Vegas cup finalist and Mojave division champ), Blues RL (southwest and Sonoran division champ and last years finalist) and Beach RL (played In Virginia and competes with slammers and blues).

I think some of the super white bracket and super black bracket could be interchanged to have better balance (bring up the high level RL teams to play those mid tier NL teams)

IF that’s Hawaii Rush then I agree that it should be Best of Best. Crossfire would be the one to drop, though. Otherwise, I’m still not really seeing anything drastically off in Best of Best.

By rankings, BoB Group 4 is the toughest, with all four teams in the national top 40. BoB Group 3 is the weakest (among BoB). If it were me, I would have out Beach in group 3.

There are a couple of teams that probably don’t belong in the top 3 brackets (Rage, Bay Area Surf) and a couple that could have been placed higher (Beach RL, LFCIA).

I went ahead and pulled the #s into a spreadsheet this morning.
1038F46C-3F0A-4C73-91EC-8CF5A2D8869A.jpeg
ETA: The #s support your assertion that Super Black and Super White could be mixed better.
 
Very. A higher rated team will beat another rated team 82% of the time. A higher rated team in the top 100 nationally will beat another team in the top 100 nationally 75% of the team. Does that mean every ranking from 1 to 2000 in each age group is exactly correct and will predict who wins the next game with 100% certainty? Of course not. Something that could do that would be science fiction rather than an actual rating system. But anyone claiming they are terribly inaccurate is either intentionally obtuse or doesn't understand how probability works.

Someone hadn't linked Koge's win from last month as of yet, but when you do so you can see how well they did in the finals. They overperformed in all 4 games (all 4 marked green), and in doing so upped their own rating/ranking significantly.

View attachment 17656View attachment 17657

That said, the 05/04 rankings (and soon, the 06/05 rankings), are probably the wonkiest in terms of making sure each team has all of their games (and none of anyone else's games) correctly assigned per team, since it's when the age groups shift from 1 per year to 2 per year. Some teams keep individual year teams, some go to two years, some go to two years but keep the name of the single year. There is none of that complication for every other group from U9-U17.

Appreciate the response. Not sure I appreciate the insult… Never said they were “terribly inaccurate”. Simply questioned the accuracy.
 
Appreciate the response. Not sure I appreciate the insult… Never said they were “terribly inaccurate”. Simply questioned the accuracy.

My apologies. Someone else had said something silly about the ratings on a different thread due to where Koge was showing at the time, and I conflated two separate users. For what it's worth, Koge's recent performance (and the relative performance of everyone else since that time), has pushed them up to #3 in the country for U19G in SR, out of 1468 ranked teams.

koge.jpg

Sadly, all of these ratings/rankings for U19 are going away on 8/1 as the years roll over to the next season, so anyone interested/invested in a particular U19 team should capture screenshots soon if relevant.
 
IF that’s Hawaii Rush then I agree that it should be Best of Best. Crossfire would be the one to drop, though. Otherwise, I’m still not really seeing anything drastically off in Best of Best.

By rankings, BoB Group 4 is the toughest, with all four teams in the national top 40. BoB Group 3 is the weakest (among BoB). If it were me, I would have out Beach in group 3.

There are a couple of teams that probably don’t belong in the top 3 brackets (Rage, Bay Area Surf) and a couple that could have been placed higher (Beach RL, LFCIA).

I went ahead and pulled the #s into a spreadsheet this morning.
View attachment 17694
ETA: The #s support your assertion that Super Black and Super White could be mixed better.

That's not the Hawaii Rush team that just won National Cup.

Will be good to see all the teams with their new rosters.
 
I wanted to come back and bump this thread with some new information about the Soccer Rankings (SR) app. I weighed the options of just starting a new thread, but figured it might make more sense to have the information consolidated here where there has already been so much discussion about the ratings/rankings/algorithm/etc.

So today Mark made a pretty incredible discovery, and I'm giddy because it was at least partially based on a suggestion I gave him. But before I get there, a little background might be helpful to ground the discussion. So first off, the way this system works is pretty well known and well described at this point, at least to folks who frequent this board. Game data is pulled in from a various electronic sources, and assigned to a team entity. If a correct team entity for the data can't be identified, it creates a new team entity. Rinse and repeat, continuing to add game results to each entity. If the game results have a rated team on the other side of it, the rating for each team is adjusted based on the new results. The ratings of the two teams are compared, and if the actual goal difference is more than expected by the existing ratings, the one who overperformed has their rating bumped up a smidge. If the goal difference is less than expected. the one who underperformed has their rating bumped down a smidge. If the goal difference is pretty much spot on with what was expected, neither team's ratings will move much at all. (more details on this up on the FAQ for the app)

There are a couple outcomes of these ratings, but essentially they are useful for predicting what is going to happen when two rated teams compete. Those predictions can be used to flight tournaments, choose proper league brackets, or as a fun prediction for how an upcoming weekend may be expected to play out. Now these predictions are never going to be 100% accurate (right every time), or 0% accurate (wrong every time); but the better the data, and the better the algorithm, the better quality the predictions can be. For definitions, Mark uses "predictive power" to state these same concepts. 0% predictive power means a coin flip (getting no better than 50% correct). 100% predictive power = god. You can convert predictiveness to the % of results correctly predicted by dividing by 2 and adding 50%. So 70% predictive power would translate to getting 85% of predictions correct. In all of these trials correct is defined as picking the correct winner, for games that result in a winner. If the wrong winner is chosen, it's a failure. Tie game results are excluded from these predictivity results.

With this setup, predictivity of the app isn't an estimate or a guess - it's a specific number that can be calculated as often as desired. Run through all the stored games in the database right now, and compare the predicted results using the comparative ratings, and the actual game results, and divide the correct predictions over all of the games being predicted, and 1 number gets spit out. Turns out this number, as of today, is 66.7% predictive over all games, which translates into picking the correct winner of the soccer game 83.35% of the time. So as expected, it's way better than a coin flip, and will pick the right winner about 5 out of 6 times. This predictive number is a validation that the ratings derived from the algorithm themselves have a certain level of accuracy. If the ratings were wildly inaccurate, the predictive number would trend to 0%; if the ratings were supernatural, the predictive number would trend to 100%. But by any measure, the real, provable, actual predictivity number is pretty darned good (and better than a well known other ranking system by more than 50 points, it's insane). For any skeptics that doubt that youth soccer can be ranked/rated, or even skeptics of this particular algorithm / ranking system, the predictivity number is what mathematically shows the expected probability - and it's an admirable number.

But that still isn't the interesting discovery. Here comes the interesting discovery. There is an intuition, even by proponents of this type of comparative ranking that uses goal differences, that the quality of the data (and the predictions) depends on how close the compared teams are to each other, and how many expected shared opponents they have. The more interplay, the better - the less interplay, the more drift. I believed that to be the case, as it seems reasonable. For example, if teams are in the same league, or same conference, or even same state; they play each other enough, that their comparative ratings will be honed and sharpened by each other, and would have a higher predictive value. And conversely, if you're comparing teams that are not in the same league, same location, may have never seen each other before, and have few if any common opponents - it makes intuitive sense that their comparative ratings would drift a bit more, and would be somewhat less accurate. Remember, this actual predictivity, this quality of each prediction, can be calculated by looking at the existing data for games that would fit into this category.

So what I suggested to Mark - and to be fair, he had also thought of himself within the past few days as well - was that he should exclude all in-state games, and measure the predictivity of interstate games exclusively. CA teams playing AZ, TX playing OK, or any other permutation in the country where the opposing teams are in different states. What this would do, is measure how good the predictions are, when there is very little shared information going into the upcoming game. Interplay is low. This represents what happens when you go to a big tournament elsewhere, as opposed to predicting what will happen with a local league game. He coded the query, ran the data, and a few hours later the number was spat out. And it turns out that for these interstate games, the algorithm is 67.0% predictive, which translates into picking the correct winner of the soccer game 83.5% of the time. So all of the intuitive worry about drift, or more local data being more refined than less remote data, turned out to be a false intuition. The comparative ratings, when used even across different states, provide just as good (and in fact a teensy bit better) predictions as when they are applied to local / in-league contests. If a team has sufficient data to be rated, that rating can be trusted regardless of extensive interplay or not. It's an incredible finding, and it validates all of the work and effort Mark and his team have done over the years to polish and refine the algorithm, tying game data to a useful rating.

And now to a real-world use, it looks like we're predicted to lose both games this Saturday with my youngest's team, so what's the leading recommendation to fill my thermos?
I had recently wondered about the interstate calculation. I saw that it came out fairly predictive in real world results and was properly impressed.
 
Are you doing this through an API?

If Mark is holding out the knowledge of an API for us - there will be pitchforks! :D I've spent way too much time transposing from the app into spreadsheets this past year!

League play kicked off yesterday for the fall season for us. Predictions for both games were as accurate as expected. Unfortunately, the first one was predicted to be a 2-1 loss. Having the final result turn out to be a 2-1 loss is a small consolation. 2nd game was predicted to be a 2 goal win, ended up being a 3 goal win.
 
If Mark is holding out the knowledge of an API for us - there will be pitchforks! :D I've spent way too much time transposing from the app into spreadsheets this past year!

League play kicked off yesterday for the fall season for us. Predictions for both games were as accurate as expected. Unfortunately, the first one was predicted to be a 2-1 loss. Having the final result turn out to be a 2-1 loss is a small consolation. 2nd game was predicted to be a 2 goal win, ended up being a 3 goal win.

If API means “Arduously Pulled by I” then, yes!!!

No, I copied/pasted the brackets from the website into Excel then manually looked up each team’s ranking in SR.
 
I imagine it's pretty straightforward - if the game scores are available electronically to the public, it will almost certainly be pulled in to SR. If they are limited from public view and require specific access - it's certainly possible they will never get to SR, so the teams would eventually drop out of the ratings entirely.
 
...If they are limited from public view and require specific access - it's certainly possible they will never get to SR, so the teams would eventually drop out of the ratings entirely.
I think that would hurt the SR, but also MLS Next's reputation. I don't know whether they care about rankings at all, but I do think that as the teams move up, it helps tilt people towards choosing MLS Next over ECNL.
 
I think that's true - but it's even more practical than that IMO. What is so secretive about MLS Next game scores that they choose not to publish them publicly? It would be a sign of weakness rather than strength. We'll see soon enough.
 
I think that's true - but it's even more practical than that IMO. What is so secretive about MLS Next game scores that they choose not to publish them publicly? It would be a sign of weakness rather than strength. We'll see soon enough.
Except that the not-published scores are only MLS v MLS. Any scores vs. other teams are still published (because they happen in tournaments).

But I suspect this has to do more with dealmaking and exclusivity than public vs. private.
 
Mark's update today adds some features. On each team page, there is now a rating for schedule strength. This shows the aggregate ranking/rating of all opponents the team has seen in the past year. It's another data point to assess whether teams are generally playing teams that are stronger, weaker, or of similar competitive levels over the past year. There is an updated Club page, which now has a bar graph showing the total number of teams in the chosen club separated by age group / gender. Both the Team pages and Club pages now have tabs at the top, making it less needed to do a ton of scrolling which was previously needed to get to the different genders and to the game data sources. And the 2016's have been officially released, so those rankings/ratings are fully visible after a soft launch a short while back.
 
I don’t know about the app. In girls 2010 seeing those GA teams ranked so high while they are losing to nobody’s while ECNL southwest teams continue to beat eachother up and are ranked lower makes no sense to me. Just looking at what the 2010 southwest teams did to all these top ranked teams in their states during ECNL playoffs last year proves how misleading these rankings really are at least in that age bracket. How is Tophat still ranked so high? It’s hilarious to me.
 
If you're actually interested in understanding how accurate (or inaccurate) ratings systems can be, it's not a hunch or a feeling - it can be tied directly to the results and measured/verified mathematically. This post in this same thread a few pages back explains exactly how, it's probably worth the quick read.

There are ~3300 2010G teams in the US with enough recent matches against other rated teams to be rated. A higher rated team will beat another rated team ~82% of the time. Does that mean that every ranking from 1 to 3300 is exactly correct and will predict who wins the next game with 100% certainty? Nope. But believing that anything can predict the future with 100% accuracy is silly - and debunking a predictive engine for not being 100% accurate is equally silly.

Looking at 2010G - out of those 3300 teams, if you focus on just the top 50, it's down to just the top 1.5% of teams across the country. Of those 50, 35 of them are ECNL, 11 of them are GA, and 4 are other. Cutting it down to the top 20 - 17 of them are ECNL, 3 of them are GA. All of these teams are quite good, by any reasonable definition.

The one that you brought up (TH 2010G) is quite an outlier, and was showing #2 in the country for awhile, before starting to move down as their more recent results aren't at the same level as their previous season. It's been discussed quite a bit here as the results in the app (and on the field) stand out. Last season they put up an incredible record in GA, were the class of the league nationally, went undefeated in the GA Champions League Finals, and went undefeated in the GA Summer Showcase Playoffs as recently as June. Their rating reflected those game results. And looking at their game history in the app for all of 2023 - it shows that their rating predicted their game results quite closely - there were only 4 games where they significantly overperformed their rating, and zero games where they significantly underperformed their rating - this is shown by those game results highlighted in either red or green. However - since they restarted play in August, they have significantly underperformed in 4 out of 6 games, gone 2-3-1, with their only convincing blowout win against a team ranked over #500 in the country. With the recent addition to show calculated schedule strength data showing even more insight, it confirmed that TH's schedule over the past year was the 54th strongest in the country - meaning that their rating is supported by convincingly and thoroughly beating up on teams that for the most part they outclassed. There are other teams in the top 10 that have a similar profile. Real Colorado's schedule was 31st in country, Solar's was 25th, and even the #1 ranked team MVLA's schedule was 24th. But the remaining top 10 teams all do show that they also had the top 10 hardest schedule in the country while earning a similar rating.

Given all that, it's understandable that people would want TH 2010G to play head-to-head with some of the other current top 10 teams to see if they really are (were?) that good, or if their rating that they have earned is misleading and would not predict their performance against the teams this board is familiar with. Some people here believe that it would be a validation of the strength of this particular GA team and would surprise the masses here. Others are pretty confident that wouldn't be the case.
 
It looks like SR is implementing a stop-gap procedure to deal with MLS Next dragging their feet about publishing any results so far this season. When I opened the app yesterday, there was a pop-up message explaining that some MLS Next teams were going to start dropping out of the ratings over time as their existing results age out, because MLS Next's website continues to say "coming soon" for results. If you are connected in some way to an MLS team, Mark can open up a feature in the app for people to self-report scores for their own team or even their entire bracket if they have access. My hunch is that the average # of SR users per MLS Next team is > 1, so I believe it's unlikely that scores won't be reported at some point - given that unwelcome outcome for their ratings if they aren't. Hopefully the website gets sorted at some point and this kludge becomes unnecessary.
 
It looks like SR is implementing a stop-gap procedure to deal with MLS Next dragging their feet about publishing any results so far this season. When I opened the app yesterday, there was a pop-up message explaining that some MLS Next teams were going to start dropping out of the ratings over time as their existing results age out, because MLS Next's website continues to say "coming soon" for results. If you are connected in some way to an MLS team, Mark can open up a feature in the app for people to self-report scores for their own team or even their entire bracket if they have access. My hunch is that the average # of SR users per MLS Next team is > 1, so I believe it's unlikely that scores won't be reported at some point - given that unwelcome outcome for their ratings if they aren't. Hopefully the website gets sorted at some point and this kludge becomes unnecessary.
Don’t think people care enough to consistently report
Even so, it’ll be spotty and unreliable
 
You might be right - I guess we'll see over time. Even for those who profess to not care about it a whit - if the scores for their kid's teams show up incorrectly and disadvantage the rating/ranking for them, I'd bet it's unlikely that 100% of the parents, let alone all of the club's leadership and/or support staff would all be happy to ignore and leave that bad data. It doesn't take everyone to care about it, it doesn't take most to care about it, it doesn't even take a few to care about it - it just takes 1.
 
Back
Top