With the help of DoughBoy, I’ve compiled all the season 1 and 2 ST results, and plugged in the TrueSkill algorithm used to determine xbox live rankings! Here’s the results of the top 25 that have participated in season 1 & 2 ST brackets:
I will update results after the portland tourney, and continue to update them as season 3 goes on!
I’m very curious to hear what you guys think. Do you think these match up to your expectations of real rankings? Two players with similar skill numbers means that the program thinks those two players are evenly matched: In other words, they should have a 50-50 chance of beating each other.
Also, it’s important that the skill ranking means more than rank. The rank is just a ladder – the skill number should be a much truer reflection of your performance.
Please discuss!
(edited to change rankings scale and round to nearest integer)
These rankings are noticeably different from the old SRK Apex rankings. In the Apex rankings, you got points for just showing up. So someone could enter 4000 tournaments where he beat his little sister and her club of my little pony collectors, and rack up Apex points. In this system, people that show infrequently can still place highly (eg: RayBladeX). The more you show up, the more accurate your skill ranking is, but you don’t get more or less points for attendance.
Just for those who haven’t read up on old posts in various NW thread, this system is basically a rip off of the published True Skill algorithm that’s used in ranking XBL players. It’s public knowledge, so Beasley is not breaking any laws that I’m aware of. This will most likely also be the formula used to rank STHD players on XBL once that releases.
Cross pollination from different environments will help determine your ranking with people from other areas even if you’ve never played each other. If zass whoop’s on us, went to EVO and got butt-raped by Choiboy, and Choiboy consistently got beat by some dude in Cali, then the random dude in Cali will have a higher skill ranking than all of us. Your skill ranking will be retroactively altered with every result recorded by people you’ve played against in the past. For that reason, the more data we collect for the database, the better.
A few notes:
zass has been very passionate about this project, pouring hours of his time coding the web app to store and display results. I think it is his intent to find reliable hosting and make it so that enthusiasts be able to view the competitive history of players from any active web browser. If all goes well, you’ll be able to see the complete tournament history of yourself and other players online one day.
Not only will you be able to view results, tournament organizers will be able to submit results from their local events. So long as there exists some amount of data regarding how players in one region matched up to at least one other person elsewhere, you can get a sense of how good you are compared to someone halfway across the world.
Cross pollination is crucial. So I’m gonna try to remember to ask Zach for the chance to borrow the Evo brackets and inputting them.
All of the results of ST tournaments from season 1 have been inputted.
Sadly, all brackets from season 2 (except for the one last Friday) have been lost. Results from the top 8 were lifted from Deezo’s postings in the Dojo, and whatever matches were videotaped have been scored. However, the majority of the data from the early rounds of season 2 brackets have been lost to the Gods. So if you lost one of your first 2 matches in the winners, and didn’t make top 8, odds are, it won’t count against your record. This is a shame since every result is important in such a system.
Due to the previous point, I would imagine that if you’ve been doing significantly better in season 2 than season 1 (or if you only started playing in season 2), your skill ranking probably didn’t get the boost that it deserves. The rankings are far more skewed in favor of strong season 1 performances. My apologies for this. Just keep showing up and we’ll get more data to feed the beast.
This is America. We love freedom, so we will eventually include MvC2 results. It would be preposterous otherwise.
Yeah this makes no sense to me, looks like all i have to due is show up and i can get top 10 which would make the rankings meaningless to me. This is almost like the BCS, you get ranked, you just dont know how.
You can think of skill as a “level”… the more tournaments you enter, the more sure we are about your level. But your level won’t change just because you enter more tournaments.
So Itazan, Umbrella, and JTM are all level 23. This means they should all be of similar skill
I dunno if you already explained this to me, but how does losing to a particular person affect your rankings? Like I’ve lost to everyone above me with the exception of Ray (never played him). I’ve also lost to nine of the people below me, but since most of those losses took place in Season One I dunno how badly that affected my ranking.
Actually, attendance basically nets you zero gain in the rankings. I think the calculations are based on the ELO system where strength of the people you play affect your score. The greater the ranking disparity between you and your opponent, the more your score goes up/down after a match. So beating someone ranked hella low won’t affect your score much, and beating the same person repeatedly won’t affect either of your scores at all. Note that RayBladeX and Itazan have only been to a single tournament, and they’re basically above 90% of the 50 or so players that have thus far been entered into the system.
I think there might also be an optional temporally degenerate modifier in place, so that the older an event is, the less it affects your score. For those who are interested, here’s an intro to how it works:
Something I should point out though. This system COMPLETELY neglects your tournament placement. It look strictly at your ranking vs. the ranking of the person you played on a match-by-match basis. So if there were only 4 really low ranked people who show up for the next tournament, they’d each earn points towards their season rankings, but they would hardly advance at all in this system. A grand finals isn’t judged to be any different from a round 1 match from the loser’s bracket. All that matters is who you play and who won.
That’s the heart of it. The way it works is almost identical to ELO (used in chess), except that it factors uncertainty of your ranking in it. So essentially:
if you lose to someone at a much higher level than you, it won’t count much
if you beat someone with a much higher level than you, it will count more
While Beasley is very passionate about coding the web input program, updating the algorithm to incorporate fighting games specific features, and working with people on the possibilities in the future, he’s a touch Axel-esque when it comes to setting up work on the server side of things. He’s found hosting and all, but actually doing the grunt work is a touch underwhelming to him. For the time being, we can flip a switch at any time locally (on his laptop) to calculate results from all of the date collected. But he has yet to setup the web page where anyone can just view player’s rankings, history, and past events. I sense a passing of the torch for the nickname of ‘Daikatana’. But I’ll hold off on that until Beasley actually says he’s gonna do something, and draws it out for 4+ months.
As for Airthrow personally, I don’t know whether your score would really change dramatically. The people who’s scores are most apt to shift would be RaybladeX and umbrellastyle, since there was significant difference in their scores, and Alex won twice.
I’m a 21 and beat a 12, 22 and 27, wouldn’t that raise up my skill points a bit and also increase my mu (if I remember correctly that’s the computer’s certainty?)
It would certainly raise your score, and lower your ? (lower ? means that the program is more certain that your sigma accurately represents how you are as a player). But you don’t get much of a jump for beating people below and around your level–the 27 will have some effect though. As for the ?, keep in mind that you already have quite a few games logged in your record. Where as Ray only has 7 or so matches total in the system, so any data has a chance of altering his score dramatically. And Alex took out some strong comp yesterday, so his score will likely go up by 1.5 or more.
You also lost to 2 people who were ranked below you prior to the event. This is not to rain on your parade of anything, just to point to my assumptions on how the program will work.
I’m sure Nate and myself will dip somewhat. Nate lost to someone who’s only previous documented match in the system was a loss to PaulLee (with a 7), so that’ll probably have some effect. But then, looking at the math yesterday on Beasley’s laptop, zass, Nate, Mandel, and I had the smallest ?, simply because we had more matches recorded than anyone else. For that reason, perhaps we won’t be hit quite as hard.
Remember though, the rankings are for entertainment purposes only. We’re ripping off a publish algorithm created by well paid mathematicians. However, Vegas odds makers earn infinitely more for their sports pics, and those require a ton of number crunching and complex calculations. And they are still wrong on occasion.
Oh yeah, I’ll probably stay the same or sink lower then. Damn.
I feel confident after the tourney though that actual practicing does help me, I don’t know if I’ll ever catch up to the top 3-4 people, but I’d really like to keep working at ST to get better and hopefully win some games.
Oh, and I forgot to mention, I’ve already input the data from yesterday’s ST brackets. So anytime zass wants to flip the switch and do the calculations, he can.