Thursday, May 31, 2007

Great Post on Shadow Gaming on Competitive Arena PvP

I think is really a must read: http://www.shadow-gaming.com/?p=111

I agree with a lot of what he said; it's been something I've wanted to rant about for a long time.

People use the argument that Wow has too much "luck" in it for it to be a great competitive game. Resists, dodges, crits, or windfury procs can have such a huge impact on a match, perhaps more than the little things that very strong players do that average players do not.

The counterexample is often poker: poker involves a lot of luck. It is inevitable even when you're stronger than your opponents in poker that not only will you lose a lot of hands, you will go through huge streaks of losing. What you play for is edge. Higher skilled players play at an advantage to the rest of the table, and it is apparent pretty quickly whether a table will be generally profitable even if you can't well predict how the next few hands, or perhaps next few yours will necessarily go.

As a poker player, I like the comparison, but feel it creates too much emphasis on the idea that because poker is a "game with a lot of luck" and is a reasonably serious competitive game, it is okay for WoW to involve a lot of randomness or luck. Consider two extremely talented long distance runners; they run a 40 mile race, and the winner wins the race by a few seconds. How confident are you that the winner will win the next race? Certainly, there is some randomness in the results even though the nature of the race (they run identical courses) implies there should be little to none. Randomness is intrinsic to anything competitive -- the strongest will not always conquer, he will just conquer more often than the weak.

The goal of an arena rating system therefore should be to recognize players who despite their occasional falters, win with consistency. Rating systems should be tailored to match their games, and games with more randomness need systems that punish or reward "flukes" not too heavily. In this way, ELO is perhaps not that well suited to WoW. Because even the best teams reach a plateau with respect to their chance of beating a lower ranked team (disconnects, atrocious luck, etc), how can it make sense for there to be games that yield no rating yet cost 32? It creates a strange situation:

Consider a 2300 rated team playing a 2200 team:

The 2300 should have roughly a 2/3 chance of winning based on how ELO is generally balanced. (I'm sure someone who is a real math geek on ELO will flame me for this generalization) The expected value of the game assuming how WoW implements ELO (which is a 32point K factor system, basically you can win or lose 32 at most), is simply: (% chance of winning) * (winning ELO) + (%chance of losing) (losing ELO)

(2/3)(2310.7)+(1/3)(2278.7) ~= 2300

That is the fundamentals of ELO, playing a game should have no effect on your rating, if it does, your rating is adjusted -- you stop when the expected value of playing a game does not change your rating.

But these % chance of winning assumptions don't include logical "caps" -- say you can't win more than 95% no matter how great the gap is:

(.95)*(2300)+(.05)(2268) = 2298

So each game you play against a team where you're playing a 0-32, and you have a winning soft cap of 95%, you're expected to lose 2 points. Is that really logical? When you lower you soft cap, this problem grows pretty quickly, or when you deal with teams that yield a few points, but you can only win 75-80% against. Fundamentally, ELO requires flawless domination of teams below you to be able to "profitably" play against every team even if you're the strongest player.

How do you fix this for WoW? "K-factor" is just math geekiness to describe how much you can potentially gain or lose from any game. Why not scale the K-Factor down when teams are far apart in rating? Or simply do a better job of scaling how you gain or lose per game as your rating changes?

Still thinking this over...

9 comments:

Animastryfe said...

I skipped the math part with it's unholy symbols of boredom.

>.>

Jenkins said...

loving all the content lately raddy, keep it up!

megan said...

Cool blog, keep up the good work! I've linked yours on mine, and you've got a new regular reader. :)

Klassick said...

You have no idea how irritating this is in 2v2. Running into a group makeup we have about a 5% chance of beating that is sitting on 2k rating and losing 20+ points 3 times in a row is pretty lame.

David said...

That post was simply The Law Of Large Numbers (LOLN) applied to WOW. This is a fundamental of all statistics and polling. SImply put, LOLN states that if you have enough contests than the better team will win the majority.

So, for example, if this season's NBA finalists play 10,000 games for the championship you can be pretty god damn sure the winner is in fact the superior team. Lucky bounces don't make a difference over ten thousand games.

They do, however, make a huge difference in one game. And over seven games, there is still room for luck, but seven games is a reasonable and more importantly *practical* number of games to determine who really is the best team - not just on that day - but overall.

Arena games are short. All the players are there. There is no reason not to have every showdown be a best of 7, or best of 9, or best of 11. This eliminates the luck factor, and also gives teams a chance to develop new 'game plans' - new counters to strategies that defeated them in the previous game.

What I hate about that article is that it assumes that because you made it to the top of your BG - a good indication of skill based on the number of games you need to play to get there - that luck is not important. Thus - and this is the stretch - its okay whatever happens in the tournament since luck has been factored out somehow.

Not true. If you played each opponent 101 times then luck would be a nonfactor statistically, as it would break your way as often as it breaks against you. But the post does not carry its argument to it's logical conclusion - that it takes many games between two teams to eliminate the luck factor.

David said...

Let me sum: An unlucky crit string can and will end your tournament hopes in WOW and that crit string has nothing to do with player skill on the opposing team.

This is a major barrier to WOW as an e-sport.

Bliz needs custom servers where teams can choose any class composition and also have thier choice of spec and any epic gear - between matches with the same opponent even.

Then, the truly best 5v5 team will show themselves as able to adjust to any scenario. They will - for example - demostrate that all of thier players can play multiple classes expertly.

Atashi said...

Yep, the question is whether 10 games per week and enough teams queuing at all times to make luck a small enough factor for competitive ranking.

IMO, no.

Atashi said...

er, sentence fragment, but I think you know what I mean.

Azael said...

The best way to avoid the randomness for tournaments is to simply have a larger amount of games on which the results are based.

It's just as they do with playoffs in sports. If there was only one game in each round of the NBA playoffs, the best team might not win, however when it's best of 7 games, the better team will generally come out on top, regardless of what bad luck they had in one or two of the games.