In defence of Metacritic scores

Back in March it was reported that Obsidian Entertainment missed out on a bonus payment from their publisher Bethesda when their game Fallout: New Vegas narrowly failed to score 85 points on the Metacritic scoring system, which aggregates game ratings from a variety of sources to form an overall score. There was some discussion over whether that sort of criterion was a fair one to base developer bonuses on, especially given the news that Obsidian were having to make lay-offs when this news became public. The implication was that if they’d got the bonus, those job losses may not have happened.

The debate was renewed this week when it emerged that Irrational Games, makers of classics like Bioshock and System Shock 2, had included a requirement on a recent job advert that applicants should have a credit on a game with a Metacritic score of 85 or above. After the initial flurry of criticism online, the requirement has been removed from the ad. Yet it would be hard to imagine that this consideration will have been forgotten entirely, since it was considered important enough to add in the first place.

Gamasutra asked industry writers on their thoughts, and they tend to converge on a common position of being critical of the practice:

“Some really smart and talented folks have contributed to games that weren’t outright critical darlings.”
“Holding their individual work to a group standard, and a nebulous one at that, is beyond the pale.”
“[…] it’s even worse when you’re pinning that badge to an individual whose contribution to a bad game could have been amazing, or to a great game could have been insignificant.”
“your Metacritic score is really just an arbitrary number derived from the press, and it doesn’t take much to ruin your chances of receiving a “good” score.”
“who would want to work for a company that believes this to be an acceptable requirement for hiring?”

The complaints seem to revolve around a few key issues – that organisations (whether publishers in the case of bonuses, or developers in the case of hiring) shouldn’t be judging whole people and entire products based on these numerical scales, that the Metacritic scores themselves are arbitrary and don’t measure anything useful, and that good people worked well on games that weren’t critically acclaimed (and vice versa).

The first issue is odd, because the job world is already heavily numbers-based. Even the amended job spec with the Metacritic requirement taken out asks for 6 or more years as a designer in the industry, 4 or more years of management experience, and 3 or more games worked on for the project duration. Requirements like these are common for top-tier jobs – for entry level positions it’s common to see requirements like “1-2 years in proficiency in C++“, “6+ months console development experience“, “Bachelors Degree“, etc. Of course there will be people without one or more of these criteria who are better than some of the candidates with all the criteria, but we know that the criteria are still a useful guide. The number of false negatives you will suffer by ruling people out wrongly is almost certainly compensated by the time saved in filtering out inappropriate applicants.

As for the relationship between a publisher and a developer, a publisher will often tie bonus payments to sales figures, and payments during the development period may be dependent on the quality of milestone builds or on the dev team meeting fixed deadlines. Tying bonuses to sales is an important part of managing the risk a publisher is exposed to when funding a project, and is essentially equivalent to paying less up front but adding royalties, except with a steeper threshold. This lets them minimise the fixed cost while also being able to reward successful developers with money that would generally only ever come from profits. Without the ability to do this, publishers would have to take fewer risks – if that is even possible these days! – and fewer games would get funded. So in the big bad world of high budget game development, these metrics are a necessary evil. If you want a publisher to throw millions at you to make a game, you’re crossing over from art to commerce, and the people who fund you deserve to get some assurances back that you are trying to make the best product with their money. And as a developer you’d usually prefer that the quality of that product was based on things you can more directly influence, such as how much the reviewers enjoy it, than on things you get little control over, such as how many units it sells. Metacritic scores are a step up from sales figures here.

The second issue is about the Metacritic score itself. What does it measure – fun, quality, predicted sales? Does it even make sense to assign a score to a game, which is surely going to be experienced subjectively? And does the aggregate value make any more sense than any given individual one? One answer to all these questions, which is actually quite simple but will not satisfy purists, is to abandon the idea that the score measures anything other than critical opinion. And whereas critical opinion itself does not equal fun, or quality, or predicted sales, it does actually correlate highly with all of those variables when the population is viewed as a whole. Metacritic scores correlate positively with the user scores (with a Pearson coefficient of 0.47 in one test I did of 50 randomly selected games) which implies the critics are at least in touch with public opinion, and that what they like is probably what the market will like too; at least one laboratory study supports this, as does empirical data from EEDAR presented at this year’s Game Developers Conference. Of course, everybody can find discrepancies, whether between critical opinion and public opinion on one game (eg. Mass Effect 3 getting 89 on Metacritic while the user score averages 4.2 out of 10), or in the relative rankings, or in finding games that scored highly but sold poorly, but on the whole the ordering is far more right than wrong and the scores are meaningful. The value itself may be imprecise but it doesn’t mean that you discard the entire measurement, just adjust your expectations.

Also in the Gamasutra article, Kris Graft suggests, “maybe their HR departments should just cut out the middleman and recruit a couple dozen video game reviewers who will play job applicants’ games, score them independently, then average out the results. Isn’t that essentially what’s going on here?” Not exactly. But is it really absurd to check a designer’s abilities by getting experienced players to actually play the candidate’s games and rate them? Surely not – in fact, surely that is going to be one of the better tests if we care about a game’s actual experience. And most Metacritic scores of reasonably well-known games are formed by aggregating more reviews than two dozen (although not all, admittedly), which makes the scores more valid than an in-house test of that size would be, as the more samples you get, the closer you approximate the ‘actual’ value. And using publicly available scores (rather than it being privately done for HR departments) means more transparency and a level playing field. The Metacritic score will almost always be a better judge of critical acclaim than any in-house test. Many developers, in games and elsewhere, will have experienced the lottery of in-house testing which often rejects someone who then goes on to pass a test at somewhere which, on paper, would be an equal or better quality employer. A standard focus for comparison on public data would seem to be an improvement on this situation.

The last major objection was that a game’s score isn’t necessarily a good match for an individual’s score. This is the hardest one to argue with because there is a lot of truth in the fact that great developers sometimes end up on poor games, and perhaps vice versa. But this is where industry experience tends to even things out – the best developers will, over their careers, generally gravitate towards the higher quality companies and make higher quality games, giving themselves a good chance of getting such a credit. (Note that the controversial advert only asked for “one game with a Metacritic score of 85 or higher” – not an average of 85 over your career.) So rather than being read as “are you good enough that you inspire whoever you work with to create 85+ scoring games”, it should be read as “are you good enough that you were previously hired by a company that has made 85+ scoring games”, or perhaps even “do you have experience of working in the kind of environment and with the kind of people that make 85+ scoring games”. These are more useful ways of viewing the requirement, and while there is still a lot of scope for false negatives and rejecting some good developers – as with any measurement made prior to employment – you can be sure that someone who meets this level will be likely to boast the kind of experience a top developer would need.

So, whereas it’s understandable that people don’t like their art being distilled down to subjective ratings and strict thresholds, nor the idea of jobs being lost or companies closing due to a metric that is potentially at the mercy of a few rogue journalists, it’s hard to argue that judging developers by Metacritic scores is inherently bad. Gamers, developers, and publishers all want better games, and that means finding ways of deciding what ‘better’ means. Metacritic scores may be far from ideal in that regard, but right now they’re probably the best we have.

9 Responses to In defence of Metacritic scores

Brian 'Psychochild' Green says:

July 29, 2012 at 9:16 am

Well, the main problem is that review scores are more about what’s expected than how “good” the game is. Look at Mass Effect 3, for example, where it scored pretty high because it’s expected to do well; the previous games were super-popular with ME2 provoking early religious devotion in some people. Yet the ending didn’t meet user expectations. And, honestly, I’ll bet 50%+ of the reviewers probably didn’t even get to the ending before passing judgment.

There’s also the unspoken rule that a game reviewer has to be very cautious about how the rate a large publisher’s games for fear of having advertising dollars taken away. So, you’ll often find game views of smaller, independent games getting lower ratings than the games from larger companies.

So, really, when it comes to the requirement to have an 85%+ Metacritic score, it’s basically saying you better have worked on a top-tier project with a huge marketing budget. Is that really a useful metric for how to draw the best talent? I doubt it.

The last thing is that job requirements are often very flexible in the game industry. A lot of times some requirements will be loosened if someone with the right skill set and other qualifications applies. But, this isn’t always the cases. In the case of Irrational, was HR going to simply dump any CVs without the 85%+ score prominently shown?

Anyway, I think the Metacritic score requirement is not a good measurement. What is a good measurement? Not sure. But, you might be right, it could be the best we have if people want to judge quality.

Log in to Reply
Ben says:

July 30, 2012 at 10:05 pm

The Mass Effect 3 thing is interesting. Ignore the ending, and I expect most people would rate their play experience as nearer an “89” than a “42”. Do we rate a game by how we feel at the end? Or how we feel during play? Does the final disappointment undo the accumulated enjoyment up to that point? Maybe I’d be more critical of these scores if I played more story-based games.

A disparity between review scores based on the size and prominence of the development team (or rather their publisher’s marketing department) is certainly a factor that would bias the results. I think this just means you have to apply the score relatively rather than absolutely; compare it to games of a similar budget and a similar genre.

I expect that a company like Irrational is going to be looking for people who have worked at a certain kind of level anyway. And even big budget top-tier projects often score lower than 85: Borderlands, LA Noire, Age of Empires 3, even Call of Duty: Modern Warfare 3 – was there ever a bigger marketing budget, in fact? So if you were already restricting your search to people who managed at AAA level, the Metascore could still be considered a useful discriminator.

Log in to Reply
Andrew Copland says:

August 9, 2012 at 3:26 pm

I’ll just leave this here then shall I? 🙂

Couldn’t find either yourself or Psychochild on there at all, nor could I find many of the people I’ve worked with, even on the same title I’m credited on. Plus it’s only found 1 in 12 published games that I am credited on.

It also seems wrong to have a hard bonus cut-off like that in the contract, perhaps you only get 50% of the bonus for a score in the range 80-85% but a full bonus for 85-95% and a super-bonus-with-strippers for anything +95%.

Then of course you have to ask, why use the review score as a bonus guide? Surely it’s sales that matter? the reviewers might hate the living shit out of “My Horse” for iOS, but it still makes Natural Motion huge sacks of cash. Surely the monetary reward you get should be based on what people are buying instead of what the reviewers thought since you sent them that copy for free anyway.

As for developer scores, that’s just flat out retarded. Their system, as demonstrated at the top of this comment, just doesn’t work… which is a bit of a flawed basis to then base your review of someones ability on.

Log in to Reply
- Ben says:
  
  August 10, 2012 at 3:36 pm
  
  Andy:
  Re: hard cutoff points – yeah, I agree that a bonus gradient makes more sense, although it requires a bit more calculus when it comes to calculating the risk involved.
  
  As for review scores vs sales, I’d say that it’s not so much about what matters to the bottom line but about what you’re trying to achieve and what you have control over. Assume money isn’t an issue – a developer would probably prefer to make a ‘good’ game than a ‘popular’ game. So when a publisher says, “we can reward you for popularity, or we can reward you for quality”, which one is more likely to align with your own goals? And then there’s the control issue – a developer’s control over sales is more indirect than its control over quality. If a publisher doesn’t fancy paying out a bonus then it can punt the title into a February release instead of a November one, or can cull the marketing budget, etc.
  
  Log in to Reply
  - Andrew Copland says:
    
    August 10, 2012 at 4:58 pm
    
    Ah but there you have the same problem, the publisher. Every title I’ve worked on has suffered in quality at the hands of the publisher asking us to do things we think are crap that are then borne out to be… well, crap.
    
    It is after all their game, and their money so they get whatever last minute shovelware feature they want. They get to drop two platforms and launch it only on the port-to-it-last platform. Or add “shampoo buffs” etc 😉
    
    The developer has a lot less control than we’d like.
    
    On the bonus front the issue for me is less about what the developer can control and more about taking a share of the revenue back to the developer fairly.
    
    I don’t think that development houses should be asking for bonuses that they can’t actually survive without. Instead I think they should be getting profit sharing deals and then paying development staff bonuses out of that.
    
    I say this because the current scheme means that the studio might get a bonus, most of which is passed onto staff… yay for staff, but then it’s straight back to the publisher cap in hand for more work. If that work falls through it’s layoffs again. Meanwhile the publisher is still earning vast amounts of profit from a successful title.
    
    If however a game does well and the studio shares in the profits from it then it can still pay it’s staff bonuses. However it also earning from the continued success of it’s game for as long as it’s successful.
    
    So I don’t think it’s the criteria for a studio bonus that’s wrong, its the idea OF a studio bonus that’s wrong. Much better to have no bonus scheme, but be guaranteed enough money up front for the studio to complete the game and survive post-launch. Or to have a profit sharing scheme and still be paid enough up front for development. Than to take a bonus scheme that puts the staffs future into doubt dependent on elements that you only have illusionary control over.
    
    Log in to Reply
    - Ben says:
      
      August 10, 2012 at 5:30 pm
      
      Ideally development houses wouldn’t do a deal where they can’t live without the bonus, but then the money is going to run out one day, bonus or not. And if it’s the choice between a deal where you absolutely need the bonus to survive 2 or 3 years from now, or no deal at all… you know what they’re going to choose. 🙂 The publisher is going to haggle down and the developer will try and haggle up but without an infinite supply of alternative publishers the developer generally has a choice of taking a poor deal or closing down.
      
      I’m not convinced the publisher is reaping massive profits while the developer starves, in most cases – as your other comment noted, some games are requiring millions of units to break even, simply because the publisher has to stump up far too much money in the first place just to pay the developer. This is a really good article – http://www.notenoughshaders.com/2012/07/02/the-rise-of-costs-the-fall-of-gaming/ – showing that games are just costing too much to make. Most of the time there’s not going to be any significant profit to share.
      
      My personal opinion is that the problem needs to be solved in a few ways – money needs investing in making games cheaper and quicker to develop, games companies need to be more flexible with the workforce (because expecting a constant stream of paying projects is probably unrealistic), and alternatives to the traditional publisher model need to be pursued with more intensity where possible.
      
      Log in to Reply
      - Andrew Copland says:
        
        August 10, 2012 at 6:54 pm
        
        I think it’s more that they might, and they spread their bets a little bit across several different studios. If each studio is relying on the bonus from a successful game but only 1 typically does then a lot of them lose out. The publisher isn’t raking it in but they’re more likely to survive since they’re the sole recipient when they do get that one big hit.
        
        Also, how the hell are these games costing so much to make? I still can#t fathom how these games are costing 15 to 60 times what MotoGP 10/11 cost!
- Brian 'Psychochild' Green says:
  
  August 12, 2012 at 1:25 am
  
  I’m kinda on Metacritic: http://www.metacritic.com/person/brian-green
  
  The problem is that my games are blended in there with others I didn’t work on because of my common name. But, hey, one is over 85 with “Brian Green” credited as a programmer, so I qualify for that original Irrational job posting, right? 🙂
  
  Log in to Reply
Andrew Copland says:

August 10, 2012 at 1:39 pm

Also after we met up for lunch (and a beer or 4) I found this article quite interesting. Can you imagine working on a game that sold 5 million copies?!? Nevermind needing to sell 5 million copies just to break even.

Total madness.

Log in to Reply

Ebonyfortress Productions

9 Responses to In defence of Metacritic scores

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta