Feedback Loop – Backward audio progress in Max Payne 3

A company that has a reputation for sparing no expenses spent 8 years in development producing Max Payne 3, a game with sound that belongs on the PS2. Instead of providing the cinematic sound spectacle I expected, Rockstar cut corners in ways I haven’t seen any major studio do in a long time.

In a recent Crude Pixel article Iain Hetherington brought up the pursuit of making a ”cinematic” experience with sound. Talking about Max Payne 3 he concluded that the game was “all-in-all an amazingly innovative, interactive, cinematic audio presentation. At times I felt like I was watching a film when I was in fact meant to be playing a game.”  He pointed out that games shouldn’t necessarily aim squarely at emulating movie sound since games are their own thing, and I agree with that. What I don’t agree with is that Max Payne 3 would be a good representation of what game audio can or should be.

Max Payne 3, per definition, does not have “cinematic” sound. It has a kind of audio mix that you will only hear in a game, in a negative sense. I have not heard any movie sound like Max Payne 3 and nor will I ever, as the developers made sound design choices which are frankly baffling.

In a movie, 75% of all dialogue and sound effects play only out of the center speaker. The front stereo speakers play music and the surround speakers play ambient sounds and effects. That’s the basic idea of what “cinematic” sound is like. Most movies follow that formula. There are some exceptions, such as Tron Legacy which dominates the audio experience with its 5.1 music, but for sound effects and dialogue it sticks to that formula as well.

5.1 speakers. Each speaker is optimized for its role in a home theater system.

What Max Payne 3 did was to almost ignore the presence of the center speaker and abuse the surrounds. When you shoot guns the sound plays out of all speakers simultaneously, when Max speaks his voice booms out of all the speakers at the same time. What that does is make the weapons and Max voice louder, but also stresses the front and rear speakers unnecessarily. This reduces the clarity of the sound since most speakers can only play a few sounds at the same time before they start smearing over the more subtle details that might exist in the sound mix. The center and front speakers are perfectly capable of producing weapons and voices with clarity and impact without having to use the rear speakers for that as well. Another consequence of playing back those sounds from the rear speakers is that audio isn’t anchored to the screen anymore and is instead “in your head”, as if you’re listening to headphones. This was a common problem in the last generation of console games as titles with no surround mix would play back their stereo cutscenes in quad to trick the players into thinking they were getting surround. Surround sound wasn’t as common back then so this lazy approach was somewhat excusable.

Max Payne 2 did not use any cheap tricks like that to make the sound louder. It instead stuck to the definitions of what makes for a cinematic sound experience and at the time it was by far the best sounding game to have been released. It took a while for other developers to catch up, but these days you can generally expect a game to have a mix which lives up to the definition of cinematic sound, such as every game released by EA in the past 6 years.

There’s one more critical error Rockstar made: not use any acoustic effects for large portions of the game. Proper use of reverb matters when you’re creating a virtual world. It’s of critical importance that what you see and what you hear match. You might not think about it, but your brain notices when something is off. It’s one of the puzzle pieces that, when done right, can elevate a game from “just a game” into a spellbinding experience. There are a couple of sections of Max Payne 3 where Rockstar seems to understand this and those sections are undoubtedly the highlights of the game, not only sonically but for the game as a whole. But for many hours of the game you might be exploring entire hotels and police stations without hearing a single echo at all. This turns these locations into game arenas, instead of the real world locations they were clearly supposed to be. Max Payne 2 had unique acoustic effects for every room in the entire game, just as every PC game around 2003 did. Here we are in 2012 and we’re moving backwards.

Rockstar had cinematic aspirations when making Max Payne 3. The audio presentation of the game however never lets you forget you’re playing a game. Other games such as Metal Gear Solid 4 and Mass Effect 3 stand out as examples of what Max Payne 3 should have sounded like. It and many other games this generation truly do have “cinematic” sound. Max Payne 3 however does not, no matter what Iain Hetherington or Rockstar themselves might say. It’s a throwback to the earliest attempts at surround sound in games from the time of the original Xbox. It might have a couple of flashes of brilliance, but as a whole it’s not a game whose sound I want to celebrate.

13 Comments

  1. Manner

    Most games don’t use the centre speaker for VO, for custscenes this can make sense but in the game it doesn’t because games are trying to simulate. In a game where you can look anywhere it doesn’t make sense, so of some is talking to you in game and you turn around and walk to the other side of the room, it wouldn’t make sense that the sound source was still front and centre. This doesn’t work all the time though the designer has to make choices when mixing the game in surround. I played the game in stereo so I don’t know how max did it.

    I think when you talk about acoustic effects, those games from 2003 did reverb over the top, it was crude. I think you are just not noticing them today because the effects are more realistic, when you directly compare reverb from a lot of those 2003 pc games with today you will see a difference today because they’re not so dramatic.

    • Peter Hasselström

       @Manner
       
      I’d prefer it if games used the center speaker more, but yes unfortunately many tend to play VO from the front speakers instead of the center. Dead Space 1 and 2 did use the center all the time since they never left the game engine, even during cutscenes. So all the noises your character made and dialogue with NPCs came out of the center, if you were looking at them. This is an approach I’d like more games to use, because with no pre-rendered cutscenes the game could use the 3d sound engine all the time.
       
      Reverb today is done in a less over the top and more realistic manner, that’s for sure. In Max Payne 3 there just wasn’t any at all in many places. This bugged me especially in the hotel level which had hard concrete surfaces everywhere and some quite large arenas, but no reverb. Even if the effect is more subtle in games today it’s still noticeable if you listen out for it. Even Crysis used different types of reverb depending on if you were standing among trees or if you were on the beach. Subtle effects, but they helped make those places sound real.

      • Manner

         @Peter Hasselström   Dead Space had it easy, most of the time characters would speak at you over radio, you rarely spoke to anyone face to face. Half Life 2 for example, you’re in the ‘Red Letter Day’ room where 3 people are having a conversation and you the player are free to walk anywhere in the room.  Do you think that all the VO should come out of the front / center speakers even if the player is standing at the far end of the room facing away from everyone?

        • Peter Hasselström

           @Manner
           No of course not, in a situation where the player controls the camera dialogue should be localized to whoever is speaking. If you’re standing in the corner on the other side of the room facing the wall the dialogue should come from the rear channels. It’s only in cinematic style cutscenes where the player has no control over the camera where that would make sense.

        • Manner

           @Peter Hasselström Yeah exactly, I thought you meant differently haha.

  2. MauriceCourtelin

    I think you’ve missed the point of the original article. Cinematic sound is not defined by how many speakers you have. Would the sonically innovative intro sequence to 1932’s “Love Me Tonight” be regarded as un-cinematic due to lack of reverbs that convey the exact space in which the sounds take place in? Would “Apocalypse Now” sound un-cinematic if it were watched on a mono television?
     
    Being cinematic in an aural sense is regarded by most practitioners and academics (Randy Thom, Walter Murch, Michel Chion) as being part of the overall function of cinema: to tell a story. Therefore the function of sound in a cinematic sense is to support and enhance the narrative experience through sound. Although there is an arguement that playback systems may be able to support this notion, it certainly wouldn’t serve as a “definition”.
     
    The original article was referring to the way that the term “cinematic” is being used as a marketing tool and that it’s non-sensical as they are two different mediums that formulate their narratives in differing ways (linear scripted events versus interactivity through gameplay). Despite this, Max Payne 3 acheives a convincing cinematic experience through its use of ambiences and focused mixing despite the fact that gameplay suffers: feeling bolted on to a linear narrative.
     
    As for the lack of acoustic effects, I think you should listen more closely. Also, there’s the posibility this was done intentionally to create a greater dynamic variance between the sound of bullet time and standard time. Examples of good environmental reverbs cited elsewhere in your writings are over-the-top, draw attention to themselves for no apparent reason, and lack the subtlety of those you would experience through film. 

    • Peter Hasselström

       @MauriceCourtelin
       
      Technology and expectations change over time, so I’m sure someone completely unfamiliar with movies mixed in mono might consider ”Love Me Tonight” to not be “cinematic”. I’ve seen people on Home Theater forums who refuse to buy blu rays with mono sound because they’ve invested tons into their 7.1 setup and are actively seeking out spectacle suitable for it. Personally I think that’s a bit mad since I love what older films do with the movie medium as it’s often things that cannot be replicated today, but those people do exist. How the movie itself was mixed and what the viewer or player has to play it back are different. Yes, Apocalypse Now would be less cinematic on a mono television compared to it being viewed in a theater. The basic gist of the sound mix would still be transferred over the mono TV, but not in a way that fully conveys the intent of the sound.
       
      When “cinematic” is used to describe the audio of a game I’d argue it should stand for something different than when describing a movie. The way I see it “cinematic” refers to the common practices today while mixing movies, such as wide dynamic range and how the channels in a surround system are used. If a game has “cinematic” sound it means it sounds like a movie, and those are the things that define what a movie sounds like today. A game like Serious Sam 3 is not cinematic as it has minimal dynamic range and uses cheap tricks such as playing music and effects in quad to make them louder. Max Payne 3 does similar “gamey” things with its sound mix, although not quite to the same extreme as Serious Sam 3. The difference between the two is that Max Payne 3 is clearly aspiring to be movie like in its style, while Serious Sam 3 has no such aspirations.
       
      I did listen closely for acoustic effects in Max Payne 3 and only heard them in a couple of places in the game. Most of the time the sound was flat and dead, with a couple of exceptions such as the times you’re exploring without a gun. Maybe they are easier to hear in stereo when the weapon sounds aren’t being played in quad in the surrounds and deafening the player, but I played it all in 5.1 with the speaker system pictured in the article. Most titles from around 2003 did have over the top acoustic effects and some still do today, but in games like Metro 2033 I feel the bar has been set for how it should be done. Uncharted 2 and 3 would be a closer comparison to Max Payne 3, and they do have clearly audible echoes if you listen out for them. I could never make out any in most of Max Payne 3. I cited Metal Gear Solid 4 in the article because it uses the channels exactly as a movie would and even has a soundtrack mixed in 5.1, and Mass Effect 3 for it’s masterful use of dynamic range. Two traits that are exactly what I’d expect from something with cinematic sound. Max Payne 3 just sounded like a game to me, in a bad way.

      • MauriceCourtelin

         @Peter Hasselström If you believe “cinematic sound” relates mostly to the number of speakers you have then I really would consider reading some material from the guys mentioned in my previous post.
         
        In contrast to what you have written with regards to speaker usage in a consumer surround system. All front 3 channels are used for dialogue and all 6 channels for sound effects. In fact when cutting ambient tracks for a movie a typical minimum would be 2 monos (for center channel to mask incosistencies in dialogue particularly between production and ADR), and 4 stereo tracks, which will be spread between LfRf,LsRs at the mix stage at the discretion of the mixer.
         
        What you are actually saying here is that you use the term cinematic in the same way marketing people do without really considering the true function of sound within the medium in which it is being used, thinking instead of it being related to technical features while ignoring the artistic and aesthetic goals.
         
        Appreciation of good sound is not about boasting to the world about how many B&W speakers fill your living room. It’s about understanding it’s function within the bigger picture and how well it achieves this. Simply trying to boil down all of this to misconceptions and speculation about technical details of film production from someone who clearly isn’t a practitioner and has no experience shows immaturity and ignorance.

        • Peter Hasselström

           @MauriceCourtelin
           
          The number of channels matter and the technical details of how they’re used matter. That being said, I agree that in the end what’s most important is the end result and how it supports the artistic vision of the creators. I know all front channels are used for dialogue and that all 6 are used for sound effects, but in an attempt to try to clarify the difference between “game” and “movie” sound I made it sound like I didn’t.
           
          How the channels are used technically is important for how the artistic vision is supported. In the game medium the player explores virtual worlds that can feel like real places. When the intent is to create a believable world where the characters and environments feel like real people in real places an audio mix like the one in Max Payne 3 undermines it. It sounds like an arcade shooter where the guns and explosions take precedent over all the smaller details that make the world come alive between the gun shots.
           
          The choices made in the mix are wrong for the type of game Rockstar were trying to make. Not because I want a game with high fidelity to show off my speakers, but because the mix actively took me out of the experience since I couldn’t connect with the world they were creating. I don’t believe the lack of acoustic effects was there as an aesthetic choice, I believe it was done to save time on an area where it was assumed nobody would notice the difference.

        • MauriceCourtelin

           @Peter Hasselström The guns and explosions should take precedent over the mix: it’s a shooter. I can’t think of any film or game with guns where they don’t take precedence apart from when decisions have been made artistically to omit them in order to highlight something else by taking you out of that real world expectation.
           
          The world in Max Payne sonically is rich, deep and reactive to the players actions. As gun shots and explosions tail off, voices, dripping, winds, animals etc. fill the void. Sonically the ambience of the game is teeming with life.
           
          I don’t know whether the verbs were toned down intentionally, that’s speculation. I was merely trying to make a point of what you are missing. Could you clarify what your issues were with the mix were as all you are really conveying is that the reverbs weren’t obvious enough.

        • Peter Hasselström

           @MauriceCourtelin
           
          During the first couple hours of playing Max Payne 3 I did think the audio was great, it was when I got further in that I began to notice issues. Many areas of the game have ambient sounds of life or weather to fill in the void after the guns have been shot. As you get to the later parts however much of the time is spent indoors and often in evacuated areas.
           
          My two major issues were the reverb and how the ”center” of the sound was placed not at the camera, but at the Max Payne character model. When I’m exploring I expect some kind of acoustic effect on the footsteps or things I knock over appropriate for the area I’m in. It should of course be subtle, but when there’s nothing else going on the absence of effects become obvious. The areas that felt “rushed” to me because of this were the boat, the hotel and police station chapters
           
          How the sound was centered on the character model instead of at the camera might be the root cause for all my issues, including the reverb. In most third person shooters like Max Payne 2, Uncharted 2/3 etc the camera is the “center” of the soundstage. All the sound of the character walking, shooting etc comes from the front speakers as it’s in front of the camera. If the sound source is centered at the character model the reverb of the room would become harder to hear from the footsteps since you’re right up close to the source of the sound. The reverb should still be noticeable as I have no problems hearing it in first person games.
           
          My assumption that the audio was centered at the character model is based on that if you strafe from left to right while firing the sound of the guns shifts slightly to either side. This wouldn’t happen if the sound was simply being hard coded to play in quad. Instead it sounds as if the sounds are anchored to something in the world.

  3. AndrewWiguna

    Very good article! About time that there’s someone out there who has a great appreciation to audio as there is to the graphics in gaming today. I myself am coming from an audiophile background and that whilst I am no longer a heavy gamer myself – I do have  fair share of understanding of how good audio should fundamentally works – in terms of musicality (enjoyment) & functionality (technical delivery).
     
    It is most interesting that you’ve seemed to hear the sound effects louder than everything else in the game’s mix. From my end however (after switching around from one headphone/amp, etc) – I do not know why but I’m experiencing the opposite.
     
    I’ve just recently started MP3. Personally, I do find that there’s two things that always bugs me two things during the first 15 mins of the game: lack of balance, and complete ABSCENCE of acoustic ambience – this in which you’ve described it spot on. I am strictly a headphone user & uses an external DAC with head amp whenever I game. 
     
    What I mean by balance – is the overall level of all the perceived sounds in the game. SFX Vs. Music Vs. Environments ratio. I have always felt that the music always for some reason overpowers everything else, including the sound effects. Personally, I found all the weapon SFX to be lacklustre in terms of timbre, volume and authority. Putting the music level all the way down however only fixes a 1/4 the problem – the dialogue easily overpowers everything else.
     
    Perhaps it is by design’s intent in making the weapons to sound more “nimble” and understated – but for not having the right balance in the first place makes the overall game somewhat ‘childish’ and lacks what I call “authentic authority”. Or maybe perhaps it is after all – a story oriented approach they’ve imposed so that the dialogue is a priority over everything else in terms of volume. This is in fact at its’ worst when MP1 was out –  so much that I had to look for mods / mod it myself so that it sound right. 
     
    The only two games I feel that’s sonically best executed is Mass Effect series (for third person gameplay) and Far Cry 2 (for FPS). The latter being – is what I believe – the only game that showcases the best authority, depth, ambience and most of all balance.

    • Peter Hasselström

      @AndrewWiguna
       I agree that especially Mass Effect 3 got the balance right between all elements of the sound. Having both a headphone and surround setup available has allowed me to compare the mixes in games and often I find that the stereo mix is neglected and can almost sound broken in comparison to how the surround mix was balanced. When the balance of ambiance, music, voices and incidental sound effects like guns is right it’s like the game casts a spell on you and the illusion of it being a living, breathing virtual world truly works. Getting the sound right certainly helped Far Cry 2 as it had plenty of issues otherwise that became more bareable when the sound was good.
       
      Having a good stereo mix should be of higher priority for game developers. The fastest growing segment of the audio market by far is headphones, and especially on PC I suspect a large amount of players use them exclusively.