Action Arcade Sounds and Reality
NO video/demo to see here — sorry! — this is just a personal recollection of how I made sound work correctly in a frenzied, action-arcade-style game, on a personal computer, once upon a time in the 1990s. I hope you enjoy reading it.
Space Invaders
The original, coin-op Space Invaders (which looked much better than any emulation I’ve ever seen; the video image was reflected by glass over a psychedelic, pop-art moonscape, and looked great in a dark arcade) played exactly seven different sounds. Those were:
- your base exploding
- your base firing a shot
- your shot killing an invader
- the flying saucer cruising across the top of the screen
- your shot killing the flying saucer
- winning an extra life
- the background thump-thump-thump-thump of the marching invaders
These sounds were achieved via a custom sound board, that had seven separate chips on it, one chip for each of those sounds. When the main processor, running the Space Invaders game program, asked the sound board to play one of the seven sounds, the appropriate chip would be invoked to play the requested sound. Each chip made only one sound, which sounded exactly the same every time it was asked to play it. (Except for the “thump” chip: it rotated its thumping sound among four different pitches, and could be asked to play that cycle at many different rates.)
Each sound chip could be asked to start or stop its sound at any time, but couldn’t be asked to play the same sound more than once over itself. So the same sound couldn’t play simultaneously with itself — but other than that, all seven sounds could play simultaneously, because they were being generated by separate chips running in parallel, and the analog output from those seven chips was mixed together into a single audio signal for the mono speaker at the base of the unit.
Williams
The popular Williams arcade games that came several years later — Defender, Robotron, Joust, and a few others that used similar hardware (not Moon Patrol) — were much more sophisticated than Space Invaders in many ways. For their sound, they used a DAC (Digital-to-Analog Converter), and all the sounds were digital recordings in the ROM of the machine. This allowed for a much richer variety of different sounds, including the human speech in Sinistar, for example.
But it did come with a big limitation: The DAC could play only one sound at a time. It could instantly stop a sound and start a new one at any moment, but it could only play one at a time. So those Williams games really only ever played one sound at a time! If you actually played those games when they were current, you might have a hard time believing that this was true — but go back and find them at some retro arcade, and you’ll soon be convinced. Many of the sounds were short, and if they got cut off a little early by another sound, you might not notice. Plus, so many things were going on in the game, and so many sounds were being played, that it was easy to be unaware of the one-sound-at-a-time limitation.
As a programmer myself, keenly interested in games and how they worked, I did notice the one-sound limitation when I was playing those games. But I also realized that the game had some clever way of deciding when to start a new sound (replacing the currently playing one), and when not to do that, in order to minimize the noticeability of the one-sound limitation. I never tried to figure out exactly what rule it was following, but I knew it was doing something to make smart decisions about when to play a sound, and when to skip a sound to let the currently playing sound continue.
Simple Solution?
So when I decided to write a little, fast-action, Williams-esque, arcade-style game for my 1990s personal computer — mainly for my own entertainment, but also in the vain hope that it would make some money — I thought to myself: this computer can play six sounds at a time. So if I just rotate among those six sound channels, I should be fine. The Williams guys had to do something smarter than that, because they had just one channel. But I have six. So channel rotation will probably be just great.
Sound was the last thing I dealt with, so the game was essentially completed and functional by the time I started working on the sound part of it. Then reality hit.
Reality, Dose 1
The first time I opened six sound channels in my game (not even trying to play any sounds on them yet), I discovered to my horror that the performance of the game suffered dramatically. Experimentation with different numbers of sound channels revealed that three was the ideal, sweet-spot number of sound channels: not significantly degrading game performance, but much more than just two or one channel. Four channels seemed like overkill compared to three (can’t I probably get by with three sounds at a time, when the Williams guys made it work with just one?), and four channels did noticeably start to hurt game performance.
So I had three channels to work with, not six. Oh well, I can just rotate among three channels, can’t I? (Not as confident now.)
Reality, Dose 2
When I tried playing my game with sounds, the sounds seemed a little sporadic and inconsistent, but then quickly became very sparse, and then stopped playing altogether. Restarting the game app cured this problem temporarily, but in much less than a minute of game play, it became all silent again.
Experimenting and guessing for quite a while, I finally discovered the cause. The OS worked on “ticks,” each of which represented 1/60 of a second of time. If the same sound channel was asked to play more than one sound before the next tick passed, then that sound channel would lock up and become silent until the app was shut down and re-launched. Some sort of OS oversight, I suppose. A minor system bug.
This problem was easily fixed by changing my code so that playSound(), my sound-playing function, would never ask a channel to play more than one sound within the same tick. I don’t remember for sure, but I think I made it keep a list of what it wanted to play on each channel, then submit to the OS’s actual sound channel only the last (most recent) request that occurred in one round (tick’s worth) of game logic execution. That was a perfect simulation of the sound channel as it would function if the lock-up-silence bug did not exist.
Reality, Dose 3
The problem of sound channels going permanently silent was now completely cured. But the results were still gravely disappointing. Sounds abruptly cut off at random, sometimes before they even barely started to play, or noticeably didn’t seem to play at all, and the overall effect was of a game with a badly broken, dysfunctional, sound system. It was totally unacceptable. Sigh. So now I knew that even with three sound channels to work with, some sort of rudimentary AI was going to be needed to make it sound decent.
What to do? I thought about it for a while, and came up with this plan:
Tables
My plan hinged on three critical tables of information. The first table was just an enumeration list (numbers: 1, 2, 3, etc.) for each of the names of the twenty-or-so sounds in my game. This list already existed, but was in an arbitrary order. So I made its order non-arbitrary, representing the priority of the sounds. Sound #1 was the lowest priority (least important to play), and sound #20 was the highest priority (most important to play).
That order was based partially on the game-related importance of playing a particular sound. E.g. the extra-life-earned sound was the highest priority sound in the entire game, sound #20. The zooming-to-the-next-wave sound was the next-highest, sound #19. And so on. Sounds that were deemed more important for the user to actually hear were given the higher priorities.
But also, much of the ordering had to do with the sound itself: A loud, punchy sound got a higher priority than a softer, more subtle sound, on the theory that if a punchy sound abruptly replaced a softer sound, the user wouldn’t even notice that the softer sound had stopped playing. But if the reverse happened, it would sound jarring and wrong.
Surprisingly, figuring out this order proved to be easy: Several quick choices, and it all made good sense.
The second table was a list of sound durations (in ticks). The audio system of the OS didn’t allow you to ask a sound channel if it was currently playing a sound or not. You could tell it to start playing a sound, and if it was already playing one, then you would be replacing that one with the new one. But you couldn’t see if a channel was done playing and was sitting idle. So this sound duration table I created would allow the program to know whether a sound channel was idle, simply by comparing how long it had been since that channel was asked to play a sound, to the duration of the sound it had been asked to play.
The third table was another list of sound durations, but this one represented the important part of the sound. So if a sound was, say, 60 ticks (1 second) long, then the important part might be 20 ticks (the first 1/3 of a second). This was based on my subjective judgment, examining the sound in my sound app, playing limited subsets of it, and deciding how much of it was the important part. The important part might be all or most of the sound, or it might be a small fraction of it, depending on what that sound sounded like. The basic idea was that the important part is the part that needed to play to give the user the impression that they heard the sound play, and it wasn’t cut off.
playSound() Logic
When my game wanted to play a sound, it would call my playSound() function, passing it the enum constant of that sound (a value from the first table). The playSound() function would decide how to pass that request on to the three sound channels, then it would return to the calling code (with no return value).
Previously, this playSound() function had simply used the three sound channels in a rotating order, but that profoundly unsatisfactory technique was now replaced with logic that went like this:
Preference 1 — Of the three sound channels, find any one of them that is currently idle, and start the requested sound on that channel. Then exit to the calling code.
If no idle channel exists, then proceed to Preference 2:
Preference 2 — Find which of the three sound channels is playing the lowest-priority sound. (Settle a tie by choosing the one that’s been playing longest.) If that channel’s sound is of lower-or-equal priority to the newly requested sound, then start the new sound on that channel, replacing what it was doing. Then exit.
If all three channels are playing sounds of higher priority than the requested sound, then proceed to Preference 3:
Preference 3 — Find which sound channels are playing the unimportant part of their sound, and choose the lowest-priority of these. (Settle a tie by choosing the one that’s been playing longest.) Start the newly requested sound on that channel, replacing what it was doing. Then exit.
If all three channels are playing the important part of their sounds, then proceed to Preference 4:
Preference 4 — Do not play the newly requested sound. Simply return to the calling code without doing anything.
And there it was. My new sound system.
Careful
Even in the ’90s, I had enough experience under my belt to realize that if there was any kind of accidental mistake (bug) in the coding of the above-described logic, I might never know it. I would just listen to the results, and think, “I guess this is how good this technique is,” not realizing that it wasn’t actually working completely correctly. So before even trying to run it, I scrutinized this new code line-by-line, making absolutely sure it was correctly coded to do what I had planned.
The Miracle
All looked good, so I gave it a whirl. Wow!!! What an incredible difference. It seemed that all the issues with sound had just vanished into thin air, and my computer suddenly had unlimited ability to play any sound, at any time! It didn’t seem like it was performing any sort of complex triage; every action in the game just made its sound, like magic. I could scarcely believe it myself. Even knowing exactly what it was really doing, I could hardly even tell. It really felt like it had suddenly been ported to a machine with infinite audio channels that all ran at once.
No further work on the game’s sound system was required. It was done. From that moment on, all my testing/playing of the game never exposed any issue with the playing of sounds.
Lessons Learned
I gleaned a few bits of valuable knowledge from this experience:
- Things that you assume to be dead simple (play a sound and it just plays, right?) can turn out to be hiding a lot of complexity and practicality that you didn’t even know was necessary.
- Simple solutions (just rotate among the sound channels, right?) sometimes don’t work even at a moderately acceptable level of utility.
- Solutions that seem overly complex, even kludgy, can turn out to be amazingly good in practice.
- And of course, I now know how I’ll be coding sounds for any arcade style game, if I ever again create one. Fingers crossed; that might actually happen before I die. 😉
Update 2021.06.09 — Preference 2 and Preference 3 might need to be swapped. I honestly can’t remember now which of those two came first!
See also:
Rock-Solid iOS App Stability