Beginning Game Development: Part VIII - DirectSound

Welcome to the eighth article on beginning game development. We have spent a lot of time working with the graphics capabilities of DirectX. We also covered how the DirectX API allows us to control input devices. Now we are going to look at another facet of DirectX, the ability to control sound devices. This capability is found in the DirectSound and AudioVideoPlayback namespaces.

Sound in Games

Sound creates an ambiance in a game that provides for a more immersive game experience. Imagine how dull a game would be without sound effects; nothing would indicate when you fire your cannon or an explosion occurs. Sounds also can be used to increase the drama of a scene by increasing the tempo as the action increases.

Sound effects also provide the same audible cues we expect in real life, such as the direction and speed of a person approaching us based on the volume, direction, and frequency of the footsteps. These sound effects add realism to the game just like proper physical behavior of objects do (something I will cover in an upcoming article about physics). Games also provide background music to make playing the game more fun. In newer games, artists who have their music included in a sound track for a popular game see an upswing of sales — it is not unusual for well-known artists to provide music for a game soundtrack. In effect, games have reached the same level as movies with the regards to the importance of soundtracks.

In BattleTank2005 I want to integrate sound in the following way. First, I want standard sound effects for shooting, explosions, and engine noise, and I want that sound to be directionally accurate. What I mean is that when I am getting shot at by an enemy to my left, I want the sound to come from the left and the volume to be an indicator of the proximity of the unit.

Secondly, I want to be able to play background music during game play and I want to control what music plays when in the game. This means I want to have one song play during the splash screen and game setup, another while playing the game, and yet another during the HiScore capturing (we are going to add these screens and states in a later article).

I am going to cover the first requirement in this article, and then cover sound effects and playing MP3 and WMA files with the AudioVideo namespace in the next article. Before we add these features, let's review the capabilities of DirectSound.

DirectSound

The DirectSound namespace only supports playing 2 channel waveform audio data at fixed sampling rates (PCM). While I have no idea what that really means, it is safe to state that you should use DirectSound to play short WAV files and the AudioVideoPlayback namespace for longer MP3 or WMA files. I am not going to cover how to use the sound capturing/recording capabilities of DirectSound namespace — but remember that they exist so you know where to look if your game requires recording sounds as well as playing them.

The DirectSound namespace provides the ability to play and capture sound with three-dimensional positioning effects. DirectSound also provides the ability to add sound effects to the audio played or recorded. Just like in the DirectX3D and the DirectInput namespaces, the actual hardware device used is abstracted into a device class. Just like the device classes in those two namespaces, the DirectSound device uses buffers, has a cooperative level, and has device capabilities. I am not going to cover the sound effects in this article, but they are not forgotten; they will be covered in detail in the next article.

Device

A device is the interface to the audio hardware on the computer. You can either create a Device class using a default GUID (DSoundHelper.DefaultPlaybackDevice) or enumerate all the devices on a system. Like the other device classes, each enumerated device has a list of capabilities stored in a Caps structure in the Caps property for the device. Once you have chosen your device, you instantiate the device class using a specific GUID.

Audio devices also have a cooperative level like the input devices did. The three possible values are: Normal, Priority, and Write Primary. These values are set via the CooperativeLevel enumeration.

Cooperative Level	Meaning
Normal	The application cannot set the format of, or write to, the primary buffer. For all the applications that use this level, the primary buffer setting is locked at 22 kHz, stereo, 8-bit samples.
Priority	Provides first rights to access hardware resources for mixing, etc., and can change the format of the primary sound buffer.This is the preferred setting for games.
Write Primary	Provides direct access to the primary sound buffer, but the application must write directly to the primary buffer.

Buffers

All the sounds in DirectX are controlled via buffers. These buffers can exist in the memory of the computer or on the sound card itself. The two buffers used in DirectSound are called the primary buffer and secondary buffer.

The primary buffer contains the actual audio data that is sent to the device and is automatically created and managed by the DirectX API. The API mixes the sound in the primary buffer with any secondary buffers. If you need to interact directly with the primary buffer, make sure to change the Cooperative level of the device to Write Primary.

Secondary buffers hold a single audio stream and must be explicitly created by the application. Each application must create at least one secondary buffer to store and play sounds. Each secondary buffer also has a specific waveform format (described in the WaveFormat structure), and only sound data that matches that format can be loaded into that secondary buffer. An application can play sounds of differing formats by creating a separate secondary buffer for each format and letting the API mix them into a common format in the primary buffer. To mix sounds in two different secondary buffers, simply play them at the same time and let the API mix them in the primary buffer. The only limitation to the number of different secondary buffers that can be mixed is the processing power of the system, but remember that any additional processing required will also slow down your game. We have not added any AI or physics computations, but we should be careful with the available processing power.

Any secondary buffer can be used for the life of the application, or it can be created and destroyed as needed. A single secondary buffer can contain the same data throughout the entire game or it can be loaded with different sounds (as long as they match the format). The sound in the secondary buffer can be played once or set up to loop. If the sound to be played is short, it can be loaded into the buffer in its entirety (called a static buffer), but longer sounds must be streamed. It is the responsibility of the application to manage the streaming of the sound to the buffer.

When a buffer is created you have to specify the control options for that buffer using the BufferDescription class. If you use a property of the buffer without first setting it in the control properties, an exception is thrown. The control options can be combined by either setting each property to true or combined in the Flag property.

 1: BufferDescription bufferDescription = new BufferDescription ( );

 2: // Use the seperate properties

 3: bufferDescription.ControlVolume = true;

 4: bufferDescription.ControlPan = true;

 5: // or combine them in the Flags property

 6: bufferDescription.Flags = BufferDescriptionFlags.ControlVolume | BufferDescriptionFlags.ControlPan;

Controlling Volume, Pan, and Frequency

To control these settings of the buffer you must first set the ControlPan, ControlVolume, and ControlFrequency properties of the buffer to true. You can then set the pan, volume, and frequency values using the buffer's Pan, Volume, and Frequency properties.

Volume is expressed in hundredths of a decibel and ranges from 0 (full volume) to -10,000 (completely silent). The decibel scale is not linear, so you may reach effective silence well before the volume setting reaches true silence at -10,000. There is also no way to increase the volume of the sound above the volume it was recorded at, so you have to make sure to record the sound with a high enough volume to at least match the desired maximum volume in the game.

Pan is expressed as an integer and ranges from -10,000 (full left) to +10,000 (full right), with 0 being center.

The frequency value is expressed in samples per seconds and represents the playback speed of the buffer. A larger number plays the sound faster and raises the pitch while a smaller number slows the speed down and lowers the pitch. To reset the sound to its original frequency, simply set the frequency value to 0. The minimum value for frequency is 100 and the maximum value is 200,000.

At this point you may think that we could use the volume, pan, and frequency settings to manipulate the sound and make it reflect the direction and distance of its origin. This was, after all, one of our original requirements. But instead of us performing the calculations to determine the relative locations and distances of each object, DirectX provides an API to do just that for us.

3D Sound

The 3D features of DirectX allow us to locate sounds in space and apply Doppler shift to moving sounds.

Note Check out this excellent explanation of the Doppler effect at Wikipedia, or use thisapplet to help yourself to visualize what it is.

Before describing how DirectX handles sound in three dimensions, it is probably useful to talk about how we perceive sound. DirectX uses these same principles to make the sound appear as realistic as possible.

The volume of a sound decreases at a fixed rate as the distance to the listener increases. This effect is called Rolloff. The relationship between the volume of a sound and its distance from the listener is inverse-proportional, meaning that if the distance is halved the volume doubles. (See:http://en.wikipedia.org/wiki/Inverse_square_law)
The listener perceives sounds coming from the left as louder in the left ear than the in right ear (interaural intensity difference) and also hears sounds coming from the left sooner in the left ear than the right ear (interaural time difference). (See:http://en.wikipedia.org/wiki/Interaural_Intensity_Difference)
The shape of the ear produces an effect called "muffling": sounds coming from the front are perceived as louder than those coming from the back. This is, of course, because the human ear is directed towards the front of the head.
The ridges of the earlobe slightly alter the sound arriving from different directions. This provides cues to the brain about the location of the sound source. This effect can be modeled mathematically and is called the Head Related Transfer Function (HRTF). (See: http://en.wikipedia.org/wiki/Head_Related_Transfer_Function)

We already know that DirectX uses the left-handed Cartesian coordinates system and Vectors to express position and directional information. This same system is also used by DirectSound in its computations. One important piece of information when dealing with 3D sound is that the default unit of measurement for distance is the meter and the default measurement for velocity is meters per second. You need to make sure to use a common system of measurement in the game. You can change this by setting theDistanceFactor property of the Listener3D object to a value that represents the meters per application-specified distance unit. If you have been using feet in all your calculations up to this point, simply set this value to 0.3048 (there are 0.3048 meters in a foot).

Also, since we are leaving the manipulation of the sound to DirectX (as opposed to us changing the volume, pan, and frequency), we must ensure that the sound source we are using is a mono and not stereo source. Finally, make sure to set the Control3D property of the buffer to true to enable 3D sounds.

DirectSound uses two objects to manage 3D sounds in the application: Buffer3D andListener3D.

3D Buffers

Unlike the SecondaryBuffer, a Buffer3D object does not inherit from the Buffer class. Instead you create a 3D Buffer object by passing it a SecondaryBuffer in the constructor. The 3D Buffer exposes a number of properties that determine how the sound is processed.

The MinDistance property determines at which distance the sound volume is no longer increased. You can also use this setting to make certain sounds appear louder even if they were recorded at the same volume (see the DirectX documentation for a detailed explanation of this). The default value for this property is 1 meter, meaning that the sound is at full volume when the distance between the listener and the sound source equals 1 meter.

The MaxDistance property is the opposite and determines the distance after which the sound no longer decreases in volume. The default value for this property is 1 billion meters, which is well beyond hearing range anyway. To avoid unnecessary processing, you should set this value to a reasonable value and set theMute3DAtMaximumDistance property of the BufferDescription to true.

Finally, we can also specify values for the sound cone if the sound is directional. A sound cone is almost identical to the cone produced by a spotlight (see article 6). It consists of a set of angles, one for the inside and one for the outside cone, and orientation, and an outside volume property. Check out the DirectX documentation for more detail on sound cones.

3D Listeners

While the 3D Buffer describes the source of a sound, the 3D Listener describes the, well, listener. Just as the position, orientation, and velocity of the buffer affects the sound, so does the position, orientation, and velocity of the listener. The default listener in DirectX is located at the origin pointing toward the positive z-axis, and the top of the head is along the positive y-axis. Each application can only have one Listener3D object.

To change the way the player hears the sound in your game, you manipulate the position, orientation, and velocity of the Listener. You can also control global settings of the acoustic environment like the Doppler shift or Rolloff factor.

Note The DopplerFactor and RolloffFactor properties are a number between 0 and 10. Zero means that the value is turned off. One represents the real world values of these acoustic effects. All other values are multiples, so that a 2 means doubling the real world effect, 3 means tripling it, and so on.

In BattleTank2005 we are going to add three new classes; a SoundDevice class to represent the actual audio device, a SoundListner class to encapsulate the Listener3D object, and a SoundEffects class to represent each separate sound effect. Each tank in BattleTank2005 can have multiple sounds associated with it, such as engine noise and the noise made when firing. To integrate sound we are going to update the Tank class to play these sounds at the appropriate time and with the appropriate position information. I did not add this functionality to the UnitBase class because the stationary objects (the obstacles) will not have any sounds associated with them.

Before we start adding these classes we need add a reference to the Microsoft.DirectX.DirectSound.DLL assembly. Make sure to choose the 1.0 and not the 2.0 version. BattleTank2005 has not yet been updated to use the beta versions of the 2.0 DirectX API. Add a new class called SoundDevice and add the using statement for Microsoft.DirectX.DirectSound. All three of the classes will implement the IDisposable interface and use the same Dispose pattern that we used in the Keyboard and Mouse classes. (I omitted that portion from the code samples below to keep them more compact and easy to understand).

Following the now familiar patterns we add a private variable called _device to the SoundDevice class and instantiate it in the constructor. We also need to set the CooperativeLevel before we can use the device. As discussed earlier, we will use CooperativeLevel.Priority setting. We also pass a reference of the game form to this method so that the device can receive Windows messages. Finally, we surround the device creation code with a Try/Catch, since things can go wrong whenever we work with devices.

 1: using System;

 2: using Microsoft.DirectX.DirectSound;

3:

 4: namespace BattleTank2005

 5: {

 6: class SoundDevice : IDisposable

 7: {

 8: public SoundDevice( System.Windows.Forms.Form parentForm )

 9: {

 10: try

 11: {

 12: _device = new Microsoft.DirectX.DirectSound.Device(); _device.SetCooperativeLevel(parentForm, CooperativeLevel.Priority); } catch

 13: {

 14: // Can not use sounds

 15: }

 16: }

17:

 18: public Microsoft.DirectX.DirectSound.Device AudioDevice

 19: {

 20: get { return _device; }

 21: }

22:

 23: private Microsoft.DirectX.DirectSound.Device _device;

 24: }

 25: }

The next class we need is the SoundListener class. After adding the using statements for DirectSound and DirectX, we construct the class by passing a reference to the SoundDevice class created earlier. The Listener needs to be associated with the Primary Buffer of the audio card so it can "hear" the final mixed sounds. Even so, the SoundListener class is largely passive, so we do need to update its position, velocity, and orientation to enable 3D sounds on the buffer. Once the buffer is created we pass it to the constructor of the Listener3D class to create the Listener. The final step is to store the Listener3DSettings of the listener in a local variable so we can access them in the Update method.

The Update method provides the way in which we pass the position information to the Listener object.

The last step is to apply the updated values to the Listener. We do this by calling the CommitDeferredSettings method of the Listener3D class after updating the Listener3DSettings values. This method simply commits all changes made to the Listener3DSettings since the last time the method was called. When all is done, the SoundListener class is positioned at the correct location in the game.

 1: using System;

 2: using Microsoft.DirectX;

 3: using Microsoft.DirectX.DirectSound;

4:

 5: namespace BattleTank2005

 6: {

 7: class SoundListener : IDisposable

 8: {

 9: public SoundListener(SoundDevice soundDevice)

 10: {

 11: BufferDescription bufferDescription = new BufferDescription();

 12: bufferDescription.PrimaryBuffer = true;

 13: bufferDescription.Control3D = true;

14:

 15: // Get the primary buffer

 16: Microsoft.DirectX.DirectSound.Buffer buffer = new Microsoft.DirectX.DirectSound.Buffer(bufferDescription, soundDevice.AudioDevice);

17:

 18: // Attach the listener to the primary buffer

 19: _listener3d = new Listener3D(buffer);

20:

 21: // Store the initial parameters

 22: _listenerSettings = new Listener3DSettings();

 23: _listenerSettings = _listener3d.AllParameters;

 24: }

25:

 26: public void Update(Vector3 position )

 27: {

 28: _listener3d.Position = position;

 29: _listener3d.CommitDeferredSettings();

 30: }

31:

 32: private Microsoft.DirectX.DirectSound.Listener3D _listener3d;

 33: private Listener3DSettings _listenerSettings;

 34: }

 35: }

Now that we have classes representing the audio device and a class to "listen" to the sounds, we need to actually create the sounds. For this we add the SoundEffect class. Once again we need to add the using statement for the DirectSound namespace. We also need to add a using statement for the DirectX namespace because we are going to use the Vector3 class.

A SoundEffect is created by passing a reference to the SoundDevice and a path to a WAV file containing a sound. In the constructor, we create a BufferDescription object that will "turn on" all of the capabilities we want. For right now we want the ability to control sounds in 3D, adjust the volume, and change the frequency. Next we create a secondary buffer passing in the path to the sound file and the BufferDescription we just created. To create the Buffer3D that we need to play 3D sounds, we pass the SecondaryBuffer to the Buffer3D class on instantiation, and voila! we have a 3DBuffer.

Once the Buffer3D class is set up we can access the properties and change the settings. We set the MaxDistance to a more manageable number. This setting combined with the Mute3DAtMaximumDistance setting on the SecondaryBuffer ensures that sounds too distant to hear are not played at all and don't consume any processing cycles.

To actually use the SoundEffect in the game we provide three public methods: one to play the sound one to stop the Sound (if it is looping) and one to update the class. The Play method either plays the sound once or loops the sound depending on the setting of the _isLooping variable. If the sound is looping, we need to be able to stop it, and that is what the Stop method does. The Update method is the way in which we pass the updated position of the tank the sound is associated with to the sound buffer so it can be accurately calculated.

 1: using System;

 2: using Microsoft.DirectX.DirectSound;

 3: using Microsoft.DirectX;

4:

 5: namespace BattleTank2005

 6: {

 7: public class SoundEffects : IDisposable

 8: {

 9: public SoundEffects( SoundDevice soundDevice, string soundFile)

 10: {

 11: BufferDescription bufferDescription = new BufferDescription();

 12: bufferDescription.Control3D = true;

 13: bufferDescription.ControlVolume = true;

 14: bufferDescription.ControlFrequency = true;

 15: bufferDescription.Mute3DAtMaximumDistance = true;

16:

 17: try

 18: {

 19: _secondaryBuffer = new SecondaryBuffer(soundFile, bufferDescription, soundDevice.AudioDevice);

 20: _3dBuffer = new Buffer3D(_secondaryBuffer);

 21: _3dBuffer.MaxDistance = 10000;

 22: }

 23: catch

 24: {

 25: // Can not use sounds

 26: }

 27: }

28:

 29: public void Update(Vector3 position){

 30: _3dBuffer.Position = position;

 31: }

32:

 33: public void Play()

 34: {

 35: if (_isLooping == true)

 36: _secondaryBuffer.Play(0, BufferPlayFlags.Looping);

 37: else

 38: _secondaryBuffer.Play(0, BufferPlayFlags.Default);

 39: }

40:

 41: public void Stop()

 42: {

 43: _secondaryBuffer.Stop();

 44: _secondaryBuffer.SetCurrentPosition(0);

 45: }

46:

 47: public bool IsLooping

 48: {

 49: set { _isLooping = value; }

 50: }

51:

 52: public int Volume

 53: {

 54: set { _secondaryBuffer.Volume = value; }

 55: }

56:

 57: public int Frequency

 58: {

 59: set { _secondaryBuffer.Frequency = value; }

 60: }

61:

 62: private SecondaryBuffer _secondaryBuffer;

 63: private Buffer3D _3dBuffer;

 64: private bool _isLooping;

 65: }

 66: }

Now we need to integrate these new classes into the overall game. The first step is to add a private variable to the GameEngine class to hold a reference to the SoundDevice and SoundListener classes because we can only have one of each of these classes. At the bottom of the GameEngine class, add the following code:

private SoundDevice _soundDevice;

  
private SoundListener _soundListener;

Next, we need to instantiate each class. The Initialize class of the GameEngine is the perfect spot for this, but to keep sound-related items together we will create a method called ConfigureSounds. In the GameEngine class add the following method:

private void ConfigureSounds()

  
{

  
_soundDevice = new SoundDevice(this);

  
_soundListener = new SoundListener(_soundDevice);

  
}

Now add a call to the ConfigureSounds method to the Initialize method of the GameEngine right after the call to the ConfigureDevice method.

ConfigureDevice();
    
ConfigureSounds()

With regard to the SoundDevice class, this is all we have to do. After creation all we need this class for is its reference to the Device object. The SoundListener class, however, needs to be updated with the correct position information in each frame. The Render method of the GameEngine class is the perfect place for this. We simply pass in the position of the camera class since it represents our location in the game. In the Render method of the GameEngine class, add the following code immediately after the call to the Update method of the camera.

_soundListener.Update(_camera.Position );

Each tank gets a unique instance of the SoundEffect class for each sound it can make. First we need to add some private variables to hold the two sound effects we want the tank to make. At the bottom of the Tank class add the following code:

private SoundEffects _engineSound;

  
private SoundEffects _fireSound;

To make sure each tank has the necessary sound we will pass them in the constructor. Update the constructor of the Tank class to look as follows (the new code is in bold):

 1: public Tank(Device device, string meshFile, Vector3 position, float scale,

 2: float speed, SoundEffects fireSound, SoundEffects engineSound)

 3: : base(device, meshFile, position, scale)

 4: {

 5: _speed = speed;

 6: _engineSound = engineSound;

 7: _engineSound.IsLooping = true;

 8: _engineSound.Play();

 9: _fireSound = fireSound;

 10: }

Once the sound effects are associated with the tank instance we start the engine sound after setting it to looping.

The next step is to change the Update method of the Tank class to add updating the 3D buffer with the position of the tank. Add the following code to the end of the Update method:

if (_engineSound != null)

  
{

  
_engineSound.Update(base.Position);

  
}

The shooting sound of the tank is not continuous, and whether or not the tank shoots will be determined by the AI we are going to add later on. For now, all we are going to do is create the necessary framework to play the shooting sound. Add the following method to the Tank class:

public void Shoot()

  
{

  
if (_fireSound != null)

  
{

  
_fireSound.Update(base.Position);

  
_fireSound.Play();

  
}

  
}

The final step for updating the tank class to be sound ready is to update the CreateTanks method of the GameEngine. We need to create the SoundEffect for the engine noise and the firing noise, and pass them to the tank class on creation. Change the CreateTanks method to look as follows:

private void CreateTanks()

  
{

  
_tanks = new List<UnitBase>();

  
SoundEffects engineSound1 = new SoundEffects(_soundDevice, @"EngineSound1.wav");

  
engineSound1.Volume = -1000;

  
engineSound1.Frequency = 100000; // Set to a value between 100 - 200000 SoundEffects engineSound2 = new SoundEffects(_soundDevice, @"EngineSound2.wav");

  
engineSound2.Volume = -1000;

  
engineSound2.Frequency = 10000; // Set to a value between 100 - 200000 SoundEffects fireSound = new SoundEffects(_soundDevice, @"Fire.wav");

  
Tank newTank1 = new Tank(_device, @"bigship1.x", new Vector3(0.0f, 20.0f, 100.0f), 

  
 1f, 10.0f, fireSound, engineSound1);

  
Tank newTank2 = new Tank(_device, @"bigship1.x", new Vector3(100.0f, 20.0f, 100.0f), 

  
 1f, 10.0f, fireSound, engineSound2);

  
_tanks.Add(newTank1);

  
_tanks.Add(newTank2);

  
}

This is where you can experiment with changing the volume or frequency of the sound effect. Fly around and get closer to each spaceship, and the sound will vary in direction and volume according to your position. Change the frequency, and the pitch will change, and you can guess what the volume setting does.

Summary

When you play BattleTank2005 now, you should hear the engine noises of the various enemy tanks properly adjusted for their spatial relationship to the listener (you). At this point, we have integrated the first set of the audio features we want to add to BattleTank2005. You should definitely experiment with the various settings to hear their effect. I have left several optional settings commented in the code so you can easily change them. In the next article we are going to cover how to use the built-in sound effect manipulation of DirectSound to change the sounds, and how to play a regular MP3 file for the soundtrack using the AudioVideoPlayback namespace.

Until then: Happy coding!

Friday, 31 January 2014