New Sound Technology for PCs

New Technology: Software Architecture
New Sound Technology for PCs
Page 1
  Page 2
  Page 3
   Page 4
    Page 5


Content developers use DirectSound, a part of the suite of Microsoft DirectX APIs, for playing sound from a PC. DirectSound collects the independent components of the intended output sound, mixes them together, and sends the mix to the audio codec for output. To accomplish these operations, DirectSound defines primary and secondary buffers. Generally, the application writes the desired sounds into secondary buffers and DirectSound mixes them into the primary buffer. The contents of the primary buffer are what gets sent to the audio codec. DirectSound defines parameters for controlling the volume of each sound, left-right balance, and the sample rate (among others). Note that the sounds in each secondary buffer do not have to be at the same sample rate. DirectSound will take care of any sample rate conversion required and will automatically enlist any hardware support available.

DirectSound solves at least two problems that existed with the previous API for outputting sound. First, DirectSound is able to get the sound out to the codec more quickly, known in the trade as “low latency.” Low latency improves synchronization between audio and video and makes it harder to perceive a lag from the time you push a button until the sound comes out. Second, DirectSound is able to mix several components together to produce an amalgamated output. Note, however, that DirectSound works only with one application at a time. If you have an application producing laser blasts while a background application is trying to play a system alert, only laser blasts will get to the output.

DirectSound 3D

DirectSound3D is the extension to DirectSound for positioning sounds around the listener. It can make it sound as if a jet is flying overhead or a monster is approaching from behind. DirectSound3D not only increases the drama of game play, it can actually provide cues essential to play – tipping you off to look behind you, for example.

The principles that make 3D positioning possible are tricky. Most users have only two loudspeakers in their system. Even if you fantasize about a home theater set up with its two loudspeakers positioned to the rear, you still wouldn’t be able to create the impression of sounds above or below the user. The key to 3D positioning is to emulate with digital signal processing the way in which our ears work. We start by measuring the filter characteristics of our ears with the source sound at many different positions. These measurements are known as the “head-related transfer function” or HRTF. The computer can then filter the sound with the appropriate HRTF based on the desired position of the sound. All implementations of 3D positioning algorithms are based on these principles.

Presenting the result with loudspeakers involves a serious complication. The sound from each loudspeaker reaches both ears, providing a cue to the location of the loudspeaker that swamps the desired cue to the location of the sound effect itself. However, it is possible to cancel the “crosstalk” signal by emitting appropriately delayed, filtered, and inverted signals from the opposite loudspeaker. Ideally, the result of this tricky “crosstalk cancellation” (CTC) scheme is that the left ear hears only the sound from the left loudspeaker and the right ear hears only the sound from the right loudspeaker. Proper operation depends on knowing the location of the user relative to the loudspeakers, but fortunately in multimedia PC applications we know that the user is generally sitting right in front of the monitor to get a good view of the scene.

The filtering required by DirectSound3D can be time-consuming. Special hardware helps by offloading this burden from the host CPU so that it is free to deal with other requirements of the game – graphics, disk transfers, reading user inputs, etc. – in the timely fashion required to make the game feel responsive.


DirectMusic, part of DirectX 6, is the newest component of this arsenal. DirectMusic actually performs three functions. First, it provides a wavetable synthesizer that runs on the host, thereby assuring that all modern PCs will be capable of rendering a MIDI score. Second, it provides interactive music composition. This API makes it possible for content developers to specify musical accompaniment in terms of the desired style and musical characteristics – and the developer can change these characteristics interactively by altering parameters that, for example, increase the tempo or add voices.

The third function of DirectMusic is downloading sounds. As we discussed above, downloadable sounds increase variety by permitting developers to create their own palettes rather than depending on a standard palette such as General MIDI. Custom palettes also maximize the sound quality achievable with a given amount of memory because they don’t waste space on unneeded samples. Furthermore, sound quality is more consistent across platforms because custom palettes avoid variations due to different “interpretations” of the General MIDI standard. Downloadable sounds will also be useful for sound effects. By including sound effects in the custom palette, the designer can use MIDI to control audio events and can use the wavetable synthesizer to control characteristics of the sound.

New Technology - Hardware Architecture Next Page