Writing windows Audio Driver

Advancing the Architecture

The introduction of the WDM Audio platform has created a unified architecture for implementing audio drivers for the Windows platform. New features and enhancements to this architecture have helped bring new types of applications to the Windows platform such as PC based soft DVD players, streaming high quality secure audio content, and home theater multi-channel surround sound support. This article focuses on some of the new challenges that face the audio driver developer when integrating support for these types of applications.

Surround Sound Playback

Many soft DVD players support the option of audio playback in several formats. These formats typically are DirectSound or waveOut, Multi-channel speaker configurations, and encoded AC3 digital output. The DirectSound / waveOut support requires no additional support from the driver, but this does not fully exploit the Dolby Digital 5.1 support featured on most DVD titles. For a true home theater experience either Multi-Channel support or encode AC3 digital output must be supported.

Supporting AC3 Dolby Digital Output over a Digital Interface

The most popular transport for AC3 encoded Dolby Digital 5.1 data is the Sony Phillips Digital Interface or SPDIF. SPDIF is a widely used digital interface that is implemented with some high-end sound cards. Including a SPDIF port on a sound card enables the audio card to support external AC3 decoder speakers for use in a home theater Surround Sound environment. Sound cards can be designed to transfer compressed AC3 data or raw PCM through the SPDIF interface.

Non-PCM Data Routing

Starting with Windows Me, AC3 encoded data (or any type of non-PCM) that is passed through the waveOut or DirectSound API data is routed directly to PortCls. This is a deviation from previous version of the Windows operating systems where all audio content was sent through KMixer. KMixer does not support private or encoded audio data formats like encoded Dolby Digital AC3.

Driver Requirements

In a WDM audio driver data flows in and out of an abstract object called a filter. Filters contain one or more arrays of PCPIN_DESCRIPTOR structures. These arrays are called pin factories because this information is used to generate pins on the filter. All data that flows in and out of a filter passes through the filters pins. To enable your driver to accept AC3 data from an application, you must first associate a pin in your filter to the AC3 data stream. Second, you must correctly define the pin factory that describes your drivers AC3 pin. These structures describe the nature of the flow of data from your driver. For example, can there be multiple instances of this pin (i.e. multiple streams), does the pin accept or receive data, what type of data format does this pin support, etc. There are several ways to define your pin and pin factory for your AC3 enabled hardware, however there are potential problems with some of these implementations. The safest approach is to create a separate pin factory for AC3 formatted data. This will allow for the creation of a separate pin that is exclusive to AC3 data. If your hardware supports only a single stream, you may need to arbitrate hardware access between your PCM pin and AC3 pin. In addition, you must define the AC3 pin on a filter that already supports PCM or your AC3 pin will not be utilized by DirectSound 8. The only way a device can support AC3 is to also support PCM.

In addition to the requirements listed above, you must also implement a DataRangeIntersection member function for your AC3 pin. You will need to override the default DataRangeIntersection handler provided by PortCls. The MSVAD sample in the Whistler DDK demonstrates the proper way to implement a DataRangeIntersection routine for accepting AC3 data. Use \tools\avstream\KsStudio.EXE in the Whistler DDK to test your intersection handler. This will allow you to instantiate your AC3 pin manually.

The DataRangeIntersection handler for your AC3 pin should return a properly formatted KSDATAFORMAT_WAVEFORMATEX structure in the ResultantFormat parameter. The KSDATAFORMAT_WAVEFORMATEX structure requires a GUID for the SubFormat field of the KSDATAFORMAT structure. This GUID is created by using the macro in DEFINE_WAVEFORMATEX_GUID in ksmedia.h. The macro utilizes a constant wave format type, which is defined in mmreg.h. WAVE_FORMAT_DOLBY_AC3_SPDIF is defined in mmreg.h as 0x0092. The following code demonstrates the use of this macro.

#define STATIC_KSDATAFORMAT_SUBTYPE_DOLBY_AC3_SPDIF \
  DEFINE_WAVEFORMATEX_GUID(WAVE_FORMAT_DOLBY_AC3_SPDIF)

DEFINE_GUIDSTRUCT("00000092-0000-0010-8000-00aa00389b71",
  KSDATAFORMAT_SUBTYPE_DOLBY_AC3_SPDIF);

#define KSDATAFORMAT_SUBTYPE_DOLBY_AC3_SPDIF \
  DEFINE_GUIDNAMED(KSDATAFORMAT_SUBTYPE_DOLBY_AC3_SPDIF)
...
// from DataRangeIntersection
INIT_WAVEFORMATEX_GUID(pMyGuid,myWaveFormatTag);
...
if (IS_VALID_WAVEFORMATEX_GUID(aWaveFormatExGuidPtr)) {
    aWaveFormatTag = EXTRACT_WAVEFORMATEX_ID(aWaveFormatExGuidPtr);
}

In addition to a DataRangeIntersection handler, your mini port must also handle the format properly in the Init and NewStream member functions. Your stream interface must handle the SetFormat function properly for the AC3 format.

Additional Information

Currently the only shipping Operating system that supports the playback of non-PCM data via waveOut is Windows Me. Windows 2000 will add this support in Service Pack 2. Support for this feature is available for Windows 98 Second Edition by obtaining a Quick Fix Engineering (QFE) package from Microsoft Product Support Services. The QFE is titled 269601USA8.EXE. This QFE is not supported on the original release of Windows 98.

A bug in the system graph builder in Windows 98 requires that your AC3 pin be connected to at least one node. The QFE mentioned above will not work unless your data path has at least one node attached.

Additional information concerning non-PCM playback can be found in the following article:

http://www.Microsoft.com/hwdev/audio/Non-PCM.htm

Supporting Multi-Channel Playback

Most DVD titles and some games support an enhanced listening experience by providing content that is tailored for multi-speaker (channel) installations. In most cases, the audio content is channel specific. This section discusses implications and design strategy for writing multi-channel audio drivers. These drivers would support Dolby Digital 5.1 Surround Sound speaker configurations and the WAVEFORMATEXTENSIBLE format, which is needed for multi-channel support.

Introducing WAVEFORMATEXTENSIBLE

Due to the limited way that channel assignments were specified using the older WAVEFORMAT structure, a new extension was defined that will allow forward growth while maintaining backward compatibility. The WAVEFORMATEXTENSIBLE structure allows an audio format to be described in a way that overcomes the previous limitation of the WAVEFORMAT structure. Previously channel assignments were assumed when the numbers of channels were limited to WAVEFORMAT.nChannels. This value was typically either two (stereo) or one (mono). With the advent of advanced audio designs and the demand for home theater solutions, it became necessary to move beyond two channels and begin dealing with various speaker configurations. Quadraphonic (four-corner), 3.1 (front left, front center, front right, low frequency enhance), and 5.1 (front left, front center, front right, back left, back right, low frequency enhance) are some of the more popular speaker configurations. With all of these speaker configurations, there was no way to map specific channels to specific speakers and the layout of the wave data did not correspond to a specific channel of speaker. To over come this problem a default channel ordering was defined. This channel ordering specified how channels are mapped to speakers and how the audio data was interleaved in the actual wave file. The channel order is:

1) Front Left - FL 2) Front Right - FR 3) Front Center - FC 4) Low Frequency - LF 5) Back Left - BL 6) Back Right - BR 7) Front Left of Center - FLC 8) Front Right of Center - FRC 9) Back Center - BC 10) Side Left - SL 11) Side Right - SR 12) Top Center - TC 13) Top Front Left - TFL 14) Top Front Center - TFC 15) Top Front Right - TFR 16) Top Back Left - TBL 17) Top Back Center - TBC 18) Top Back Right - TBR

The audio channels are interleaved in the order listed. If a channel is not included in the wave file, then the channel is omitted from the interleaved data. This allows for any number of channels (from 1 to 18).

As you can see below, WAVEFORMATEXTENSIBLE includes WAVEFORMATEX as a member of the structure. This allows WAVEFORMATEXTENSIBLE to provide maximum compatibility with previous methods of specifying the wave format.

typedef struct {
  WAVEFORMATEX Format;
  union {
    WORD wValidBitsPerSample;
    WORD wSamplesPerBlock;
    WORD  wReserved;
  } Samples;
  DWORD  dwChannelMask;
  GUID  SubFormat;
} WAVEFORMATEXTENSIBLE, *PWAVEFORMATEXTENSIBLE;

The dwChannelMask field was designed to provide a mapping of channels to actual speakers that are available. The following constants are defined in mmreg.h:

#define SPEAKER_FRONT_LEFT                0x1
#define SPEAKER_FRONT_RIGHT               0x2
#define SPEAKER_FRONT_CENTER              0x4
#define SPEAKER_LOW_FREQUENCY             0x8
#define SPEAKER_BACK_LEFT                 0x10
#define SPEAKER_BACK_RIGHT                0x20
#define SPEAKER_FRONT_LEFT_OF_CENTER      0x40
#define SPEAKER_FRONT_RIGHT_OF_CENTER     0x80
#define SPEAKER_BACK_CENTER               0x100
#define SPEAKER_SIDE_LEFT                 0x200
#define SPEAKER_SIDE_RIGHT                0x400
#define SPEAKER_TOP_CENTER                0x800
#define SPEAKER_TOP_FRONT_LEFT            0x1000
#define SPEAKER_TOP_FRONT_CENTER          0x2000
#define SPEAKER_TOP_FRONT_RIGHT           0x4000
#define SPEAKER_TOP_BACK_LEFT             0x8000
#define SPEAKER_TOP_BACK_CENTER           0x10000
#define SPEAKER_TOP_BACK_RIGHT            0x20000
#define SPEAKER_RESERVED                  0x80000000

For example, if you have a Quadraphonic (four speaker) configuration, nChannels = 4 and dwChannelMask = 0x00000033. This would specify the playback was intended for the front left, front right, back left and back right speakers. The audio data would be interleaved in that order.

For a detailed description of WAVFORMATEXTENSIBLE, see the article at http://www.microsoft.com/hwdev/audio/multichaud.htm

Modifications Required to Support Multi-Channel Playback

To add support for multi-channel playback in your audio driver you must make the WDM audio system aware of the capabilities of your hardware. First you must use an extend version of KSDATAFORMAT_WAVEFORMATEX. The example below shows a typical use of the extend format structure.

DataFormat.FormatSize  = sizeof(KSDATAFORMAT) + sizeof(WAVEFORMATEXTENSIBLE);
DataFormat.Flags       = 0;
DataFormat.SampleSize  = 0;
DataFormat.Reserved    = 0;
DataFormat.MajorFormat = STATICGUIDOF(KSDATAFORMAT_TYPE_AUDIO);
DataFormat.SubFormat   = STATICGUIDOF(KSDATAFORMAT_SUBTYPE_PCM);
DataFormat.Specifier   = STATICGUIDOF(KSDATAFORMAT_SPECIFIER_WAVEFORMATEX);
Format.wFormatTag      = WAVE_FORMAT_EXTENSIBLE;
Format.nChannels       = 4;
Format.nSamplesPerSec  = 44100;
Format.nAvgBytesPerSec = 352800;
Format.nBlockAlign     = 8;
Format.wBitsPerSample  = 16;
Format.cbSize          = sizeof(WAVEFORMATEXTENSIBLE) - sizeof(WAVEFORMATEX);
Format.wValidBitesPerSample = 16;
Format.dwChannelMask   = KSAUDIO_SPEAKER_SURROUND;
Format.SubFormat       = KSDATAFORMAT_SUBTYPE_PCM;

The second thing that you should add to your driver to inform the operating system that you support multi-channel playback is to modify KSDATARANGE_AUDIO structure that is specified for your playback pin. Below is an example of a typical KSDATARANGE_AUDIO structure.

DataRange.FormatSize = sizeof(KSDATARANGE_AUDIO);
DataRange.Flags      = 0;
DataRange.SampleSize = 0;
DataRange.Reserved   = 0;
DataRange.MajorFormat = STATICGUIDOF(KSDATAFORMAT_TYPE_AUDIO);
DataRange.SubFormat   = STATICGUIDOF(KSDATAFORMAT_SUBTYPE_PCM);
DataRange.Specifier   = STATICGUIDOF(KSDATAFORMAT_SPECIFIER_WAVEFORMATEX);
MaximumChannels              = 4;   // max number of channels or -1 for unlimited
MinimumBitsPerSample         = 2;
MaximumBitsPerSample         = 16;
MinimumSampleFrequency       = 5000;
MaximumSampleFrequency       = 48000;

In addition to the above modification, you will also need to modify your drivers DataRangeIntersection member function of your stream interface object to support the WAVEFORMATEXTENSIBLE format. Refer to the latest Windows 2000 DDK in winddk\src\wdm\audio\ac97 for a multi-channel enabled driver sample.

Other Multi-Channel Considerations

All multi-channel data from the WDM audio system is interleaved. If your hardware does not understand the interleaved format of the rear channels then you will need to de-interleave the data before it is passed to your hardware.
Your hardware must support volume and mute control on all channels. Your hardware may implement a separate volume control for each channel, or it may implement a master volume control for all channels. These controls can be implemented in either the host audio controller or the audio codec. Your topology will need to reflect the volume that matches your hardware.
If your hardware supports multiple speaker configurations, you may need to update your topology if a speaker assignment is changed. At this time the WDM audio system does not support dynamic topology updates so a reboot may be required if the speaker configuration changes.

Protecting Content with Digital Rights Management

Streaming Media and downloadable music are two applications that have created an industry need to provide the protection of digital media. Digital Rights Management (DRM) is the technology that allows for protection of digital media on the WDM audio platform. The Secure Audio Path (SAP) architecture implements DRM to allow the content provider to control the level and manor of the protection of their copyrighted content.

DRM and SAP, the Big Picture

DRM utilizes encrypted data to apply copy protection against copyrighted audio content. On Windows 98, and Windows 2000 this encryption is limited to application level. On Windows Me and later, the DRM encryption is maintained into the operating system kernel. Windows Me and later also support the concept of a trusted device driver. A trusted driver is a device driver that obeys the copy protection that is specified by the content provider and passed through the DRM system. Once the encrypted data reaches the device driver, the audio is played if the driver is trusted. Currently DRM is implemented only for digital audio. MIDI and DLS sets are not supported under the current DRM system.

Implementing DRM in Your WDM Audio Device Driver

Before you begin adding support for your DRM enabled device driver, you must first download the Windows Millennium Edition DRM Supplement to the Windows 2000 DDK. This supplement is available at http://www.microsoft.com/hwdev/audio/DRMsup.htm. This update includes updated header files that define the interfaces that are required for supporting DRM.

The first step in supporting DRM in your driver is adding the IDrmAudioStream interface to your WaveCyclic and WavePci mini-port objects. Once the driver is loaded, if a secure audio path is requested, the DRM system component above PortCls calls the QueryInterface method on the IMiniPortWavePciStream or IMiniPortWaveCyclicStream interface. You must return an IDrmAudioStream interface in response to a QueryInterface call that specifies a reference ID of IDrmAudioStream. This allows the system to obtain a reference to the streams IID_IDrmAudioStream. Once the DRM system has a reference to the IDrmAudioStream interface, it can then call the SetContentId method. SetContentId is used to pass the content ID and the rights for a specific stream. The content ID is used by the DRM system to identify protected content in an audio stream.

NTSTATUS
IDrmAudioStream::SetContentId(
  IN ULONG ContentId,
  IN PCDRMRIGHTS DrmRights
  );

The second parameter of SetContentId is a pointer to a DRMRIGHTS structure. This structure contains two flags that describe the associated rights for the ContentId.

typedef struct tagDRMRIGHTS {
  BOOL CopyProtect;
  ULONG Reserved;
  BOOL DigitalOutputDisable;
} DRMRIGHTS , *PDRMRIGHTS;

These two flags are described below.

CopyProtect

Specifies one of the following copy-protection values:

Value	Meaning
TRUE (nonzero)	A software component must not: Store the content in any form in any nonvolatile storage. Pass the content by reference or value to any other component within the host system that is not authenticated by the DRM system.
FALSE (zero)	There are no restrictions on copying the content.

DigitalOutputDisable

Specifies one of the following digital output protection values:

Value	Meaning
TRUE (nonzero)	A software component must not transfer the content out of the host system through any type of digital interface. Note that digital output protection does not affect USB devices because the host system includes USB devices.
FALSE (zero)	There are no restrictions on transferring the content from the host system to an external component.

Supporting DRM with Advanced Audio Features

The DigitalOutputDisable flag of the DRMRIGHTS structure applies primarily to S/PDIF or AES/BEU outputs. If this flag is TRUE, the digital output must be disabled. This does not apply to USB digital audio. DRM support for USB will be addressed in future versions of Windows. Although it is not required for WHQL DRM certification, it is a good practice to set the SCMS in the SPDIF stream to prohibit copying if DRMRIGHTS.CopyProtect is TRUE.

Some audio devices handle multiple audio streams and mix these streams into a single output at the hardware level. In this scenario, the driver must keep up with the content IDs and the associated content rights of all streams. In addition, the driver must also keep track of the composite rights of all of the mixed streams. The system provides the IDrmPort interface to help manage the composite rights of the mixed streams. IDrmPort contains the following functions:

IDrmPort::CreateContentMixed
IDrmPort::DestroyContent
IDrmPort::ForwardContentToFileObject
IDrmPort::ForwardContentToInterface
IDrmPort::GetContentRights

Testing Your DRM Enabled Driver

To qualify as a trusted DRM driver, your device driver must pass the Microsoft Windows Hardware Quality Lab (WHQL) DRM compliance test. This test is performed by WHQL once your driver has passed the standard Audio WHQL Windows logo test. A driver that has obtained the Windows logo will be marked as DRM certified or non-certified. The DRM certification test is performed at the vendors request. To pass the DRM certification a driver must do the following:

The audio miniport driver must correctly implement the IDrmAudioStream interface.
The driver must disable the ability to capture the stream currently being played back if DRMRIGHTS.CopyProtect= TRUE. The driver must not store the unprotected digital content to any other medium, including hard disk, EEPROM, memory card, and memory stick. The driver must disable the loopback of digital audio data on the audio device
The driver must disable any digital output on the device if DRMRIGHTS.DigitalOutputDisable = TRUE. Digital outputs include, but are not limited to SPDIF, IEEE 1394, serial, parallel, modem, and network ports. Note: This does not currently apply to USB.
The driver must never attach to an untrusted driver when handling secure content. The driver must only rely on components that contain a DRM signature. If protected audio is passed from one module to another, the driver must utilize the DRM APIs in the Kernel to inform the DRM system of the transfer of protected data.

Acronyms

DRM - Digital Rights Management

SPDIF - Sony Phillips Digital Interface

SAP - Secure Audio Path

WHQL - Windows Hardware Quality Lab

Microsoft, DirectX, MS-DOS, Win32, Win64, Windows, and Windows NT are registered trademarks of Microsoft Corporation. Other product and company names mentioned herein may be the trademarks of their respective owners.

Device Driver, Driver Guide, Driver Download

Template Information

Trang