Win32 Audio Background Info
From time to time people ask about the reasons PortAudio is implemented the way it is on Windows. This page doesn’t try to answer that question, but it does outline some of the non-obvious issues and provides links to more information. Please contribute if you know something that isn’t here.
Note that the titles of articles are used as the link text, so if there’s a broken link you can try googling for the text in the link name.
Note when reading Microsoft documentation it is sometimes unclear which version(s) of Windows the documentation is referring to. Often information may only apply to specific versions. For example, information about KMixer is not always applicable to Windows Vista and later.
KMixer and Windows Audio Architecture background
There is a nice discussion of changes to the Windows Audio Architecture under Vista Creative Labs description of Vista audio architecture. This is a good place to start as it reviews the architecture under Windows XP also. Although PortAudio needs to run on more than just the latest versions on Windows, Microsoft’s Audio Device Technologies for Windows page has lots of information about current generation systems. See also the white paper Microsoft Device Driver Interface for HD Audio
For a user-level view of WDM and kernel mixer issues check out the Windows Driver Model section of PC Notes column of Sound on Sound magazine.
> KMixer introduces a latency of 30 milliseconds into an audio stream. This is usually sufficient to absorb jitter resulting from competition for CPU time with ISRs (interrupt service routines) and other high-priority operations.
Windows Driver Kit > Device and Driver Technologies > Design Guide > WDM Audio Support in Different Versions of Windows > Factors Governing Wave-Output Latency > KMixer Latency
Microsoft’s overview of WAVE_FORMAT_EXTENSIBLE Multiple Channel Audio Data and WAVE Files is especially relevant for multichannel and high-bit-depth support. See also The evolution of a data structure – the WAVEFORMAT
This page discusses that KMixer buffer sizes are 10ms:
> each buffer contains 10 milliseconds of data, the buffer size per IRP is calculated to be approximately 882 bytes:
> (4 bytes)(22.05 kHz)(10 milliseconds) = 882 bytes
> (The size cannot be exactly 882 bytes because the buffer contains an integral number of four-byte audio frames.)
It’s not clear if KMixer rounds up or down.
How KMixer Handles Set-Format Requests
Bypassing KMixer on Windows XP using DirectSound hardware pin
>DirectSound streams that feed into hardware mixer pins bypass KMixer and avoid the latency of software mixing in KMixer.
Overview of DirectSound Hardware Acceleration
It is unclear whether PortAudio correctly engages hardware mixing where available.
It is unclear whether this feature is available on later versions of Windows, or how common it is on Windows XP.
Mixing latency, Vista and Later
As discussed in the first comment here Audio in Windows Vista (2) Vista introduces a new audio architecture which doesn’t use Kmixer. This implies that system level mixer latency is lower on Vista. This has supposedly been improved again on Windows 7:
>In Windows 7 share mode streams run in low-latency mode. The audio engine runs in pull mode with a significant reduction in latency. This is very useful for communication applications that require low audio stream latency for faster streaming.
What’s New for Core Audio APIs in Windows 7
Usually you access audio via high-level APIs such as DirectSound, WMME or WASAPI. WDM/KS is refers to directly accessing kernel audio drivers from user space. WDM is the low level driver model used by all Windows drivers. PortAudio provides a WDM/KS implementation. WDM audio drivers come in three flavours: WaveCyclic, WavePci and WaveRT. WaveRT was introduced in Vista and isn’t available in older operating systems.
WDM is the driver model for Windows audio device drivers in current versions of Windows. API’s like WMME, DirectSound and WASAPI are implemented on top of WDM/KS (Kernel Streaming) but you can also talk to WDM directly from user space. WDM Audio Architecture: Basic Concepts
WDM WaveCyclic/WavePci “kernel streaming” source code example from Microsoft: DirectKS Sample Application
WDM WaveRT real-time streaming driver model is described in A Wave Port Driver for Real-Time Audio Streaming. Also has some info about older WaveCyclic/WavePci
WDM APIs optionally used in PA/WMME and PA/DirectSound
PortAudio can optionally use some WDM facilities to query driver channel count in the DirectSound and WMME implementations. This bypasses the problem that the Kernel Mixer will usually return ((short)-1) as the number of supported channels rather than what the driver reports. To enable this feature make sure PAWIN_USE_WDMKS_DEVICE_INFO is defined in your Makefile or command line parameters.
Our WASAPI version is working fine now, but does need to implement some workarounds for Windows bugs such as:
Exclusive mode event-driven render across WOW64 is broken in Vista RTM. All event-driven capture is broken in Vista RTM.