Digital audio represents continuous sound‑pressure variations as a discrete sequence of samples that can be processed, stored, copied, and transmitted by computers and micro‑processors. Two critical parameters fully describe linear PCM (Pulse‑Code Modulation):
// 48 kHz, 24‑bit signed‑integer little‑endian stereo (₂ channels)
struct Sample24 {
int32_t L : 24;
int32_t R : 24;
};
Note ▸ Internally the struct above is padded to 32 bits per field on most CPUs; external media files use tightly packed 24‑bit samples.
Microphone voltage is amplified, anti‑aliased by a low‑pass filter, and then sampled by an ADC. The ADC’s time‑base must be phase‑locked to the system word‑clock.
Typical real‑time processing stages:
The processed bit‑stream is clocked into a DAC, reconstructed by a smoothing filter, and sent to line‑level outputs or power‑amps.
A band‑limited signal (max frequency fmax)
is perfectly reconstructible when sampleRate >= 2 × fmax
.
This critical frequency is the Nyquist frequency.
If spectral content above Nyquist is not removed before sampling, it folds back as false low‑frequency components (aliases). Proper anti‑alias filtering is therefore mandatory.
Mapping a continuous amplitude to discrete levels introduces quantisation noise. Adding very low‑level noise dither randomises the error, preventing correlation with the signal and improving subjective linearity at low volumes.
Container | Typical Extension | Features |
---|---|---|
WAVE | .wav | RIFF‑based; little‑endian; ubiquitous on Windows |
AIFF | .aif/.aiff | Big‑endian; popular on macOS & Pro DAWs |
CAF | .caf | Large‑file support; metadata‑rich; Apple CoreAudio |
RF64 | .wav | >4 GB WAV via 64‑bit size chunks (EBU tech) |
Note ▸ Lossless ≠ uncompressed. FLAC typically achieves 30–60 % size reduction without altering the PCM stream.
All digital‑audio devices must agree on sample‑rate and phase. An unstable clock manifests as jitter, injecting noise and distortion.
Platform | Primary API | Notes |
---|---|---|
Linux | ALSA | PCM & MIDI; snd_pcm_readi() / writei() . |
macOS / iOS | Core Audio | High‑level AVAudioEngine ; low‑level AudioUnit . |
Windows | WASAPI / ASIO | WASAPI shared/exclusive; ASIO bypasses KMixer for low latency. |
Cross‑platform | JACK, PortAudio | Callback‑driven real‑time graph; used by Ardour, Reaper‑Linux. |
Web | Web Audio API | Graph‑node processing inside browsers; AudioWorklet for DSP. |
Buffers are typically delivered via a real‑time audioCallback()
that must finish before the next block is due – otherwise you’ll hear
underruns (glitches).
Total round‑trip latency ≈ ADC‑block + processing + DAC‑block. Typical buffer sizes:
Note ▸ Smaller buffers lower latency but raise CPU‑wake‑ups → more context switches and potential underruns on low‑power devices.
While audio streams carry raw PCM or encoded frames, control messages (MIDI, OSC, …) operate at far lower bandwidths. They handle note events, parameter automation, and device discovery.
// Send middle‑C (60) at max velocity
sendMIDI(0x90 /* NoteOn ch1 */, 60, 127);
High‑resolution extensions (MIDI 2.0, MPE) provide per‑note expression for modern synths and DAWs.