This SDL2
/ SDL2_ttf
program records mono
audio from any capture device, streams it to a WAV file,
visualises the live waveform, and performs a lightweight
autocorrelation‑based tuner that prints pitch (Hz) → musical note.
48000 Hz
.capture.wav
.2048
samples at a time, write them to disk,
#define WIN_W 1024 /* window width */
#define WIN_H 400 /* window height */
#define BUF_SAMPLES 8192 /* ring‑buffer size (power‑of‑two) */
#define CHUNK 2048 /* processing hop size */
#define WAV_NAME "capture.wav"
wr
(writer) /
rd
(reader) wrap modulo BUF_SAMPLES
.CHUNK
must remain
> max lag (~960
samples) for reliable autocorrelation.main()
step‑by‑step
SDL_Init(SDL_INIT_AUDIO | SDL_INIT_VIDEO)
and
TTF_Init()
are mandatory before any I/O or rendering.
scanf
; defaults to 0.
SDL_AudioSpec want = {0};
want.freq = 48000; /* sample‑rate */
want.format = AUDIO_F32SYS; /* 32‑bit float */
want.channels = 1; /* mono */
want.samples = CHUNK; /* callback buffer size */
want.callback = capture_cb; /* recording handler */
Returned have
describes the exact format SDL could
provide; the code uses it later for WAV header values.
An SDL_Window
→ SDL_Renderer
pair is set up.
Font path tries Apple SF first, falls back to DejaVu.
Because the data chunk size is unknown upfront, the code back‑fills RIFF chunk size and data chunk size after recording stops (see Section 6).
SDL_QUIT
events.ring
contains ≥ CHUNK
samples:
local[]
.float
s to WAV.SDL_Delay(1)
yields time to the OS.
capture_cb(void *udata, Uint8 *stream, int len)
is invoked by SDL on a real‑time thread.
stream
→ raw audio bytes (here float*
).len
→ byte count (÷ 4 for float samples).ring[wr]
,
incrementing wr
(wraps automatically via power‑of‑two mask).⚠ Thread‑safety: Only wr
is
modified inside the callback; the main thread reads it
without locks, relying on atomicity of 32‑bit writes.
float detect_pitch(const float *buf, int n, int sr)
buf
– pointer to CHUNK
recent samples.n
– number of samples analysed.sr
– current sample‑rate.For each candidate lag between minLag
(1000 Hz)
and maxLag
(50 Hz) it performs a straight
dot‑product of the signal with itself,
retaining the lag with maximum energy.
Returned pitch = sr / bestLag
, or 0 if silence.
In practice this O(n·lag) scan is < 2 ms for
2048×900
floating operations on modern CPUs.
const char *freq_to_note(float f, char *out, size_t n)
f
– detected frequency in Hz.out
– pre‑allocated output buffer.n
– buffer capacity.
Uses the well‑known formula
midi = 69 + 12·log2(f / 440 Hz)
then maps midi % 12
to name array {C, C#, … B} and
puts octave = (midi/12 – 1)
.
After the loop exits, the code seeks back to two offsets:
36 + bytes_total
data_size_pos
→ bytes_total
This finalises the file so any DAW/player can open it.
Finally all SDL objects, the device, and TTF subsystems are dequeued.