Two transports, one job: move live video and audio from a sender to its receivers with as little delay as possible.
A sender streams over shared memory and UDP simultaneously; a receiver picks SHM when it is on the same machine and tells the sender via a flag, so no UDP is wasted on it.
SLM (SLAY Media) is the wire protocol underneath every SLAY tool. It carries video — raw NV12, or hardware H.264/H.265 — and 48 kHz stereo f32 audio from one sender to one or more receivers, over two transports that run at the same time:
Everything in the design bends toward one goal: low latency. In our tests about 1 ms over shared memory and ~3 ms to a laptop over WiFi with H.264 (around 20 ms for uncompressed NV12 on a gigabit LAN), with a connect-time handshake the steady-state media path never pays for again.
A sender always offers both transports. A receiver signals which one it is using with the
SLM_HELLO_FLAG_SHM bit in its HELLO packet; the sender then skips UDP
for that receiver. The shared-memory layout — header, slot ring, frame headers, payloads
— is byte-for-byte identical on every platform; only the OS primitives that create and
signal the region differ.
A UDP session moves through four phases: the sender advertises itself, the receiver proves it owns its address (the v4 return-routability check that defeats spoofed-source reflection attacks), the sender streams, and finally tears down. Validation is a one-time, connect-time cost — once an address is validated the media path adds no further round trips.
Amber arrows are sender → receiver; red arrows are receiver → sender. A spoofer that forges a victim's address never receives the CHALLENGE, so it can never echo the nonce and the flood never starts.
Every UDP datagram starts with the 12-byte common header below; the type
byte selects one of these. Multi-byte integers are little-endian.
| Type | Name | Direction | Purpose |
|---|---|---|---|
0x01 | ANNOUNCE | sender → multicast | Advertise the sender for discovery |
0x02 | HELLO | receiver → sender | Register / keepalive; carries the echo nonce |
0x03 | CAPS | sender → receiver | Stream format: size, fps, pixel format, codec |
0x04 | CHALLENGE | sender → receiver | Random nonce for address validation |
0x10 | VIDEO | sender → receivers | One chunk of a video frame |
0x11 | AUDIO | sender → receivers | Interleaved f32le PCM, 48 kHz stereo |
0x12 | FEC | sender → receivers | XOR parity for the video path (lossy links) |
0x20 | KEEPALIVE | receiver → sender | Liveness; structurally identical to HELLO |
0x30 | TALLY | receiver → sender | On-air status (off / preview / program) |
0x31 | MSG | receiver → sender | Short free-text operator message |
0x32 | STAT | receiver → sender | Echoes a send timestamp for latency measurement |
0xFF | BYE | sender → receivers | Clean shutdown; flush buffers |
Twelve bytes prefix every UDP datagram. The random stream_id lets a
receiver notice a sender restart, and the version byte gates the whole
data path: a receiver silently ignores any packet whose version does not match its own.
Offsets 0..11. The media payloads add their own headers after this: a VIDEO chunk adds 36 bytes (then up to 1400 bytes of frame data), AUDIO adds 24 bytes.
frame_seq and drops incomplete frames after one frame time.
The codec field in CAPS selects how a receiver reads VIDEO payloads. A
receiver that does not support the offered codec disconnects cleanly rather than render garbage.
| Codec | VIDEO payload |
|---|---|
0 RAW | NV12 (per pixel_fmt), optionally LZ4-compressed per frame |
1 H.264 | H.264 Annex B access units, no container |
2 H.265 | H.265 Annex B access units, no container |
| Pixel format | Description |
|---|---|
1 NV12 | Y plane, then interleaved UV. The only format the current sender produces for RAW. |
2 I420 | Planar YUV 4:2:0 |
3 BGRA | 32-bit packed |
4 YUY2 | Packed 4:2:2 |
Color matrix is BT.709, limited range for YUV. For H.264/H.265 the payload is an encoded bitstream and pixel_fmt is ignored.
| Constant | Value | Description |
|---|---|---|
SLM_MCAST_GROUP | 239.255.77.43 | Multicast group for ANNOUNCE |
SLM_MCAST_PORT | 9878 | Multicast port |
SLM_MAX_UDP_PAYLOAD | 1400 B | Max chunk payload; fits a 1500-byte MTU |
SLM_PROTOCOL_VERSION | 4 | UDP protocol version |
SLM_SHM_MAGIC | 0x534C4D02 | SHM region magic (distinct from UDP "SLM\x01") |
SLM_SHM_VERSION | 2 | SHM header version |
SLM_SHM_SLOTS | 16 | Ring buffer slot count |
SLM_SHM_STALE_SECS | 5 s | Heartbeat age that marks a dead producer |
| Audio sample rate | 48000 Hz | Fixed for all streams |
| Audio channels | 2 | Stereo, interleaved |
| Receiver expiry | 10 s | HELLO / KEEPALIVE timeout |
SLM is open, with reference implementations in C and Rust. The full byte-level specification — every field offset, the SHM region layout, and the version history — lives in PROTOCOL.md. Write your own sender or receiver, or drop the OBS plugin into OBS Studio.