Open Sonos Voice Interfaces for Local, Private Home Assistant Integration

Das Problem

To: Sonos, Inc. leadership and product teams

We, a broad coalition of Home Assistant users and Sonos customers, ask Sonos to open local voice interfaces on Sonos speakers so they can act as voice satellites for Home Assistant Assist — with custom wake words and LAN-only audio/text handoff. This small change unlocks a major, privacy-preserving capability for a large existing user base and drives additional Sonos adoption.

Hard number, clear need: There are 81,045 active Home Assistant installations using the Sonos integration today (512,941 active installs × 15.8%). This community is ready to use local voice features immediately once basic local hooks are available.

What we want Sonos to change (concrete, minimal API)
Local wake-word events

On-device detection emits an immediate LAN event with metadata (device ID, timestamp, confidence).
Optional pre-roll buffer (≈500–1500 ms PCM) for natural speech starts.
In-home audio streaming to Assist

WebRTC (preferred, bidirectional) or local WebSocket/HTTP stream (PCM/Opus).
Include VAD/barge-in side-band signals (start/stop, energy).
Text alternative (on-device ASR → text to Assist)

Optional ASR on the speaker that sends plain text to the Assist intent pipeline.
Further reduces bandwidth and latency.
Feedback & UX hooks

Local API to drive LEDs/mic state, “listening/speaking” indicators, and playback ducking.
Barge-in support so Assist can pause/duck ongoing audio.
Local discovery & pairing

mDNS/SSDP announce (e.g., _sonos-voice._tcp.local).
Local OAuth/PKCE or signed token pairing. LAN-only; no cloud required.
Privacy by design

Fully opt-in; hardware mic-mute remains authoritative.
Clear policy: no audio leaves the LAN without explicit consent.

Why this matters
Immediate impact: 80k+ active Sonos+HA setups gain local wake words and private Assist handoff with near-zero cloud dependency.
Superior UX: On-device detection and local processing reduce latency and failure modes versus cloud round-trips.
Customer choice, not lock-in: Many users are moving away from Alexa/Google for privacy and reliability. This gives them a first-class local option while Sonos stays the audio specialist.
Business upside: Home Assistant users are high-intent buyers who commonly deploy multi-room audio. A clear “voice satellite” story sells more Era/One/Move/Beam/Arc units.
Low lift for Sonos: The community will deliver a reference Home Assistant add-on, documentation, and a test matrix across popular Sonos models once the hooks exist.

How it works (minimal reference flow)
Speaker detects wake word → emits LAN wake-word event (+ optional pre-roll).
Either stream audio locally (WebRTC/WS) or send recognized text.
Home Assistant Assist handles NLU/intents and returns TTS/audio to the same speaker.
LEDs/state reflect listening/speaking; media playback ducks and resumes.

Our commitment
We will provide: a Supervisor add-on for Assist integration, example pipelines (EN/DE), and cross-device testing (Era series, Beam/Arc, One [mic], Roam/Move).
We will keep this optional and compatible with Sonos Voice Control and existing “Works with Sonos” integrations.

Call to action
Sonos, please implement the local voice hooks described above (wake-word event, LAN audio/text handoff, feedback signals, discovery/auth, and privacy guarantees). This is a customer-first path that delivers better everyday experience, strengthens the Sonos ecosystem, and sells more speakers — without compromising privacy.

Sign this petition to ask Sonos to open local voice interfaces so that the 81,045 active Sonos+Home Assistant setups (and many more to come) can finally use custom wake words and fast, private, on-device voice control.

Rene WurchPetitionsstarter*in

Medienanfragen

13 Das Problem

To: Sonos, Inc. leadership and product teams

What we want Sonos to change (concrete, minimal API)
Local wake-word events

WebRTC (preferred, bidirectional) or local WebSocket/HTTP stream (PCM/Opus).
Include VAD/barge-in side-band signals (start/stop, energy).
Text alternative (on-device ASR → text to Assist)

Optional ASR on the speaker that sends plain text to the Assist intent pipeline.
Further reduces bandwidth and latency.
Feedback & UX hooks

Local API to drive LEDs/mic state, “listening/speaking” indicators, and playback ducking.
Barge-in support so Assist can pause/duck ongoing audio.
Local discovery & pairing

mDNS/SSDP announce (e.g., _sonos-voice._tcp.local).
Local OAuth/PKCE or signed token pairing. LAN-only; no cloud required.
Privacy by design

Rene WurchPetitionsstarter*in

Medienanfragen

Jetzt unterstützen

13

Kommentare von Unterstützer*innen

Neuigkeiten zur Petition

Diese Petition teilen

Petition am 12. Oktober 2025 erstellt