spych

Spych

PyPI version License: MIT PyPI Downloads

Spych (pronounced "speech"): talk to your computer like its your personal assistant without sending your voice to the cloud.

A lightweight, fully offline Python toolkit for wake word detection, audio transcription, and AI integrations. Built on faster-whisper and PvRecorder.

  • Fully offline: no API keys, no cloud calls, no eavesdropping
  • Multi-threaded wake word detection: overlapping listener windows so you rarely miss a trigger
  • Multiple wake words: map different words to different actions in one listener
  • Live transcription: continuous VAD-gated transcription to .txt and/or .srt files
  • Built-in agents: for Ollama, Claude Code, Codex, Gemini CLI, and OpenCode
  • Multi-agent orchestration: run several agents simultaneously under a single listener, each with its own wake words
  • Extensible: subclass BaseResponder to build your own agents with custom wake words and logic

API Docs: https://connor-makowski.github.io/spych/spych.html

Setup

Installation

Install Spych globally using pipx:

pipx install spych

Alternative: pip

Install using pip (requires Python 3.11+):

pip install spych

CLI

Once installed, spych is available as a command anywhere on your machine. You will still need to set up your respective agents before using them. See the docs below for setup instructions. Navigate to your project directory and launch any agent directly:

cd ~/my_project
spych claude

All agents and their parameters are supported as flags:

spych ollama --model llama3.2:latest
spych claude_sdk --setting-sources user project local
spych codex --listen-duration 8
spych opencode --model anthropic/claude-sonnet-4-5
spych gemini --wake-words gemini "hey gemini"

A global --theme flag controls the terminal colour output and must be placed before the agent name:

spych --theme light claude
spych --theme solarized ollama --model llama3.2:latest

Available themes: dark (default), light, solarized, mono.

Live transcription is also available via the CLI:

spych live
spych live --output-path meeting --output-format srt
spych live --terminate-words "stop recording"
spych live --no-timestamps --whisper-model small.en

Multiple agents can be run by creating one terminal session per agent and setting --wake-words to be different per agent. In this way you can create 3 claude agents with different wake words.

  • A Multi agent mode is also available via the CLI, but has some limitations.
  • See the "Multi-agent" section below for more details.

Run spych --help or spych <agent> --help to see all available options.


Quick Start: Voice Agents

The fastest path from zero to voice-controlled AI. These one-liners handle everything: wake word detection, transcription, and routing your speech to the target agent.

Ollama

Talk to a local LLM entirely offline. Requires Ollama installed and running.

For this example, we'll use the free llama3.2:latest model, but any Ollama model will work. For this example run: ollama pull llama3.2:latest.

from spych.agents import ollama

# Pull the model first: ollama pull llama3.2:latest
# Then say "hey llama" to trigger
ollama(model="llama3.2:latest")

Claude Code CLI

Voice-control Claude Code directly from your terminal. Requires Claude Code installed and authenticated. See: https://code.claude.com/docs/en/quickstart. Make sure you can run claude code commands in your terminal before trying this.

Note: This can pull from your .claude folder in your user directory or from the project directory, so you can have different settings for different projects if you like.

from spych.agents import claude_code_cli

# Say "hey claude" to trigger
claude_code_cli()

Claude Code SDK

Same as above but uses the Claude Agent SDK via a subprocess worker instead of the CLI. This is great for a lightweight setup with better tool call feedback loops, but you will still need to be authenticated with the SDK and have your tools set up. See: https://platform.claude.com/docs/en/agent-sdk/overview for setup instructions.

Note: This can pull from your .claude folder in your user directory or from the project directory, so you can have different settings for different projects if you like.

from spych.agents import claude_code_sdk

# Say "hey claude" to trigger
claude_code_sdk()

Codex CLI

Voice-control OpenAI's Codex agent. Requires Codex CLI installed and authenticated. Make sure you can run codex commands in your terminal before trying this.

from spych.agents import codex_cli

# Say "hey codex" to trigger
codex_cli()

Gemini CLI

Voice-control Google's Gemini agent. Requires Gemini CLI installed and authenticated. Make sure you can run gemini commands in your terminal before trying this.

from spych.agents import gemini_cli

# Say "hey gemini" to trigger
gemini_cli()

OpenCode CLI

Voice-control the OpenCode agent. Requires OpenCode installed and authenticated. Make sure you can run opencode commands in your terminal before trying this.

from spych.agents import opencode_cli

# Say "hey opencode" to trigger
opencode_cli()

💡 Pro tip: Saying "Hey Llama" or "Hey Claude" tends to trigger more reliably than just the bare wake word.

All agents accept a terminate_words list (default: ["terminate"]). Say the word or use ctrl+c to stop the listener cleanly.

Coding Agent Parameters

Parameter claude_code_cli claude_code_sdk codex_cli gemini_cli opencode_cli Description
name Claude Claude Codex Gemini OpenCode Custom display name for the agent
wake_words ["claude", "clod", "cloud", "clawed"] ["claude", "clod", "cloud", "clawed"] ["codex"] ["gemini", "google"] ["opencode", "open code"] Words that trigger the agent
terminate_words ["terminate"] ["terminate"] ["terminate"] ["terminate"] ["terminate"] Words that stop the listener
model - - - - None Model in provider/model format
listen_duration 0 0 0 0 0 Seconds to listen after wake word (0 = VAD auto)
continue_conversation True True True True True Resume the most recent session
setting_sources - ["user", "project", "local"] - - - Claude Code local settings to load
show_tool_events True True True True True Print live tool start/end events
spych_kwargs - - - - - Extra kwargs passed to Spych
spych_wake_kwargs - - - - - Extra kwargs passed to SpychWake

Ollama Parameters

Parameter Default Description
name "Ollama" Custom display name for the agent
wake_words ["llama", "ollama", "lama"] Words that trigger the agent
terminate_words ["terminate"] Words that stop the listener
model "llama3.2:latest" Ollama model name
listen_duration 0 Seconds to listen after wake word (0 = VAD auto)
history_length 10 Past interactions to include in context
host "http://localhost:11434" Ollama instance URL
spych_kwargs None Extra kwargs passed to Spych
spych_wake_kwargs None Extra kwargs passed to SpychWake

Live Transcription

SpychLive continuously records from the microphone using VAD and writes the transcript to disk in real time. No wake word required — it transcribes everything until stopped.

Python

from spych.live import SpychLive

live = SpychLive(
    output_format="srt",         # "txt", "srt", or "both"
    output_path="my_transcript", # written to my_transcript.srt
    show_timestamps=True,
    stop_key="q",                # type q + Enter to stop
    terminate_words=["stop recording"],
)
live.start()

CLI

spych live                                           # writes transcript.srt
spych live --output-path meeting --output-format both
spych live --terminate-words "stop recording"
spych live --no-timestamps --whisper-model small.en

SpychLive Parameters

Parameter Default Description
output_format "srt" Output format(s): "txt", "srt", or "both"
output_path "transcript" Base path without extension; extensions are appended automatically
show_timestamps True Prepend [HH:MM:SS] timestamps to terminal and .txt output
stop_key "q" Key (then Enter) to stop the session
terminate_words None Spoken words that stop the session (detected after transcription, ~1–3s latency)
on_terminate None No-argument callback executed when a terminate word fires
device_index -1 Microphone device index; -1 uses system default
whisper_model "base.en" faster-whisper model name
whisper_device "cpu" Device for inference: "cpu" or "cuda"
whisper_compute_type "int8" Compute precision: "int8", "float16", or "float32"
no_speech_threshold 0.3 Whisper segments with no_speech_prob above this are discarded
speech_threshold 0.5 Silero VAD probability above which a frame is considered speech onset
silence_threshold 0.35 Silero VAD probability below which a frame is considered silence during speech
silence_frames_threshold 20 Consecutive silent frames (~32ms each) required to close a segment (~640ms)
speech_pad_frames 5 Pre-roll frame count and onset confirmation threshold (~160ms)
max_speech_duration_s 30.0 Hard cap on a single segment in seconds
context_words 32 Trailing transcript words passed as initial_prompt for contextual accuracy

Multi-agent

Run several agents simultaneously under a single listener, each bound to its own wake words. Say "hey claude" to talk to Claude, "hey llama" to talk to Ollama — all in the same terminal session.

CLI

# Two agents, default wake words
spych multi --agents claude gemini

# Include Ollama with a specific model
spych multi --agents claude ollama --ollama-model llama3.2:latest

# Tune listen duration across all agents
spych multi --agents claude codex --listen-duration 8

Multi-agent CLI Parameters

Flag Default Description
--agents (required) One or more agent names to run: claude (claude_code_cli), claude_sdk (claude_code_sdk), codex (codex_cli), gemini (gemini_cli), opencode (opencode_cli), ollama
--terminate-words ["terminate"] Words that stop all agents
--listen-duration 5 Seconds to listen after a wake word
--continue-conversation true Resume the most recent session for each coding agent
--show-tool-events true Print live tool start/end events
--ollama-model llama3.2:latest Ollama model. Only used when ollama is in --agents
--ollama-host http://localhost:11434 Ollama instance URL. Only used when ollama is in --agents
--ollama-history-length 10 Ollama context history length. Only used when ollama is in --agents
--opencode-model None OpenCode model in provider/model format. Only used when opencode_cli is in --agents
--setting-sources ["user", "project", "local"] Claude Code SDK setting sources. Only used when claude_code_sdk is in --agents

Python

Use SpychOrchestrator directly to mix any combination of responders with custom wake words.

from spych.core import Spych
from spych.orchestrator import SpychOrchestrator
from spych.agents.claude import LocalClaudeCodeCLIResponder
from spych.agents.ollama import OllamaResponder

spych_object = Spych(whisper_model="base.en")

SpychOrchestrator(
    entries=[
        {
            "responder": LocalClaudeCodeCLIResponder(spych_object=spych_object),
            "wake_words": ["claude", "clod", "cloud", "clawed"],
            "terminate_words": ["terminate"],
        },
        {
            "responder": OllamaResponder(spych_object=spych_object, model="llama3.2:latest"),
            "wake_words": ["llama", "ollama", "lama"],
        },
    ]
).start()

OrchestratorEntry Keys

Key Required Default Description
responder ✓ - A BaseResponder instance
wake_words ✓ - Words that trigger this responder. Must be unique across all entries
terminate_words ["terminate"] Words that stop the entire orchestrator. Merged across all entries

SpychOrchestrator Parameters

Parameter Default Description
entries (required) List of OrchestratorEntry dicts — see table above
spych_wake_kwargs None Extra kwargs forwarded to SpychWake (e.g. whisper_model, wake_listener_count)

Building Your Own Agent

Not using any of the above? No problem. Subclass BaseResponder, implement respond, and you're done. Spych handles the rest: listening, transcription, spinner UI, timing, error handling, all of it.

from spych.responders import BaseResponder

class MyResponder(BaseResponder):
    def respond(self, user_input: str) -> str:
        return f"'{self.name}' heard: {user_input}"

A complete working example with a custom wake word:

from spych import Spych,SpychOrchestrator
from spych.responders import BaseResponder

class MyResponder(BaseResponder):
    def respond(self, user_input: str) -> str:
        return f"'{self.name}' heard: {user_input}"

SpychOrchestrator(
    entries=[
        {
            "responder": MyResponder(
                spych_object=Spych(whisper_model="base.en"),
                listen_duration=5,
                name="TestResponder",
            ),
            "wake_words": ["test"],
            "terminate_words": ["terminate"],
        }
    ]
).start()

The orchestrator can also handle multiple custom agents at once, each with their own wake words. For example, you can make a translation agent that listens for "Spanish" or "German" and routes to the appropriate responder:

Note: To run this example, you will need to have Ollama running and an Ollama model that can do translations. You can use llama3.2:latest or any other model you have set up for this purpose.

from spych import Spych,SpychOrchestrator
from spych.agents import OllamaResponder

class Spanish(OllamaResponder):
    def respond(self, user_input: str) -> str:
        user_input = f"Translate the following text to Spanish and return only the translated text: '{user_input}'"
        response = super().respond(user_input)
        return response

class German(OllamaResponder):
    def respond(self, user_input: str) -> str:
        user_input = f"Translate the following text to German and return only the translated text: '{user_input}'"
        response = super().respond(user_input)
        return response

SpychOrchestrator(
    entries=[
        {
            "responder": Spanish(
                spych_object=Spych(whisper_model="base.en"),
                name="SpanishTranslator",
                model="llama3.2:latest",
            ),
            "wake_words": ["spanish"],
            "terminate_words": ["terminate"],
        },
        {
            "responder": German(
                spych_object=Spych(whisper_model="base.en"),
                name="GermanTranslator",
                model="llama3.2:latest",
            ),
            "wake_words": ["german"],
            "terminate_words": ["terminate"],
        }
    ]
).start()

Custom Agent Contributions

Think your agent would be useful to others? Open a PR or file a feature request via a GitHub issue. Contributions are very welcome.


Lower-Level API

Need more control? Use SpychWake and Spych directly.

Listen and Transcribe

Spych records from the mic and returns a transcription string.

from spych import Spych

spych = Spych(
    whisper_model="base.en",  # or tiny, small, medium, large -> all faster-whisper models work
    whisper_device="cpu",     # use "cuda" if you have an Nvidia GPU
)

print(spych.listen(duration=5))

See: https://connor-makowski.github.io/spych/spych/core.html

Wake Word Detection

SpychWake runs multiple overlapping listener threads and fires a callback when a wake word is detected.

from spych import SpychWake, Spych

spych = Spych(whisper_model="base.en", whisper_device="cpu")

def on_wake():
    print("Wake word detected! Listening...")
    print(spych.listen(duration=5))

wake = SpychWake(
    wake_word_map={"speech": on_wake},
    whisper_model="tiny.en",
    whisper_device="cpu",
)

wake.start()

See: https://connor-makowski.github.io/spych/spych/wake.html


API Reference

Full docs including all parameters and methods: https://connor-makowski.github.io/spych/spych.html


Support

Found a bug or want a new feature? Open an issue on GitHub.


Contributing

Contributions are welcome!

  1. Fork the repo and clone it locally.
  2. Make your changes.
  3. Run tests and make sure they pass.
  4. Commit atomically with clear messages.
  5. Submit a pull request.

Virtual environment setup:

python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
./utils/test.sh
  1"""
  2# Spych
  3[![PyPI version](https://badge.fury.io/py/spych.svg)](https://badge.fury.io/py/spych)
  4[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
  5[![PyPI Downloads](https://img.shields.io/pypi/dm/spych.svg?label=PyPI%20downloads)](https://pypi.org/project/spych/)
  6
  7**Spych** (pronounced "speech"): talk to your computer like its your personal assistant without sending your voice to the cloud.
  8
  9A lightweight, fully offline Python toolkit for wake word detection, audio transcription, and AI integrations. Built on [faster-whisper](https://github.com/SYSTRAN/faster-whisper) and [PvRecorder](https://github.com/Picovoice/pvrecorder).
 10
 11- **Fully offline**: no API keys, no cloud calls, no eavesdropping
 12- **Multi-threaded wake word detection**: overlapping listener windows so you rarely miss a trigger
 13- **Multiple wake words**: map different words to different actions in one listener
 14- **Live transcription**: continuous VAD-gated transcription to `.txt` and/or `.srt` files
 15- **Built-in agents**: for [Ollama](https://ollama.com), [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [Codex](https://github.com/openai/codex), [Gemini CLI](https://github.com/google-gemini/gemini-cli), and [OpenCode](https://opencode.ai)
 16- **Multi-agent orchestration**: run several agents simultaneously under a single listener, each with its own wake words
 17- **Extensible**: subclass `BaseResponder` to build your own agents with custom wake words and logic
 18
 19**API Docs**: https://connor-makowski.github.io/spych/spych.html
 20
 21
 22# Setup
 23
 24## Installation
 25
 26### Recommended: pipx (strongly recommended)
 27
 28Install Spych globally using [pipx](https://pipx.pypa.io/stable/installation/):
 29
 30```bash
 31pipx install spych
 32```
 33
 34### Alternative: pip
 35
 36Install using pip (requires Python 3.11+):
 37
 38```bash
 39pip install spych
 40```
 41
 42---
 43
 44# CLI
 45
 46Once installed, `spych` is available as a command anywhere on your machine. You will still need to set up your respective agents before using them. See the docs below for setup instructions. Navigate to your project directory and launch any agent directly:
 47
 48```bash
 49cd ~/my_project
 50spych claude
 51```
 52
 53All agents and their parameters are supported as flags:
 54
 55```bash
 56spych ollama --model llama3.2:latest
 57spych claude_sdk --setting-sources user project local
 58spych codex --listen-duration 8
 59spych opencode --model anthropic/claude-sonnet-4-5
 60spych gemini --wake-words gemini "hey gemini"
 61```
 62
 63A global `--theme` flag controls the terminal colour output and must be placed before the agent name:
 64
 65```bash
 66spych --theme light claude
 67spych --theme solarized ollama --model llama3.2:latest
 68```
 69
 70Available themes: `dark` (default), `light`, `solarized`, `mono`.
 71
 72Live transcription is also available via the CLI:
 73
 74```bash
 75spych live
 76spych live --output-path meeting --output-format srt
 77spych live --terminate-words "stop recording"
 78spych live --no-timestamps --whisper-model small.en
 79```
 80
 81Multiple agents can be run by creating one terminal session per agent and setting `--wake-words` to be different per agent. In this way you can create 3 claude agents with different wake words.
 82
 83- A Multi agent mode is also available via the CLI, but has some limitations.
 84- See the "Multi-agent" section below for more details.
 85
 86Run `spych --help` or `spych <agent> --help` to see all available options.
 87
 88---
 89
 90# Quick Start: Voice Agents
 91
 92The fastest path from zero to voice-controlled AI. These one-liners handle everything: wake word detection, transcription, and routing your speech to the target agent.
 93
 94## Ollama
 95
 96Talk to a local LLM entirely offline. Requires [Ollama](https://ollama.com) installed and running.
 97
 98For this example, we'll use the free `llama3.2:latest` model, but any Ollama model will work. For this example run: `ollama pull llama3.2:latest`.
 99```python
100from spych.agents import ollama
101
102# Pull the model first: ollama pull llama3.2:latest
103# Then say "hey llama" to trigger
104ollama(model="llama3.2:latest")
105```
106
107## Claude Code CLI
108
109Voice-control Claude Code directly from your terminal. Requires [Claude Code](https://docs.anthropic.com/en/docs/claude-code) installed and authenticated. See: https://code.claude.com/docs/en/quickstart. Make sure you can run `claude code` commands in your terminal before trying this. 
110
111Note: This can pull from your `.claude` folder in your user directory or from the project directory, so you can have different settings for different projects if you like.
112
113
114```python
115from spych.agents import claude_code_cli
116
117# Say "hey claude" to trigger
118claude_code_cli()
119```
120
121## Claude Code SDK
122
123Same as above but uses the Claude Agent SDK via a subprocess worker instead of the CLI. This is great for a lightweight setup with better tool call feedback loops, but you will still need to be authenticated with the SDK and have your tools set up. See: https://platform.claude.com/docs/en/agent-sdk/overview for setup instructions. 
124
125Note: This can pull from your `.claude` folder in your user directory or from the project directory, so you can have different settings for different projects if you like.
126
127```python
128from spych.agents import claude_code_sdk
129
130# Say "hey claude" to trigger
131claude_code_sdk()
132```
133
134## Codex CLI
135
136Voice-control OpenAI's Codex agent. Requires [Codex CLI](https://github.com/openai/codex) installed and authenticated. Make sure you can run `codex` commands in your terminal before trying this.
137
138```python
139from spych.agents import codex_cli
140
141# Say "hey codex" to trigger
142codex_cli()
143```
144
145## Gemini CLI
146
147Voice-control Google's Gemini agent. Requires [Gemini CLI](https://github.com/google-gemini/gemini-cli) installed and authenticated. Make sure you can run `gemini` commands in your terminal before trying this.
148
149```python
150from spych.agents import gemini_cli
151
152# Say "hey gemini" to trigger
153gemini_cli()
154```
155
156## OpenCode CLI
157
158Voice-control the OpenCode agent. Requires [OpenCode](https://opencode.ai) installed and authenticated. Make sure you can run `opencode` commands in your terminal before trying this.
159
160```python
161from spych.agents import opencode_cli
162
163# Say "hey opencode" to trigger
164opencode_cli()
165```
166
167> 💡 **Pro tip:** Saying "Hey Llama" or "Hey Claude" tends to trigger more reliably than just the bare wake word.
168
169All agents accept a `terminate_words` list (default: `["terminate"]`). Say the word or use `ctrl+c` to stop the listener cleanly.
170
171### Coding Agent Parameters
172
173| Parameter | `claude_code_cli` | `claude_code_sdk` | `codex_cli` | `gemini_cli` | `opencode_cli` | Description |
174|---|---|---|---|---|---|---|
175| `name` | `Claude` | `Claude` | `Codex` | `Gemini` | `OpenCode` | Custom display name for the agent |
176| `wake_words` | `["claude", "clod", "cloud", "clawed"]` | `["claude", "clod", "cloud", "clawed"]` | `["codex"]` | `["gemini", "google"]` | `["opencode", "open code"]` | Words that trigger the agent |
177| `terminate_words` | `["terminate"]` | `["terminate"]` | `["terminate"]` | `["terminate"]` | `["terminate"]` | Words that stop the listener |
178| `model` | - | - | - | - | `None` | Model in `provider/model` format |
179| `listen_duration` | `0` | `0` | `0` | `0` | `0` | Seconds to listen after wake word (0 = VAD auto) |
180| `continue_conversation` | `True` | `True` | `True` | `True` | `True` | Resume the most recent session |
181| `setting_sources` | - | `["user", "project", "local"]` | - | - | - | Claude Code local settings to load |
182| `show_tool_events` | `True` | `True` | `True` | `True` | `True` | Print live tool start/end events |
183| `spych_kwargs` | - | - | - | - | - | Extra kwargs passed to `Spych` |
184| `spych_wake_kwargs` | - | - | - | - | - | Extra kwargs passed to `SpychWake` |
185
186### Ollama Parameters
187
188| Parameter | Default | Description |
189|---|---|---|
190| `name` | `"Ollama"` | Custom display name for the agent |
191| `wake_words` | `["llama", "ollama", "lama"]` | Words that trigger the agent |
192| `terminate_words` | `["terminate"]` | Words that stop the listener |
193| `model` | `"llama3.2:latest"` | Ollama model name |
194| `listen_duration` | `0` | Seconds to listen after wake word (0 = VAD auto) |
195| `history_length` | `10` | Past interactions to include in context |
196| `host` | `"http://localhost:11434"` | Ollama instance URL |
197| `spych_kwargs` | `None` | Extra kwargs passed to `Spych` |
198| `spych_wake_kwargs` | `None` | Extra kwargs passed to `SpychWake` |
199
200---
201
202# Live Transcription
203
204`SpychLive` continuously records from the microphone using VAD and writes the transcript to disk in real time. No wake word required — it transcribes everything until stopped.
205
206## Python
207
208```python
209from spych.live import SpychLive
210
211live = SpychLive(
212    output_format="srt",         # "txt", "srt", or "both"
213    output_path="my_transcript", # written to my_transcript.srt
214    show_timestamps=True,
215    stop_key="q",                # type q + Enter to stop
216    terminate_words=["stop recording"],
217)
218live.start()
219```
220
221## CLI
222
223```bash
224spych live                                           # writes transcript.srt
225spych live --output-path meeting --output-format both
226spych live --terminate-words "stop recording"
227spych live --no-timestamps --whisper-model small.en
228```
229
230### `SpychLive` Parameters
231
232| Parameter | Default | Description |
233|---|---|---|
234| `output_format` | `"srt"` | Output format(s): `"txt"`, `"srt"`, or `"both"` |
235| `output_path` | `"transcript"` | Base path without extension; extensions are appended automatically |
236| `show_timestamps` | `True` | Prepend `[HH:MM:SS]` timestamps to terminal and `.txt` output |
237| `stop_key` | `"q"` | Key (then Enter) to stop the session |
238| `terminate_words` | `None` | Spoken words that stop the session (detected after transcription, ~1–3s latency) |
239| `on_terminate` | `None` | No-argument callback executed when a terminate word fires |
240| `device_index` | `-1` | Microphone device index; `-1` uses system default |
241| `whisper_model` | `"base.en"` | faster-whisper model name |
242| `whisper_device` | `"cpu"` | Device for inference: `"cpu"` or `"cuda"` |
243| `whisper_compute_type` | `"int8"` | Compute precision: `"int8"`, `"float16"`, or `"float32"` |
244| `no_speech_threshold` | `0.3` | Whisper segments with `no_speech_prob` above this are discarded |
245| `speech_threshold` | `0.5` | Silero VAD probability above which a frame is considered speech onset |
246| `silence_threshold` | `0.35` | Silero VAD probability below which a frame is considered silence during speech |
247| `silence_frames_threshold` | `20` | Consecutive silent frames (~32ms each) required to close a segment (~640ms) |
248| `speech_pad_frames` | `5` | Pre-roll frame count and onset confirmation threshold (~160ms) |
249| `max_speech_duration_s` | `30.0` | Hard cap on a single segment in seconds |
250| `context_words` | `32` | Trailing transcript words passed as `initial_prompt` for contextual accuracy |
251
252---
253
254# Multi-agent
255
256Run several agents simultaneously under a single listener, each bound to its own wake words. Say "hey claude" to talk to Claude, "hey llama" to talk to Ollama — all in the same terminal session.
257
258## CLI
259
260```bash
261# Two agents, default wake words
262spych multi --agents claude gemini
263
264# Include Ollama with a specific model
265spych multi --agents claude ollama --ollama-model llama3.2:latest
266
267# Tune listen duration across all agents
268spych multi --agents claude codex --listen-duration 8
269```
270
271### Multi-agent CLI Parameters
272
273| Flag | Default | Description |
274|---|---|---|
275| `--agents` | *(required)* | One or more agent names to run: `claude` (`claude_code_cli`), `claude_sdk` (`claude_code_sdk`), `codex` (`codex_cli`), `gemini` (`gemini_cli`), `opencode` (`opencode_cli`), `ollama` |
276| `--terminate-words` | `["terminate"]` | Words that stop all agents |
277| `--listen-duration` | `5` | Seconds to listen after a wake word |
278| `--continue-conversation` | `true` | Resume the most recent session for each coding agent |
279| `--show-tool-events` | `true` | Print live tool start/end events |
280| `--ollama-model` | `llama3.2:latest` | Ollama model. Only used when `ollama` is in `--agents` |
281| `--ollama-host` | `http://localhost:11434` | Ollama instance URL. Only used when `ollama` is in `--agents` |
282| `--ollama-history-length` | `10` | Ollama context history length. Only used when `ollama` is in `--agents` |
283| `--opencode-model` | `None` | OpenCode model in `provider/model` format. Only used when `opencode_cli` is in `--agents` |
284| `--setting-sources` | `["user", "project", "local"]` | Claude Code SDK setting sources. Only used when `claude_code_sdk` is in `--agents` |
285
286## Python
287
288Use `SpychOrchestrator` directly to mix any combination of responders with custom wake words.
289
290```python
291from spych.core import Spych
292from spych.orchestrator import SpychOrchestrator
293from spych.agents.claude import LocalClaudeCodeCLIResponder
294from spych.agents.ollama import OllamaResponder
295
296spych_object = Spych(whisper_model="base.en")
297
298SpychOrchestrator(
299    entries=[
300        {
301            "responder": LocalClaudeCodeCLIResponder(spych_object=spych_object),
302            "wake_words": ["claude", "clod", "cloud", "clawed"],
303            "terminate_words": ["terminate"],
304        },
305        {
306            "responder": OllamaResponder(spych_object=spych_object, model="llama3.2:latest"),
307            "wake_words": ["llama", "ollama", "lama"],
308        },
309    ]
310).start()
311```
312
313### `OrchestratorEntry` Keys
314
315| Key | Required | Default | Description |
316|---|---|---|---|
317| `responder` | ✓ | - | A `BaseResponder` instance |
318| `wake_words` | ✓ | - | Words that trigger this responder. Must be unique across all entries |
319| `terminate_words` | | `["terminate"]` | Words that stop the entire orchestrator. Merged across all entries |
320
321### `SpychOrchestrator` Parameters
322
323| Parameter | Default | Description |
324|---|---|---|
325| `entries` | *(required)* | List of `OrchestratorEntry` dicts — see table above |
326| `spych_wake_kwargs` | `None` | Extra kwargs forwarded to `SpychWake` (e.g. `whisper_model`, `wake_listener_count`) |
327
328---
329
330# Building Your Own Agent
331
332Not using any of the above? No problem. Subclass `BaseResponder`, implement `respond`, and you're done. Spych handles the rest: listening, transcription, spinner UI, timing, error handling, all of it.
333```python
334from spych.responders import BaseResponder
335
336class MyResponder(BaseResponder):
337    def respond(self, user_input: str) -> str:
338        return f"'{self.name}' heard: {user_input}"
339```
340
341A complete working example with a custom wake word:
342```python
343from spych import Spych,SpychOrchestrator
344from spych.responders import BaseResponder
345
346class MyResponder(BaseResponder):
347    def respond(self, user_input: str) -> str:
348        return f"'{self.name}' heard: {user_input}"
349
350SpychOrchestrator(
351    entries=[
352        {
353            "responder": MyResponder(
354                spych_object=Spych(whisper_model="base.en"),
355                listen_duration=5,
356                name="TestResponder",
357            ),
358            "wake_words": ["test"],
359            "terminate_words": ["terminate"],
360        }
361    ]
362).start()
363```
364
365The orchestrator can also handle multiple custom agents at once, each with their own wake words. For example, you can make a translation agent that listens for "Spanish" or "German" and routes to the appropriate responder:
366
367> Note: To run this example, you will need to have Ollama running and an Ollama model that can do translations. You can use `llama3.2:latest` or any other model you have set up for this purpose.
368
369```python
370from spych import Spych,SpychOrchestrator
371from spych.agents import OllamaResponder
372
373class Spanish(OllamaResponder):
374    def respond(self, user_input: str) -> str:
375        user_input = f"Translate the following text to Spanish and return only the translated text: '{user_input}'"
376        response = super().respond(user_input)
377        return response
378    
379class German(OllamaResponder):
380    def respond(self, user_input: str) -> str:
381        user_input = f"Translate the following text to German and return only the translated text: '{user_input}'"
382        response = super().respond(user_input)
383        return response
384
385SpychOrchestrator(
386    entries=[
387        {
388            "responder": Spanish(
389                spych_object=Spych(whisper_model="base.en"),
390                name="SpanishTranslator",
391                model="llama3.2:latest",
392            ),
393            "wake_words": ["spanish"],
394            "terminate_words": ["terminate"],
395        },
396        {
397            "responder": German(
398                spych_object=Spych(whisper_model="base.en"),
399                name="GermanTranslator",
400                model="llama3.2:latest",
401            ),
402            "wake_words": ["german"],
403            "terminate_words": ["terminate"],
404        }
405    ]
406).start()
407```
408
409## Custom Agent Contributions
410
411Think your agent would be useful to others? Open a PR or file a feature request via a GitHub issue. Contributions are very welcome.
412
413---
414
415# Lower-Level API
416
417Need more control? Use `SpychWake` and `Spych` directly.
418
419## Listen and Transcribe
420
421`Spych` records from the mic and returns a transcription string.
422```python
423from spych import Spych
424
425spych = Spych(
426    whisper_model="base.en",  # or tiny, small, medium, large -> all faster-whisper models work
427    whisper_device="cpu",     # use "cuda" if you have an Nvidia GPU
428)
429
430print(spych.listen(duration=5))
431```
432
433See: https://connor-makowski.github.io/spych/spych/core.html
434
435## Wake Word Detection
436
437`SpychWake` runs multiple overlapping listener threads and fires a callback when a wake word is detected.
438```python
439from spych import SpychWake, Spych
440
441spych = Spych(whisper_model="base.en", whisper_device="cpu")
442
443def on_wake():
444    print("Wake word detected! Listening...")
445    print(spych.listen(duration=5))
446
447wake = SpychWake(
448    wake_word_map={"speech": on_wake},
449    whisper_model="tiny.en",
450    whisper_device="cpu",
451)
452
453wake.start()
454```
455
456See: https://connor-makowski.github.io/spych/spych/wake.html
457
458---
459
460# API Reference
461
462Full docs including all parameters and methods: https://connor-makowski.github.io/spych/spych.html
463
464---
465
466# Support
467
468Found a bug or want a new feature? [Open an issue on GitHub](https://github.com/connor-makowski/spych/issues).
469
470---
471
472# Contributing
473
474Contributions are welcome!
475
4761. Fork the repo and clone it locally.
4772. Make your changes.
4783. Run tests and make sure they pass.
4794. Commit atomically with clear messages.
4805. Submit a pull request.
481
482**Virtual environment setup:**
483```bash
484python3.11 -m venv venv
485source venv/bin/activate
486pip install -r requirements.txt
487./utils/test.sh
488```"""
489
490from .core import Spych
491from .wake import SpychWake
492from .orchestrator import SpychOrchestrator