spych
Spych
Spych (pronounced "speech"): talk to your computer like its your personal assistant without sending your voice to the cloud.
A lightweight, fully offline Python toolkit for wake word detection, audio transcription, and AI integrations. Built on faster-whisper and PvRecorder.
- Fully offline: no API keys, no cloud calls, no eavesdropping
- Multi-threaded wake word detection: overlapping listener windows so you rarely miss a trigger
- Multiple wake words: map different words to different actions in one listener
- Live transcription: continuous VAD-gated transcription to
.txtand/or.srtfiles - Built-in agents: for Ollama, Claude Code, Codex, Gemini CLI, and OpenCode
- Multi-agent orchestration: run several agents simultaneously under a single listener, each with its own wake words
- Extensible: subclass
BaseResponderto build your own agents with custom wake words and logic
API Docs: https://connor-makowski.github.io/spych/spych.html
Setup
Installation
Recommended: pipx (strongly recommended)
Install Spych globally using pipx:
pipx install spych
Alternative: pip
Install using pip (requires Python 3.11+):
pip install spych
CLI
Once installed, spych is available as a command anywhere on your machine. You will still need to set up your respective agents before using them. See the docs below for setup instructions. Navigate to your project directory and launch any agent directly:
cd ~/my_project
spych claude
All agents and their parameters are supported as flags:
spych ollama --model llama3.2:latest
spych claude_sdk --setting-sources user project local
spych codex --listen-duration 8
spych opencode --model anthropic/claude-sonnet-4-5
spych gemini --wake-words gemini "hey gemini"
A global --theme flag controls the terminal colour output and must be placed before the agent name:
spych --theme light claude
spych --theme solarized ollama --model llama3.2:latest
Available themes: dark (default), light, solarized, mono.
Live transcription is also available via the CLI:
spych live
spych live --output-path meeting --output-format srt
spych live --terminate-words "stop recording"
spych live --no-timestamps --whisper-model small.en
Multiple agents can be run by creating one terminal session per agent and setting --wake-words to be different per agent. In this way you can create 3 claude agents with different wake words.
- A Multi agent mode is also available via the CLI, but has some limitations.
- See the "Multi-agent" section below for more details.
Run spych --help or spych <agent> --help to see all available options.
Quick Start: Voice Agents
The fastest path from zero to voice-controlled AI. These one-liners handle everything: wake word detection, transcription, and routing your speech to the target agent.
Ollama
Talk to a local LLM entirely offline. Requires Ollama installed and running.
For this example, we'll use the free llama3.2:latest model, but any Ollama model will work. For this example run: ollama pull llama3.2:latest.
from spych.agents import ollama
# Pull the model first: ollama pull llama3.2:latest
# Then say "hey llama" to trigger
ollama(model="llama3.2:latest")
Claude Code CLI
Voice-control Claude Code directly from your terminal. Requires Claude Code installed and authenticated. See: https://code.claude.com/docs/en/quickstart. Make sure you can run claude code commands in your terminal before trying this.
Note: This can pull from your .claude folder in your user directory or from the project directory, so you can have different settings for different projects if you like.
from spych.agents import claude_code_cli
# Say "hey claude" to trigger
claude_code_cli()
Claude Code SDK
Same as above but uses the Claude Agent SDK via a subprocess worker instead of the CLI. This is great for a lightweight setup with better tool call feedback loops, but you will still need to be authenticated with the SDK and have your tools set up. See: https://platform.claude.com/docs/en/agent-sdk/overview for setup instructions.
Note: This can pull from your .claude folder in your user directory or from the project directory, so you can have different settings for different projects if you like.
from spych.agents import claude_code_sdk
# Say "hey claude" to trigger
claude_code_sdk()
Codex CLI
Voice-control OpenAI's Codex agent. Requires Codex CLI installed and authenticated. Make sure you can run codex commands in your terminal before trying this.
from spych.agents import codex_cli
# Say "hey codex" to trigger
codex_cli()
Gemini CLI
Voice-control Google's Gemini agent. Requires Gemini CLI installed and authenticated. Make sure you can run gemini commands in your terminal before trying this.
from spych.agents import gemini_cli
# Say "hey gemini" to trigger
gemini_cli()
OpenCode CLI
Voice-control the OpenCode agent. Requires OpenCode installed and authenticated. Make sure you can run opencode commands in your terminal before trying this.
from spych.agents import opencode_cli
# Say "hey opencode" to trigger
opencode_cli()
💡 Pro tip: Saying "Hey Llama" or "Hey Claude" tends to trigger more reliably than just the bare wake word.
All agents accept a terminate_words list (default: ["terminate"]). Say the word or use ctrl+c to stop the listener cleanly.
Coding Agent Parameters
| Parameter | claude_code_cli |
claude_code_sdk |
codex_cli |
gemini_cli |
opencode_cli |
Description |
|---|---|---|---|---|---|---|
name |
Claude |
Claude |
Codex |
Gemini |
OpenCode |
Custom display name for the agent |
wake_words |
["claude", "clod", "cloud", "clawed"] |
["claude", "clod", "cloud", "clawed"] |
["codex"] |
["gemini", "google"] |
["opencode", "open code"] |
Words that trigger the agent |
terminate_words |
["terminate"] |
["terminate"] |
["terminate"] |
["terminate"] |
["terminate"] |
Words that stop the listener |
model |
- | - | - | - | None |
Model in provider/model format |
listen_duration |
0 |
0 |
0 |
0 |
0 |
Seconds to listen after wake word (0 = VAD auto) |
continue_conversation |
True |
True |
True |
True |
True |
Resume the most recent session |
setting_sources |
- | ["user", "project", "local"] |
- | - | - | Claude Code local settings to load |
show_tool_events |
True |
True |
True |
True |
True |
Print live tool start/end events |
spych_kwargs |
- | - | - | - | - | Extra kwargs passed to Spych |
spych_wake_kwargs |
- | - | - | - | - | Extra kwargs passed to SpychWake |
Ollama Parameters
| Parameter | Default | Description |
|---|---|---|
name |
"Ollama" |
Custom display name for the agent |
wake_words |
["llama", "ollama", "lama"] |
Words that trigger the agent |
terminate_words |
["terminate"] |
Words that stop the listener |
model |
"llama3.2:latest" |
Ollama model name |
listen_duration |
0 |
Seconds to listen after wake word (0 = VAD auto) |
history_length |
10 |
Past interactions to include in context |
host |
"http://localhost:11434" |
Ollama instance URL |
spych_kwargs |
None |
Extra kwargs passed to Spych |
spych_wake_kwargs |
None |
Extra kwargs passed to SpychWake |
Live Transcription
SpychLive continuously records from the microphone using VAD and writes the transcript to disk in real time. No wake word required — it transcribes everything until stopped.
Python
from spych.live import SpychLive
live = SpychLive(
output_format="srt", # "txt", "srt", or "both"
output_path="my_transcript", # written to my_transcript.srt
show_timestamps=True,
stop_key="q", # type q + Enter to stop
terminate_words=["stop recording"],
)
live.start()
CLI
spych live # writes transcript.srt
spych live --output-path meeting --output-format both
spych live --terminate-words "stop recording"
spych live --no-timestamps --whisper-model small.en
SpychLive Parameters
| Parameter | Default | Description |
|---|---|---|
output_format |
"srt" |
Output format(s): "txt", "srt", or "both" |
output_path |
"transcript" |
Base path without extension; extensions are appended automatically |
show_timestamps |
True |
Prepend [HH:MM:SS] timestamps to terminal and .txt output |
stop_key |
"q" |
Key (then Enter) to stop the session |
terminate_words |
None |
Spoken words that stop the session (detected after transcription, ~1–3s latency) |
on_terminate |
None |
No-argument callback executed when a terminate word fires |
device_index |
-1 |
Microphone device index; -1 uses system default |
whisper_model |
"base.en" |
faster-whisper model name |
whisper_device |
"cpu" |
Device for inference: "cpu" or "cuda" |
whisper_compute_type |
"int8" |
Compute precision: "int8", "float16", or "float32" |
no_speech_threshold |
0.3 |
Whisper segments with no_speech_prob above this are discarded |
speech_threshold |
0.5 |
Silero VAD probability above which a frame is considered speech onset |
silence_threshold |
0.35 |
Silero VAD probability below which a frame is considered silence during speech |
silence_frames_threshold |
20 |
Consecutive silent frames (~32ms each) required to close a segment (~640ms) |
speech_pad_frames |
5 |
Pre-roll frame count and onset confirmation threshold (~160ms) |
max_speech_duration_s |
30.0 |
Hard cap on a single segment in seconds |
context_words |
32 |
Trailing transcript words passed as initial_prompt for contextual accuracy |
Multi-agent
Run several agents simultaneously under a single listener, each bound to its own wake words. Say "hey claude" to talk to Claude, "hey llama" to talk to Ollama — all in the same terminal session.
CLI
# Two agents, default wake words
spych multi --agents claude gemini
# Include Ollama with a specific model
spych multi --agents claude ollama --ollama-model llama3.2:latest
# Tune listen duration across all agents
spych multi --agents claude codex --listen-duration 8
Multi-agent CLI Parameters
| Flag | Default | Description |
|---|---|---|
--agents |
(required) | One or more agent names to run: claude (claude_code_cli), claude_sdk (claude_code_sdk), codex (codex_cli), gemini (gemini_cli), opencode (opencode_cli), ollama |
--terminate-words |
["terminate"] |
Words that stop all agents |
--listen-duration |
5 |
Seconds to listen after a wake word |
--continue-conversation |
true |
Resume the most recent session for each coding agent |
--show-tool-events |
true |
Print live tool start/end events |
--ollama-model |
llama3.2:latest |
Ollama model. Only used when ollama is in --agents |
--ollama-host |
http://localhost:11434 |
Ollama instance URL. Only used when ollama is in --agents |
--ollama-history-length |
10 |
Ollama context history length. Only used when ollama is in --agents |
--opencode-model |
None |
OpenCode model in provider/model format. Only used when opencode_cli is in --agents |
--setting-sources |
["user", "project", "local"] |
Claude Code SDK setting sources. Only used when claude_code_sdk is in --agents |
Python
Use SpychOrchestrator directly to mix any combination of responders with custom wake words.
from spych.core import Spych
from spych.orchestrator import SpychOrchestrator
from spych.agents.claude import LocalClaudeCodeCLIResponder
from spych.agents.ollama import OllamaResponder
spych_object = Spych(whisper_model="base.en")
SpychOrchestrator(
entries=[
{
"responder": LocalClaudeCodeCLIResponder(spych_object=spych_object),
"wake_words": ["claude", "clod", "cloud", "clawed"],
"terminate_words": ["terminate"],
},
{
"responder": OllamaResponder(spych_object=spych_object, model="llama3.2:latest"),
"wake_words": ["llama", "ollama", "lama"],
},
]
).start()
OrchestratorEntry Keys
| Key | Required | Default | Description |
|---|---|---|---|
responder |
✓ | - | A BaseResponder instance |
wake_words |
✓ | - | Words that trigger this responder. Must be unique across all entries |
terminate_words |
["terminate"] |
Words that stop the entire orchestrator. Merged across all entries |
SpychOrchestrator Parameters
| Parameter | Default | Description |
|---|---|---|
entries |
(required) | List of OrchestratorEntry dicts — see table above |
spych_wake_kwargs |
None |
Extra kwargs forwarded to SpychWake (e.g. whisper_model, wake_listener_count) |
Building Your Own Agent
Not using any of the above? No problem. Subclass BaseResponder, implement respond, and you're done. Spych handles the rest: listening, transcription, spinner UI, timing, error handling, all of it.
from spych.responders import BaseResponder
class MyResponder(BaseResponder):
def respond(self, user_input: str) -> str:
return f"'{self.name}' heard: {user_input}"
A complete working example with a custom wake word:
from spych import Spych,SpychOrchestrator
from spych.responders import BaseResponder
class MyResponder(BaseResponder):
def respond(self, user_input: str) -> str:
return f"'{self.name}' heard: {user_input}"
SpychOrchestrator(
entries=[
{
"responder": MyResponder(
spych_object=Spych(whisper_model="base.en"),
listen_duration=5,
name="TestResponder",
),
"wake_words": ["test"],
"terminate_words": ["terminate"],
}
]
).start()
The orchestrator can also handle multiple custom agents at once, each with their own wake words. For example, you can make a translation agent that listens for "Spanish" or "German" and routes to the appropriate responder:
Note: To run this example, you will need to have Ollama running and an Ollama model that can do translations. You can use
llama3.2:latestor any other model you have set up for this purpose.
from spych import Spych,SpychOrchestrator
from spych.agents import OllamaResponder
class Spanish(OllamaResponder):
def respond(self, user_input: str) -> str:
user_input = f"Translate the following text to Spanish and return only the translated text: '{user_input}'"
response = super().respond(user_input)
return response
class German(OllamaResponder):
def respond(self, user_input: str) -> str:
user_input = f"Translate the following text to German and return only the translated text: '{user_input}'"
response = super().respond(user_input)
return response
SpychOrchestrator(
entries=[
{
"responder": Spanish(
spych_object=Spych(whisper_model="base.en"),
name="SpanishTranslator",
model="llama3.2:latest",
),
"wake_words": ["spanish"],
"terminate_words": ["terminate"],
},
{
"responder": German(
spych_object=Spych(whisper_model="base.en"),
name="GermanTranslator",
model="llama3.2:latest",
),
"wake_words": ["german"],
"terminate_words": ["terminate"],
}
]
).start()
Custom Agent Contributions
Think your agent would be useful to others? Open a PR or file a feature request via a GitHub issue. Contributions are very welcome.
Lower-Level API
Need more control? Use SpychWake and Spych directly.
Listen and Transcribe
Spych records from the mic and returns a transcription string.
from spych import Spych
spych = Spych(
whisper_model="base.en", # or tiny, small, medium, large -> all faster-whisper models work
whisper_device="cpu", # use "cuda" if you have an Nvidia GPU
)
print(spych.listen(duration=5))
See: https://connor-makowski.github.io/spych/spych/core.html
Wake Word Detection
SpychWake runs multiple overlapping listener threads and fires a callback when a wake word is detected.
from spych import SpychWake, Spych
spych = Spych(whisper_model="base.en", whisper_device="cpu")
def on_wake():
print("Wake word detected! Listening...")
print(spych.listen(duration=5))
wake = SpychWake(
wake_word_map={"speech": on_wake},
whisper_model="tiny.en",
whisper_device="cpu",
)
wake.start()
See: https://connor-makowski.github.io/spych/spych/wake.html
API Reference
Full docs including all parameters and methods: https://connor-makowski.github.io/spych/spych.html
Support
Found a bug or want a new feature? Open an issue on GitHub.
Contributing
Contributions are welcome!
- Fork the repo and clone it locally.
- Make your changes.
- Run tests and make sure they pass.
- Commit atomically with clear messages.
- Submit a pull request.
Virtual environment setup:
python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
./utils/test.sh
1""" 2# Spych 3[](https://badge.fury.io/py/spych) 4[](https://opensource.org/licenses/MIT) 5[](https://pypi.org/project/spych/) 6 7**Spych** (pronounced "speech"): talk to your computer like its your personal assistant without sending your voice to the cloud. 8 9A lightweight, fully offline Python toolkit for wake word detection, audio transcription, and AI integrations. Built on [faster-whisper](https://github.com/SYSTRAN/faster-whisper) and [PvRecorder](https://github.com/Picovoice/pvrecorder). 10 11- **Fully offline**: no API keys, no cloud calls, no eavesdropping 12- **Multi-threaded wake word detection**: overlapping listener windows so you rarely miss a trigger 13- **Multiple wake words**: map different words to different actions in one listener 14- **Live transcription**: continuous VAD-gated transcription to `.txt` and/or `.srt` files 15- **Built-in agents**: for [Ollama](https://ollama.com), [Claude Code](https://docs.anthropic.com/en/docs/claude-code), [Codex](https://github.com/openai/codex), [Gemini CLI](https://github.com/google-gemini/gemini-cli), and [OpenCode](https://opencode.ai) 16- **Multi-agent orchestration**: run several agents simultaneously under a single listener, each with its own wake words 17- **Extensible**: subclass `BaseResponder` to build your own agents with custom wake words and logic 18 19**API Docs**: https://connor-makowski.github.io/spych/spych.html 20 21 22# Setup 23 24## Installation 25 26### Recommended: pipx (strongly recommended) 27 28Install Spych globally using [pipx](https://pipx.pypa.io/stable/installation/): 29 30```bash 31pipx install spych 32``` 33 34### Alternative: pip 35 36Install using pip (requires Python 3.11+): 37 38```bash 39pip install spych 40``` 41 42--- 43 44# CLI 45 46Once installed, `spych` is available as a command anywhere on your machine. You will still need to set up your respective agents before using them. See the docs below for setup instructions. Navigate to your project directory and launch any agent directly: 47 48```bash 49cd ~/my_project 50spych claude 51``` 52 53All agents and their parameters are supported as flags: 54 55```bash 56spych ollama --model llama3.2:latest 57spych claude_sdk --setting-sources user project local 58spych codex --listen-duration 8 59spych opencode --model anthropic/claude-sonnet-4-5 60spych gemini --wake-words gemini "hey gemini" 61``` 62 63A global `--theme` flag controls the terminal colour output and must be placed before the agent name: 64 65```bash 66spych --theme light claude 67spych --theme solarized ollama --model llama3.2:latest 68``` 69 70Available themes: `dark` (default), `light`, `solarized`, `mono`. 71 72Live transcription is also available via the CLI: 73 74```bash 75spych live 76spych live --output-path meeting --output-format srt 77spych live --terminate-words "stop recording" 78spych live --no-timestamps --whisper-model small.en 79``` 80 81Multiple agents can be run by creating one terminal session per agent and setting `--wake-words` to be different per agent. In this way you can create 3 claude agents with different wake words. 82 83- A Multi agent mode is also available via the CLI, but has some limitations. 84- See the "Multi-agent" section below for more details. 85 86Run `spych --help` or `spych <agent> --help` to see all available options. 87 88--- 89 90# Quick Start: Voice Agents 91 92The fastest path from zero to voice-controlled AI. These one-liners handle everything: wake word detection, transcription, and routing your speech to the target agent. 93 94## Ollama 95 96Talk to a local LLM entirely offline. Requires [Ollama](https://ollama.com) installed and running. 97 98For this example, we'll use the free `llama3.2:latest` model, but any Ollama model will work. For this example run: `ollama pull llama3.2:latest`. 99```python 100from spych.agents import ollama 101 102# Pull the model first: ollama pull llama3.2:latest 103# Then say "hey llama" to trigger 104ollama(model="llama3.2:latest") 105``` 106 107## Claude Code CLI 108 109Voice-control Claude Code directly from your terminal. Requires [Claude Code](https://docs.anthropic.com/en/docs/claude-code) installed and authenticated. See: https://code.claude.com/docs/en/quickstart. Make sure you can run `claude code` commands in your terminal before trying this. 110 111Note: This can pull from your `.claude` folder in your user directory or from the project directory, so you can have different settings for different projects if you like. 112 113 114```python 115from spych.agents import claude_code_cli 116 117# Say "hey claude" to trigger 118claude_code_cli() 119``` 120 121## Claude Code SDK 122 123Same as above but uses the Claude Agent SDK via a subprocess worker instead of the CLI. This is great for a lightweight setup with better tool call feedback loops, but you will still need to be authenticated with the SDK and have your tools set up. See: https://platform.claude.com/docs/en/agent-sdk/overview for setup instructions. 124 125Note: This can pull from your `.claude` folder in your user directory or from the project directory, so you can have different settings for different projects if you like. 126 127```python 128from spych.agents import claude_code_sdk 129 130# Say "hey claude" to trigger 131claude_code_sdk() 132``` 133 134## Codex CLI 135 136Voice-control OpenAI's Codex agent. Requires [Codex CLI](https://github.com/openai/codex) installed and authenticated. Make sure you can run `codex` commands in your terminal before trying this. 137 138```python 139from spych.agents import codex_cli 140 141# Say "hey codex" to trigger 142codex_cli() 143``` 144 145## Gemini CLI 146 147Voice-control Google's Gemini agent. Requires [Gemini CLI](https://github.com/google-gemini/gemini-cli) installed and authenticated. Make sure you can run `gemini` commands in your terminal before trying this. 148 149```python 150from spych.agents import gemini_cli 151 152# Say "hey gemini" to trigger 153gemini_cli() 154``` 155 156## OpenCode CLI 157 158Voice-control the OpenCode agent. Requires [OpenCode](https://opencode.ai) installed and authenticated. Make sure you can run `opencode` commands in your terminal before trying this. 159 160```python 161from spych.agents import opencode_cli 162 163# Say "hey opencode" to trigger 164opencode_cli() 165``` 166 167> 💡 **Pro tip:** Saying "Hey Llama" or "Hey Claude" tends to trigger more reliably than just the bare wake word. 168 169All agents accept a `terminate_words` list (default: `["terminate"]`). Say the word or use `ctrl+c` to stop the listener cleanly. 170 171### Coding Agent Parameters 172 173| Parameter | `claude_code_cli` | `claude_code_sdk` | `codex_cli` | `gemini_cli` | `opencode_cli` | Description | 174|---|---|---|---|---|---|---| 175| `name` | `Claude` | `Claude` | `Codex` | `Gemini` | `OpenCode` | Custom display name for the agent | 176| `wake_words` | `["claude", "clod", "cloud", "clawed"]` | `["claude", "clod", "cloud", "clawed"]` | `["codex"]` | `["gemini", "google"]` | `["opencode", "open code"]` | Words that trigger the agent | 177| `terminate_words` | `["terminate"]` | `["terminate"]` | `["terminate"]` | `["terminate"]` | `["terminate"]` | Words that stop the listener | 178| `model` | - | - | - | - | `None` | Model in `provider/model` format | 179| `listen_duration` | `0` | `0` | `0` | `0` | `0` | Seconds to listen after wake word (0 = VAD auto) | 180| `continue_conversation` | `True` | `True` | `True` | `True` | `True` | Resume the most recent session | 181| `setting_sources` | - | `["user", "project", "local"]` | - | - | - | Claude Code local settings to load | 182| `show_tool_events` | `True` | `True` | `True` | `True` | `True` | Print live tool start/end events | 183| `spych_kwargs` | - | - | - | - | - | Extra kwargs passed to `Spych` | 184| `spych_wake_kwargs` | - | - | - | - | - | Extra kwargs passed to `SpychWake` | 185 186### Ollama Parameters 187 188| Parameter | Default | Description | 189|---|---|---| 190| `name` | `"Ollama"` | Custom display name for the agent | 191| `wake_words` | `["llama", "ollama", "lama"]` | Words that trigger the agent | 192| `terminate_words` | `["terminate"]` | Words that stop the listener | 193| `model` | `"llama3.2:latest"` | Ollama model name | 194| `listen_duration` | `0` | Seconds to listen after wake word (0 = VAD auto) | 195| `history_length` | `10` | Past interactions to include in context | 196| `host` | `"http://localhost:11434"` | Ollama instance URL | 197| `spych_kwargs` | `None` | Extra kwargs passed to `Spych` | 198| `spych_wake_kwargs` | `None` | Extra kwargs passed to `SpychWake` | 199 200--- 201 202# Live Transcription 203 204`SpychLive` continuously records from the microphone using VAD and writes the transcript to disk in real time. No wake word required — it transcribes everything until stopped. 205 206## Python 207 208```python 209from spych.live import SpychLive 210 211live = SpychLive( 212 output_format="srt", # "txt", "srt", or "both" 213 output_path="my_transcript", # written to my_transcript.srt 214 show_timestamps=True, 215 stop_key="q", # type q + Enter to stop 216 terminate_words=["stop recording"], 217) 218live.start() 219``` 220 221## CLI 222 223```bash 224spych live # writes transcript.srt 225spych live --output-path meeting --output-format both 226spych live --terminate-words "stop recording" 227spych live --no-timestamps --whisper-model small.en 228``` 229 230### `SpychLive` Parameters 231 232| Parameter | Default | Description | 233|---|---|---| 234| `output_format` | `"srt"` | Output format(s): `"txt"`, `"srt"`, or `"both"` | 235| `output_path` | `"transcript"` | Base path without extension; extensions are appended automatically | 236| `show_timestamps` | `True` | Prepend `[HH:MM:SS]` timestamps to terminal and `.txt` output | 237| `stop_key` | `"q"` | Key (then Enter) to stop the session | 238| `terminate_words` | `None` | Spoken words that stop the session (detected after transcription, ~1–3s latency) | 239| `on_terminate` | `None` | No-argument callback executed when a terminate word fires | 240| `device_index` | `-1` | Microphone device index; `-1` uses system default | 241| `whisper_model` | `"base.en"` | faster-whisper model name | 242| `whisper_device` | `"cpu"` | Device for inference: `"cpu"` or `"cuda"` | 243| `whisper_compute_type` | `"int8"` | Compute precision: `"int8"`, `"float16"`, or `"float32"` | 244| `no_speech_threshold` | `0.3` | Whisper segments with `no_speech_prob` above this are discarded | 245| `speech_threshold` | `0.5` | Silero VAD probability above which a frame is considered speech onset | 246| `silence_threshold` | `0.35` | Silero VAD probability below which a frame is considered silence during speech | 247| `silence_frames_threshold` | `20` | Consecutive silent frames (~32ms each) required to close a segment (~640ms) | 248| `speech_pad_frames` | `5` | Pre-roll frame count and onset confirmation threshold (~160ms) | 249| `max_speech_duration_s` | `30.0` | Hard cap on a single segment in seconds | 250| `context_words` | `32` | Trailing transcript words passed as `initial_prompt` for contextual accuracy | 251 252--- 253 254# Multi-agent 255 256Run several agents simultaneously under a single listener, each bound to its own wake words. Say "hey claude" to talk to Claude, "hey llama" to talk to Ollama — all in the same terminal session. 257 258## CLI 259 260```bash 261# Two agents, default wake words 262spych multi --agents claude gemini 263 264# Include Ollama with a specific model 265spych multi --agents claude ollama --ollama-model llama3.2:latest 266 267# Tune listen duration across all agents 268spych multi --agents claude codex --listen-duration 8 269``` 270 271### Multi-agent CLI Parameters 272 273| Flag | Default | Description | 274|---|---|---| 275| `--agents` | *(required)* | One or more agent names to run: `claude` (`claude_code_cli`), `claude_sdk` (`claude_code_sdk`), `codex` (`codex_cli`), `gemini` (`gemini_cli`), `opencode` (`opencode_cli`), `ollama` | 276| `--terminate-words` | `["terminate"]` | Words that stop all agents | 277| `--listen-duration` | `5` | Seconds to listen after a wake word | 278| `--continue-conversation` | `true` | Resume the most recent session for each coding agent | 279| `--show-tool-events` | `true` | Print live tool start/end events | 280| `--ollama-model` | `llama3.2:latest` | Ollama model. Only used when `ollama` is in `--agents` | 281| `--ollama-host` | `http://localhost:11434` | Ollama instance URL. Only used when `ollama` is in `--agents` | 282| `--ollama-history-length` | `10` | Ollama context history length. Only used when `ollama` is in `--agents` | 283| `--opencode-model` | `None` | OpenCode model in `provider/model` format. Only used when `opencode_cli` is in `--agents` | 284| `--setting-sources` | `["user", "project", "local"]` | Claude Code SDK setting sources. Only used when `claude_code_sdk` is in `--agents` | 285 286## Python 287 288Use `SpychOrchestrator` directly to mix any combination of responders with custom wake words. 289 290```python 291from spych.core import Spych 292from spych.orchestrator import SpychOrchestrator 293from spych.agents.claude import LocalClaudeCodeCLIResponder 294from spych.agents.ollama import OllamaResponder 295 296spych_object = Spych(whisper_model="base.en") 297 298SpychOrchestrator( 299 entries=[ 300 { 301 "responder": LocalClaudeCodeCLIResponder(spych_object=spych_object), 302 "wake_words": ["claude", "clod", "cloud", "clawed"], 303 "terminate_words": ["terminate"], 304 }, 305 { 306 "responder": OllamaResponder(spych_object=spych_object, model="llama3.2:latest"), 307 "wake_words": ["llama", "ollama", "lama"], 308 }, 309 ] 310).start() 311``` 312 313### `OrchestratorEntry` Keys 314 315| Key | Required | Default | Description | 316|---|---|---|---| 317| `responder` | ✓ | - | A `BaseResponder` instance | 318| `wake_words` | ✓ | - | Words that trigger this responder. Must be unique across all entries | 319| `terminate_words` | | `["terminate"]` | Words that stop the entire orchestrator. Merged across all entries | 320 321### `SpychOrchestrator` Parameters 322 323| Parameter | Default | Description | 324|---|---|---| 325| `entries` | *(required)* | List of `OrchestratorEntry` dicts — see table above | 326| `spych_wake_kwargs` | `None` | Extra kwargs forwarded to `SpychWake` (e.g. `whisper_model`, `wake_listener_count`) | 327 328--- 329 330# Building Your Own Agent 331 332Not using any of the above? No problem. Subclass `BaseResponder`, implement `respond`, and you're done. Spych handles the rest: listening, transcription, spinner UI, timing, error handling, all of it. 333```python 334from spych.responders import BaseResponder 335 336class MyResponder(BaseResponder): 337 def respond(self, user_input: str) -> str: 338 return f"'{self.name}' heard: {user_input}" 339``` 340 341A complete working example with a custom wake word: 342```python 343from spych import Spych,SpychOrchestrator 344from spych.responders import BaseResponder 345 346class MyResponder(BaseResponder): 347 def respond(self, user_input: str) -> str: 348 return f"'{self.name}' heard: {user_input}" 349 350SpychOrchestrator( 351 entries=[ 352 { 353 "responder": MyResponder( 354 spych_object=Spych(whisper_model="base.en"), 355 listen_duration=5, 356 name="TestResponder", 357 ), 358 "wake_words": ["test"], 359 "terminate_words": ["terminate"], 360 } 361 ] 362).start() 363``` 364 365The orchestrator can also handle multiple custom agents at once, each with their own wake words. For example, you can make a translation agent that listens for "Spanish" or "German" and routes to the appropriate responder: 366 367> Note: To run this example, you will need to have Ollama running and an Ollama model that can do translations. You can use `llama3.2:latest` or any other model you have set up for this purpose. 368 369```python 370from spych import Spych,SpychOrchestrator 371from spych.agents import OllamaResponder 372 373class Spanish(OllamaResponder): 374 def respond(self, user_input: str) -> str: 375 user_input = f"Translate the following text to Spanish and return only the translated text: '{user_input}'" 376 response = super().respond(user_input) 377 return response 378 379class German(OllamaResponder): 380 def respond(self, user_input: str) -> str: 381 user_input = f"Translate the following text to German and return only the translated text: '{user_input}'" 382 response = super().respond(user_input) 383 return response 384 385SpychOrchestrator( 386 entries=[ 387 { 388 "responder": Spanish( 389 spych_object=Spych(whisper_model="base.en"), 390 name="SpanishTranslator", 391 model="llama3.2:latest", 392 ), 393 "wake_words": ["spanish"], 394 "terminate_words": ["terminate"], 395 }, 396 { 397 "responder": German( 398 spych_object=Spych(whisper_model="base.en"), 399 name="GermanTranslator", 400 model="llama3.2:latest", 401 ), 402 "wake_words": ["german"], 403 "terminate_words": ["terminate"], 404 } 405 ] 406).start() 407``` 408 409## Custom Agent Contributions 410 411Think your agent would be useful to others? Open a PR or file a feature request via a GitHub issue. Contributions are very welcome. 412 413--- 414 415# Lower-Level API 416 417Need more control? Use `SpychWake` and `Spych` directly. 418 419## Listen and Transcribe 420 421`Spych` records from the mic and returns a transcription string. 422```python 423from spych import Spych 424 425spych = Spych( 426 whisper_model="base.en", # or tiny, small, medium, large -> all faster-whisper models work 427 whisper_device="cpu", # use "cuda" if you have an Nvidia GPU 428) 429 430print(spych.listen(duration=5)) 431``` 432 433See: https://connor-makowski.github.io/spych/spych/core.html 434 435## Wake Word Detection 436 437`SpychWake` runs multiple overlapping listener threads and fires a callback when a wake word is detected. 438```python 439from spych import SpychWake, Spych 440 441spych = Spych(whisper_model="base.en", whisper_device="cpu") 442 443def on_wake(): 444 print("Wake word detected! Listening...") 445 print(spych.listen(duration=5)) 446 447wake = SpychWake( 448 wake_word_map={"speech": on_wake}, 449 whisper_model="tiny.en", 450 whisper_device="cpu", 451) 452 453wake.start() 454``` 455 456See: https://connor-makowski.github.io/spych/spych/wake.html 457 458--- 459 460# API Reference 461 462Full docs including all parameters and methods: https://connor-makowski.github.io/spych/spych.html 463 464--- 465 466# Support 467 468Found a bug or want a new feature? [Open an issue on GitHub](https://github.com/connor-makowski/spych/issues). 469 470--- 471 472# Contributing 473 474Contributions are welcome! 475 4761. Fork the repo and clone it locally. 4772. Make your changes. 4783. Run tests and make sure they pass. 4794. Commit atomically with clear messages. 4805. Submit a pull request. 481 482**Virtual environment setup:** 483```bash 484python3.11 -m venv venv 485source venv/bin/activate 486pip install -r requirements.txt 487./utils/test.sh 488```""" 489 490from .core import Spych 491from .wake import SpychWake 492from .orchestrator import SpychOrchestrator