How it works
One key. No windows, no buttons, no browser tab.
The whole interaction fits between pressing and releasing a single key — by default F12, but any key you like.
Hold F12
While AgentWhisper runs, the key belongs to it exclusively — no other program sees it, so nothing fires by accident.
Speak
A translucent panel floats at the bottom of your screen, green bars dancing to your voice — you always know when the mic is live.
Release
Your words are typed into the window you were working in and land in your clipboard as backup. A notification shows what was heard.
Private by architecture
Your voice never leaves this machine.
Speech recognition runs locally through faster-whisper — OpenAI's Whisper model on your own CPU. Nothing to sign up for, nothing phoning home, nothing to subscribe to.
One guided ~140 MB model download on first run, then AgentWhisper works entirely offline. Unplug the router; it won't notice.
Features
A small tool, finished properly.
Everything you need for daily dictation and nothing you don't — controlled from a tray icon, a config file, or your terminal.
Lives in your system tray
A microphone icon that turns red while recording. Every control is two clicks away — this is the entire menu:
Scriptable from the terminal
The same controls as the tray, for your keybindings and scripts.
Two recording modes
Hold-to-talk, or tap once to start and once to stop. Switch anytime from the tray.
Any key you like
F1–F12, Scroll Lock, Pause… reserved exclusively while it runs.
F12 ⇄Configured in one commented file
Pick the key, the model, auto-type behavior and recording limits in ~/.config/agentwhisper/config.toml.
A first run that holds your hand
The speech model downloads in the background with live progress in the tray. Dictations made meanwhile are queued, not lost.
Install
Two minutes, in or out.
Built for Debian/Ubuntu with X11 and Python 3.11+. Everything lands in predictable places, and uninstall.sh removes it all again.
Full walkthrough and troubleshooting in the installation guide.
# system packages for tray, typing, clipboard (most preinstalled on XFCE) sudo apt install python3-gi python3-gi-cairo \ gir1.2-ayatanaappindicator3-0.1 xclip xdotool libnotify-bin # grab the .deb from the latest release, then: sudo apt install ./agentwhisper_0.3.7_all.deb
# no root needed — installs into your home directory git clone https://github.com/ChrisSchroedinger/agentwhisper.git cd agentwhisper ./install.sh
Then start AgentWhisper from your applications menu — the mic appears in your tray.
Roadmap
The name is the roadmap.
v0.3 does everything on the box today. Where it goes next:
Core dictation
Exclusive hotkey, local transcription, auto-type, tray menu, voice visualizer, autostart, guided model download, .deb package.
AppImage
One file that runs on any distro — no install script, no packaging dance.
More languages & Wayland
v0.3 ships English models; the engine underneath already speaks 90+ languages.
Agent mode
Push-to-talk straight into your AI coding agent: speak to your computer, watch it work.