Overview
vocal.nvim lets you record audio from within Neovim and transcribe it with either the OpenAI Whisper API or a local Whisper model. The transcribed text is inserted at the cursor position — or replaces a visual selection — without leaving the editor.
How it works
Run :Vocal to start recording, speak, then run :Vocal again to stop and transcribe. The plugin records via sox, sends the audio to Whisper (local or API), and inserts the result asynchronously so Neovim stays responsive.
Configuration
require("vocal").setup({
local_model = {
model = "base",
path = "~/whisper",
},
})
Local model transcription is the default and works without an API key. Swap to the API by setting api_key and omitting local_model.
What I learned
The gap between "record audio" and "get clean text in the buffer" is bigger than it looks. Sox handles recording fine, but getting asynchronous Whisper inference — especially the local Python subprocess — to report back to Neovim without blocking required careful use of vim.loop and job control. The visual-mode replace path also needed separate handling since nvim_buf_set_text behaves differently with active selections.
Stack
- Language: Lua
- Audio: sox
- Transcription: OpenAI Whisper API or local
openai-whisperPython package - Targets: Neovim 0.11+