GitHub
← Back to home
Plugins·Active

vocal.nvim

Neovim plugin for speech-to-text transcription via Whisper API or local model.

NeovimLuaSpeech-to-textWhisper

Overview

vocal.nvim lets you record audio from within Neovim and transcribe it with either the OpenAI Whisper API or a local Whisper model. The transcribed text is inserted at the cursor position — or replaces a visual selection — without leaving the editor.

How it works

Run :Vocal to start recording, speak, then run :Vocal again to stop and transcribe. The plugin records via sox, sends the audio to Whisper (local or API), and inserts the result asynchronously so Neovim stays responsive.

Configuration

require("vocal").setup({
  local_model = {
    model = "base",
    path = "~/whisper",
  },
})

Local model transcription is the default and works without an API key. Swap to the API by setting api_key and omitting local_model.

What I learned

The gap between "record audio" and "get clean text in the buffer" is bigger than it looks. Sox handles recording fine, but getting asynchronous Whisper inference — especially the local Python subprocess — to report back to Neovim without blocking required careful use of vim.loop and job control. The visual-mode replace path also needed separate handling since nvim_buf_set_text behaves differently with active selections.

Stack

  • Language: Lua
  • Audio: sox
  • Transcription: OpenAI Whisper API or local openai-whisper Python package
  • Targets: Neovim 0.11+