Tabby Docs
Features

Voice Interaction

System-wide voice transcription, commands, and content generation.

Tabby provides a powerful suite of voice interaction tools that work globally across your Windows system. These are separate from the interactive Voice Agent and are designed for rapid, heads-up tasks.

Voice Interaction Modes

You can toggle voice interaction from anywhere using global shortcuts. Each mode is represented by a compact on-screen indicator.

Transcribe Mode (Ctrl+Alt+T)

The standard mode for converting speech to text.

  • Action: Tabby listens to your voice and types the transcription directly into your active application.
  • Use Case: Ideal for rapid messaging, taking notes, or drafting content without using your hands.

Command Mode

Transform your voice into system-level actions.

  • Action: Tabby parses your intent and executes desktop commands using Windows MCP.
  • Capability: Can open applications ("Open Chrome"), navigate to URLs ("Go to YouTube"), or perform multi-step workflows ("Open Notepad and type a grocery list").

Generate Mode

Instruction-based content creation.

  • Action: Uses your transcription as an instruction to generate entirely new content.
  • Capability: Optimized for generating professional emails, code snippets, or formatted text based on your personal context and memory.

Global Shortcuts

ShortcutAction
Ctrl + Alt + TToggle Voice Transcription (Start/Stop listening)
Ctrl + Shift + TCycle through modes (Transcribe, Command, Generate)

Visual Indicator

When voice interaction is active, a floating pill-shaped indicator appears on your screen to show:

  • Current Mode: Which of the three modes is currently active.
  • Mic Status: Whether the system is Recording or Processing your audio.

Voice Agent vs. Voice Interaction

It is important to distinguish between the Voice Agent and Voice Interaction modes:

  • Voice Agent: A full-screen interactive experience for natural, back-and-forth conversations with the AI (similar to a phone call). Accessed via the Action Menu or Ctrl + Alt + J.
  • Voice Interaction: A heads-up, utility-focused tool for transcribing, commanding, or generating text directly into other apps. Accessed via Ctrl + Alt + T.