New Plugin marketplace is live — teach it new tricks

Your Mac just learned
to listen.

Press a hotkey and talk. Cursor Voice sees your screen, clicks and types for you, and answers out loud — a native macOS assistant that actually does the thing.

macOS 14+ · Apple Silicon · your own OpenAI key · open source

100%
open source · MIT
~3 MB
native Swift, no Electron
Your key
BYO OpenAI key, your data
v0.8
shipping fast, in public
Just say it

One sentence. Done.

Why it's different

It doesn't just answer.
It acts.

Chatbots talk. Siri shrugs. Cursor Voice reads the actual pixels in front of you and operates your Mac like a careful pair of hands.

Sees your screen

Screenshots on demand, OCR with Apple Vision, and a fresh capture after every action so it can verify what it just did — and self-correct when something didn't take.

Clicks by name, not by guess

It targets real UI elements through the macOS Accessibility tree first, falls back to on-screen text, and only simulates the mouse as a last resort.

Types & dictates

Dictate into any field, draft replies, fill forms — it types where your cursor is.

Live web answers

Searches and reads pages in real time instead of guessing from stale training data.

Remembers you

Durable local memory across sessions — your apps, paths, and habits. On your disk, not a server.

Extend it with plugins

A plugin is one small JSON file. Install community tools in a click from the marketplace — or publish your own and it ships after an automatic safety review.

Teach it macros

Show it a routine once — “record a macro called morning setup” — and replay it forever with one sentence. Your skills, saved on your Mac.

Costs pennies, shows the meter

You bring your own OpenAI key and watch spend live under the orb — session cost, credit remaining, no subscription, no markup.

Built for hands-free

Vision-assist describes the screen aloud for low-vision use; hands-free mode runs the whole Mac without touching mouse or keyboard.

How it works

Hotkey. Speak. Done.

1

Summon the orb

A glowing orb pops up next to your cursor — in any app, over anything. Or just say “Hey Cursor.”

2

Say what you want

Natural language, no commands to memorize. It looks at your screen for context, asks if it's unsure, and you can interrupt it mid-sentence.

3

Watch it work

It clicks, types, opens, searches — narrating just enough — then verifies the result with a fresh look at the screen.

The honest comparison

“Hey Siri” can't do this.

Siri sets timers. Cursor Voice operates your Mac. Full comparison →

Capability
Cursor Voice
Siri
Sees what's on your screen
Clicks & types in any app
Holds a real conversation
Live web answers with sources
limited
Extensible with plugins
Open source — audit every line
Install

Running in under a minute.

Pick your flavor. Paste your OpenAI key in Settings, grant permissions once, and start talking.

$ curl -fsSL https://raw.githubusercontent.com/cursorvoice/cursor-voice/main/install.sh | bash

Downloads the latest release, installs to /Applications, and launches it. That's it.

$ brew install --cask cursorvoice/cursor-voice/cursor-voice
$ brew upgrade --cask cursor-voice # update later

The cask clears quarantine automatically — no right-click-Open dance.

Prefer dragging it in yourself?

Download the DMG

First launch: right-click the app → Open (it's self-signed, not notarized — and fully open source, so you can read exactly what it does).

It asks for mic, screen recording & accessibility — that's the whole point: hear you, see the screen, act for you. Granted once, kept across updates. A guided checklist walks you through it.
Questions

Fair questions, straight answers.

What does it cost to run?
The app is free and open source. You bring your own OpenAI API key and pay OpenAI directly per use — typical sessions cost a few cents. A live cost meter under the orb and a Usage tab show exactly what you're spending, and you can set a credit budget so there are no surprises.
Is my screen being streamed somewhere?
No. Screenshots are taken on demand — when you ask something that needs eyes — and go directly from your Mac to OpenAI's API over your own key. There's no middle server, no account with us, and nothing is stored off your machine. Memory lives in a local file you can read and delete. Privacy policy.
Why does it need accessibility & screen recording?
Screen recording lets it see what you see; accessibility lets it click buttons and type the way assistive tech does. Both are standard macOS permissions you grant once. Risky shell commands are blocked by default, there's a dry-run mode that narrates instead of acting, and the whole codebase is on GitHub if you want to verify any of this.
Can it really interrupt / be interrupted?
Yes — it streams audio both ways over the OpenAI Realtime API, so you can cut it off mid-sentence and it stops and listens. Echo rejection keeps it from interrupting itself on speakers, and a push-to-talk mode is there if you prefer hold-to-speak.
Is this a Siri replacement?
Different league. Siri answers trivia and sets timers; Cursor Voice operates your Mac — it reads the screen, clicks, types, manages files and windows, and holds a conversation while doing it. See the full comparison. (Not affiliated with Apple or OpenAI.)
It's an early beta — what does that mean?
It ships fast and in public — updates land weekly, sometimes daily. Things occasionally miss a click; it takes a fresh screenshot after every action to catch and fix that itself. Found a bug? Open an issue or email support@cursorvoice.app.

Stop clicking.
Start talking.

Free, open source, yours. One command and your Mac is listening.