First-party disclosure: we build transcribe.so, a pay-as-you-go transcription service. This page is the canonical statement of how our pricing works, with verified numbers for the subscription tools people compare us against. Where a subscription is genuinely the better deal, we say so.
Most transcription tools want a monthly subscription. The problem: almost nobody transcribes the same number of hours every month. Students transcribe during exam season. Podcasters skip weeks. Researchers batch interviews into one intense month and then go quiet. A subscription prices all of that as if usage were flat, and the industry's two favorite tricks, monthly minute caps and quiet auto-renewals, make the mismatch expensive.
TL;DR. If your volume is irregular, or below roughly 10 hours a month, pay-as-you-go beats every subscription we measured: a one-hour file costs about $1.12 to $3.23 on transcribe.so depending on the model, billed per minute, with credits that never expire. If you reliably transcribe 15+ hours of English every single month and never need an API, a flat-rate plan like TurboScribe ($120/year, with fair-use throttling) is genuinely cheap. Heavy fixed volume with team seats: a subscription, ours or someone else's, wins on rate.
Why subscriptions fit transcription badly
Three patterns show up across the major tools (all checked June 12, 2026):
- Minute caps inside "paid" plans. Otter Pro is capped at 1,200 monthly minutes and 90 minutes per conversation; the free tier allows 300 minutes a month but only 3 file imports for the lifetime of the account. Notta Pro caps at 1,800 minutes a month. Happy Scribe sells 120 to 6,000 minutes a month by tier, with overage at $0.20/min.
- "Unlimited" with fine print. Trint's pricing FAQ excludes "archival projects, continuous live transcription and bulk volume projects" from its unlimited plans. TurboScribe's help docs describe a High Volume Mode where uploads queue one at a time and can be rejected with "Try again later."
- Auto-renewal complaints. Trustpilot reviews for Otter, Trint, TurboScribe, and Transkriptor repeat the same story: a forgotten monthly or annual renewal, then a refused refund. That risk is structural to subscriptions: you pay for the months you forget.
None of this is fraud. It is just a pricing model optimized for predictable corporate usage, sold to people whose usage is not predictable.
What pay-as-you-go means here
On transcribe.so there is no required subscription and no monthly cap. The model:
- Prepaid wallet. You add credits when you want. The balance never expires. New accounts get $1 of free credit to start, no card required.
- Billed per minute. Each job is priced by audio length and the model you pick. You see the exact quote before you confirm; if the final cost comes in lower, the difference is released back.
- No postpaid billing. Nothing renews, nothing invoices you later, and an idle month costs $0.
Current rates by model
Rates from the transcribe.so pricing page, June 12, 2026:
| Model | Rate | Per minute | Best for |
|---|---|---|---|
| Voxtral V2 | $1.12/hr | about $0.019/min | lowest cost, native diarization |
| Qwen3-ASR Flash | $1.44/hr | $0.024/min | default: long files, strong Korean/Japanese/Chinese |
| GPT-4o Transcribe Diarize | $3.23/hr | about $0.054/min | meetings and interviews, best speaker labels |
Speaker diarization is included in the model price on every pipeline that supports it. There is no separate add-on fee, ever.
For accuracy context: Qwen3-ASR places #4 of 80+ models on the public Open ASR Leaderboard (May 2026 snapshot), and on the published FLEURS benchmark it scores 2.07% word error rate for Korean, 3.09% for Japanese, and 2.38% for Mandarin (Qwen3-ASR paper, Table A.2). Those are published numbers, not our marketing.
The math at three volumes
Annual cost, using each vendor's own published prices (all checked June 12, 2026). transcribe.so rows use Qwen3-ASR Flash at $1.44/hr; the GPT-4o diarized rate is shown where speaker labels matter.
| Student, 3 hr/mo | Podcaster, 10 hr/mo | Team, 40 hr/mo | |
|---|---|---|---|
| transcribe.so pay-as-you-go | $51.84/yr ($116.28 with GPT-4o diarization) | $172.80/yr | $691.20/yr |
| transcribe.so subscription | not needed | Plus $12/mo, about 12 hr included: $144/yr | Pro $49/mo, about 50 hr included: $588/yr |
| TurboScribe Unlimited | $120/yr (annual billing) | $120/yr | $120/yr until fair-use throttling; no API |
| Rev AI | $540/yr at $0.25/min | $1,800/yr pay-per-minute, or Essentials annual $305.90/yr (English and Spanish only) | Essentials $305.90/yr; all 37 languages needs Pro at $575.88/yr |
| Trint | $948/yr (Pro annual) | $948/yr | Team, 2 seats: $1,656/yr |
| Otter Pro | $99.96/yr (but 3 lifetime file imports on free, 10 imports/mo on Pro) | $99.96/yr if under 1,200 min/mo and 90 min/meeting | needs Business: $239.88/user/yr |
| Happy Scribe | Basic annual $102/yr (120 min/mo cap is too small) | Pro annual $228/yr (600 min/mo, exactly 10 hr) | Business annual $708/yr |
Three honest readings of that table:
- At 3 hours a month, pay-as-you-go wins outright. Every subscription bills you $100 to $948 a year for usage that costs about $52 at metered rates.
- At 10 hours a month it depends on caps. Otter Pro at $99.96/yr looks cheapest, but 10 hours of file uploads collides with its import and per-conversation limits, which is exactly the complaint that sends people searching for alternatives. Flat-rate TurboScribe at $120/yr is the real budget pick if Whisper-level English accuracy is enough and you do not need an API or a searchable library.
- At 40 hours a month, subscriptions win on rate, including ours: Pro at $49/mo undercuts our own pay-as-you-go. That is the honest shape of the market: metered pricing for variable usage, plans for fixed volume.
When a subscription is actually better
- You transcribe heavy, fixed English volume every month and need nothing but transcripts. TurboScribe Unlimited at $120/yr is hard to beat on price, with the documented fair-use caveats.
- You live inside meetings and want a bot that joins calls. Meeting notetakers like Otter bundle recording, joining, and CRM hooks that a file-first tool does not replace.
- You need certified human accuracy. Rev's human service at $1.99/min with a 99%+ accuracy claim is the standard for legal and compliance work. No AI tool, ours included, should be sold to you for that job.
- Your monthly volume is high and predictable. Use a plan: ours (Starter $5, Plus $12, Pro $49) or a competitor's. Plans exist because at fixed volume they are the better deal.
What you get beyond the transcript
Pay-as-you-go does not mean bare-bones. Every transcribe.so job, including free-credit jobs, includes AI chapters and section summaries, semantic search across your whole library, and Q&A with timestamped citations back to the exact moment in the audio.
It also plugs into the tools you already use: a Claude connector over MCP, a ChatGPT Custom GPT, a REST API with an OpenAPI spec, and a native macOS app that captures meeting audio via AudioTap on macOS 14.2+. You can paste YouTube and podcast URLs instead of uploading files, and export SRT, VTT, and karaoke VTT subtitles.
Frequently asked questions
Is transcribe.so really subscription-free?
Yes. The wallet is prepaid: add credits, use them whenever, and they never expire. Subscriptions exist but are optional; they add included monthly hours and lower rates for people with steady volume.
What does a one-hour file actually cost?
Between $1.12 (Voxtral V2) and $3.23 (GPT-4o Transcribe Diarize with speaker labels), depending on the model. The default Qwen3-ASR Flash pipeline is $1.44 per hour, billed per minute, and you see the exact quote before you confirm.
Do unused credits expire at the end of the month?
No. Wallet credits have no expiry date. Only the included monthly credit inside an optional subscription resets each billing cycle.
Is speaker diarization extra?
No. Diarization is included in the per-minute price of the models that support it, like GPT-4o Transcribe Diarize and Voxtral V2. There is no add-on fee.
How accurate is it compared to Whisper-based tools?
The default Qwen3-ASR model ranks #4 of 80+ models on the public Open ASR Leaderboard (May 2026) and publishes FLEURS word error rates of 2.07% for Korean, 3.09% for Japanese, and 2.38% for Mandarin. Whisper-based tools like TurboScribe and MacWhisper use OpenAI Whisper, which trails it on those benchmarks.
Can I use it from Claude or ChatGPT?
Yes. transcribe.so ships an MCP connector for Claude, a ChatGPT Custom GPT, and a developer API, so agents can submit and search transcriptions with the same per-minute billing.
Comparing a specific tool? See the guides for TurboScribe, Rev (and Rev's pricing explained), Trint, and MacWhisper, or the full pricing page.