Introducing the transcribe.so API: speech-to-text as a Bearer token

Transcribe.so
transcribe so apitranscription apispeech to text apideveloper apiqwen3-asr-flash apiwhisper api alternative

You can now transcribe audio with transcribe.so without ever opening the dashboard.

curl -X POST https://transcribe.so/api/v1/transcriptions \
  -H "Authorization: Bearer tsk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "source": "external_url",
    "url": "https://example.com/podcast.mp3",
    "pipeline_code": "qwen3-asr-flash-filetrans"
  }'

That's the whole thing. One Bearer token, one POST, you're transcribing.

Why an API

We built transcribe.so for people who care about transcript quality — meeting notes, podcasts, courseware, voice memos. But the more we shipped the dashboard, the more we kept hearing the same request from the people who use it most:

"I love the output. I just don't want to drag a file into a browser tab forty times a day."

Fair. Some workflows want a UI. Most automation wants a curl.

The use cases we've heard so far:

  • Pipelines that ingest podcasts — auto-transcribe new episodes the moment they hit your S3 bucket.
  • Meeting bots — transcribe Twilio recordings or Zoom dumps as they come in, no human in the loop.
  • Journalist workflows — drop interview audio in a folder, get back a searchable transcript with chapters.
  • Voice-memo automations — your phone records, your laptop transcribes, your second brain stores it.

So here it is.

What's the same as the dashboard

Almost everything.

  • Same models. Qwen3-ASR-Flash, GPT-4o-transcribe-diarize, Voxtral. Pick the one that fits the request — pass pipeline_code and you're set.
  • Same per-minute pricing. No "API tier" markup. The $X/min you see on the pricing page is exactly what an API call costs.
  • Same wallet. Monthly credit drains first; top-up balance covers the rest. If you've got a Pro membership, that 30%-off applies to API calls just like it does to web jobs.
  • Same downstream pipeline. Topics, chapters, summaries, semantic search, Q&A with citations — all of it lives behind GET /transcriptions/:id/result. You're not getting a stripped-down API output. You're getting the full pipeline.

The whole point of the API is that we didn't fork it. The Bearer token swaps in for the session cookie, and you call the same code path the web UI does.

What's different

The handful of things you'd expect.

Bearer auth. Every request carries Authorization: Bearer tsk_live_.... No cookies, no CSRF, no SDK, no setup beyond pasting a key into your env. Keys live at transcribe.so/settings/api-keys, and we show the plaintext exactly once.

Async with polling. Submit, get back a transcription ID and status: "processing", then poll GET /transcriptions/:id until you see "completed" or "failed". We tried to make this honest: the same statuses you see on the dashboard are the statuses the API returns. (Webhooks ship next — you won't have to poll forever.)

Idempotency-Key header. Send a UUID and we'll cache the response for 24 hours. Retrying on a network blip won't double-charge you or queue the job twice. Standard Stripe pattern.

Per-key visibility. Each key shows you this month: $X.XX and all time: $X.XX on the dashboard. If you hand a key to a teammate or paste one into a script you're not sure about, you can see exactly what it spent. Revoke one and the others keep working.

We deliberately didn't ship per-key spending caps in v1. The wallet itself is the cap — you can't spend money you don't have, and an account-level monthly hard cap is the cleaner pattern when we get there. (If your team needs per-key caps before then, tell us.)

A worked example

Here's a Python script that watches a folder, transcribes every new audio file by URL, and saves the JSON result next to it. It's small enough to read in one breath:

import os, time, requests, json, pathlib

API = "https://transcribe.so/api/v1"
HEADERS = {"Authorization": f"Bearer {os.environ['TRANSCRIBE_API_KEY']}"}

def transcribe(url: str) -> dict:
    r = requests.post(f"{API}/transcriptions", headers=HEADERS, json={
        "source": "external_url",
        "url": url,
        "pipeline_code": "qwen3-asr-flash-filetrans",
    })
    r.raise_for_status()
    job_id = r.json()["id"]

    while True:
        s = requests.get(f"{API}/transcriptions/{job_id}", headers=HEADERS).json()
        if s["status"] in ("completed", "failed"):
            break
        time.sleep(3)

    if s["status"] == "failed":
        raise RuntimeError(f"transcription failed: {s.get('error')}")

    return requests.get(f"{API}/transcriptions/{job_id}/result", headers=HEADERS).json()


if __name__ == "__main__":
    out = transcribe("https://example.com/podcast.mp3")
    pathlib.Path("transcript.json").write_text(json.dumps(out, indent=2))
    print(f"saved {len(out['segments'])} segments")

Six lines of glue, full transcript with chapters and topics on disk. The same shape works in a Cloudflare Worker (Twilio webhook → API call → write to KV), a GitHub Action (new podcast episode in a release → transcribe → comment on the PR), or a long-running n8n flow.

What's next

The roadmap, in priority order:

  1. File uploads. v1 ships with external-URL transcription. Direct uploads via a presigned S3 PUT land next — useful when you don't want to host the audio yourself.
  2. Webhooks. transcription.completed and transcription.failed, signed with HMAC, with exponential-backoff retries. Polling works; webhooks are nicer.
  3. OpenAPI spec + SDKs. Once the surface stops moving, we'll publish a proper OpenAPI 3.1 spec and generate first-party Python and TypeScript SDKs.
  4. Account-level monthly cap. A single "don't let the whole account spend more than $X this month" hard limit. Applies equally to web UI and API.

If you have a use case that doesn't fit any of the above, we want to hear it. The API is going to be shaped by what people actually build with it.

Until then — grab a key, paste it into your script, and let us know what you ship.

Ready to transcribe your own content?

No credit card required. Pay only for what you use.

See it in action

Real output from a real transcription

Browse chapters, ask questions, and explore search results from an actual transcript.

Real OutputTry Demo
44 Harsh Truths About The Game Of Life - Naval Ravikant (4K)
Chris Williamson
Contents
8 chapters · 513 topics
1Happiness Versus Success: Philosophical Reflections on Contentment, Desire, and Motivation
2Optimizing Sleep: Smart Temperature Regulation and the Foundations of Self-Esteem
3Decisive Action and Iterative Practice: Keys to Optimal Choices and Mastery
4Wealth Management: From Materialism to Value Creation and Fair Compensation
5Evaluating LLMs: Capabilities, Limitations, and Their Role in AI's Evolving Landscape
6Pathogens, Evolution, and Knowledge: How Humans Adapt and Defend
7Agency, Power, and the Individual: From Child Development to Cultural Conflict
8Unseen Trends: Media Oversights, Medical Limitations, and the Primitive State of Modern Biology
Q&A preview
Answer
Naval explains two distinct paths to happiness using the story of Alexander and Diogenes. The first path is through success—conquering the world, satisfying material needs, and getting what you want. The second path, exemplified by Diogenes living in a barrel, is simply not wanting in the first place. As Socrates said when shown luxuries: 'How many things there are in this world that I do not want.' Naval suggests not wanting something is as good as having it—both paths lead to the same destination of contentment [00:38–01:10]. He's not sure which path is more valid, noting it depends on how you define success [01:10–01:25].

Command Palette

Search for a command to run...

No credit card required. Pay only for what you use.