Ask any video or audio. Get answers with timestamps.

Paste a YouTube link, podcast, or recording. Get cited answers with exact timestamps and human-quality chapters, accurate in 67 languages. Unlimited, one flat price.

The same engine behind a simple HTTP API. Plug cited answers and exact timestamps into your agents, automations, meeting bots, and apps. Read the API docs Try it on ChatGPT

See pricing

Start free, no card.·Unlimited from $19/mo.·67 languages.

Try this real transcript
How to Quit Your Job (and Find Work You Actually Love)
Ali Abdaal
Contents
18 chapters · 57 sections
1Why I quit my high-paying job with no plan
2The shame of walking away from success
3Stop accepting low-grade suffering at work
4Are you wired for the pathless path?
5The math behind quitting your job safely
6Use time off to rediscover who you are
7How to fund your freedom on a budget
8Your income streams will evolve over time
9Turn your skills into immediate cash flow
10Treat your career break like a life MBA
11Passion doesn't mean work is easy
12Align your daily actions with your ideal life
13Focus on your mode, not your niche
14Declare yourself retired with the skip test
15Handling family criticism of your career choices
16Would you trade wealth for total freedom?
17Get comfortable with feeling cringe
18Why traditional job security is a myth
Ask this video
Answer
Paul left because the work had quietly stopped fitting who he was, not because of a single dramatic event. Early on he chased prestige and big salaries, optimizing for impressive internships and the markers of success [00:59–02:18]. By around thirty-two the job had drained his energy and passion, and quitting was mostly about escaping that misalignment and getting himself back [04:37–06:04]. When he ran a self-assessment, he realized he'd drifted from the goals he set in grad school, to avoid becoming money-obsessed and to keep his sense of humor, which made clear how far off course he'd gone [06:05–07:55]. The decision was less “follow your dream” and more “stop betraying your own values.”

Command Palette

Search for a command to run...

We route every language to the best of

OpenAIQwenMistral
Works with
YouTube
Google Meet
Zoom
Microsoft Teams
Loom
Voice Memos
Video files
Audio files
Export to
CapCut
Final Cut Pro
Premiere Pro
DaVinci Resolve
Copy to
Notion
Apple Notes
Google Keep
OneNote
Evernote
Obsidian
WhatsApp
Slack
Telegram

Supports MP4, MOV, WebM, MP3, WAV, M4A, AAC, FLAC, OGG, and more.

YouTube to textSpeech to textAudio to textVideo to textVoice note to textGoogle Meet to textLoom to textLecture video to notesSubtitle generatorSearchable transcripts

How it works

Three steps from a recording to a searchable library

Real screenshots from a real transcript. No mockups, no marketing fluff. This is what you get.

  1. 1

    Drop in your audio

    Paste a link or upload a file. Anything you record, download, or screen-capture works.

    Paste a YouTube URL and pick the best speech-to-text model for your language
  2. 2

    Get a structured transcript

    Every transcript comes back with chapters, topic summaries, and timestamps so you can jump straight to what matters.

    Auto-generated chapters and topic summaries with timestamps
  3. 3

    Ask anything across your library

    Ask a question and get a cited answer with a timestamp. Search across every transcript in your library. Export subtitles if you need them.

    AI Q&A answer with inline timestamps and cited topic cards

No credit card required. Cancel anytime.

The real cost of long videos

You remember the answer is in there. You just can't find it.

Long lectures and podcasts hold the exact moment you need. The summary won't show it. The timeline won't either. So you scrub, overshoot, and start the video over.

And when the video is in a language your tool barely understands, the transcript is wrong before you even start looking.

Per long video

3-hour lecture you need to study

+ 20–40 min scrubbing to find one explanation

+ Replaying the same 90-second clip three times

+ A summary that skipped the part you wanted

= You learned less than the hours you put in

Per week

5 long videos to get through

+ 20–30 min hunting for moments in each

= 1.5–2.5 hours of study time gone every week

Library, not a one-off transcript

Build a knowledge library you can actually search

Most tools give you one transcript at a time. transcribe.so turns every lecture, podcast, and video into a searchable, askable library that grows with you.

One searchable library across everything

Every YouTube link, lecture, and podcast joins one library. Find a quote across hours of content in seconds.

Ask anything. Jump to the second they said it.

Ask a question and get a cited answer with a timestamp. Click the citation to jump straight to the moment in playback.

Works in any language, automatically

67 languages with measured accuracy per language. We pick the right speech-to-text engine for you, so you focus on studying.

That's roughly 1.5–2.5 hours of study time back every week.

No credit card required. Cancel anytime.

Five surfaces, one engine

Use transcribe.so wherever you already work

Same library, same pricing, same wallet across the web, ChatGPT, Claude, the macOS app, and the API.

Web

Use it from your browser

Paste a YouTube link or upload audio. Get chapters, cited Q&A, and subtitle exports. Free credits on signup.

Open the app
ChatGPT

Use it as a Custom GPT

Same engine inside ChatGPT. Send a YouTube link or file in chat. Wallet, history, and pipelines stay in sync.

Open the GPT
Claude

Use it as a Claude Connector

Add transcribe.so to Claude as a custom connector via MCP. Transcribe and search your library straight from the chat.

Add the connector
macOS

Use it as a native Mac app

Capture meetings via AudioTap on macOS 14.2 and up. No virtual drivers, no extra software. Sign in once, then dictate or capture.

Get the macOS app
API + MCP

Use it from your code

One Bearer token, one HTTP endpoint, an OpenAPI spec, and an MCP server. Drops into any agent, automation, or backend.

Read the API docs
For power users

Under the hood: the speech-to-text engines we route to

You don't need to pick a model. We route each file to the right engine for your language automatically. If you want to override the default, here's what powers your library: GPT-4o, Qwen3-ASR-Flash, and Voxtral. Chapters, library search, cited Q&A, subtitles, and exports work the same way across all of them.

Premium
GPT-4o Transcribe Diarize
Best-in-class diarization with built-in speaker labels
Built-in speaker identification (who said what)
58 languages, sentence timestamps
Hosted by OpenAI for enterprise reliability
OpenAIOpenAI
Top-Tier
Qwen3-ASR-Flash
Leaderboard-leading accuracy with word-level timestamps
#4 of 80+ on HuggingFace Open ASR Leaderboard (6.37% avg WER)
33 languages, word timestamps (10 langs)
Emotion detection, long-form audio
QwenAlibaba Qwen3
New
Voxtral Mini Transcribe
Word-level timestamps with speaker labels
Word-level timestamps in 13 languages
Speaker labels & context biasing
13 languages, lowest cost per minute
MMistral AI
Search backbone
Semantic Search & AI Q&A
Powers search by meaning and AI Q&A across every transcript, no matter which ASR model produced it.
Hybrid retrieval with second-stage reranking
Citation-grounded answers with timestamps
Find moments by meaning, not just keywords
Frontier embedding + LLM stack

A note from the maker

Hey, I'm Seunghun 👋

In 2023 I left Spotify to work on the problem of finding the useful 90 seconds inside a three-hour podcast. We built goodlisten.co, ran out of runway, and I went back to a desk job.

But I kept needing it myself. English was the easy part. The audio I actually cared about was harder: Korean podcasts where the host slips into English, Japanese conversations with three speakers, Spanish lectures recorded in noisy rooms. I was tired of spending two hours just to find the two minutes that mattered.

So in 2025 I stopped trying to build for “the market” and built the tool I wished existed, for one very specific user: me. If it saves you time, tell me. If it doesn't, tell me directly. That's how it gets better.

Seunghun

Who it's for

Built for learners. With an API for builders.

Whether you're studying from a 3-hour lecture, a foreign-language podcast, or building an AI agent that needs to listen, the same engine handles it.

Students and lifelong learners

Turn every lecture, podcast, and video you study into a searchable library. Get cited answers tied to the exact second they were said.

Learners studying in any language

Korean MOOCs, Japanese podcasts, Spanish talks, or English lectures. We pick the right speech-to-text engine for each of 67 languages, with measured accuracy.

Developers building AI apps

One HTTP API plus a Claude and ChatGPT MCP surface. Plug the same engine into agents, automations, voice memo apps, and meeting bots.

No credit card required. Cancel anytime.

Use it in your app

One HTTP API. Plus an MCP server for Claude and ChatGPT.

Same engine that powers your library. Chapters, library search, cited Q&A, subtitles, exports. Hit it from any agent, video tool, meeting bot, or voice app.

One curl, full pipeline

curl https://transcribe.so/api/v1/transcriptions \
  -H "Authorization: Bearer tsk_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "source": "youtube",
    "url": "https://youtu.be/dQw4w9WgXcQ",
    "pipeline_code": "qwen3-asr-flash-filetrans"
  }'

YouTube, file upload via presigned S3, or any direct audio URL. Same response shape.

  • AI agents

    Drop a transcript into your agent's context. Claude, ChatGPT, Cursor, anything that calls HTTP.

  • Video editors and tools

    Word-level timestamps, burn-in captions, SRT/VTT export. Same engine as the dashboard.

  • Meeting bots and call platforms

    Transcribe Zoom, Twilio, or any recording the moment a call ends. Webhook fires when ready.

  • Voice memos, podcasts, language apps

    67 languages with measured accuracy per language. Auto-detect or pin a specific code per request.

Use cases

From transcription to something actually useful

Whether you are publishing, editing, researching, or learning, transcribe.so helps you get usable output from long-form content faster.

Podcast and interview transcription

Search long conversations, find strong quotes, and jump straight to important moments with chapters, citations, and playback.

Subtitle creation for videos

Generate subtitles that are easier to use in your editing workflow, with more control than rough auto-captions.

Learning from YouTube and lectures

Turn long videos into structured content with chapters, cited answers, and searchable playback so you can study faster.

Meeting and recording review

Upload calls, notes, or voice recordings and quickly find decisions, highlights, and follow-up moments without re-listening to everything.

No credit card required. Cancel anytime.

FAQ

Before you try transcribe.so

Start your library with one link.

Paste any lecture, podcast, or video. Free credits to start. See the exact cost before you confirm.

No credit card required. Cancel anytime.

Keep scrolling for details

Product features in depth

What you get with every transcript

Every upload joins your searchable library. You get chapters, cited Q&A with timestamps, library-wide search, subtitles for any editor, and full exports. The right engine for your language is picked automatically.

In the box

Library-wide search across every transcript you own
Cited Q&A with timestamps. Click to jump to the second they said it
AI-generated chapters and section detection
AI summary and key takeaways
Speaker labels on multi-speaker audio
Entity extraction (people, places, brands)
67 languages with measured accuracy per language
Subtitle exports (SRT, WebVTT, karaoke VTT, JSON)
Encrypted Cloudflare R2 storage. Your audio is never used for training
Power-user toggle: override the default engine (GPT-4o, Qwen3-ASR-Flash, or Voxtral)

No credit card required. Cancel anytime.

Subtitles

Subtitles ready for any editor

Every transcript ships with word-level timestamps, formatted as SRT or WebVTT for CapCut, Premiere Pro, DaVinci Resolve, or Final Cut Pro. Pick a platform preset or tune every parameter.

Platform Presets

One-click presets tuned for each platform's readability standards. Each preset controls characters per line, max lines, reading speed (CPS), timing gaps, and more.

YouTube
Long-form captions optimized for readability
20 CPS · 2 lines
TikTok / Shorts
Short, punchy single-line captions
20 CPS · 1 line
Netflix-style
Professional broadcast with strict reading speed
17 CPS · 2 lines
Podcast
Longer segments with speaker labels
15 CPS · 2 lines
Broadcast / TV
Traditional broadcast standards
15 CPS · 2 lines
Custom
Full control over every parameter

Export Formats

Export in the format your video editor needs. SRT and WebVTT import directly into CapCut, Premiere Pro, DaVinci Resolve, and Final Cut Pro.

SRT
CapCut, Premiere Pro, DaVinci Resolve, Final Cut Pro & more
WebVTT
Web players, CapCut, and editors with styling support
Karaoke VTT
Word-by-word highlight timing
JSON
Full data with word timestamps

Powered by Word-Level Timestamps

Unlike simple text-splitting tools, our subtitle engine uses precise word-level timestamps from your transcription to build optimally timed cues.

Line breaks chosen for readability, not character count
Smart line breaking at natural pauses
CPS-aware reading speed optimization
Automatic gap and duration enforcement
Speaker label support for multi-speaker content
Live preview before export
Privacy First

Your private files stay private

Worried about uploading sensitive audio? Privacy is built into the platform from the bottom up.

Encrypted Storage

Your files are stored in private Cloudflare R2 buckets with time-limited access links. Only you can view your transcriptions.

Instant Deletion

Delete anytime. All data is instantly removed from our servers. No backups, no retention, completely gone.

Trusted Infrastructure

Inference and embeddings via trusted enterprise providers (OpenAI, Mistral, and partners). Storage on Cloudflare R2. No other third parties involved.

Your Data, Your Control

We don't use your content for AI training. Your transcriptions are private and never shared or made public.

Questions about privacy? Contact us

Export & Share

Copy or export anything you read

Export anything in markdown. Chapters, search results, Q&A history all carry timestamps that link back to the source.

Table of Contents
Chapters
Search Results
Q&A History
One-click copy Markdown download Playable YouTube links Direct timestamps Time ranges