TranscribeHQ API
Agentic video transcription service with AI-powered quality correction, visual context analysis, and AI intelligence for all platforms. Supports YouTube, Vimeo, Loom, TikTok, Facebook, Instagram, and Twitter/X.
Authentication
All API v1 endpoints require an API key. The health check endpoint is public. You can provide your key using any of these three methods:
Method 1 — Authorization: Bearer header (recommended)
Authorization: Bearer YOUR_API_KEY
Method 2 — x-api-key header
x-api-key: YOUR_API_KEY
Method 3 — Query parameter
GET /api/v1/jobs?api_key=YOUR_API_KEY
Supported Platforms
Tier 1 — Video Hosting
| Platform | Notes |
|---|---|
| YouTube | Full support including long videos |
| Vimeo | Public videos with browser impersonation |
| Dailymotion | Public videos |
| Twitch | VODs and clips |
| Streamable | Short-form video hosting |
| Rumble | Public videos |
| Bilibili | Public videos |
| TikTok | Short-form video |
Tier 1 — Screen Recording
| Platform | Notes |
|---|---|
| Loom | Direct API extraction + visual context |
| ScreenPal | Public recordings |
| Komodo | Public recordings |
| Screencastify | Shared recordings |
| Vidyard | Public/shared videos |
| Wistia | Public videos |
| BerryPal | Shared recordings |
| Jumpshare | Shared recordings |
| Zight (CloudApp) | Shared recordings |
| Sendspark | Shared videos |
| Arcade | Interactive demos |
| Scribe | Step-by-step recordings |
| Tella | Shared recordings |
| Cap | Open-source screen recorder |
| Claap | Meeting/async recordings |
| Pitch | Presentation recordings |
| Screen Studio | macOS screen recordings |
| Droplr | Shared recordings/screenshots |
Tier 1 — Cloud Storage
| Platform | Notes |
|---|---|
| Dropbox | Shared video links |
| Google Drive | Shared video files |
| OneDrive | Shared video files |
Tier 2 — Social Media (variable reliability)
| Platform | Notes |
|---|---|
| Public videos only, may be unreliable | |
| Reels and video posts | |
| Twitter / X | Video tweets, may be unreliable |
| Video posts | |
| Public video posts | |
| Snapchat | Public stories/spotlights |
| Threads | Video posts |
Tier 1 platforms are reliable and fully supported. Tier 2 platforms may have variable availability — a warning is included in the response. Unsupported URLs are rejected with a 400 error.
Error Handling
| Status | Meaning | Description |
|---|---|---|
| 400 | Bad Request | Invalid URL or unsupported platform |
| 401 | Unauthorized | Missing API key |
| 403 | Forbidden | Invalid API key |
| 404 | Not Found | Job ID not found |
| 503 | Unavailable | API keys not configured (production) |
Error responses use this format:
{
"status": "error",
"message": "Invalid or missing API key"
}
Endpoints
Returns the server health status. No authentication required.
Response 200
{
"status": "ok"
}
Example
# curl
curl https://your-domain/api/healthz
Submit a video URL to start the transcription pipeline. Optionally provide a webhook URL to receive a notification when processing completes.
Request Body
| Field | Type | Required | Description |
|---|---|---|---|
url | string | Yes | Video URL to transcribe |
webhook_url | string | No | HTTPS URL for completion webhook |
Request Example
curl -X POST https://your-domain/api/v1/transcribe \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
"webhook_url": "https://your-server.com/webhook"
}'
Response 202
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"platform": "YouTube",
"tier": 1,
"warning": null,
"poll_url": "/api/v1/jobs/550e8400-e29b-41d4-a716-446655440000"
}
Error Responses
| Status | Condition |
|---|---|
| 400 | Missing URL, invalid URL format, or unsupported platform |
| 401 | Missing API key |
| 403 | Invalid API key |
Returns a summary list of all transcription jobs.
Response 200
{
"jobs": [
{
"job_id": "550e8400-...",
"url": "https://www.youtube.com/watch?v=...",
"platform": "YouTube",
"status": "complete",
"progress": 100,
"is_loom": false,
"created_at": "2026-03-13T01:00:00.000Z"
}
],
"total": 1
}
Example
curl https://your-domain/api/v1/jobs \
-H "Authorization: Bearer YOUR_API_KEY"
Returns the full job status and results. The response includes transcripts, quality scores, corrections, cleaned transcripts, task briefs, and visual context annotations.
Path Parameters
| Parameter | Type | Description |
|---|---|---|
jobId | string (UUID) | The job ID returned by the transcribe endpoint |
Response 200 — Complete Job
{
"job_id": "550e8400-...",
"status": "complete",
"platform": "Loom",
"is_loom": true,
"progress": 100,
"raw_transcript": "So today I want to show you...",
"corrected_transcript": "So today I want to show you...",
"cleaned_transcript": "Today I want to show you...",
"quality_score": 88,
"corrections": [
{
"timestamp": "00:01:23",
"original": "their",
"issue": "Homophone error",
"suggested_correction": "there",
"confidence": "high"
}
],
"brief": {
"summary": "Client requests changes to...",
"platforms": ["WordPress", "Elementor"],
"task_type": "Update",
"priority": "Normal",
"requirements": ["1. Update hero section..."],
"ambiguities": ["Which color scheme?"],
"suggested_hero": "Web Hero",
"hero_reasoning": "WordPress/Elementor task"
},
"visual_context": {
"confirmed_annotations": [
{
"timestamp": "00:00:42",
"trigger_phrase": "this button right here",
"platform": "WordPress",
"ui_element": "CTA button in hero section",
"context": "Elementor editor showing hero block",
"visible_text": "Get Started Now",
"confidence": "high"
}
],
"unverified_annotations": [],
"platform_summary": { "WordPress": 2 },
"total_confirmed": 2,
"total_dismissed": 1
},
"activity_log": [
{
"step": "transcription",
"label": "Deepgram Transcription",
"status": "complete",
"elapsed": 3.2
}
],
"error": null
}
Job Status Values
| Status | Description |
|---|---|
pending | Job created, waiting to start |
extracting | Downloading and extracting audio |
transcribing | Running Deepgram speech-to-text |
auditing | AI quality audit in progress |
correcting | AI auto-correction |
verifying | Second opinion verification |
cleaning | Transcript cleaning |
briefing | Task brief generation |
complete | All processing finished |
error | Processing failed |
Webhooks
When you provide a webhook_url in the transcribe request, TranscribeHQ sends a POST notification when the job completes or fails.
Webhook Payload
{
"event": "transcription.complete",
"job_id": "550e8400-...",
"status": "complete",
"platform": "YouTube",
"is_loom": false,
"quality_score": 92,
"error": null,
"poll_url": "/api/v1/jobs/550e8400-...",
"timestamp": "2026-03-13T01:05:00.000Z"
}
Webhook Headers
| Header | Value |
|---|---|
Content-Type | application/json |
X-TranscribeHQ-Event | transcription.complete or transcription.failed |
Retry Policy
Failed deliveries are retried up to 3 times with exponential backoff (1s, 2s, 3s). Each attempt has a 10-second timeout.
Visual Context Agent
The Visual Context Agent automatically detects moments in a video where the speaker references something on screen (e.g., "this button right here", "as you can see"). It extracts video frames at those timestamps and uses AI vision to analyze what's visible.
How It Works
- Scans the transcript for 23 trigger phrases like "this right here", "as you can see", "this button"
- Uses AI to classify whether each phrase is an active screen reference
- Extracts 5 frames around each confirmed trigger timestamp
- Analyzes frames with AI vision to identify platforms, UI elements, and visible text
- Splits results into confirmed (high/medium confidence) and unverified (low confidence) buckets
Annotation Fields
| Field | Type | Description |
|---|---|---|
timestamp | string | When it occurs (e.g., "00:01:23") |
trigger_phrase | string | The phrase that triggered capture |
platform | string | Detected platform (e.g., "WordPress") |
ui_element | string | Specific UI element identified |
context | string | Description of what's visible on screen |
visible_text | string? | Any readable text on screen |
confidence | enum | high, medium, or low |
AI Intelligence
All videos receive full AI processing beyond standard transcription:
Transcript Cleaning
AI removes filler words, fixes run-on sentences, and formats the transcript into clean paragraphs while preserving all meaning.
Task Brief Generation
AI generates a structured task brief from the video transcript, including:
| Field | Description |
|---|---|
summary | 2-3 sentence overview of the request |
platforms | Technologies mentioned (e.g., WordPress, Elementor) |
task_type | Bug Fix, New Build, Update, Edit, Question, or Review Request |
priority | Urgent, Normal, or Low |
requirements | Numbered action items extracted from the video |
ambiguities | Questions that need client clarification |
suggested_hero | Web Hero, Automation Hero, Graphic Design Hero, or Video Hero |
hero_reasoning | Why that hero type was recommended |
Quality Assurance
The brief goes through a self-check (verifying completeness against the transcript) and a visual verification pass (ensuring the brief references specific platforms and UI elements from visual annotations).
Pipeline Stages
Every transcription job progresses through these stages.
| # | Stage | Engine | Description |
|---|---|---|---|
| 1 | Audio Extraction | yt-dlp / ffmpeg | Download and extract audio from video |
| 2 | Transcription | Deepgram Nova-3 | Speech-to-text with smart formatting |
| 3 | Quality Audit | Gemini 2.5 Flash | Identify potential transcription errors |
| 4 | Auto-Correction | Gemini 2.5 Flash | Fix identified issues in the transcript |
| 5 | Verification | Gemini 2.5 Flash | Second opinion QA + quality score |
| 6 | Visual Context | ffmpeg + Gemini Vision | Screen capture + AI vision analysis |
| 7 | Cleaning | Gemini 2.5 Flash | Remove fillers, format paragraphs |
| 8 | Task Brief | Gemini 2.5 Flash | Generate structured task brief |
| 9 | Self-Check | Gemini 2.5 Flash | Verify brief completeness |
| 10 | Visual Verify | Gemini 2.5 Flash | Ensure brief uses visual context |