TranscribeHQ API

Agentic video transcription service with AI-powered quality correction, visual context analysis, and AI intelligence for all platforms. Supports YouTube, Vimeo, Loom, TikTok, Facebook, Instagram, and Twitter/X.

Base URL /api

Authentication

All API v1 endpoints require an API key. The health check endpoint is public. You can provide your key using any of these three methods:

Method 1 — Authorization: Bearer header (recommended)

Authorization: Bearer YOUR_API_KEY

Method 2 — x-api-key header

x-api-key: YOUR_API_KEY

Method 3 — Query parameter

GET /api/v1/jobs?api_key=YOUR_API_KEY

Server-to-server ready: All three authentication methods work with Postman, cURL, AI agents, and any HTTP client. No browser-specific headers required.

Supported Platforms

Tier 1 — Video Hosting

Platform	Notes
YouTube	Full support including long videos
Vimeo	Public videos with browser impersonation
Dailymotion	Public videos
Twitch	VODs and clips
Streamable	Short-form video hosting
Rumble	Public videos
Bilibili	Public videos
TikTok	Short-form video

Tier 1 — Screen Recording

Platform	Notes
Loom	Direct API extraction + visual context
ScreenPal	Public recordings
Komodo	Public recordings
Screencastify	Shared recordings
Vidyard	Public/shared videos
Wistia	Public videos
BerryPal	Shared recordings
Jumpshare	Shared recordings
Zight (CloudApp)	Shared recordings
Sendspark	Shared videos
Arcade	Interactive demos
Scribe	Step-by-step recordings
Tella	Shared recordings
Cap	Open-source screen recorder
Claap	Meeting/async recordings
Pitch	Presentation recordings
Screen Studio	macOS screen recordings
Droplr	Shared recordings/screenshots

Tier 1 — Cloud Storage

Platform	Notes
Dropbox	Shared video links
Google Drive	Shared video files
OneDrive	Shared video files

Tier 2 — Social Media (variable reliability)

Platform	Notes
Facebook	Public videos only, may be unreliable
Instagram	Reels and video posts
Twitter / X	Video tweets, may be unreliable
Reddit	Video posts
LinkedIn	Public video posts
Snapchat	Public stories/spotlights
Threads	Video posts

Tier 1 platforms are reliable and fully supported. Tier 2 platforms may have variable availability — a warning is included in the response. Unsupported URLs are rejected with a 400 error.

Error Handling

Status	Meaning	Description
400	Bad Request	Invalid URL or unsupported platform
401	Unauthorized	Missing API key
403	Forbidden	Invalid API key
404	Not Found	Job ID not found
503	Unavailable	API keys not configured (production)

Error responses use this format:

{
  "status": "error",
  "message": "Invalid or missing API key"
}

Endpoints

GET /api/healthz Server health check Public

Returns the server health status. No authentication required.

Response 200

{
  "status": "ok"
}

Example

# curl
curl https://your-domain/api/healthz

POST /api/v1/transcribe Start transcription job Auth

Submit a video URL to start the transcription pipeline. Optionally provide a webhook URL to receive a notification when processing completes.

Request Body

Field	Type	Required	Description
`url`	string	Yes	Video URL to transcribe
`webhook_url`	string	No	HTTPS URL for completion webhook

Request Example

curl -X POST https://your-domain/api/v1/transcribe \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "webhook_url": "https://your-server.com/webhook"
  }'

Response 202

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "platform": "YouTube",
  "tier": 1,
  "warning": null,
  "poll_url": "/api/v1/jobs/550e8400-e29b-41d4-a716-446655440000"
}

Error Responses

Status	Condition
400	Missing URL, invalid URL format, or unsupported platform
401	Missing API key
403	Invalid API key

GET /api/v1/jobs List all transcription jobs Auth

Returns a summary list of all transcription jobs.

Response 200

{
  "jobs": [
    {
      "job_id": "550e8400-...",
      "url": "https://www.youtube.com/watch?v=...",
      "platform": "YouTube",
      "status": "complete",
      "progress": 100,
      "is_loom": false,
      "created_at": "2026-03-13T01:00:00.000Z"
    }
  ],
  "total": 1
}

Example

curl https://your-domain/api/v1/jobs \
  -H "Authorization: Bearer YOUR_API_KEY"

GET /api/v1/jobs/:jobId Get job status and results Auth

Returns the full job status and results. The response includes transcripts, quality scores, corrections, cleaned transcripts, task briefs, and visual context annotations.

Path Parameters

Parameter	Type	Description
`jobId`	string (UUID)	The job ID returned by the transcribe endpoint

Response 200 — Complete Job

{
  "job_id": "550e8400-...",
  "status": "complete",
  "platform": "Loom",
  "is_loom": true,
  "progress": 100,
  "raw_transcript": "So today I want to show you...",
  "corrected_transcript": "So today I want to show you...",
  "cleaned_transcript": "Today I want to show you...",
  "quality_score": 88,
  "corrections": [
    {
      "timestamp": "00:01:23",
      "original": "their",
      "issue": "Homophone error",
      "suggested_correction": "there",
      "confidence": "high"
    }
  ],
  "brief": {
    "summary": "Client requests changes to...",
    "platforms": ["WordPress", "Elementor"],
    "task_type": "Update",
    "priority": "Normal",
    "requirements": ["1. Update hero section..."],
    "ambiguities": ["Which color scheme?"],
    "suggested_hero": "Web Hero",
    "hero_reasoning": "WordPress/Elementor task"
  },
  "visual_context": {
    "confirmed_annotations": [
      {
        "timestamp": "00:00:42",
        "trigger_phrase": "this button right here",
        "platform": "WordPress",
        "ui_element": "CTA button in hero section",
        "context": "Elementor editor showing hero block",
        "visible_text": "Get Started Now",
        "confidence": "high"
      }
    ],
    "unverified_annotations": [],
    "platform_summary": { "WordPress": 2 },
    "total_confirmed": 2,
    "total_dismissed": 1
  },
  "activity_log": [
    {
      "step": "transcription",
      "label": "Deepgram Transcription",
      "status": "complete",
      "elapsed": 3.2
    }
  ],
  "error": null
}

Job Status Values

Status	Description
`pending`	Job created, waiting to start
`extracting`	Downloading and extracting audio
`transcribing`	Running Deepgram speech-to-text
`auditing`	AI quality audit in progress
`correcting`	AI auto-correction
`verifying`	Second opinion verification
`cleaning`	Transcript cleaning
`briefing`	Task brief generation
`complete`	All processing finished
`error`	Processing failed

Webhooks

When you provide a webhook_url in the transcribe request, TranscribeHQ sends a POST notification when the job completes or fails.

Webhook Payload

{
  "event": "transcription.complete",
  "job_id": "550e8400-...",
  "status": "complete",
  "platform": "YouTube",
  "is_loom": false,
  "quality_score": 92,
  "error": null,
  "poll_url": "/api/v1/jobs/550e8400-...",
  "timestamp": "2026-03-13T01:05:00.000Z"
}

Webhook Headers

Header	Value
`Content-Type`	application/json
`X-TranscribeHQ-Event`	transcription.complete or transcription.failed

Retry Policy

Failed deliveries are retried up to 3 times with exponential backoff (1s, 2s, 3s). Each attempt has a 10-second timeout.

SSRF Protection: Webhook URLs pointing to localhost, private IPs (10.x, 172.x, 192.168.x), metadata endpoints (169.254.169.254), and .internal/.local domains are blocked.

Visual Context Agent

The Visual Context Agent automatically detects moments in a video where the speaker references something on screen (e.g., "this button right here", "as you can see"). It extracts video frames at those timestamps and uses AI vision to analyze what's visible.

How It Works

Scans the transcript for 23 trigger phrases like "this right here", "as you can see", "this button"
Uses AI to classify whether each phrase is an active screen reference
Extracts 5 frames around each confirmed trigger timestamp
Analyzes frames with AI vision to identify platforms, UI elements, and visible text
Splits results into confirmed (high/medium confidence) and unverified (low confidence) buckets

Annotation Fields

Field	Type	Description
`timestamp`	string	When it occurs (e.g., "00:01:23")
`trigger_phrase`	string	The phrase that triggered capture
`platform`	string	Detected platform (e.g., "WordPress")
`ui_element`	string	Specific UI element identified
`context`	string	Description of what's visible on screen
`visible_text`	string?	Any readable text on screen
`confidence`	enum	high, medium, or low

AI Intelligence

All videos receive full AI processing beyond standard transcription:

Transcript Cleaning

AI removes filler words, fixes run-on sentences, and formats the transcript into clean paragraphs while preserving all meaning.

Task Brief Generation

AI generates a structured task brief from the video transcript, including:

Field	Description
`summary`	2-3 sentence overview of the request
`platforms`	Technologies mentioned (e.g., WordPress, Elementor)
`task_type`	Bug Fix, New Build, Update, Edit, Question, or Review Request
`priority`	Urgent, Normal, or Low
`requirements`	Numbered action items extracted from the video
`ambiguities`	Questions that need client clarification
`suggested_hero`	Web Hero, Automation Hero, Graphic Design Hero, or Video Hero
`hero_reasoning`	Why that hero type was recommended

Quality Assurance

The brief goes through a self-check (verifying completeness against the transcript) and a visual verification pass (ensuring the brief references specific platforms and UI elements from visual annotations).

Pipeline Stages

Every transcription job progresses through these stages.

#	Stage	Engine	Description
1	Audio Extraction	yt-dlp / ffmpeg	Download and extract audio from video
2	Transcription	Deepgram Nova-3	Speech-to-text with smart formatting
3	Quality Audit	Gemini 2.5 Flash	Identify potential transcription errors
4	Auto-Correction	Gemini 2.5 Flash	Fix identified issues in the transcript
5	Verification	Gemini 2.5 Flash	Second opinion QA + quality score
6	Visual Context	ffmpeg + Gemini Vision	Screen capture + AI vision analysis
7	Cleaning	Gemini 2.5 Flash	Remove fillers, format paragraphs
8	Task Brief	Gemini 2.5 Flash	Generate structured task brief
9	Self-Check	Gemini 2.5 Flash	Verify brief completeness
10	Visual Verify	Gemini 2.5 Flash	Ensure brief uses visual context