TranscribeHQ API

Agentic video transcription service with AI-powered quality correction, visual context analysis, and AI intelligence for all platforms. Supports YouTube, Vimeo, Loom, TikTok, Facebook, Instagram, and Twitter/X.

Base URL /api

Authentication

All API v1 endpoints require an API key. The health check endpoint is public. You can provide your key using any of these three methods:

Method 1 — Authorization: Bearer header (recommended)

Authorization: Bearer YOUR_API_KEY

Method 2 — x-api-key header

x-api-key: YOUR_API_KEY

Method 3 — Query parameter

GET /api/v1/jobs?api_key=YOUR_API_KEY
Server-to-server ready: All three authentication methods work with Postman, cURL, AI agents, and any HTTP client. No browser-specific headers required.

Supported Platforms

Tier 1 — Video Hosting

PlatformNotes
YouTubeFull support including long videos
VimeoPublic videos with browser impersonation
DailymotionPublic videos
TwitchVODs and clips
StreamableShort-form video hosting
RumblePublic videos
BilibiliPublic videos
TikTokShort-form video

Tier 1 — Screen Recording

PlatformNotes
LoomDirect API extraction + visual context
ScreenPalPublic recordings
KomodoPublic recordings
ScreencastifyShared recordings
VidyardPublic/shared videos
WistiaPublic videos
BerryPalShared recordings
JumpshareShared recordings
Zight (CloudApp)Shared recordings
SendsparkShared videos
ArcadeInteractive demos
ScribeStep-by-step recordings
TellaShared recordings
CapOpen-source screen recorder
ClaapMeeting/async recordings
PitchPresentation recordings
Screen StudiomacOS screen recordings
DroplrShared recordings/screenshots

Tier 1 — Cloud Storage

PlatformNotes
DropboxShared video links
Google DriveShared video files
OneDriveShared video files

Tier 2 — Social Media (variable reliability)

PlatformNotes
FacebookPublic videos only, may be unreliable
InstagramReels and video posts
Twitter / XVideo tweets, may be unreliable
RedditVideo posts
LinkedInPublic video posts
SnapchatPublic stories/spotlights
ThreadsVideo posts

Tier 1 platforms are reliable and fully supported. Tier 2 platforms may have variable availability — a warning is included in the response. Unsupported URLs are rejected with a 400 error.

Error Handling

StatusMeaningDescription
400Bad RequestInvalid URL or unsupported platform
401UnauthorizedMissing API key
403ForbiddenInvalid API key
404Not FoundJob ID not found
503UnavailableAPI keys not configured (production)

Error responses use this format:

{
  "status": "error",
  "message": "Invalid or missing API key"
}

Endpoints

GET /api/healthz Server health check Public

Returns the server health status. No authentication required.

Response 200

{
  "status": "ok"
}

Example

# curl
curl https://your-domain/api/healthz
POST /api/v1/transcribe Start transcription job Auth

Submit a video URL to start the transcription pipeline. Optionally provide a webhook URL to receive a notification when processing completes.

Request Body

FieldTypeRequiredDescription
urlstringYesVideo URL to transcribe
webhook_urlstringNoHTTPS URL for completion webhook

Request Example

curl -X POST https://your-domain/api/v1/transcribe \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "webhook_url": "https://your-server.com/webhook"
  }'

Response 202

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "platform": "YouTube",
  "tier": 1,
  "warning": null,
  "poll_url": "/api/v1/jobs/550e8400-e29b-41d4-a716-446655440000"
}

Error Responses

StatusCondition
400Missing URL, invalid URL format, or unsupported platform
401Missing API key
403Invalid API key
GET /api/v1/jobs List all transcription jobs Auth

Returns a summary list of all transcription jobs.

Response 200

{
  "jobs": [
    {
      "job_id": "550e8400-...",
      "url": "https://www.youtube.com/watch?v=...",
      "platform": "YouTube",
      "status": "complete",
      "progress": 100,
      "is_loom": false,
      "created_at": "2026-03-13T01:00:00.000Z"
    }
  ],
  "total": 1
}

Example

curl https://your-domain/api/v1/jobs \
  -H "Authorization: Bearer YOUR_API_KEY"
GET /api/v1/jobs/:jobId Get job status and results Auth

Returns the full job status and results. The response includes transcripts, quality scores, corrections, cleaned transcripts, task briefs, and visual context annotations.

Path Parameters

ParameterTypeDescription
jobIdstring (UUID)The job ID returned by the transcribe endpoint

Response 200 — Complete Job

{
  "job_id": "550e8400-...",
  "status": "complete",
  "platform": "Loom",
  "is_loom": true,
  "progress": 100,
  "raw_transcript": "So today I want to show you...",
  "corrected_transcript": "So today I want to show you...",
  "cleaned_transcript": "Today I want to show you...",
  "quality_score": 88,
  "corrections": [
    {
      "timestamp": "00:01:23",
      "original": "their",
      "issue": "Homophone error",
      "suggested_correction": "there",
      "confidence": "high"
    }
  ],
  "brief": {
    "summary": "Client requests changes to...",
    "platforms": ["WordPress", "Elementor"],
    "task_type": "Update",
    "priority": "Normal",
    "requirements": ["1. Update hero section..."],
    "ambiguities": ["Which color scheme?"],
    "suggested_hero": "Web Hero",
    "hero_reasoning": "WordPress/Elementor task"
  },
  "visual_context": {
    "confirmed_annotations": [
      {
        "timestamp": "00:00:42",
        "trigger_phrase": "this button right here",
        "platform": "WordPress",
        "ui_element": "CTA button in hero section",
        "context": "Elementor editor showing hero block",
        "visible_text": "Get Started Now",
        "confidence": "high"
      }
    ],
    "unverified_annotations": [],
    "platform_summary": { "WordPress": 2 },
    "total_confirmed": 2,
    "total_dismissed": 1
  },
  "activity_log": [
    {
      "step": "transcription",
      "label": "Deepgram Transcription",
      "status": "complete",
      "elapsed": 3.2
    }
  ],
  "error": null
}

Job Status Values

StatusDescription
pendingJob created, waiting to start
extractingDownloading and extracting audio
transcribingRunning Deepgram speech-to-text
auditingAI quality audit in progress
correctingAI auto-correction
verifyingSecond opinion verification
cleaningTranscript cleaning
briefingTask brief generation
completeAll processing finished
errorProcessing failed

Webhooks

When you provide a webhook_url in the transcribe request, TranscribeHQ sends a POST notification when the job completes or fails.

Webhook Payload

{
  "event": "transcription.complete",
  "job_id": "550e8400-...",
  "status": "complete",
  "platform": "YouTube",
  "is_loom": false,
  "quality_score": 92,
  "error": null,
  "poll_url": "/api/v1/jobs/550e8400-...",
  "timestamp": "2026-03-13T01:05:00.000Z"
}

Webhook Headers

HeaderValue
Content-Typeapplication/json
X-TranscribeHQ-Eventtranscription.complete or transcription.failed

Retry Policy

Failed deliveries are retried up to 3 times with exponential backoff (1s, 2s, 3s). Each attempt has a 10-second timeout.

SSRF Protection: Webhook URLs pointing to localhost, private IPs (10.x, 172.x, 192.168.x), metadata endpoints (169.254.169.254), and .internal/.local domains are blocked.

Visual Context Agent

The Visual Context Agent automatically detects moments in a video where the speaker references something on screen (e.g., "this button right here", "as you can see"). It extracts video frames at those timestamps and uses AI vision to analyze what's visible.

How It Works

  • Scans the transcript for 23 trigger phrases like "this right here", "as you can see", "this button"
  • Uses AI to classify whether each phrase is an active screen reference
  • Extracts 5 frames around each confirmed trigger timestamp
  • Analyzes frames with AI vision to identify platforms, UI elements, and visible text
  • Splits results into confirmed (high/medium confidence) and unverified (low confidence) buckets

Annotation Fields

FieldTypeDescription
timestampstringWhen it occurs (e.g., "00:01:23")
trigger_phrasestringThe phrase that triggered capture
platformstringDetected platform (e.g., "WordPress")
ui_elementstringSpecific UI element identified
contextstringDescription of what's visible on screen
visible_textstring?Any readable text on screen
confidenceenumhigh, medium, or low

AI Intelligence

All videos receive full AI processing beyond standard transcription:

Transcript Cleaning

AI removes filler words, fixes run-on sentences, and formats the transcript into clean paragraphs while preserving all meaning.

Task Brief Generation

AI generates a structured task brief from the video transcript, including:

FieldDescription
summary2-3 sentence overview of the request
platformsTechnologies mentioned (e.g., WordPress, Elementor)
task_typeBug Fix, New Build, Update, Edit, Question, or Review Request
priorityUrgent, Normal, or Low
requirementsNumbered action items extracted from the video
ambiguitiesQuestions that need client clarification
suggested_heroWeb Hero, Automation Hero, Graphic Design Hero, or Video Hero
hero_reasoningWhy that hero type was recommended

Quality Assurance

The brief goes through a self-check (verifying completeness against the transcript) and a visual verification pass (ensuring the brief references specific platforms and UI elements from visual annotations).

Pipeline Stages

Every transcription job progresses through these stages.

#StageEngineDescription
1Audio Extractionyt-dlp / ffmpegDownload and extract audio from video
2TranscriptionDeepgram Nova-3Speech-to-text with smart formatting
3Quality AuditGemini 2.5 FlashIdentify potential transcription errors
4Auto-CorrectionGemini 2.5 FlashFix identified issues in the transcript
5VerificationGemini 2.5 FlashSecond opinion QA + quality score
6Visual Contextffmpeg + Gemini VisionScreen capture + AI vision analysis
7CleaningGemini 2.5 FlashRemove fillers, format paragraphs
8Task BriefGemini 2.5 FlashGenerate structured task brief
9Self-CheckGemini 2.5 FlashVerify brief completeness
10Visual VerifyGemini 2.5 FlashEnsure brief uses visual context