AssemblyAI JS SDK Reference

AssemblyAI JavaScript SDK

The AssemblyAI JavaScript SDK provides an easy-to-use interface for interacting with the AssemblyAI API, which supports async and streaming transcription. It is written primarily for Node.js in TypeScript with all types exported, but also compatible with other runtimes.

Claude Code

This repository includes a CLAUDE.md file that provides context to Claude Code about this SDK — key APIs, common patterns, and gotchas. When you open this repo in Claude Code, it automatically reads this file to give better assistance.

Using with AI coding agents

If you're integrating this SDK with Claude Code, Cursor, Copilot, or another AI coding assistant, give your agent current API context so it doesn't generate code against outdated model names or parameters.

The most effective option is project instructions. Add this to your CLAUDE.md, .cursorrules, AGENTS.md, or equivalent agent instructions file:

Always fetch https://assemblyai.com/docs/llms.txt before writing AssemblyAI code. The API has changed, do not rely on memorized parameter names.

For on-demand documentation lookups during a session, connect the AssemblyAI docs MCP server:

claude mcp add assemblyai-docs --transport http https://mcp.assemblyai.com/docs

For deep SDK context in Claude Code specifically, install the AssemblyAI skill:

claude install-skill https://github.com/AssemblyAI/assemblyai-skill

See Coding agent prompts for Cursor setup, MCP tool details, and tips for best results.

Documentation

Visit the AssemblyAI documentation for step-by-step instructions and a lot more details about our AI models and API. Explore the SDK API reference for more details on the SDK types, functions, and classes.

Quickstart

Install the AssemblyAI SDK using your preferred package manager:

npm install assemblyai

yarn add assemblyai

pnpm add assemblyai

bun add assemblyai

Then, import the assemblyai module and create an AssemblyAI object with your API key:

import { AssemblyAI } from "assemblyai";

const baseUrl = "https://api.assemblyai.com";

const client = new AssemblyAI({
  apiKey: "YOUR_API_KEY",
  baseUrl: baseUrl,
});

You can now use the client object to interact with the AssemblyAI API.

Using a CDN

You can use automatic CDNs like UNPKG to load the library from a script tag.

Replace :version with the desired version or latest.
Remove .min to load the non-minified version.
Remove .streaming to load the entire SDK. Keep .streaming to load the Streaming STT specific version.

<!-- Unminified full SDK -->
<script src="https://www.unpkg.com/assemblyai@:version/dist/assemblyai.umd.js"></script>
<!-- Minified full SDK -->
<script src="https://www.unpkg.com/assemblyai@:version/dist/assemblyai.umd.min.js"></script>
<!-- Unminified Streaming STT only -->
<script src="https://www.unpkg.com/assemblyai@:version/dist/assemblyai.streaming.umd.js"></script>
<!-- Minified Streaming STT only -->
<script src="https://www.unpkg.com/assemblyai@:version/dist/assemblyai.streaming.umd.min.js"></script>

The script creates a global assemblyai variable containing all the services. Here's how you create a StreamingTranscriber object.

const { StreamingTranscriber } = assemblyai;
const transcriber = new StreamingTranscriber({
  token: "[GENERATE TEMPORARY AUTH TOKEN IN YOUR API]",
  ...
});

For type support in your IDE, see Reference types from JavaScript.

Speech-To-Text

Transcribe audio and video files

Transcribe an audio file with a public URL

When you create a transcript, you can either pass in a URL to an audio file or upload a file directly.

// Transcribe file at remote URL
const audioFile = "https://assembly.ai/sports_injuries.mp3";

const params = {
  audio: audioFile,
  speech_models: ["universal-3-5-pro", "universal-2"],
  language_detection: true,
};

const run = async () => {
  const transcript = await client.transcripts.transcribe(params);
  console.log(transcript.text);
};

run();

[!NOTE] You can also pass a local file path, a stream, or a buffer as the audio property.

transcribe queues a transcription job and polls it until the status is completed or error.

If you don't want to wait until the transcript is ready, you can use submit:

let transcript = await client.transcripts.submit({
  audio: "https://assembly.ai/espn.m4a",
  speech_models: ["universal-3-5-pro", "universal-2"],
  language_detection: true,
});

Transcribe a local audio file

When you create a transcript, you can either pass in a URL to an audio file or upload a file directly.

// Upload a file via local path and transcribe
let transcript = await client.transcripts.transcribe({
  audio: "./news.mp4",
  speech_models: ["universal-3-5-pro", "universal-2"],
  language_detection: true,
});

Note: You can also pass a file URL, a stream, or a buffer as the audio property.

transcribe queues a transcription job and polls it until the status is completed or error.

If you don't want to wait until the transcript is ready, you can use submit:

let transcript = await client.transcripts.submit({
  audio: "./news.mp4",
  speech_models: ["universal-3-5-pro", "universal-2"],
  language_detection: true,
});

Enable additional Speech Understanding models

You can extract even more insights from the audio by enabling any of our Speech Understanding models using transcription options. For example, here's how to enable Speaker diarization model to detect who said what.

import { AssemblyAI } from "assemblyai";

const client = new AssemblyAI({
  apiKey: "<YOUR_API_KEY>",
});

const audioFile = "https://assembly.ai/wildfires.mp3";

const params = {
  audio: audioFile,
  speech_models: ["universal-3-5-pro", "universal-2"],
  language_detection: true,
  speaker_labels: true,
};

const run = async () => {
  const transcript = await client.transcripts.transcribe(params);

  for (const utterance of transcript.utterances!) {
    console.log(`Speaker ${utterance.speaker}: ${utterance.text}`);
  }
};

run();

Get a transcript

This will return the transcript object in its current state. If the transcript is still processing, the status field will be queued or processing. Once the transcript is complete, the status field will be completed.

const transcript = await client.transcripts.get(transcript.id);

If you created a transcript using .submit(), you can still poll until the transcript status is completed or error using .waitUntilReady():

const transcript = await client.transcripts.waitUntilReady(transcript.id, {
  // How frequently the transcript is polled in ms. Defaults to 3000.
  pollingInterval: 1000,
  // How long to wait in ms until the "Polling timeout" error is thrown. Defaults to infinite (-1).
  pollingTimeout: 5000,
});

Get sentences and paragraphs

const sentences = await client.transcripts.sentences(transcript.id);
const { paragraphs } = await client.transcripts.paragraphs(transcript.id);

for (const paragraph of paragraphs) {
  console.log(paragraph.text);
}

for (const sentence of sentences) {
  console.log(sentence.text);
}

Get subtitles

const charsPerCaption = 32;
let srt = await client.transcripts.subtitles(transcript.id, "srt");
srt = await client.transcripts.subtitles(transcript.id, "srt", charsPerCaption);

let vtt = await client.transcripts.subtitles(transcript.id, "vtt");
vtt = await client.transcripts.subtitles(transcript.id, "vtt", charsPerCaption);

List transcripts

This will return a page of transcripts you created.

const page = await client.transcripts.list();

You can also paginate over all pages.

let previousPageUrl: string | null = null;
do {
  const page = await client.transcripts.list(previousPageUrl);
  previousPageUrl = page.page_details.prev_url;
} while (previousPageUrl !== null);

[!NOTE] To paginate over all pages, you need to use the page.page_details.prev_url because the transcripts are returned in descending order by creation date and time. The first page is are the most recent transcript, and each "previous" page are older transcripts.

Delete a transcript

const res = await client.transcripts.delete(transcript.id);

Transcribe audio synchronously

client.sync posts a whole audio file and returns the finished transcript in one round trip — no job id, no polling. Use it for short clips where you want the answer inline; use client.transcripts for long-form audio, URLs, or the rich audio-intelligence features the sync API doesn't expose.

const result = await client.sync.transcribe("./call.wav");
console.log(result.text, result.session_id);

The input can be a local file path, raw audio bytes, a Blob, or a readable stream — but not a URL.

Configure the transcription

const result = await client.sync.transcribe("./call.wav", {
  prompt: "Transcribe verbatim. Preserve disfluencies.", // max 4096 chars
  keyterms_prompt: ["AssemblyAI", "Lemur"], // max 2048 chars total
  language_codes: ["es"], // or e.g. ["en", "es"] for multilingual; defaults to English
  conversation_context: [
    // prior turns, oldest first
    "I'd like to book a flight to Denver.",
    "Sure, what date were you thinking?",
  ],
});

Raw S16LE PCM audio needs sample_rate and channels; WAV reads them from its header.

const result = await client.sync.transcribe(rawPcmBytes, {
  sample_rate: 16_000,
  channels: 1,
});

Get word timestamps

Word timestamps are opt-in. By default each word in result.words carries text and confidence only — start/end are absent. Set timestamps: true to get accurate per-word timings at a small latency cost.

const result = await client.sync.transcribe("./call.wav", {
  timestamps: true,
});
for (const word of result.words) {
  console.log(word.text, word.start, word.end); // milliseconds
}

Pre-warm the connection

The sync API is a single request/response, so a transcribe() that connects on demand pays the full DNS + TCP + TLS handshake on the critical path. Call warm() as soon as you know audio is coming — for example while it is still being recorded — so the next transcribe() reuses the open connection.

await client.sync.warm(); // fire as recording starts
const audio = await recordUntilDone();
const result = await client.sync.transcribe(audio); // reuses the hot connection

Handle errors

Failures throw a SyncTranscriptError with the HTTP status, a machine-readable errorCode (bad_audio, audio_too_large, capacity_exceeded, …), and retryAfter (seconds) on 429/503 responses.

import { SyncTranscriptError } from "assemblyai";

try {
  const result = await client.sync.transcribe("./call.wav");
} catch (error) {
  if (error instanceof SyncTranscriptError) {
    console.error(error.status, error.errorCode, error.retryAfter);
  }
}

Transcribe streaming audio

Refer to AssemblyAI's streaming documentation for full code examples.

Create the streaming transcriber.

const transcriber = client.streaming.transcriber({
  speechModel: "universal-3-5-pro",
  sampleRate: 16_000,
});

[!WARNING] Storing your API key in client-facing applications exposes your API key. Generate a temporary auth token on the server and pass it to your client. Server code:
const token = await client.streaming.createTemporaryToken({
  expires_in_seconds = 60,
});
// TODO: return token to client
Client code:
import { StreamingTranscriber } from "assemblyai";
// TODO: implement getToken to retrieve token from server
const token = await getToken();
const transcriber = new StreamingTranscriber({
  token,
});

You can configure the following events.

transcriber.on("open", ({ id, expires_at }) => console.log('Session ID:', id, 'Expires at:', expires_at));
transcriber.on("close", (code: number, reason: string) => console.log('Closed', code, reason));
transcriber.on("turn", ({ transcript }) => console.log('Transcript:', transcript));
transcriber.on("error", (error: Error) => console.error('Error', error));

After configuring your events, connect to the server.

await transcriber.connect();

Send audio data via chunks.

// Pseudo code for getting audio
getAudio((chunk) => {
  transcriber.sendAudio(chunk);
});

Or send audio data via a stream:

audioStream.pipeTo(transcriber.stream());

Close the connection when you're finished.

await transcriber.close();

Contributing

If you want to contribute to the JavaScript SDK, follow the guidelines in CONTRIBUTING.md.