WebSocket API for Streaming Transcription


To start a new stream, the connection must first be set up. A WebSocket connection starts with a HTTP GET request with header fields Upgrade: websocket and Connection: Upgrade as per RFC6455.

GET /asr/v0.1/stream HTTP/1.1
Host: api.myrtle.ai
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Protocol: stream.asr.api.myrtle.ai
Sec-WebSocket-Version: 13

If all is well, the server will respond in the affirmative.

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Protocol: stream.asr.api.myrtle.ai

The server will return HTTP/1.1 400 Bad Request if the request is invalid.

Request Parameters

Parameters are query-encoded in the request URL.

Content Type


Requests can specify the audio format with the content_type parameter. If the content type is not specified then the server will attempt to infer it. Currently only audio/x-raw is supported.

Supported content types are:

  • audio/x-raw: Unstructured and uncompressed raw audio data. If raw audio is used then additional parameters must be provided by adding:
    • format: The format of audio samples. Only S16LE is currently supported
    • rate: The sample rate of the audio. Only 16000 is currently supported
    • channels: The number of channels. Only 1 channel is currently supported

As a query parameter, this would look like:


Model Identifier


Requests can specify a transcription model identifier.

Model Version


Requests can specify the transcription model version. Can be "latest" or a specific version id.

Model Language


The BCP47 language tag for the speech in the audio.

Max Number of Alternatives


The maximum number of alternative transcriptions to provide.

Supported Models

Model idVersionSupported Languages

Request Frames

For audio/x-raw audio, raw audio samples in the format specified in the format parameter should be sent in WebSocket Binary frames without padding. Frames can be any length greater than zero.

A WebSocket Binary frame of length zero is treated as an end-of-stream (EOS) message.

Response Frames

Response frames are sent as WebSocket Text frames containing JSON.

  "start": 0.0,
  "end": 2.0,
  "is_provisional": false,
  "alternatives": [
      "transcript": "hello world",
      "confidence": 1.0

Closing the Connection

The client should not close the WebSocket connection, it should send an EOS message and wait for a WebSocket Close frame from the server. Closing the connection before receivng a WebSocket Close frame from the server may cause transcription results to be dropped.

An end-of-stream (EOS) message can be sent by sending a zero-length binary frame.


If an error occurs, the server will send a WebSocket Close frame, with error details in the body.

Error CodeDetails
400Invalid parameters passed.
503Maximum number of simultaneous connections reached.