Class: RecognizeStream

RecognizeStream

new RecognizeStream(options)

pipe()-able Node.js Duplex stream - accepts binary audio and emits text/objects in it's data events.

Uses WebSockets under the hood. For audio with no recognizable speech, no data events are emitted.

By default, only finalized text is emitted in the data events, however when objectMode/readableObjectMode and interim_results are enabled, both interim and final results objects are emitted. WriteableElementStream uses this, for example, to live-update the DOM with word-by-word transcriptions.

Note that the WebSocket connection is not established until the first chunk of data is recieved. This allows for auto-detection of content type (for wav/flac/opus audio).

Parameters:
Name Type Description
options Object
Properties
Name Type Attributes Default Description
model String <optional>
'en-US_BroadbandModel'

voice model to use. Microphone streaming only supports broadband models.

url String <optional>
'wss://stream.watsonplatform.net/speech-to-text/api'

base URL for service

token String <optional>

Auth token

headers Object <optional>

Only works in Node.js, not in browsers. Allows for custom headers to be set, including an Authorization header (preventing the need for auth tokens)

content-type String <optional>
'audio/wav'

content type of audio; can be automatically determined from file header in most cases. only wav, flac, and ogg/opus are supported

interim_results Boolean <optional>
true

Send back non-final previews of each "sentence" as it is being processed. These results are ignored in text mode.

continuous Boolean <optional>
true

set to false to automatically stop the transcription after the first "sentence"

word_confidence Boolean <optional>
false

include confidence scores with results. Defaults to true when in objectMode.

timestamps Boolean <optional>
false

include timestamps with results. Defaults to true when in objectMode.

max_alternatives Number <optional>
1

maximum number of alternative transcriptions to include. Defaults to 3 when in objectMode.

keywords Array.<String> <optional>

a list of keywords to search for in the audio

keywords_threshold Number <optional>

Number between 0 and 1 representing the minimum confidence before including a keyword in the results. Required when options.keywords is set

word_alternatives_threshold Number <optional>

Number between 0 and 1 representing the minimum confidence before including an alternative word in the results. Must be set to enable word alternatives,

profanity_filter Boolean <optional>
false

set to true to filter out profanity and replace the words with *'s

inactivity_timeout Number <optional>
30

how many seconds of silence before automatically closing the stream (even if continuous is true). use -1 for infinity

readableObjectMode Boolean <optional>
false

emit result objects instead of string Buffers for the data events. Does not affect input (which must be binary)

objectMode Boolean <optional>
false

alias for options.readableObjectMode

X-Watson-Learning-Opt-Out Number <optional>
false

set to true to opt-out of allowing Watson to use this request to improve it's services

smart_formatting Boolean <optional>
false

formats numeric values such as dates, times, currency, etc.

customization_id String <optional>

not yet supported on the public STT service

Source:

Methods

stop()

Prevents any more audio from being sent over the WebSocket and gracefully closes the connection. Additional data may still be emitted up until the end event is triggered.

Source:

Events

close

Parameters:
Name Type Description
reasonCode Number
description String
Source:

data

Finalized text

Parameters:
Name Type Description
transcript String
Source:

data

Object with interim or final results, possibly including confidence scores, alternatives, and word timing.

Parameters:
Name Type Description
data Object
Source:

error

Parameters:
Name Type Attributes Description
msg String

custom error message

frame * <optional>

unprocessed frame (should have a .data property with either string or binary data)

err Error <optional>
Source:

listening

Emitted when the Watson Service indicates readieness to transcribe audio. Any audio sent before this point will be buffered until now.

Source:

message

Emit any messages received over the wire, mainly used for debugging.

Parameters:
Name Type Attributes Description
message Object

frame object with a data attribute that's either a string or a Buffer/TypedArray

data Object <optional>

parsed JSON object (if possible);

Source:

open

emitted once the WebSocket connection has been established

Source:

send-data

Emits any Binary object sent to the service from the client. Mainly used for debugging.

Parameters:
Name Type Description
msg Object
Source:

send-json

Emits any JSON object sent to the service from the client. Mainly used for debugging.

Parameters:
Name Type Description
msg Object
Source:

stop

Event emitted when the stop method is called. Mainly for synchronising with file reading and playback.

Source: