JSDoc: Class: RecognizeStream

new RecognizeStream(options)

pipe()-able Node.js Readable/Writeable stream - accepts binary audio and emits text/objects in it's data events.

Uses WebSockets under the hood. For audio with no recognizable speech, no data events are emitted.

By default, only finalized text is emitted in the data events, however in readableObjectMode (usually just objectMode when using a helper method).

An interim result looks like this:

 { alternatives:
   [ { timestamps:
        [ [ 'it', 20.9, 21.04 ],
          [ 'is', 21.04, 21.17 ],
          [ 'a', 21.17, 21.25 ],
          [ 'site', 21.25, 21.56 ],
          [ 'that', 21.56, 21.7 ],
          [ 'hardly', 21.7, 22.06 ],
          [ 'anyone', 22.06, 22.49 ],
          [ 'can', 22.49, 22.67 ],
          [ 'behold', 22.67, 23.13 ],
          [ 'without', 23.13, 23.46 ],
          [ 'some', 23.46, 23.67 ],
          [ 'sort', 23.67, 23.91 ],
          [ 'of', 23.91, 24 ],
          [ 'unwanted', 24, 24.58 ],
          [ 'emotion', 24.58, 25.1 ] ],
       transcript: 'it is a site that hardly anyone can behold without some sort of unwanted emotion ' } ],
  final: false,
  result_index: 3 }

While a final result looks like this (some features only appear in final results):

  { alternatives:
     [ { word_confidence:
          [ [ 'it', 1 ],
            [ 'is', 0.956286624429304 ],
            [ 'a', 0.8105753725270362 ],
            [ 'site', 1 ],
            [ 'that', 1 ],
            [ 'hardly', 1 ],
            [ 'anyone', 1 ],
            [ 'can', 1 ],
            [ 'behold', 0.5273598005406737 ],
            [ 'without', 1 ],
            [ 'some', 1 ],
            [ 'sort', 1 ],
            [ 'of', 1 ],
            [ 'unwanted', 1 ],
            [ 'emotion', 0.49401837076320887 ] ],
         confidence: 0.881,
         transcript: 'it is a site that hardly anyone can behold without some sort of unwanted emotion ',
         timestamps:
          [ [ 'it', 20.9, 21.04 ],
            [ 'is', 21.04, 21.17 ],
            [ 'a', 21.17, 21.25 ],
            [ 'site', 21.25, 21.56 ],
            [ 'that', 21.56, 21.7 ],
            [ 'hardly', 21.7, 22.06 ],
            [ 'anyone', 22.06, 22.49 ],
            [ 'can', 22.49, 22.67 ],
            [ 'behold', 22.67, 23.13 ],
            [ 'without', 23.13, 23.46 ],
            [ 'some', 23.46, 23.67 ],
            [ 'sort', 23.67, 23.91 ],
            [ 'of', 23.91, 24 ],
            [ 'unwanted', 24, 24.58 ],
            [ 'emotion', 24.58, 25.1 ] ] },
       { transcript: 'it is a sight that hardly anyone can behold without some sort of unwanted emotion ' },
       { transcript: 'it is a site that hardly anyone can behold without some sort of unwanted emotions ' } ],
    final: true,
    result_index: 3 }

Parameters:

Name Type Description

options

Object

Properties

Name	Type	Attributes	Default	Description
`model`	String	<optional>	'en-US_BroadbandModel'	voice model to use. Microphone streaming only supports broadband models.
`url`	String	<optional>	'wss://stream.watsonplatform.net/speech-to-text/api'	base URL for service
`content-type`	String	<optional>	'audio/wav'	content type of audio; can be automatically determined from file header in most cases. only wav, flac, and ogg/opus are supported
`interim_results`	Boolean	<optional>	true	Send back non-final previews of each "sentence" as it is being processed. These results are ignored in text mode.
`continuous`	Boolean	<optional>	true	set to false to automatically stop the transcription after the first "sentence"
`word_confidence`	Boolean	<optional>	false	include confidence scores with results. Defaults to true when in objectMode.
`timestamps`	Boolean	<optional>	false	include timestamps with results. Defaults to true when in objectMode.
`max_alternatives`	Number	<optional>	1	maximum number of alternative transcriptions to include. Defaults to 3 when in objectMode.
`keywords`	Array.<String>	<optional>		a list of keywords to search for in the audio
`keywords_threshold`	Number	<optional>		Number between 0 and 1 representing the minimum confidence before including a keyword in the results. Required when options.keywords is set
`word_alternatives_threshold`	Number	<optional>		Number between 0 and 1 representing the minimum confidence before including an alternative word in the results. Must be set to enable word alternatives,
`profanity_filter`	Boolean	<optional>	false	set to true to filter out profanity and replace the words with *'s
`inactivity_timeout`	Number	<optional>	30	how many seconds of silence before automatically closing the stream (even if continuous is true). use -1 for infinity
`readableObjectMode`	Boolean	<optional>	false	emit `result` objects instead of string Buffers for the `data` events. Changes several other defaults.
`X-WDC-PL-OPT-OUT`	Number	<optional>	0	set to 1 to opt-out of allowing Watson to use this request to improve it's services

Source:

speech-to-text/recognize-stream.js, line 128

Methods

(inner) flowForResults(event)

listening for results events should put the stream in flowing mode just like data events

Parameters:

Name	Type	Description
`event`	String

Source:

speech-to-text/recognize-stream.js, line 141

Events

close

Parameters:

Name	Type	Description
`reasonCode`	Number
`description`	String

Source:

speech-to-text/recognize-stream.js, line 234

connection-close

Parameters:

Name	Type	Description
`reasonCode`	Number
`description`	String

Deprecated:

Source:

speech-to-text/recognize-stream.js, line 240

data

Finalized text

Parameters:

Name	Type	Description
`transcript`	String

Source:

speech-to-text/recognize-stream.js, line 322

data

Object with interim or final results, possibly including confidence scores, alternatives, and word timing.

Parameters:

Name	Type	Description
`data`	Object

Source:

speech-to-text/recognize-stream.js, line 315

error

Parameters:

Name	Type	Attributes	Description
`msg`	String		custom error message
`frame`	*	<optional>	unprocessed frame (should have a .data property with either string or binary data)
`err`	Error	<optional>

Source:

speech-to-text/recognize-stream.js, line 249

receive-json

Parameters:

Name	Type	Description
`msg`	Object	the raw JSON received from Watson - sometimes useful for debugging

Source:

speech-to-text/recognize-stream.js, line 277

results

Object with interim or final results, possibly including confidence scores, alternatives, and word timing.

Parameters:

Name	Type	Description
`results`	Object

Deprecated:

- use objectMode and listen for the 'data' event instead

Source:

speech-to-text/recognize-stream.js, line 307

results

Object with array of interim or final results, possibly including confidence scores, alternatives, and word timing. May have no results at all for empty audio files.

Parameters:

Name	Type	Description
`results`	Object

Deprecated:

- use objectMode and listen for the 'data' event instead

Source:

speech-to-text/recognize-stream.js, line 296

send-json

Parameters:

Name	Type	Description
`msg`	Object	the raw JSON sent to Watson - sometimes useful for debugging

Source:

speech-to-text/recognize-stream.js, line 339