JSDoc: Class: RecognizeStream

new RecognizeStream(options)

pipe()-able Node.js Readable/Writeable stream - accepts binary audio and emits text/objects in it's data events.

Uses WebSockets under the hood. For audio with no recognizable speech, no data events are emitted.

By default, only finalized text is emitted in the data events, however in readableObjectMode (usually just objectMode when using a helper method).

An interim result looks like this (assuming all features are enabled):

 { alternatives:
   [ { timestamps:
        [ [ 'it', 20.9, 21.04 ],
          [ 'is', 21.04, 21.17 ],
          [ 'a', 21.17, 21.25 ],
          [ 'site', 21.25, 21.56 ],
          [ 'that', 21.56, 21.7 ],
          [ 'hardly', 21.7, 22.06 ],
          [ 'anyone', 22.06, 22.49 ],
          [ 'can', 22.49, 22.67 ],
          [ 'behold', 22.67, 23.13 ],
          [ 'without', 23.13, 23.46 ],
          [ 'some', 23.46, 23.67 ],
          [ 'sort', 23.67, 23.91 ],
          [ 'of', 23.91, 24 ],
          [ 'unwanted', 24, 24.58 ],
          [ 'emotion', 24.58, 25.1 ] ],
       transcript: 'it is a site that hardly anyone can behold without some sort of unwanted emotion ' } ],
  final: false,
  result_index: 3 }

While a final result looks like this (again, assuming all features are enabled):

  { alternatives:
     [ { word_confidence:
          [ [ 'it', 1 ],
            [ 'is', 0.956286624429304 ],
            [ 'a', 0.8105753725270362 ],
            [ 'site', 1 ],
            [ 'that', 1 ],
            [ 'hardly', 1 ],
            [ 'anyone', 1 ],
            [ 'can', 1 ],
            [ 'behold', 0.5273598005406737 ],
            [ 'without', 1 ],
            [ 'some', 1 ],
            [ 'sort', 1 ],
            [ 'of', 1 ],
            [ 'unwanted', 1 ],
            [ 'emotion', 0.49401837076320887 ] ],
         confidence: 0.881,
         transcript: 'it is a site that hardly anyone can behold without some sort of unwanted emotion ',
         timestamps:
          [ [ 'it', 20.9, 21.04 ],
            [ 'is', 21.04, 21.17 ],
            [ 'a', 21.17, 21.25 ],
            [ 'site', 21.25, 21.56 ],
            [ 'that', 21.56, 21.7 ],
            [ 'hardly', 21.7, 22.06 ],
            [ 'anyone', 22.06, 22.49 ],
            [ 'can', 22.49, 22.67 ],
            [ 'behold', 22.67, 23.13 ],
            [ 'without', 23.13, 23.46 ],
            [ 'some', 23.46, 23.67 ],
            [ 'sort', 23.67, 23.91 ],
            [ 'of', 23.91, 24 ],
            [ 'unwanted', 24, 24.58 ],
            [ 'emotion', 24.58, 25.1 ] ] },
       { transcript: 'it is a sight that hardly anyone can behold without some sort of unwanted emotion ' },
       { transcript: 'it is a site that hardly anyone can behold without some sort of unwanted emotions ' } ],
    final: true,
    result_index: 3 }

Parameters:

Name Type Description

options

Object

Properties

Name	Type	Attributes	Default	Description
`model`	String	<optional>	'en-US_BroadbandModel'	voice model to use. Microphone streaming only supports broadband models.
`url`	String	<optional>	'wss://stream.watsonplatform.net/speech-to-text/api'	base URL for service
`content-type`	String	<optional>	'audio/wav'	content type of audio; can be automatically determined from file header in most cases. only wav, flac, and ogg/opus are supported
`interim_results`	Boolean	<optional>	false	Send back non-final previews of each "sentence" as it is being processed. Defaults to true when in objectMode.
`continuous`	Boolean	<optional>	true	set to false to automatically stop the transcription after the first "sentence"
`word_confidence`	Boolean	<optional>	false	include confidence scores with results. Defaults to true when in objectMode.
`timestamps`	Boolean	<optional>	false	include timestamps with results. Defaults to true when in objectMode.
`max_alternatives`	Number	<optional>	1	maximum number of alternative transcriptions to include. Defaults to 3 when in objectMode.
`inactivity_timeout`	Number	<optional>	30	how many seconds of silence before automatically closing the stream (even if continuous is true). use -1 for infinity
`readableObjectMode`	Boolean	<optional>	false	emit `result` objects instead of string Buffers for the `data` events. Changes several other defaults.
`X-WDC-PL-OPT-OUT`	Number	<optional>	0	set to 1 to opt-out of allowing Watson to use this request to improve it's services //todo: investigate other options at http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/apis/#!/speech-to-text/recognizeSessionless

Source:

speech-to-text/recognize-stream.js, line 128

Methods

(inner) flowForResults(event)

listening for results events should put the stream in flowing mode just like data events

Parameters:

Name	Type	Description
`event`	String

Source:

speech-to-text/recognize-stream.js, line 141

Events

close

Parameters:

Name	Type	Description
`reasonCode`	Number
`description`	String

Source:

speech-to-text/recognize-stream.js, line 234

connection-close

Parameters:

Name	Type	Description
`reasonCode`	Number
`description`	String

Deprecated:

Source:

speech-to-text/recognize-stream.js, line 239

data

Finalized text

Parameters:

Name	Type	Description
`transcript`	String

Source:

speech-to-text/recognize-stream.js, line 317

data

Object with interim or final results, possibly including confidence scores, alternatives, and word timing.

Parameters:

Name	Type	Description
`data`	Object

Source:

speech-to-text/recognize-stream.js, line 310

error

Parameters:

Name	Type	Attributes	Description
`msg`	String		custom error message
`frame`	*	<optional>	unprocessed frame (should have a .data property with either string or binary data)
`err`	Error	<optional>

Source:

speech-to-text/recognize-stream.js, line 248

results

Object with array of interim or final results, possibly including confidence scores, alternatives, and word timing. May have no results at all for empty audio files.

Parameters:

Name	Type	Description
`results`	Object

Deprecated:

- use objectMode and listen for the 'data' event instead

Source:

speech-to-text/recognize-stream.js, line 291

results

Object with interim or final results, possibly including confidence scores, alternatives, and word timing.

Parameters:

Name	Type	Description
`results`	Object

Deprecated:

- use objectMode and listen for the 'data' event instead

Source:

speech-to-text/recognize-stream.js, line 302