# Sofya Transcription

**Sofya Transcription** is a JavaScript library that provides a robust and flexible solution for real-time audio transcription. It is designed to transcribe audio streams and can be easily integrated into web applications. The library also includes a functionality for capturing audio from media elements.

## Features

-   **Real-Time Transcription**: Transcribe audio streams in real time with high accuracy.
-   **Flexible Integration**: Seamlessly integrates with your web applications.
-   **Media Element Audio Capture**: Feature to capture audio from media elements like `<video>` and `<audio>`.
-   **Multiple Provider Support**: Support for Sofya Compliance and Sofya as Service transcription providers.
-   **Type-Safe Configuration**: TypeScript definitions for provider-specific configurations.

## Installation

To install **Sofya Transcription**, you can use npm:

`npm install sofya.transcription` 

## Usage

Here's a basic example of how to use **Sofya Transcription** in your project:

1.  **Import the Library**:
    
    `import { MediaElementAudioCapture, SofyaTranscriber } from 'sofya.transcription';` 
    
2.  **Create a Transcription Service Instance**:
    
    ```typescript
    // Using API key connection
    const transcriber = new SofyaTranscriber({
      apiKey: 'YOUR_API_KEY',
      config: {
        language: 'en-US'
      }
    });
    
    // Or using a specific provider
    const transcriber = new SofyaTranscriber({
      provider: 'sofya_compliance',
      endpoint: 'YOUR_ENDPOINT',
      config: {
        language: 'en-US',
        token: 'YOUR_TOKEN',
        compartmentId: 'YOUR_COMPARTMENT_ID',
        region: 'YOUR_REGION'
      }
    });
    ```
    
3.  **Initialize and Start Transcription**:
 
    ```typescript
    // Wait for the transcriber to be ready
    transcriber.on('ready', () => {
      // Get media stream
      navigator.mediaDevices.getUserMedia({ audio: true })
        .then(mediaStream => {
          // Start transcription
          transcriber.startTranscription(mediaStream);
        })
        .catch(error => {
          console.error('Error accessing microphone:', error);
        });
    });
    ```
    
4.  **Handle Transcription Events**:
    
    ```typescript
    transcriber.on('recognizing', (text) => {
      console.log('Recognizing: ' + text);
    });
    
    transcriber.on('recognized', (text) => {
      console.log('Recognized: ' + text);
    });
    
    transcriber.on('error', (error) => {
      console.error('Transcription error:', error);
    });
    
    transcriber.on('stopped', () => {
      console.log('Transcription stopped');
    });
    ```
    
5.  **Control Transcription**:
        
    ```typescript
    // Pause transcription
    transcriber.pauseTranscription();
    
    // Resume transcription
    transcriber.resumeTranscription();
    
    // Stop transcription
    await transcriber.stopTranscription();
    ```

## API

### `SofyaTranscriber`

-   **constructor(connection: Connection)**: Creates a new instance of the transcription service with a connection object.
    
-   **startTranscription(mediaStream: MediaStream): void**: Starts the transcription process with a given `MediaStream`.
    
-   **stopTranscription(): void**: Stops the transcription process.

-   **pauseTranscription(): void**: Pauses the transcription process.
    
-   **resumeTranscription(): void**: Resumes the transcription process.
    
-   **on(event: string, callback: Function): this**: Registers an event handler for transcription events. Possible events include:
    
    -   `recognizing`: Fired when transcription is in progress.
    -   `recognized`: Fired when transcription is complete.
    -   `error`: Fired when an error occurs.
    -   `ready`: Fired when the transcription service is ready to start.
    -   `stopped`: Fired when the transcription process is stopped.
    -   `connected`: Fired when the transcription service is connected to the provider.

### Connection Types

The SDK supports different connection modes based on the provider:

#### API Key Connection

```typescript
{
  apiKey: string;
  config?: BaseConfig;
}
```

#### Sofya Compliance Provider Connection

```typescript
{
  provider: "sofya_compliance";
  endpoint: string;
  config: SofyaComplianceConfig;
}
```

#### Sofya As Service Provider Connection

```typescript
{
  provider: "sofya_as_service";
  endpoint: string;
  config: SofyaSpeechConfig;
}
```

#### STT WVAD Provider Connection

```typescript
{
  provider: "stt_wvad";
  endpoint: string;
  config: SofyaSpeechConfig;
}
```

### Configuration Types

#### BaseConfig

```typescript
interface BaseConfig {
  language: string;
}
```

#### SofyaComplianceConfig

```typescript
interface SofyaComplianceConfig extends BaseConfig {
  token: string;
  compartmentId: string;
  region: string;
}
```

#### SofyaSpeechConfig

```typescript
interface SofyaSpeechConfig extends BaseConfig {}
```

## React Example

```jsx
import React from 'react'
import { SofyaTranscriber } from 'sofya.transcription'

const App = () => {
  const transcriberRef = React.useRef<SofyaTranscriber | null>(null)
  const [transcription, setTranscription] = React.useState('')
  const transcriptionRef = React.useRef('')

  const getMediaStream = async () => {
    const stream = await navigator.mediaDevices.getUserMedia({ audio: true })
    return stream
  }

  const startTranscription = async () => {
    try {
      const stream = await getMediaStream()
      
      // Create transcriber with API key connection
      const transcriber = new SofyaTranscriber({
        apiKey: 'your_api_key',
        config: {
          language: 'en-US'
        }
      })
      
      transcriberRef.current = transcriber

      transcriber.on("ready", () => {
        transcriber.startTranscription(stream)
      })
      
      transcriber.on('recognizing', (result: string) => {
        transcriptionRef.current = result
        setTranscription(result)
      })
      
      transcriber.on('recognized', (result: string) => {
        transcriptionRef.current = result
        setTranscription(result)
      })
      
      transcriber.on('error', (error: Error) => {
        console.error('Transcription error:', error)
      })
    } catch (error) {
      console.error('Error starting transcription:', error)
    }
  }

  const stopTranscription = async () => {
    if (transcriberRef.current) {
      await transcriberRef.current.stopTranscription()
    }
  }

  return (
    <div>
      <button onClick={startTranscription}>Start Transcription</button>
      <button onClick={stopTranscription}>Stop Transcription</button>
      <div>
        <h3>Transcription:</h3>
        <p>{transcription}</p>
      </div>
    </div>
  )
}

export default App
```

## License

This project is licensed under the MIT License - see the LICENSE file for details.