# 🎙️ MBZ Voice SDK

> **Speak. Think. Respond. Seamlessly.**

MBZ-Voice-SDK is a powerful developer tool that enables you to integrate voice input, AI understanding (via Gemini), and spoken responses into any modern web app. Whether you're building a chatbot, AI assistant, or a voice-powered UI — this SDK makes it plug-and-play.

---

## 📋 Table of Contents

- [Features](#-features)
- [Requirements](#-requirements)
- [Installation](#-install-the-sdk)
- [Backend Setup](#️-backend-setup-guide)
- [Usage Examples](#-sdk-usage-example)
- [API Documentation](#-api-documentation)
- [Troubleshooting](#-troubleshooting)
- [Contributing](#-contributing)
- [Security Notice](#-security-notice)
- [Tools Used](#-tools-used)
- [License](#-license)
- [Support](#-support)

---

## 🔥 Features

✅ **Voice Input**: Capture user speech via browser microphone using Web Speech API  
✅ **AI Processing**: Gemini-powered AI backend built with FastAPI  
✅ **Voice Response**: Convert AI text responses to spoken words using Web Speech TTS  
✅ **Audio Controls**: Easily toggle mute/unmute functionality  
✅ **Conversation Memory**: Store the last 3 Q&A exchanges using localStorage  
✅ **Framework Agnostic**: Seamlessly integrate with plain JavaScript, React, Vue, or any modern frontend framework  
✅ **Customizable**: Configure language, voice type, and response behavior  
✅ **Lightweight**: Minimal dependencies for optimal performance  

## 💻 Requirements

- Modern web browser with support for:
  - Web Speech API (SpeechRecognition)
  - Web Speech API (SpeechSynthesis)
  - localStorage
- Node.js 14+ (for development)
- Python 3.8+ (for backend)
- Gemini API key from Google AI Studio

## 📦 Install the SDK

### NPM Installation

After publishing on npm:

```bash
npx mbz-voice-sdk init
### Creating a Comprehensive README.md File

Here's an enhanced README.md file with more complete details for the MBZ Voice SDK:

```markdown
...
```

### Yarn Installation

```shellscript
yarn add mbz-voice-sdk
```

### Local Installation (if cloned)

```shellscript
cd mbz-voice-sdk/sdk
npm install
```

### CDN Usage

```html
<script src="https://unpkg.com/mbz-voice-sdk@latest/dist/mbz-voice-sdk.min.js"></script>
```

## ⚙️ Backend Setup Guide

This SDK requires a backend API endpoint connected to Gemini (Google AI). We've provided a ready-to-use FastAPI backend in the `/backend` folder.

### 1️⃣ Navigate to the backend directory

```shellscript
cd ../backend
```

### 2️⃣ Install Python dependencies

```shellscript
pip install -r requirements.txt
```

### 3️⃣ Add Your Gemini API Key

Create a `.env` file in the backend folder and paste your Gemini API key:

```plaintext
GEMINI_API_KEY=your_google_gemini_api_key_here
```

👉 Get your key from: [https://makersuite.google.com/app/apikey](https://makersuite.google.com/app/apikey)

### 4️⃣ Run the server

```shellscript
uvicorn main:app --reload
```

Now your backend is live at:

```plaintext
http://localhost:8000/ask
```

## 🧠 SDK Usage Example

### Basic Usage

```javascript
import { MBZVoiceAgent } from "mbz-voice-sdk";

const agent = new MBZVoiceAgent({
  apiUrl: "http://localhost:8000/ask",
  lang: "en-US",
  speak: true
});

agent.onTranscript((text) => {
  console.log("User said:", text);
});

agent.onResponse((reply) => {
  console.log("AI replied:", reply);
});

document.getElementById("start-btn").onclick = () => agent.listen();
```

### React Integration

```javascriptreact
import React, { useEffect, useState } from 'react';
import { MBZVoiceAgent } from 'mbz-voice-sdk';

function VoiceAssistant() {
  const [transcript, setTranscript] = useState('');
  const [response, setResponse] = useState('');
  const [isListening, setIsListening] = useState(false);
  const [agent, setAgent] = useState(null);

  useEffect(() => {
    // Initialize the agent
    const voiceAgent = new MBZVoiceAgent({
      apiUrl: "http://localhost:8000/ask",
      lang: "en-US",
      speak: true
    });

    // Set up event handlers
    voiceAgent.onTranscript((text) => {
      setTranscript(text);
    });

    voiceAgent.onResponse((reply) => {
      setResponse(reply);
    });

    voiceAgent.onListeningChange((listening) => {
      setIsListening(listening);
    });

    setAgent(voiceAgent);

    // Cleanup on unmount
    return () => {
      voiceAgent.cleanup();
    };
  }, []);

  const handleListen = () => {
    if (agent) {
      agent.listen();
    }
  };

  return (
    <div className="voice-assistant">
      <button 
        onClick={handleListen}
        className={isListening ? 'listening' : ''}
      >
        {isListening ? '🔴 Listening...' : '🎙️ Start Talking'}
      </button>
      
      {transcript && (
        <div className="transcript">
          <h3>You said:</h3>
          <p>{transcript}</p>
        </div>
      )}
      
      {response && (
        <div className="response">
          <h3>AI response:</h3>
          <p>{response}</p>
        </div>
      )}
    </div>
  );
}

export default VoiceAssistant;
```

## 🧪 HTML Quick Test

```html
<button id="start-btn">🎙️ Start Talking</button>
<div id="transcript"></div>
<div id="response"></div>

<script type="module">
  import { MBZVoiceAgent } from 'mbz-voice-sdk';

  const agent = new MBZVoiceAgent({ 
    apiUrl: 'http://localhost:8000/ask',
    speak: true
  });

  const transcriptEl = document.getElementById('transcript');
  const responseEl = document.getElementById('response');

  agent.onTranscript(text => {
    console.log("🎤", text);
    transcriptEl.textContent = `You said: ${text}`;
  });
  
  agent.onResponse(reply => {
    console.log("🤖", reply);
    responseEl.textContent = `AI says: ${reply}`;
  });

  document.getElementById("start-btn").onclick = () => agent.listen();
</script>
```

## 📚 API Documentation

### `MBZVoiceAgent` Class

The main class for interacting with the SDK.

#### Constructor

```javascript
const agent = new MBZVoiceAgent(options);
```

#### Options

| Option | Type | Default | Description
|-----|-----|-----|-----
| `apiUrl` | String | Required | The URL of your backend API endpoint
| `lang` | String | 'en-US' | The language for speech recognition
| `speak` | Boolean | true | Whether to speak the AI's response
| `voiceIndex` | Number | 0 | Index of the voice to use for speech synthesis
| `pitch` | Number | 1.0 | The pitch of the voice (0.1 to 2.0)
| `rate` | Number | 1.0 | The speed of the voice (0.1 to 10.0)
| `volume` | Number | 1.0 | The volume of the voice (0.0 to 1.0)
| `maxHistory` | Number | 3 | Maximum number of Q&A pairs to store in history


#### Methods

| Method | Parameters | Description
|-----|-----|-----|-----
| `listen()` | None | Start listening for voice input
| `stop()` | None | Stop listening for voice input
| `mute()` | None | Mute the voice response
| `unmute()` | None | Unmute the voice response
| `cleanup()` | None | Clean up resources and event listeners
| `onTranscript(callback)` | Function | Set callback for transcript events
| `onResponse(callback)` | Function | Set callback for AI response events
| `onListeningChange(callback)` | Function | Set callback for listening state changes
| `onError(callback)` | Function | Set callback for error events
| `getHistory()` | None | Get the conversation history
| `clearHistory()` | None | Clear the conversation history


## 🔧 Troubleshooting

### Microphone Not Working

- Ensure your browser has permission to access the microphone
- Check if your microphone is properly connected and working
- Try using a different browser (Chrome and Edge have the best support)


### Speech Recognition Not Starting

- Make sure you're using a supported browser (Chrome, Edge, Safari)
- Check your internet connection
- Verify that your site is served over HTTPS (required for production)


### Backend Connection Issues

- Confirm your backend server is running
- Check for CORS issues (the backend should allow requests from your frontend)
- Verify your API URL is correct in the SDK initialization


### Voice Response Not Working

- Check if your device's volume is turned on
- Make sure the `speak` option is set to `true`
- Try using a different voice by changing the `voiceIndex`


## 🤝 Contributing

Contributions are welcome! Here's how you can help:

1. **Fork the repository**
2. **Create a feature branch**:

```shellscript
git checkout -b feature/amazing-feature
```


3. **Commit your changes**:

```shellscript
git commit -m 'Add some amazing feature'
```


4. **Push to the branch**:

```shellscript
git push origin feature/amazing-feature
```


5. **Open a Pull Request**


### Development Setup

```shellscript
# Clone the repository
git clone https://github.com/ProMBZ/mbz-voice-sdk.git

# Install dependencies
cd mbz-voice-sdk
npm install

# Run development server
npm run dev

# Build for production
npm run build
```

## 🔐 Security Notice

This SDK does not use any built-in Gemini key.

🔐 You are responsible for adding your own Gemini key to the backend.

Never include your Gemini key in frontend code.

## 🧰 Tools Used

- **Frontend**:

- JavaScript (SpeechRecognition + TTS APIs)
- localStorage for conversation persistence
- Rollup for bundling



- **Backend**:

- FastAPI (Python)
- Google Generative AI SDK (Gemini 1.5 Flash)
- Python-dotenv for environment variables





## 📄 License

MIT © 2025 — Developed by Muhammad (MBZ-Voice-SDK)🔗 GitHub: @ProMBZ

## 💬 Support

If you have questions, suggestions, or want to collaborate:📧 Email: [muhammadzohaib1415@gmail.com](mailto:muhammadzohaib1415@gmail.com)🌍 Portfolio: [https://kzml8bqhnxp4cn0duf08.lite.vusercontent.net/](https://kzml8bqhnxp4cn0duf08.lite.vusercontent.net/)

---

Made with ❤️ by Muhammad

```plaintext

This comprehensive README.md file includes all the essential details about the MBZ Voice SDK, including installation instructions, usage examples, API documentation, troubleshooting tips, and contribution guidelines. It's well-structured with clear sections and formatting to make it easy to navigate and understand.



<Actions>
  <Action name="Create a demo implementation" description="Build a simple demo app using the MBZ Voice SDK" />
  <Action name="Add code examples for Vue.js" description="Add specific code examples for Vue.js integration" />
  <Action name="Create backend API documentation" description="Generate detailed API documentation for the backend endpoints" />
  <Action name="Add deployment instructions" description="Create a guide for deploying the backend to production" />
  <Action name="Create a video tutorial" description="Outline steps for creating a video tutorial for the SDK" />
</Actions>


```