# Document Extraction Service

`document-extraction-service` is a Node.js library for seamless integration with document processing APIs. It provides request preparation, response validation, and callback handling functionalities, making document extraction workflows efficient and robust.

---

## Installation

To install the package, run:

```bash
npm install document-extraction-service
```

---

## Quick Start

### Configure the Request Validator

```javascript
const { createRequestValidator } = require('document-extraction-service');

const config = {
  endpoint: 'https://your-extraction-api.com',
  headers: {
    'Content-Type': 'multipart/form-data',
    'callback_url_pattern': 'https://your-service.com/callback/{{streamId}}/{{extractionStrategyId}}',
    'trace_id': '{{traceId}}'
  },
  requestBody: {
    'strategies_batch_id': '{{strategiesBatchId}}',
    'doc_id': '{{docId}}',
    'url': '{{file_url}}',
    'document_meta': '{{content}}',
  },
  timeout_days: 2, 
  max_retries: 3  
};

const requestValidator = createRequestValidator(config);
```

### Create the Callback Validator

```javascript
const { createCallbackValidator } = require('document-extraction-service');

const callbackValidator = createCallbackValidator();
```

### Processing a Document

```javascript
const processDocument = async () => {
  const docId = 'doc123';
  const content = { text: 'Your document content' };
  const streamId = 'stream456';

  try {
    // Prepare request parameters
    const requestParams = requestValidator.prepareRequest(docId, content, streamId);

    // Make API call (using your preferred HTTP client, e.g., axios)
    const response = await axios(requestParams);

    // Handle response
    const result = requestValidator.handleResponse(response, requestParams.headers['X-Trace-ID']);
    console.log('Document processing initiated:', result);
  } catch (error) {
    console.error('Error processing document:', error);
  }
};
```

### Handling Callback

```javascript
const handleCallback = async (callbackData) => {
  try {
    const result = await callbackValidator.handleCallback(callbackData);
    if (!result.success) {
      console.error(result.error);
    } else {
      console.log('Callback processed:', result);
    }
  } catch (error) {
    console.error('Error processing callback:', error);
  }
};
```

---

## API Documentation

### Configuration Object

```javascript
const config = {
  endpoint: 'https://api.example.com', // Required - API endpoint
  headers: {
    'Authorization': 'Bearer token',
    'callback_url_pattern': 'https://callback.com/{{docId}}/{{streamId}}' // Required
  },
  timeout_days: 2, // Optional - default: 2
  max_retries: 3 // Optional - default: 3
};
```

### Request Preparation

```javascript
const params = requestValidator.prepareRequest(docId, content, streamId);
```
#### Returns:
```javascript
{
  url: string,    'callback_url_pattern': 'https://your-service.com/callback/{{docId}}/{{streamId}}'

  method: 'POST',
  headers: {
    'X-Document-ID': string,
    'X-Trace-ID': string,
    'X-Callback-URL': string,
    ...other headers
  },
  data: {
    content: any,
    streamId: string
  }
}
```

### Response Handling

```javascript
const result = requestValidator.handleResponse(response, traceId);
```
#### Returns:
```javascript
{
  success: boolean,
  docId: string,
  traceId: string,
  message?: string,
  error?: string
}
```

### Callback Handling

```javascript
const result = await callbackValidator.handleCallback(callbackData);
```
#### Input Format:
```javascript
{
  doc_id: string,
  trace_id: string,
  chunk_data: Array<{
    content: string,
    index: number,
    chunkId: string,
    chunkText: string
  }>,
  last_batch: boolean
}
```
#### Returns:
```javascript
{
  success: boolean,
  docId: string,
  traceId: string,
  isLastBatch: boolean,
  chunks: Array<ProcessedChunk>,
  metadata: {
    processedAt: string,
    chunksCount: number
  }
}
```

---

## Additional Utilities

### Chunk Validation

```javascript
const {
  ChunkData,
  ExtractionConfig,
  CustomExtractorFactory
} = require('document-extraction-service');

// Validate chunks
ChunkData.validateResponse(chunksData);
ChunkData.validateChunk(chunk);
```

### Custom Configurations

```javascript
const config = new ExtractionConfig({...});
const factory = new CustomExtractorFactory();

const customRequestValidator = factory.createRequestValidator(config);
const customCallbackValidator = factory.createCallbackValidator();
```

---

## Error Handling

### Validating Input

```javascript
try {
  const result = await requestValidator.prepareRequest(docId, content, streamId);
} catch (error) {
  if (error.message.includes('Missing required field')) {
    // Handle validation error
  } else {
    // Handle other errors
  }
}
```

### Handling Callback Validation Errors

```javascript
try {
  const result = await callbackValidator.handleCallback(callbackData);
  if (!result.success) {
    // Handle validation failure
    console.error(result.error);
  }
} catch (error) {
  // Handle unexpected errors
  console.error(error);
}
```

---

## Testing

Run the provided test suite:

```bash
npm test
```

---

## Features
- **Request Preparation**: Simplifies constructing API requests with headers and parameters.
- **Response Validation**: Ensures API responses are correctly formatted.
- **Callback Processing**: Validates and processes callback data efficiently.
- **Customizable Configuration**: Supports flexible timeout, retry logic, and callback URL patterns.

---

## License
This project is licensed under the MIT License. Contributions are welcome!

