# WordSensor v2.0.0 🚀

[![npm version](https://badge.fury.io/js/word-sensor.svg)](https://badge.fury.io/js/word-sensor)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![TypeScript](https://img.shields.io/badge/TypeScript-007ACC?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
[![Tests](https://img.shields.io/badge/Tests-36%20passed-brightgreen)](https://github.com/asruldev/word-sensor)

**WordSensor** is a powerful and flexible word filtering library for JavaScript/TypeScript. It helps you detect, replace, or remove forbidden words from text with advanced features like regex patterns, statistics, batch processing, and more.

## ✨ Features

- 🔍 **Advanced Detection**: Detect prohibited words with precise positioning
- 🚫 **Multiple Filtering Modes**: Replace, remove, or highlight forbidden words
- 🎭 **Smart Masking**: Full, partial, or smart masking options
- 📊 **Statistics & Analytics**: Track detections and get detailed insights
- 🔧 **Regex Support**: Use custom regex patterns for complex filtering
- 📦 **Batch Processing**: Process multiple texts efficiently
- 🎯 **Preset Filters**: Ready-to-use profanity, spam, and phishing filters
- 🔄 **Custom Replacers**: Create custom replacement functions
- 📈 **Real-time Monitoring**: Log and track all detections
- 🌐 **API Integration**: Load forbidden words from external APIs
- 📁 **File Support**: Import word lists from files
- ⚡ **High Performance**: Optimized for speed and memory efficiency
- 🎨 **Emoji Replacers**: Replace words with emojis
- 🔒 **Word Boundaries**: Configurable word boundary detection
- 📝 **TypeScript Support**: Full TypeScript definitions included

## 📦 Installation

```bash
npm install word-sensor
```

or

```bash
yarn add word-sensor
```

## 🚀 Quick Start

### Basic Usage

```typescript
import { WordSensor } from 'word-sensor';

// Create a sensor with forbidden words
const sensor = new WordSensor({
  words: ['badword', 'offensive', 'rude'],
  maskChar: '*',
  caseInsensitive: true,
  logDetections: true
});

// Filter text
const result = sensor.filter('This is a badword test.');
console.log(result); // "This is a ******* test."
```

### Using Preset Filters

```typescript
import { createProfanityFilter, createSpamFilter, createPhishingFilter } from 'word-sensor';

// Create specialized filters
const profanityFilter = createProfanityFilter();
const spamFilter = createSpamFilter();
const phishingFilter = createPhishingFilter();

// Use them
console.log(profanityFilter.filter('This is badword content.')); // "This is ******* content."
console.log(spamFilter.filter('Buy now! Free money!')); // "#### now! #### money!"
```

## 📚 API Reference

### WordSensor Class

#### Constructor

```typescript
new WordSensor(config?: WordSensorConfig)
```

**Configuration Options:**
- `words?: string[]` - Initial list of forbidden words
- `maskChar?: string` - Character used for masking (default: "*")
- `caseInsensitive?: boolean` - Case-insensitive matching (default: true)
- `logDetections?: boolean` - Enable detection logging (default: false)
- `enableRegex?: boolean` - Enable regex pattern support (default: false)
- `wordBoundary?: boolean` - Use word boundaries (default: true)
- `customReplacer?: (word: string, context: string) => string` - Custom replacement function

#### Core Methods

##### `filter(text: string, mode?: "replace" | "remove" | "highlight", maskType?: "full" | "partial" | "smart"): string`

Filter text with specified mode and masking type.

```typescript
// Replace with full masking
sensor.filter('This is badword.'); // "This is *******."

// Remove forbidden words
sensor.filter('This is badword.', 'remove'); // "This is ."

// Highlight forbidden words
sensor.filter('This is badword.', 'highlight'); // "This is [FILTERED: badword]."

// Smart masking
sensor.filter('This is badword.', 'replace', 'smart'); // "This is b****d."
```

##### `detect(text: string): string[]`

Detect all forbidden words in text.

```typescript
const detected = sensor.detect('This contains badword and offensive content.');
console.log(detected); // ["badword", "offensive"]
```

##### `detectWithPositions(text: string): Array<{word: string, start: number, end: number}>`

Detect forbidden words with their positions.

```typescript
const positions = sensor.detectWithPositions('This badword is offensive.');
console.log(positions);
// [
//   { word: "badword", start: 5, end: 12 },
//   { word: "offensive", start: 16, end: 25 }
// ]
```

#### Word Management

```typescript
// Add words
sensor.addWord('newbadword', '###'); // With custom mask
sensor.addWords(['word1', 'word2']);

// Remove words
sensor.removeWord('badword');
sensor.removeWords(['word1', 'word2']);

// Check words
sensor.hasWord('badword'); // true/false
sensor.getWords(); // Get all forbidden words
sensor.clearWords(); // Clear all words
```

#### Regex Patterns

```typescript
// Enable regex support
const regexSensor = new WordSensor({ enableRegex: true });

// Add regex patterns
regexSensor.addRegexPattern('\\b\\w+@\\w+\\.\\w+\\b', '[EMAIL]');
regexSensor.addRegexPattern('\\b\\d{4}-\\d{4}-\\d{4}-\\d{4}\\b', '[CARD]');

// Filter with regex
const result = regexSensor.filter('Contact me at test@example.com');
console.log(result); // "Contact me at [EMAIL]"
```

#### Statistics & Monitoring

```typescript
// Get detection statistics
const stats = sensor.getStats();
console.log(stats);
// {
//   totalDetections: 5,
//   uniqueWords: ["badword", "offensive"],
//   detectionCounts: { "badword": 3, "offensive": 2 },
//   lastDetectionTime: Date
// }

// Get detection logs
const logs = sensor.getDetectionLogs();
console.log(logs); // ["badword", "offensive", "badword", ...]

// Reset statistics
sensor.resetStats();
```

#### Configuration Methods

```typescript
// Update configuration
sensor.setMaskChar('#');
sensor.setCaseInsensitive(false);
sensor.setLogDetections(true);
sensor.setCustomReplacer((word) => `[${word.toUpperCase()}]`);
```

#### Utility Methods

```typescript
// Check if text is clean
sensor.isClean('This is clean text.'); // true
sensor.isClean('This has badword.'); // false

// Get clean percentage
sensor.getCleanPercentage('This badword is offensive.'); // 50

// Sanitize text (quick filter)
sensor.sanitizeText('This is badword.'); // "This is *******."
```

### Utility Functions

#### Preset Filters

```typescript
import { 
  createProfanityFilter, 
  createSpamFilter, 
  createPhishingFilter,
  PRESET_WORDS 
} from 'word-sensor';

// Create specialized filters
const profanityFilter = createProfanityFilter('*');
const spamFilter = createSpamFilter('#');
const phishingFilter = createPhishingFilter('!');

// Access preset word lists
console.log(PRESET_WORDS.profanity);
console.log(PRESET_WORDS.spam);
console.log(PRESET_WORDS.phishing);
```

#### Batch Processing

```typescript
import { batchFilter, batchDetect, getBatchStats } from 'word-sensor';

const texts = [
  'This is bad.',
  'This is offensive.',
  'This is clean.'
];

// Batch filter
const filtered = batchFilter(texts, sensor);
console.log(filtered);
// ["This is ***.", "This is *********.", "This is clean."]

// Batch detect
const detected = batchDetect(texts, sensor);
console.log(detected);
// [
//   { text: "This is bad.", detected: ["bad"] },
//   { text: "This is offensive.", detected: ["offensive"] },
//   { text: "This is clean.", detected: [] }
// ]

// Get batch statistics
const stats = getBatchStats(texts, sensor);
console.log(stats);
// {
//   totalTexts: 3,
//   cleanTexts: 1,
//   dirtyTexts: 2,
//   totalDetections: 2,
//   averageCleanPercentage: 66.67
// }
```

#### Custom Replacers

```typescript
import { createCustomReplacer, createEmojiReplacer } from 'word-sensor';

// Create custom replacer
const customReplacer = createCustomReplacer({
  'bad': 'good',
  'offensive': 'appropriate',
  'rude': 'polite'
});

// Create emoji replacer
const emojiReplacer = createEmojiReplacer();

// Use with sensor
sensor.setCustomReplacer(customReplacer);
sensor.setCustomReplacer(emojiReplacer);
```

#### Regex Utilities

```typescript
import { validateRegexPattern, escapeRegexSpecialChars } from 'word-sensor';

// Validate regex pattern
validateRegexPattern('\\b\\w+\\b'); // true
validateRegexPattern('invalid['); // false

// Escape special characters
escapeRegexSpecialChars('test.com'); // "test\\.com"
escapeRegexSpecialChars('test*test'); // "test\\*test"
```

#### API Integration

```typescript
import { loadForbiddenWordsFromAPI, loadWordsFromFile } from 'word-sensor';

// Load from API
await loadForbiddenWordsFromAPI(
  'https://api.example.com/forbidden-words',
  'data.words',
  sensor
);

// Load from file (browser)
const fileInput = document.getElementById('file') as HTMLInputElement;
const file = fileInput.files[0];
if (file) {
  const words = await loadWordsFromFile(file);
  sensor.addWords(words);
}
```

## 🎯 Advanced Examples

### Content Moderation System

```typescript
import { WordSensor, createProfanityFilter, createSpamFilter } from 'word-sensor';

class ContentModerator {
  private profanityFilter: WordSensor;
  private spamFilter: WordSensor;
  private customFilter: WordSensor;

  constructor() {
    this.profanityFilter = createProfanityFilter();
    this.spamFilter = createSpamFilter();
    this.customFilter = new WordSensor({
      enableRegex: true,
      wordBoundary: false
    });

    // Add custom patterns
    this.customFilter.addRegexPattern('\\b\\w+@\\w+\\.\\w+\\b', '[EMAIL]');
    this.customFilter.addRegexPattern('\\b\\d{10,}\\b', '[PHONE]');
  }

  moderateContent(content: string): {
    isClean: boolean;
    filteredContent: string;
    violations: string[];
    stats: any;
  } {
    // Apply all filters
    let filteredContent = content;
    const violations: string[] = [];

    // Check profanity
    const profanityDetected = this.profanityFilter.detect(content);
    if (profanityDetected.length > 0) {
      violations.push('profanity');
      filteredContent = this.profanityFilter.filter(filteredContent);
    }

    // Check spam
    const spamDetected = this.spamFilter.detect(content);
    if (spamDetected.length > 0) {
      violations.push('spam');
      filteredContent = this.spamFilter.filter(filteredContent);
    }

    // Apply custom filters
    filteredContent = this.customFilter.filter(filteredContent);

    return {
      isClean: violations.length === 0,
      filteredContent,
      violations,
      stats: {
        profanity: this.profanityFilter.getStats(),
        spam: this.spamFilter.getStats(),
        custom: this.customFilter.getStats()
      }
    };
  }
}

// Usage
const moderator = new ContentModerator();
const result = moderator.moderateContent('This is badword spam content with test@example.com');
console.log(result);
```

### Real-time Chat Filter

```typescript
import { WordSensor, createEmojiReplacer } from 'word-sensor';

class ChatFilter {
  private sensor: WordSensor;
  private messageHistory: string[] = [];

  constructor() {
    this.sensor = new WordSensor({
      words: ['badword', 'offensive'],
      logDetections: true,
      customReplacer: createEmojiReplacer()
    });
  }

  processMessage(message: string, userId: string): {
    filteredMessage: string;
    isClean: boolean;
    warning: string | null;
  } {
    const filteredMessage = this.sensor.filter(message);
    const isClean = this.sensor.isClean(message);
    
    // Check user history
    const userViolations = this.messageHistory.filter(msg => 
      msg.includes(userId) && !this.sensor.isClean(msg)
    ).length;

    let warning = null;
    if (!isClean) {
      if (userViolations >= 3) {
        warning = 'You have been warned multiple times. Further violations may result in a ban.';
      } else {
        warning = 'Please keep the chat appropriate.';
      }
    }

    // Log message
    this.messageHistory.push(`${userId}: ${message}`);

    return { filteredMessage, isClean, warning };
  }

  getModerationStats() {
    return this.sensor.getStats();
  }
}
```

### Batch Content Analysis

```typescript
import { WordSensor, batchDetect, getBatchStats } from 'word-sensor';

class ContentAnalyzer {
  private sensor: WordSensor;

  constructor() {
    this.sensor = new WordSensor({
      words: ['inappropriate', 'spam', 'offensive'],
      logDetections: true
    });
  }

  analyzeBatch(contentList: string[]): {
    summary: any;
    details: Array<{
      content: string;
      isClean: boolean;
      detectedWords: string[];
      cleanPercentage: number;
    }>;
  } {
    const batchResults = batchDetect(contentList, this.sensor);
    const batchStats = getBatchStats(contentList, this.sensor);

    const details = contentList.map((content, index) => ({
      content,
      isClean: batchResults[index].detected.length === 0,
      detectedWords: batchResults[index].detected,
      cleanPercentage: this.sensor.getCleanPercentage(content)
    }));

    return {
      summary: {
        ...batchStats,
        sensorStats: this.sensor.getStats()
      },
      details
    };
  }
}
```

## 🧪 Testing

```bash
# Run tests
npm test

# Run tests in watch mode
npm run test:watch

# Run tests with coverage
npm run test:coverage
```

## 📦 Build

```bash
# Build for production
npm run build

# Build in watch mode
npm run dev

# Clean build artifacts
npm run clean
```

## 🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## 📄 License

This project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.

## 👨‍💻 Author

Developed by [Asrul Harahap](https://github.com/asruldev). 

- GitHub: [@asruldev](https://github.com/asruldev)
- Twitter: [@asruldev](https://twitter.com/asruldev)

## 🙏 Acknowledgments

- Thanks to all contributors who helped improve this library
- Inspired by the need for better content moderation tools
- Built with TypeScript for better developer experience

## 📈 Changelog

### v2.0.0
- ✨ **Major Release**: Complete rewrite with advanced features
- 🔧 **New Constructor**: Config-based initialization
- 📊 **Statistics**: Comprehensive detection tracking
- 🔍 **Regex Support**: Custom regex pattern filtering
- 📦 **Batch Processing**: Efficient multi-text processing
- 🎯 **Preset Filters**: Ready-to-use specialized filters
- 🎨 **Custom Replacers**: Flexible replacement functions
- 📈 **Position Detection**: Get exact word positions
- 🔄 **Smart Masking**: Intelligent masking algorithms
- 🌐 **API Integration**: External word list loading
- 📁 **File Support**: Import word lists from files
- 🎨 **Emoji Replacers**: Fun emoji-based replacements
- 📝 **Enhanced Types**: Better TypeScript support
- 🧪 **Comprehensive Tests**: 36 test cases covering all features

### v1.0.5
- 🐛 Bug fixes and improvements
- 📝 Better documentation

---

⭐ **Star this repository if you find it useful!**