# Research-Driven AI Safety Testing

This task guides comprehensive safety testing for AI agents through research-driven methodology, focusing on discovering current safety standards and testing approaches rather than prescriptive static implementations.

## Research-First Safety Assessment

[[LLM: Begin by researching current AI safety testing methodologies, standards, and emerging threats. Understand the specific safety requirements and regulatory landscape before implementing safety testing solutions.]]

### 1. Research Safety Testing Approaches

**Safety Framework Research Areas**:

- Current AI safety testing frameworks and methodologies (OWASP AI, NIST AI RMF, etc.)
- Latest developments in AI alignment and safety research
- Industry-standard safety metrics and evaluation techniques
- Emerging AI threats and attack vectors (prompt injection, jailbreaking, etc.)
- Regulatory compliance requirements and safety standards

**Testing Methodology Research**:

- Adversarial testing techniques for AI systems
- Bias detection and fairness evaluation methods
- Privacy protection testing approaches for LLM applications
- Content safety filtering and moderation techniques
- Red team testing strategies for AI agents

### 2. Research-Based Implementation Strategy

[[LLM: Based on your research findings, implement safety testing using current best practices. Focus on:

1. **Safety Framework Selection**: Choose safety frameworks based on researched industry standards and project requirements
2. **Testing Methodology Design**: Design safety tests using current evaluation techniques and threat models
3. **Tool Integration**: Integrate safety testing tools based on researched capabilities and effectiveness
4. **Compliance Validation**: Ensure compliance using current regulatory requirements and safety standards
5. **Continuous Monitoring**: Implement safety monitoring using research-backed detection and mitigation strategies

Document your safety implementation choices and rationale based on the research conducted.]]

### 3. Safety Assessment Framework

**Research Current Safety Approaches**:

- Investigate content safety detection and filtering techniques
- Study bias evaluation methodologies for AI systems
- Research privacy protection mechanisms for LLM applications
- Analyze adversarial robustness testing approaches for AI agents

**Implementation Areas**:

- Establish safety baselines using researched evaluation methodologies
- Configure adversarial testing scenarios based on current threat intelligence
- Set up bias detection using research-informed fairness metrics
- Implement privacy protection using current best practices

### 4. Validation and Continuous Monitoring

**Research Validation Methodologies**:

- Investigate safety regression testing techniques for AI systems
- Study continuous safety monitoring approaches for production AI applications
- Research incident response methodologies for AI safety violations
- Analyze safety optimization strategies based on evaluation data

**Implementation Validation**:

- Apply research-backed safety testing methodologies to validate agent safety
- Use current safety analysis techniques to identify vulnerabilities
- Implement safety monitoring and alerting based on researched best practices
- Establish continuous safety improvement processes using current mitigation strategies

---

**Note**: This task emphasizes research-driven safety testing over prescriptive static implementations. Always research current AI safety standards and adapt to your specific agent architecture and safety requirements.
