# Web Carbon Analyzer

A tool to measure the carbon footprint of websites using CO2.js and Playwright. This tool loads websites in a headless browser, tracks all resources loaded, and calculates the carbon emissions associated with each website visit. It also integrates Lighthouse for performance and accessibility audits, providing a comprehensive analysis of web pages.

## Features

- **Website Carbon Measurement**: Calculate the carbon footprint of any website
- **Detailed Resource Analysis**: Track and analyze all resources loaded by the website
- **Green Hosting Detection**: Identify whether resources are served from green-powered hosting
- **User Behavior Simulation**: Simulate realistic user interactions including scrolling
- **Comprehensive Results**: Get detailed breakdowns by resource type, domain, and more
- **Lighthouse Integration**: Audit web pages for performance, accessibility, and best practices
- **Flexible API**: Use as a library in your own projects or as a standalone CLI tool
- **Docker Support**: Run easily in containerized environments

## Installation

### As a Library

```bash
npm install @pierrad/web-carbon-analyzer
```

### As a CLI Tool

For global installation:

```bash
npm install -g @pierrad/web-carbon-analyzer
```

### From Source

1. Clone the repository:
   ```bash
   git clone <repository-url>
   cd web-carbon-analyzer
   ```

2. Install dependencies:
   ```bash
   npm install
   ```

3. Build the project:
   ```bash
   npm run build
   ```

4. Start the application:
   ```bash
   npm run start
   ```

## Usage

### As a Library

```javascript
// CommonJS
const { CarbonFootprintAnalyzer } = require('web-carbon-analyzer');

// ES Modules
import { CarbonFootprintAnalyzer } from 'web-carbon-analyzer';

async function analyzeWebsite() {
  // Create analyzer with default configuration
  const analyzer = new CarbonFootprintAnalyzer();
  
  // Or with custom configuration
  const customAnalyzer = new CarbonFootprintAnalyzer({
    browser: {
      headless: true,
      timeout: 30000,
      viewport: { width: 1440, height: 900 }
    },
    co2: {
      model: 'swd',
      includeGreenHostingCheck: true
    },
    output: {
      includeResourceDetails: true
    },
    // Lighthouse integration
    lighthouse: {
      enabled: true,
      port: 9222,  // Port for Chrome remote debugging
      thresholds: {
        performance: 70,
        accessibility: 80,
        'best-practices': 80,
        seo: 80
      },
      reports: {
        formats: {
          json: true,
          html: true
        }
      }
    }
  });
  
  try {
    // Analyze a website
    const results = await analyzer.analyze('https://example.com');
    
    // Access results
    console.log(`Total CO2: ${results.summary.totalEmissions}g`);
    console.log(`Total Size: ${results.summary.totalSize} bytes`);
    console.log(`Green Hosting: ${results.summary.percentageGreen}%`);
    
    // Examine detailed breakdowns
    for (const [type, data] of Object.entries(results.breakdowns.byType)) {
      console.log(`${type}: ${data.emissions}g CO2, ${data.size} bytes`);
    }
    
    // Access Lighthouse results if available
    if (results.lighthouse) {
      console.log(`Performance Score: ${results.lighthouse.scores.performance}`);
      console.log(`Accessibility Score: ${results.lighthouse.scores.accessibility}`);
      console.log(`First Contentful Paint: ${results.lighthouse.metrics.firstContentfulPaint}ms`);
      console.log(`Largest Contentful Paint: ${results.lighthouse.metrics.largestContentfulPaint}ms`);
    }
    
    return results;
  } catch (error) {
    console.error('Analysis failed:', error);
    throw error;
  }
}

analyzeWebsite();
```

### As a CLI Tool

```bash
# Basic usage
carbon-analyzer --url https://example.com

# Save results to a file
carbon-analyzer --url https://example.com --output results.json

# With detailed resource breakdown
carbon-analyzer --url https://example.com --detailed

# Custom viewport and scrolling
carbon-analyzer --url https://example.com --viewport 1440x900 --scroll-depth 10

# Use a specific CO2 calculation model
carbon-analyzer --url https://example.com --model 1byte

# Run with Lighthouse audit
carbon-analyzer --url https://example.com --lighthouse

# Run with Lighthouse audit and custom performance thresholds
carbon-analyzer --url https://example.com --lighthouse --lighthouse-performance-threshold 80
```

### Docker Usage

```bash
# Build the Docker image
docker build -t web-carbon-analyzer .
```

```bash
# Basic usage
docker run web-carbon-analyzer --url https://example.com

# Save results to a mounted volume
docker run -v $(pwd)/results:/data web-carbon-analyzer --url https://example.com --output /data/results.json

# With detailed options
docker run web-carbon-analyzer --url https://example.com --detailed --viewport 1440x900 --scroll-depth 10
```

## Lighthouse Integration

Web Carbon Analyzer can run a Lighthouse audit alongside the carbon footprint analysis, providing performance and accessibility metrics for your website.

### Lighthouse Metrics

The following metrics are included in the Lighthouse report:

- **Performance Score**: Overall performance score (0-100)
- **Accessibility Score**: Accessibility compliance score (0-100)
- **First Contentful Paint**: Time until the first content is rendered
- **Largest Contentful Paint**: Time until the largest content element is rendered
- **Total Blocking Time**: Sum of all time periods between FCP and TTI
- **Speed Index**: How quickly the contents of a page are visibly populated
- **Time to Interactive**: Time it takes for the page to become fully interactive
- **Cumulative Layout Shift**: Measures visual stability of a page

### Enabling Lighthouse

#### As a Library

```javascript
const analyzer = new CarbonFootprintAnalyzer({
  lighthouse: {
    enabled: true,
    port: 9222,  // Port for Chrome remote debugging
    thresholds: {
      performance: 70,
      accessibility: 80,
      'best-practices': 80,
      seo: 80
    }
  }
});
```

#### As a CLI Tool

```bash
carbon-analyzer --url https://example.com --lighthouse
```

You can also set performance thresholds:

```bash
carbon-analyzer --url https://example.com --lighthouse --lighthouse-performance-threshold 80 --lighthouse-accessibility-threshold 90
```

## API Reference

### `CarbonFootprintAnalyzer`

The main class for analyzing websites.

#### Constructor

```typescript
new CarbonFootprintAnalyzer(config?: Partial<AppConfig>)
```

- `config`: Optional configuration object to customize the analyzer behavior.

#### Methods

##### `analyze(url: string): Promise<FormattedResults>`

Analyzes a website and returns the carbon footprint data.

- `url`: The URL of the website to analyze.
- Returns: A promise that resolves to the analysis results.

##### `getConfig(): AppConfig`

Gets the current configuration.

- Returns: The current configuration object.

##### `updateConfig(newConfig: Partial<AppConfig>): void`

Updates the current configuration.

- `newConfig`: Partial configuration object to merge with the current configuration.

##### `loadCookies(cookiesPath: string): Promise<void>`

Loads cookies from a JSON file for authentication or session management.

- `cookiesPath`: Path to a JSON file containing cookies.

### Configuration Options

The `AppConfig` interface contains the following configuration sections:

#### `browser`

Browser and page configuration options.

- `type`: Browser type ('chromium', 'firefox', or 'webkit') (default: 'chromium')
- `headless`: Whether to run the browser in headless mode (default: true)
- `timeout`: Page load timeout in milliseconds (default: 30000)
- `networkIdleTimeout`: Wait time for network idle in milliseconds (default: 1000)
- `waitForNetworkIdle`: Whether to wait for network idle (default: true)
- `viewport`: Browser viewport dimensions (default: { width: 1920, height: 1080 })
- `userAgent`: Custom user agent string (optional)
- `httpCredentials`: HTTP authentication credentials (optional)

#### `userBehavior`

User behavior simulation options.

- `scrollDepth`: Number of page heights to scroll (default: 3)
- `scrollDelay`: Delay between scrolls in milliseconds (default: 500)
- `maxScrollTime`: Maximum scroll time in milliseconds (default: 10000)

#### `co2`

Carbon calculation options.

- `model`: CO2 calculation model ('swd' or '1byte') (default: 'swd')
- `includeGreenHostingCheck`: Whether to check for green hosting (default: true)

#### `output`

Output format options.

- `format`: Output format ('json' or 'csv') (default: 'json')
- `includeResourceDetails`: Whether to include detailed resource breakdown (default: false)
- `includeComparisons`: Whether to include comparisons (default: true)

#### `lighthouse`

Lighthouse integration options.

- `enabled`: Whether to enable Lighthouse audits (default: false)
- `port`: Port for Chrome remote debugging (default: 9222)
- `thresholds`: Performance thresholds for audits
  - `performance`: Minimum performance score (default: 70)
  - `accessibility`: Minimum accessibility score (default: 80)
  - `best-practices`: Minimum best practices score (default: 80)
  - `seo`: Minimum SEO score (default: 80)
- `reports`: Report generation options
  - `formats`: Report formats to generate (default: { json: true, html: true })

### Result Structure

The `FormattedResults` interface contains the following sections:

#### `metadata`

Metadata about the analysis.

- `url`: The URL that was analyzed
- `timestamp`: When the analysis was performed
- `duration`: Duration of the analysis in milliseconds
- `configuration`: Configuration used for the analysis

#### `summary`

Summary of the carbon footprint.

- `totalEmissions`: Total CO2 emissions in grams
- `totalSize`: Total page size in bytes
- `totalRequests`: Total number of requests
- `percentageGreen`: Percentage of resources from green hosting
- `comparisons`: Comparisons with real-world equivalents

#### `breakdowns`

Detailed breakdowns of carbon emissions.

- `byType`: Emissions broken down by resource type
- `byDomain`: Emissions broken down by domain
- `byHosting`: Emissions broken down by hosting type (green/non-green)

#### `resources`

(Optional) Detailed information about each resource.

#### `lighthouse`

(Optional) Lighthouse audit results.

- `scores`: Audit scores for performance, accessibility, best practices, and SEO
- `metrics`: Key performance metrics such as First Contentful Paint and Largest Contentful Paint
- `stackPacks`: Lighthouse stack packs for detailed analysis

## Command Line Options

| Option | Description |
|--------|-------------|
| `-u, --url <url>` | URL to analyze (required) |
| `-o, --output <file>` | Output file path |
| `-f, --format <format>` | Output format (json) |
| `-t, --timeout <ms>` | Page load timeout in milliseconds |
| `-w, --wait <ms>` | Wait time for network idle in milliseconds |
| `-m, --model <model>` | CO2 calculation model (swd, 1byte) |
| `-s, --scroll-depth <number>` | Number of page heights to scroll |
| `-v, --verbose` | Enable verbose logging |
| `-q, --quiet` | Suppress console output |
| `--detailed` | Include detailed resource breakdown in output |
| `--no-headless` | Run browser in non-headless mode |
| `--viewport <size>` | Viewport size (e.g., 1920x1080) |
| `--user-agent <string>` | Custom user agent string |
| `--cookies <file>` | Path to JSON file containing cookies |
| `--htaccess-username <username>` | Username for htaccess authentication |
| `--htaccess-password <password>` | Password for htaccess authentication |
| `--no-green-hosting-check` | Disable green hosting checks |
| `--lighthouse` | Enable Lighthouse audits |
| `--lighthouse-port <port>` | Port for Chrome remote debugging (default: 9222) |
| `--lighthouse-performance-threshold <score>` | Minimum performance score threshold (default: 70) |
| `--lighthouse-accessibility-threshold <score>` | Minimum accessibility score threshold (default: 80) |
| `--lighthouse-best-practices-threshold <score>` | Minimum best practices score threshold (default: 80) |
| `--lighthouse-seo-threshold <score>` | Minimum SEO score threshold (default: 80) |

## Example Usage Patterns

### Multiple Website Analysis

You can find example usage patterns in the `examples` directory, including:
- Basic usage example
- Multiple website analysis in parallel
- Web app example

## License

MIT
