# Firestore to BigQuery export
NPM package for copying and converting [Cloud Firestore](https://firebase.google.com/docs/firestore/) data to [BigQuery](https://cloud.google.com/bigquery/docs/).

<p align="center">
  <a href="LICENSE">
    <img src="https://img.shields.io/badge/license-MIT-brightgreen.svg?" alt="Software License" />
  </a>
  <a href="https://npmjs.org/package/firestore-to-bigquery-export">
    <img src="https://img.shields.io/npm/v/firestore-to-bigquery-export.svg?" alt="Packagist" />
  </a>
  <a href="https://npmjs.org/package/firestore-to-bigquery-export">
    <img src="https://img.shields.io/npm/dm/firestore-to-bigquery-export.svg?" alt="Packagist" />
  </a>
  <a href="https://github.com/Johannes-Berggren/firestore-to-bigquery-export/issues">
    <img src="https://img.shields.io/github/issues/Johannes-Berggren/firestore-to-bigquery-export.svg?" alt="Issues" />
  </a>
</p>

Firestore is awesome. BigQuery is awesome. But transferring data from Firestore to BigQuery sucks.
This package lets you plug and play your way out of config hell.

- Create a BigQuery dataset with tables corresponding to your Firestore collections.
- Table schemas are automatically generated based on your document property data types.
- Convert and copy your Firestore collections to BigQuery.

This package doesn't write anything to Firestore.

## Contents
  * [Installation](#installation)
  * [How to](#how-to)
    + [API](#api)
    + [Examples](#examples)
  * [Limitations](#limitations)
  * [Issues](#issues)
  * [Issues](#to-do)

## Installation
> npm i firestore-to-bigquery-export

```javascript
import bigExport from 'firestore-to-bigquery-export'

// or

const bigExport = require('firestore-to-bigquery-export')

// then

const GCPSA = require('./Your-Service-Account-File.json')
bigExport.setBigQueryConfig(GCPSA)
bigExport.setFirebaseConfig(GCPSA)
```

## How to

### API
```javascript
bigExport.setBigQueryConfig(
  serviceAccountFile // JSON
)
```

```javascript
bigExport.setFirebaseConfig(
  serviceAccountFile // JSON
)
```


```javascript
bigExport.createBigQueryTable(
  datasetID, // String
  collectionName, // String
  verbose // boolean
)
// returns Promise<Array>
```

```javascript
bigExport.copyToBigQuery(
  datasetID, // String
  collectionName, // String
  snapshot // firebase.firestore.QuerySnapshot
)
// returns Promise<number>
```

```javascript
bigExport.deleteBigQueryTable(
  datasetID, // String
  tableName // String
)
// returns Promise<Array>
```


### Examples
```javascript
/* Create table 'account' in BigQuery dataset 'firestore'. You have to create the dataset beforehand.
 * The given table name has to match the Firestore collection name.
 * Table schema will be autogenerated based on the datatypes found in the collections documents.
 */
await bigExport.createBigQueryTable('firestore', 'accounts')
```

Then, you can transport your data:
```javascript
/* Copying and converting all documents in the given Firestore collection snapshot.
 * Inserting each document as a row in tables with the same name as the collection, in the dataset named 'firestore'.
 * Cells (document properties) that doesn't match the table schema will be rejected.
 */
const snapshot = await firebase.collection('payments').get()
const result = await bigExport.copyToBigQuery('firestore', 'payments', snapshot)
console.log('Copied ' + result + ' documents to BigQuery.')

/*
 * You can do multiple collections async, like this.
 * If you get error messages, you should probably copy fewer collections at a time.
 */
const collectionNames = ['payments', 'profiles', 'ratings', 'users']

for (const name of collectionNames) {
  const snapshot = await firestore.collection(name).get()
  await bigExport.copyToBigQuery('firestore', name, snapshot)
}
```

After that, you may want to refresh your data. For the time being, the quick and dirty way is to delete your tables and make new ones:
```javascript
// Deleting the given BigQuery table.
await bigExport.deleteBigQueryTable('firestore', 'accounts')
```

## Keep in mind
* If there's even one prop value that's a FLOAT in your collection during schema generation, the column will be set to FLOAT.
* If there are ONLY INTs, the column will be set to INTEGER.
* All columns will be NULLABLE.

## Limitations
* Your Firestore data model should be consistent. If a property of documents in the same collection have different data types, you'll get errors.
* Patching existing BigQuery sets isn't supported (yet). To refresh your datasets, you can `deleteBigQueryTables()`, then `createBigQueryTables()` and then `copyCollectionsToBigQuery()`.
* Changed your Firestore data model? Delete the corresponding BigQuery table and run `createBigQueryTables()` to create a table with a new schema.
* When running this package via a Cloud Function, you may experience that your function times out if your Firestore is large, (Deadline Exceeded). You can then:
    * Increase the timeout for your Cloud Function in the [Google Cloud Platform Cloud Function Console](https://console.cloud.google.com/functions).
    * Run your function locally, using `firebase serve --only functions`. 

## Issues
Please use the [issue tracker](https://github.com/Johannes-Berggren/firestore-to-bigquery-export/issues).

## To-do
* Improve the handling of arrays.
* Implement patching of tables.
