UNPKG

@jackdbd/eleventy-plugin-text-to-speech/README.md

Version:

10 kBMarkdownView Raw

1# @jackdbd/eleventy-plugin-text-to-speech
2
3[![npm version](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech.svg)](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech)
4![Snyk Vulnerabilities for npm package](https://img.shields.io/snyk/vulnerabilities/npm/@jackdbd%2Feleventy-plugin-text-to-speech)
5
6Eleventy plugin that synthesizes **any text** you want, on **any page** of your Eleventy site, using the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech). You can either self-host the audio assets this plugin generates, or host them on [Cloud Storage](https://cloud.google.com/storage).
7
8> :warning: The Cloud Text-to-Speech API has a [limit of 5000 characters](https://cloud.google.com/text-to-speech/quotas).
9> 
10> See also:
11>
12> - [this issue of the Wavenet for Chrome extension](https://github.com/wavenet-for-chrome/extension/issues/12)
13>
14> - [this discussion on Google Groups](https://groups.google.com/g/google-translate-api/c/2JsRdq0tEdA)
15
16<!-- START doctoc generated TOC please keep comment here to allow auto update -->
17<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
18<details><summary>Table of Contents</summary>
19
20- [Installation](#installation)
21- [Preliminary Operations](#preliminary-operations)
22  - [Enable the Text-to-Speech API](#enable-the-text-to-speech-api)
23  - [Set up authentication via a service account](#set-up-authentication-via-a-service-account)
24  - [Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)](#optional-create-cloud-storage-bucket-only-if-you-want-to-host-audio-files-on-cloud-storage)
25- [Usage](#usage)
26  - [Self-hosting the generated audio assets](#self-hosting-the-generated-audio-assets)
27  - [Hosting the generated audio assets on Cloud Storage](#hosting-the-generated-audio-assets-on-cloud-storage)
28  - [Multiple hosts](#multiple-hosts)
29- [Configuration](#configuration)
30  - [Required parameters](#required-parameters)
31  - [Options](#options)
32- [Debug](#debug)
33- [Credits](#credits)
34
35<!-- END doctoc generated TOC please keep comment here to allow auto update -->
36</details>
37
38## Installation
39
40```sh
41npm install --save-dev @jackdbd/eleventy-plugin-text-to-speech
42```
43
44## Preliminary Operations
45
46### Enable the Text-to-Speech API
47
48Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:
49
50```sh
51gcloud services enable texttospeech.googleapis.com
52```
53
54### Set up authentication via a service account
55
56This plugin uses the [official Node.js client library for the Text-to-Speech API](https://github.com/googleapis/nodejs-text-to-speech). In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.
57
58First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the [Storage Object Admin predefined IAM role](https://cloud.google.com/storage/docs/access-control/iam-roles).
59
60```sh
61gcloud iam service-accounts create sa-text-to-speech-user \
62  --display-name "Text-to-Speech user SA"
63```
64
65Second, [download the JSON key of this service account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and store it somewhere safe. Do **not** track this file in git.
66
67### Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)
68
69Create a Cloud Storage bucket in your desired [location](https://cloud.google.com/storage/docs/locations). Enable [uniform bucket-level access](https://cloud.google.com/storage/docs/uniform-bucket-level-access) and use the `nearline` [storage class](https://cloud.google.com/storage/docs/storage-classes).
70
71```sh
72gsutil mb \
73  -p $GCP_PROJECT_ID \
74  -l $CLOUD_STORAGE_LOCATION \
75  -c nearline \
76  -b on \
77  gs://bkt-eleventy-plugin-text-to-speech-audio-files
78```
79
80If you want, you can check that uniform bucket-level access is **enabled** using this command:
81
82```sh
83gsutil uniformbucketlevelaccess get \
84  gs://bkt-eleventy-plugin-text-to-speech-audio-files
85```
86
87Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):
88
89```sh
90gsutil iam ch allUsers:objectViewer \
91  gs://bkt-eleventy-plugin-text-to-speech-audio-files
92```
93
94## Usage
95
96Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the [environment variable](https://developers.cloudflare.com/pages/platform/build-configuration/#environment-variables) `CF_PAGES_URL`.
97
98### Self-hosting the generated audio assets
99
100If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:
101
102```js
103const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
104
105module.exports = function (eleventyConfig) {
106  // some eleventy configuration...
107
108  eleventyConfig.addPlugin(tts, {
109    audioHost: process.env.CF_PAGES_URL
110      ? new URL(`${process.env.CF_PAGES_URL}/assets/audio`)
111      : new URL('http://localhost:8090/assets/audio')
112  })
113
114  // some more eleventy configuration...
115}
116```
117
118### Hosting the generated audio assets on Cloud Storage
119
120If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:
121
122```js
123const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
124
125module.exports = function (eleventyConfig) {
126  // some eleventy configuration...
127
128  eleventyConfig.addPlugin(tts, {
129    audioHost: {
130      bucketName: 'some-bucket-containing-publicly-readable-files'
131    },
132    rules: [
133      // synthesize the text contained in all <h1> tags, in all posts
134      {
135        regex: new RegExp('posts\\/.*\\.html$'),
136        cssSelectors: ['h1']
137      },
138      // synthesize the text contained in all <p> tags that start with "Once upon a time", in all HTML pages, except the 404.html page
139      {
140        regex: new RegExp('^((?!404).)*\\.html$'),
141        xPathExpressions: ['//p[starts-with(., "Once upon a time")]']
142      }
143    ],
144    voice: 'en-GB-Wavenet-C'
145  })
146
147  // some more eleventy configuration...
148}
149```
150
151### Multiple hosts
152
153If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:
154
155- self-host some audio assets, and host on a Cloud Storage bucket some other assets
156- host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.
157
158Have a look at the Eleventy configuration of the [demo-site in this monorepo](../demo-site/README.md).
159
160## Configuration
161
162### Required parameters
163
164| Parameter | Explanation |
165| --- | --- |
166| `audioHost` | Each audio host should have a matching writer responsible for writing/uploading the assets to the host. |
167
168### Options
169
170| Option | Default | Explanation |
171| --- | --- | --- |
172| `audioEncodings` | `['OGG_OPUS', 'MP3']` | List of [audio encodings](https://cloud.google.com/speech-to-text/docs/encoding#audio-encodings) to use when generating audio assets from text matches. |
173| `audioInnerHTML` | see in [src/dom.ts](./src/dom.ts) | Function to use to generate the innerHTML of the `<audio>` tag to inject in the page for each text match. |
174| `cacheExpiration` | `365d` | Expiration for the 11ty AssetCache. See [here](https://www.11ty.dev/docs/plugins/fetch/#change-the-cache-duration). |
175| `collectionName` | `audio-items` | Name of the 11ty collection created by this plugin. |
176| `keyFilename` | `process.env.GOOGLE_APPLICATION_CREDENTIALS` | credentials for the Cloud Text-to-Speech API (and for the Cloud Storage API if you don't set it in `audioHost`). |
177| `rules` | see in [src/constants.ts](./src/constants.ts) | Rules that determine which texts to convert into speech. |
178| `transformName` | `inject-audio-tags-into-html` | Name of the 11ty transform created by this plugin. |
179| `voice` | `en-US-Standard-J` | Voice to use when generating audio assets from text matches. The Speech-to-Text API supports [these voices](https://cloud.google.com/text-to-speech/docs/voices), and might have different [pricing](https://cloud.google.com/text-to-speech/pricing) for diffent voices. |
180
181> :warning: Don't forget to set either `keyFilename` or the `GOOGLE_APPLICATION_CREDENTIALS` environment variable on your build server.
182>
183> *Tip*: check what I did in the Eleventy configuration file for the [demo-site](../demo-site/README.md) of this monorepo.
184
185## Debug
186
187This plugin uses the [debug](https://github.com/debug-js/debug) library for logging. You can control what's logged using the `DEBUG` environment variable. For example, if you set your environment variables in a `.envrc` file, you could do:
188
189```sh
190# print all logging statements
191export DEBUG=eleventy-plugin-text-to-speech/*
192
193# print just the logging statements from the dom module and the writers module
194export DEBUG=eleventy-plugin-text-to-speech/dom,eleventy-plugin-text-to-speech/writers
195
196# print all logging statements, except the ones from the dom module and the transforms module
197export DEBUG=eleventy-plugin-text-to-speech/*,-eleventy-plugin-text-to-speech/dom,-eleventy-plugin-text-to-speech/transforms
198```
199
200## Credits
201
202I had the idea of this plugin while reading the code of the homonym [eleventy-plugin-text-to-speech](https://github.com/larryhudson/eleventy-plugin-text-to-speech) by [Larry Hudson](https://larryhudson.io/). There are a few differences between these plugins, the main one is that this plugin uses the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech), while Larry's plugin uses the [Microsoft Azure Speech SDK](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdk).

1	`# @jackdbd/eleventy-plugin-text-to-speech`
2
3	`[![npm version](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech.svg)](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech)`
4	`![Snyk Vulnerabilities for npm package](https://img.shields.io/snyk/vulnerabilities/npm/@jackdbd%2Feleventy-plugin-text-to-speech)`
5
6	`Eleventy plugin that synthesizes any text you want, on any page of your Eleventy site, using the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech). You can either self-host the audio assets this plugin generates, or host them on [Cloud Storage](https://cloud.google.com/storage).`
7
8	`> :warning: The Cloud Text-to-Speech API has a [limit of 5000 characters](https://cloud.google.com/text-to-speech/quotas).`
9	`>`
10	`> See also:`
11	`>`
12	`> - [this issue of the Wavenet for Chrome extension](https://github.com/wavenet-for-chrome/extension/issues/12)`
13	`>`
14	`> - [this discussion on Google Groups](https://groups.google.com/g/google-translate-api/c/2JsRdq0tEdA)`
15
16	`<!-- START doctoc generated TOC please keep comment here to allow auto update -->`
17	`<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->`
18	`<details><summary>Table of Contents</summary>`
19
20	`- [Installation](#installation)`
21	`- [Preliminary Operations](#preliminary-operations)`
22	`- [Enable the Text-to-Speech API](#enable-the-text-to-speech-api)`
23	`- [Set up authentication via a service account](#set-up-authentication-via-a-service-account)`
24	`- [Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)](#optional-create-cloud-storage-bucket-only-if-you-want-to-host-audio-files-on-cloud-storage)`
25	`- [Usage](#usage)`
26	`- [Self-hosting the generated audio assets](#self-hosting-the-generated-audio-assets)`
27	`- [Hosting the generated audio assets on Cloud Storage](#hosting-the-generated-audio-assets-on-cloud-storage)`
28	`- [Multiple hosts](#multiple-hosts)`
29	`- [Configuration](#configuration)`
30	`- [Required parameters](#required-parameters)`
31	`- [Options](#options)`
32	`- [Debug](#debug)`
33	`- [Credits](#credits)`
34
35	`<!-- END doctoc generated TOC please keep comment here to allow auto update -->`
36	`</details>`
37
38	`## Installation`
39
40	```sh
41	`npm install --save-dev @jackdbd/eleventy-plugin-text-to-speech`
42	```
43
44	`## Preliminary Operations`
45
46	`### Enable the Text-to-Speech API`
47
48	`Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:`
49
50	```sh
51	`gcloud services enable texttospeech.googleapis.com`
52	```
53
54	`### Set up authentication via a service account`
55
56	`This plugin uses the [official Node.js client library for the Text-to-Speech API](https://github.com/googleapis/nodejs-text-to-speech). In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.`
57
58	`First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the [Storage Object Admin predefined IAM role](https://cloud.google.com/storage/docs/access-control/iam-roles).`
59
60	```sh
61	`gcloud iam service-accounts create sa-text-to-speech-user \`
62	`--display-name "Text-to-Speech user SA"`
63	```
64
65	`Second, [download the JSON key of this service account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and store it somewhere safe. Do not track this file in git.`
66
67	`### Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)`
68
69	Create a Cloud Storage bucket in your desired [location](https://cloud.google.com/storage/docs/locations). Enable [uniform bucket-level access](https://cloud.google.com/storage/docs/uniform-bucket-level-access) and use the `nearline` [storage class](https://cloud.google.com/storage/docs/storage-classes).
70
71	```sh
72	`gsutil mb \`
73	`-p $GCP_PROJECT_ID \`
74	`-l $CLOUD_STORAGE_LOCATION \`
75	`-c nearline \`
76	`-b on \`
77	`gs://bkt-eleventy-plugin-text-to-speech-audio-files`
78	```
79
80	`If you want, you can check that uniform bucket-level access is enabled using this command:`
81
82	```sh
83	`gsutil uniformbucketlevelaccess get \`
84	`gs://bkt-eleventy-plugin-text-to-speech-audio-files`
85	```
86
87	`Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):`
88
89	```sh
90	`gsutil iam ch allUsers:objectViewer \`
91	`gs://bkt-eleventy-plugin-text-to-speech-audio-files`
92	```
93
94	`## Usage`
95
96	Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the [environment variable](https://developers.cloudflare.com/pages/platform/build-configuration/#environment-variables) `CF_PAGES_URL`.
97
98	`### Self-hosting the generated audio assets`
99
100	`If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:`
101
102	```js
103	`const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')`
104
105	`module.exports = function (eleventyConfig) {`
106	`// some eleventy configuration...`
107
108	`eleventyConfig.addPlugin(tts, {`
109	`audioHost: process.env.CF_PAGES_URL`
110	? new URL(`${process.env.CF_PAGES_URL}/assets/audio`)
111	`: new URL('http://localhost:8090/assets/audio')`
112	`})`
113
114	`// some more eleventy configuration...`
115	`}`
116	```
117
118	`### Hosting the generated audio assets on Cloud Storage`
119
120	`If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:`
121
122	```js
123	`const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')`
124
125	`module.exports = function (eleventyConfig) {`
126	`// some eleventy configuration...`
127
128	`eleventyConfig.addPlugin(tts, {`
129	`audioHost: {`
130	`bucketName: 'some-bucket-containing-publicly-readable-files'`
131	`},`
132	`rules: [`
133	`// synthesize the text contained in all <h1> tags, in all posts`
134	`{`
135	`regex: new RegExp('posts\\/.*\\.html$'),`
136	`cssSelectors: ['h1']`
137	`},`
138	`// synthesize the text contained in all <p> tags that start with "Once upon a time", in all HTML pages, except the 404.html page`
139	`{`
140	`regex: new RegExp('^((?!404).)*\\.html$'),`
141	`xPathExpressions: ['//p[starts-with(., "Once upon a time")]']`
142	`}`
143	`],`
144	`voice: 'en-GB-Wavenet-C'`
145	`})`
146
147	`// some more eleventy configuration...`
148	`}`
149	```
150
151	`### Multiple hosts`
152
153	`If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:`
154
155	`- self-host some audio assets, and host on a Cloud Storage bucket some other assets`
156	`- host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.`
157
158	`Have a look at the Eleventy configuration of the [demo-site in this monorepo](../demo-site/README.md).`
159
160	`## Configuration`
161
162	`### Required parameters`
163
164	`\| Parameter \| Explanation \|`
165	`\| --- \| --- \|`
166	\| `audioHost` \| Each audio host should have a matching writer responsible for writing/uploading the assets to the host. \|
167
168	`### Options`
169
170	`\| Option \| Default \| Explanation \|`
171	`\| --- \| --- \| --- \|`
172	\| `audioEncodings` \| `['OGG_OPUS', 'MP3']` \| List of [audio encodings](https://cloud.google.com/speech-to-text/docs/encoding#audio-encodings) to use when generating audio assets from text matches. \|
173	\| `audioInnerHTML` \| see in [src/dom.ts](./src/dom.ts) \| Function to use to generate the innerHTML of the `<audio>` tag to inject in the page for each text match. \|
174	\| `cacheExpiration` \| `365d` \| Expiration for the 11ty AssetCache. See [here](https://www.11ty.dev/docs/plugins/fetch/#change-the-cache-duration). \|
175	\| `collectionName` \| `audio-items` \| Name of the 11ty collection created by this plugin. \|
176	\| `keyFilename` \| `process.env.GOOGLE_APPLICATION_CREDENTIALS` \| credentials for the Cloud Text-to-Speech API (and for the Cloud Storage API if you don't set it in `audioHost`). \|
177	\| `rules` \| see in [src/constants.ts](./src/constants.ts) \| Rules that determine which texts to convert into speech. \|
178	\| `transformName` \| `inject-audio-tags-into-html` \| Name of the 11ty transform created by this plugin. \|
179	\| `voice` \| `en-US-Standard-J` \| Voice to use when generating audio assets from text matches. The Speech-to-Text API supports [these voices](https://cloud.google.com/text-to-speech/docs/voices), and might have different [pricing](https://cloud.google.com/text-to-speech/pricing) for diffent voices. \|
180
181	> :warning: Don't forget to set either `keyFilename` or the `GOOGLE_APPLICATION_CREDENTIALS` environment variable on your build server.
182	`>`
183	`> Tip: check what I did in the Eleventy configuration file for the [demo-site](../demo-site/README.md) of this monorepo.`
184
185	`## Debug`
186
187	This plugin uses the [debug](https://github.com/debug-js/debug) library for logging. You can control what's logged using the `DEBUG` environment variable. For example, if you set your environment variables in a `.envrc` file, you could do:
188
189	```sh
190	`# print all logging statements`
191	`export DEBUG=eleventy-plugin-text-to-speech/*`
192
193	`# print just the logging statements from the dom module and the writers module`
194	`export DEBUG=eleventy-plugin-text-to-speech/dom,eleventy-plugin-text-to-speech/writers`
195
196	`# print all logging statements, except the ones from the dom module and the transforms module`
197	`export DEBUG=eleventy-plugin-text-to-speech/*,-eleventy-plugin-text-to-speech/dom,-eleventy-plugin-text-to-speech/transforms`
198	```
199
200	`## Credits`
201
202	I had the idea of this plugin while reading the code of the homonym [eleventy-plugin-text-to-speech](https://github.com/larryhudson/eleventy-plugin-text-to-speech) by [Larry Hudson](https://larryhudson.io/). There are a few differences between these plugins, the main one is that this plugin uses the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech), while Larry's plugin uses the [Microsoft Azure Speech SDK](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdk).