UNPKG

10 kBMarkdownView Raw
1# @jackdbd/eleventy-plugin-text-to-speech
2
3[![npm version](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech.svg)](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech)
4![Snyk Vulnerabilities for npm package](https://img.shields.io/snyk/vulnerabilities/npm/@jackdbd%2Feleventy-plugin-text-to-speech)
5
6Eleventy plugin that synthesizes **any text** you want, on **any page** of your Eleventy site, using the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech). You can either self-host the audio assets this plugin generates, or host them on [Cloud Storage](https://cloud.google.com/storage).
7
8> :warning: The Cloud Text-to-Speech API has a [limit of 5000 characters](https://cloud.google.com/text-to-speech/quotas).
9>
10> See also:
11>
12> - [this issue of the Wavenet for Chrome extension](https://github.com/wavenet-for-chrome/extension/issues/12)
13>
14> - [this discussion on Google Groups](https://groups.google.com/g/google-translate-api/c/2JsRdq0tEdA)
15
16<!-- START doctoc generated TOC please keep comment here to allow auto update -->
17<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
18<details><summary>Table of Contents</summary>
19
20- [Installation](#installation)
21- [Preliminary Operations](#preliminary-operations)
22 - [Enable the Text-to-Speech API](#enable-the-text-to-speech-api)
23 - [Set up authentication via a service account](#set-up-authentication-via-a-service-account)
24 - [Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)](#optional-create-cloud-storage-bucket-only-if-you-want-to-host-audio-files-on-cloud-storage)
25- [Usage](#usage)
26 - [Self-hosting the generated audio assets](#self-hosting-the-generated-audio-assets)
27 - [Hosting the generated audio assets on Cloud Storage](#hosting-the-generated-audio-assets-on-cloud-storage)
28 - [Multiple hosts](#multiple-hosts)
29- [Configuration](#configuration)
30 - [Required parameters](#required-parameters)
31 - [Options](#options)
32- [Debug](#debug)
33- [Credits](#credits)
34
35<!-- END doctoc generated TOC please keep comment here to allow auto update -->
36</details>
37
38## Installation
39
40```sh
41npm install --save-dev @jackdbd/eleventy-plugin-text-to-speech
42```
43
44## Preliminary Operations
45
46### Enable the Text-to-Speech API
47
48Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:
49
50```sh
51gcloud services enable texttospeech.googleapis.com
52```
53
54### Set up authentication via a service account
55
56This plugin uses the [official Node.js client library for the Text-to-Speech API](https://github.com/googleapis/nodejs-text-to-speech). In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.
57
58First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the [Storage Object Admin predefined IAM role](https://cloud.google.com/storage/docs/access-control/iam-roles).
59
60```sh
61gcloud iam service-accounts create sa-text-to-speech-user \
62 --display-name "Text-to-Speech user SA"
63```
64
65Second, [download the JSON key of this service account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and store it somewhere safe. Do **not** track this file in git.
66
67### Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)
68
69Create a Cloud Storage bucket in your desired [location](https://cloud.google.com/storage/docs/locations). Enable [uniform bucket-level access](https://cloud.google.com/storage/docs/uniform-bucket-level-access) and use the `nearline` [storage class](https://cloud.google.com/storage/docs/storage-classes).
70
71```sh
72gsutil mb \
73 -p $GCP_PROJECT_ID \
74 -l $CLOUD_STORAGE_LOCATION \
75 -c nearline \
76 -b on \
77 gs://bkt-eleventy-plugin-text-to-speech-audio-files
78```
79
80If you want, you can check that uniform bucket-level access is **enabled** using this command:
81
82```sh
83gsutil uniformbucketlevelaccess get \
84 gs://bkt-eleventy-plugin-text-to-speech-audio-files
85```
86
87Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):
88
89```sh
90gsutil iam ch allUsers:objectViewer \
91 gs://bkt-eleventy-plugin-text-to-speech-audio-files
92```
93
94## Usage
95
96Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the [environment variable](https://developers.cloudflare.com/pages/platform/build-configuration/#environment-variables) `CF_PAGES_URL`.
97
98### Self-hosting the generated audio assets
99
100If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:
101
102```js
103const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
104
105module.exports = function (eleventyConfig) {
106 // some eleventy configuration...
107
108 eleventyConfig.addPlugin(tts, {
109 audioHost: process.env.CF_PAGES_URL
110 ? new URL(`${process.env.CF_PAGES_URL}/assets/audio`)
111 : new URL('http://localhost:8090/assets/audio')
112 })
113
114 // some more eleventy configuration...
115}
116```
117
118### Hosting the generated audio assets on Cloud Storage
119
120If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:
121
122```js
123const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
124
125module.exports = function (eleventyConfig) {
126 // some eleventy configuration...
127
128 eleventyConfig.addPlugin(tts, {
129 audioHost: {
130 bucketName: 'some-bucket-containing-publicly-readable-files'
131 },
132 rules: [
133 // synthesize the text contained in all <h1> tags, in all posts
134 {
135 regex: new RegExp('posts\\/.*\\.html$'),
136 cssSelectors: ['h1']
137 },
138 // synthesize the text contained in all <p> tags that start with "Once upon a time", in all HTML pages, except the 404.html page
139 {
140 regex: new RegExp('^((?!404).)*\\.html$'),
141 xPathExpressions: ['//p[starts-with(., "Once upon a time")]']
142 }
143 ],
144 voice: 'en-GB-Wavenet-C'
145 })
146
147 // some more eleventy configuration...
148}
149```
150
151### Multiple hosts
152
153If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:
154
155- self-host some audio assets, and host on a Cloud Storage bucket some other assets
156- host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.
157
158Have a look at the Eleventy configuration of the [demo-site in this monorepo](../demo-site/README.md).
159
160## Configuration
161
162### Required parameters
163
164| Parameter | Explanation |
165| --- | --- |
166| `audioHost` | Each audio host should have a matching writer responsible for writing/uploading the assets to the host. |
167
168### Options
169
170| Option | Default | Explanation |
171| --- | --- | --- |
172| `audioEncodings` | `['OGG_OPUS', 'MP3']` | List of [audio encodings](https://cloud.google.com/speech-to-text/docs/encoding#audio-encodings) to use when generating audio assets from text matches. |
173| `audioInnerHTML` | see in [src/dom.ts](./src/dom.ts) | Function to use to generate the innerHTML of the `<audio>` tag to inject in the page for each text match. |
174| `cacheExpiration` | `365d` | Expiration for the 11ty AssetCache. See [here](https://www.11ty.dev/docs/plugins/fetch/#change-the-cache-duration). |
175| `collectionName` | `audio-items` | Name of the 11ty collection created by this plugin. |
176| `keyFilename` | `process.env.GOOGLE_APPLICATION_CREDENTIALS` | credentials for the Cloud Text-to-Speech API (and for the Cloud Storage API if you don't set it in `audioHost`). |
177| `rules` | see in [src/constants.ts](./src/constants.ts) | Rules that determine which texts to convert into speech. |
178| `transformName` | `inject-audio-tags-into-html` | Name of the 11ty transform created by this plugin. |
179| `voice` | `en-US-Standard-J` | Voice to use when generating audio assets from text matches. The Speech-to-Text API supports [these voices](https://cloud.google.com/text-to-speech/docs/voices), and might have different [pricing](https://cloud.google.com/text-to-speech/pricing) for diffent voices. |
180
181> :warning: Don't forget to set either `keyFilename` or the `GOOGLE_APPLICATION_CREDENTIALS` environment variable on your build server.
182>
183> *Tip*: check what I did in the Eleventy configuration file for the [demo-site](../demo-site/README.md) of this monorepo.
184
185## Debug
186
187This plugin uses the [debug](https://github.com/debug-js/debug) library for logging. You can control what's logged using the `DEBUG` environment variable. For example, if you set your environment variables in a `.envrc` file, you could do:
188
189```sh
190# print all logging statements
191export DEBUG=eleventy-plugin-text-to-speech/*
192
193# print just the logging statements from the dom module and the writers module
194export DEBUG=eleventy-plugin-text-to-speech/dom,eleventy-plugin-text-to-speech/writers
195
196# print all logging statements, except the ones from the dom module and the transforms module
197export DEBUG=eleventy-plugin-text-to-speech/*,-eleventy-plugin-text-to-speech/dom,-eleventy-plugin-text-to-speech/transforms
198```
199
200## Credits
201
202I had the idea of this plugin while reading the code of the homonym [eleventy-plugin-text-to-speech](https://github.com/larryhudson/eleventy-plugin-text-to-speech) by [Larry Hudson](https://larryhudson.io/). There are a few differences between these plugins, the main one is that this plugin uses the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech), while Larry's plugin uses the [Microsoft Azure Speech SDK](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdk).