1 | # @jackdbd/eleventy-plugin-text-to-speech
2 |
3 | [![npm version](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech.svg)](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech)
4 | ![Snyk Vulnerabilities for npm package](https://img.shields.io/snyk/vulnerabilities/npm/@jackdbd%2Feleventy-plugin-text-to-speech)
5 |
6 | Eleventy plugin that synthesizes **any text** you want, on **any page** of your Eleventy site, using the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech). You can either self-host the audio assets this plugin generates, or host them on [Cloud Storage](https://cloud.google.com/storage).
7 |
8 | > :warning: The Cloud Text-to-Speech API has a [limit of 5000 characters](https://cloud.google.com/text-to-speech/quotas).
9 | >
10 | > See also:
11 | >
12 | > - [this issue of the Wavenet for Chrome extension](https://github.com/wavenet-for-chrome/extension/issues/12)
13 | >
14 | > - [this discussion on Google Groups](https://groups.google.com/g/google-translate-api/c/2JsRdq0tEdA)
15 |
16 |
17 |
18 | <details><summary>Table of Contents</summary>
19 |
20 | - [Installation](#installation)
21 | - [Preliminary Operations](#preliminary-operations)
22 | - [Enable the Text-to-Speech API](#enable-the-text-to-speech-api)
23 | - [Set up authentication via a service account](#set-up-authentication-via-a-service-account)
24 | - [Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)](#optional-create-cloud-storage-bucket-only-if-you-want-to-host-audio-files-on-cloud-storage)
25 | - [Usage](#usage)
26 | - [Self-hosting the generated audio assets](#self-hosting-the-generated-audio-assets)
27 | - [Hosting the generated audio assets on Cloud Storage](#hosting-the-generated-audio-assets-on-cloud-storage)
28 | - [Multiple hosts](#multiple-hosts)
29 | - [Configuration](#configuration)
30 | - [Required parameters](#required-parameters)
31 | - [Options](#options)
32 | - [Debug](#debug)
33 | - [Credits](#credits)
34 |
35 |
36 | </details>
37 |
38 | ## Installation
39 |
40 | ```sh
41 | npm install --save-dev @jackdbd/eleventy-plugin-text-to-speech
42 | ```
43 |
44 | ## Preliminary Operations
45 |
46 | ### Enable the Text-to-Speech API
47 |
48 | Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:
49 |
50 | ```sh
51 | gcloud services enable texttospeech.googleapis.com
52 | ```
53 |
54 | ### Set up authentication via a service account
55 |
56 | This plugin uses the [official Node.js client library for the Text-to-Speech API](https://github.com/googleapis/nodejs-text-to-speech). In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.
57 |
58 | First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the [Storage Object Admin predefined IAM role](https://cloud.google.com/storage/docs/access-control/iam-roles).
59 |
60 | ```sh
61 | gcloud iam service-accounts create sa-text-to-speech-user \
62 | --display-name "Text-to-Speech user SA"
63 | ```
64 |
65 | Second, [download the JSON key of this service account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and store it somewhere safe. Do **not** track this file in git.
66 |
67 | ### Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)
68 |
69 | Create a Cloud Storage bucket in your desired [location](https://cloud.google.com/storage/docs/locations). Enable [uniform bucket-level access](https://cloud.google.com/storage/docs/uniform-bucket-level-access) and use the `nearline` [storage class](https://cloud.google.com/storage/docs/storage-classes).
70 |
71 | ```sh
72 | gsutil mb \
73 | -p $GCP_PROJECT_ID \
75 | -c nearline \
76 | -b on \
77 | gs://bkt-eleventy-plugin-text-to-speech-audio-files
78 | ```
79 |
80 | If you want, you can check that uniform bucket-level access is **enabled** using this command:
81 |
82 | ```sh
83 | gsutil uniformbucketlevelaccess get \
84 | gs://bkt-eleventy-plugin-text-to-speech-audio-files
85 | ```
86 |
87 | Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):
88 |
89 | ```sh
90 | gsutil iam ch allUsers:objectViewer \
91 | gs://bkt-eleventy-plugin-text-to-speech-audio-files
92 | ```
93 |
94 | ## Usage
95 |
96 | Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the [environment variable](https://developers.cloudflare.com/pages/platform/build-configuration/#environment-variables) `CF_PAGES_URL`.
97 |
98 | ### Self-hosting the generated audio assets
99 |
100 | If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:
101 |
102 | ```js
103 | const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
104 |
105 | module.exports = function (eleventyConfig) {
106 | // some eleventy configuration...
107 |
108 | eleventyConfig.addPlugin(tts, {
109 | audioHost: process.env.CF_PAGES_URL
110 | ? new URL(`${process.env.CF_PAGES_URL}/assets/audio`)
111 | : new URL('http://localhost:8090/assets/audio')
112 | })
113 |
114 | // some more eleventy configuration...
115 | }
116 | ```
117 |
118 | ### Hosting the generated audio assets on Cloud Storage
119 |
120 | If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:
121 |
122 | ```js
123 | const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
124 |
125 | module.exports = function (eleventyConfig) {
126 | // some eleventy configuration...
127 |
128 | eleventyConfig.addPlugin(tts, {
129 | audioHost: {
130 | bucketName: 'some-bucket-containing-publicly-readable-files'
131 | },
132 | rules: [
133 | // synthesize the text contained in all <h1> tags, in all posts
134 | {
135 | regex: new RegExp('posts\\/.*\\.html$'),
136 | cssSelectors: ['h1']
137 | },
138 | // synthesize the text contained in all <p> tags that start with "Once upon a time", in all HTML pages, except the 404.html page
139 | {
140 | regex: new RegExp('^((?!404).)*\\.html$'),
141 | xPathExpressions: ['//p[starts-with(., "Once upon a time")]']
142 | }
143 | ],
144 | voice: 'en-GB-Wavenet-C'
145 | })
146 |
147 | // some more eleventy configuration...
148 | }
149 | ```
150 |
151 | ### Multiple hosts
152 |
153 | If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:
154 |
155 | - self-host some audio assets, and host on a Cloud Storage bucket some other assets
156 | - host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.
157 |
158 | Have a look at the Eleventy configuration of the [demo-site in this monorepo](../demo-site/README.md).
159 |
160 | ## Configuration
161 |
162 | ### Required parameters
163 |
164 | | Parameter | Explanation |
165 | | --- | --- |
166 | | `audioHost` | Each audio host should have a matching writer responsible for writing/uploading the assets to the host. |
167 |
168 | ### Options
169 |
170 | | Option | Default | Explanation |
171 | | --- | --- | --- |
172 | | `audioEncodings` | `['OGG_OPUS', 'MP3']` | List of [audio encodings](https://cloud.google.com/speech-to-text/docs/encoding#audio-encodings) to use when generating audio assets from text matches. |
173 | | `audioInnerHTML` | see in [src/dom.ts](./src/dom.ts) | Function to use to generate the innerHTML of the `<audio>` tag to inject in the page for each text match. |
174 | | `cacheExpiration` | `365d` | Expiration for the 11ty AssetCache. See [here](https://www.11ty.dev/docs/plugins/fetch/#change-the-cache-duration). |
175 | | `collectionName` | `audio-items` | Name of the 11ty collection created by this plugin. |
176 | | `keyFilename` | `process.env.GOOGLE_APPLICATION_CREDENTIALS` | credentials for the Cloud Text-to-Speech API (and for the Cloud Storage API if you don't set it in `audioHost`). |
177 | | `rules` | see in [src/constants.ts](./src/constants.ts) | Rules that determine which texts to convert into speech. |
178 | | `transformName` | `inject-audio-tags-into-html` | Name of the 11ty transform created by this plugin. |
179 | | `voice` | `en-US-Standard-J` | Voice to use when generating audio assets from text matches. The Speech-to-Text API supports [these voices](https://cloud.google.com/text-to-speech/docs/voices), and might have different [pricing](https://cloud.google.com/text-to-speech/pricing) for diffent voices. |
180 |
181 | > :warning: Don't forget to set either `keyFilename` or the `GOOGLE_APPLICATION_CREDENTIALS` environment variable on your build server.
182 | >
183 | > *Tip*: check what I did in the Eleventy configuration file for the [demo-site](../demo-site/README.md) of this monorepo.
184 |
185 | ## Debug
186 |
187 | This plugin uses the [debug](https://github.com/debug-js/debug) library for logging. You can control what's logged using the `DEBUG` environment variable. For example, if you set your environment variables in a `.envrc` file, you could do:
188 |
189 | ```sh
190 | # print all logging statements
191 | export DEBUG=eleventy-plugin-text-to-speech/*
192 |
193 | # print just the logging statements from the dom module and the writers module
194 | export DEBUG=eleventy-plugin-text-to-speech/dom,eleventy-plugin-text-to-speech/writers
195 |
196 | # print all logging statements, except the ones from the dom module and the transforms module
197 | export DEBUG=eleventy-plugin-text-to-speech/*,-eleventy-plugin-text-to-speech/dom,-eleventy-plugin-text-to-speech/transforms
198 | ```
199 |
200 | ## Credits
201 |
202 | I had the idea of this plugin while reading the code of the homonym [eleventy-plugin-text-to-speech](https://github.com/larryhudson/eleventy-plugin-text-to-speech) by [Larry Hudson](https://larryhudson.io/). There are a few differences between these plugins, the main one is that this plugin uses the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech), while Larry's plugin uses the [Microsoft Azure Speech SDK](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdk).