1 | # @jackdbd/eleventy-plugin-text-to-speech
|
2 |
|
3 | [![npm version](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech.svg)](https://badge.fury.io/js/@jackdbd%2Feleventy-plugin-text-to-speech)
|
4 | ![Snyk Vulnerabilities for npm package](https://img.shields.io/snyk/vulnerabilities/npm/@jackdbd%2Feleventy-plugin-text-to-speech)
|
5 |
|
6 | Eleventy plugin that synthesizes **any text** you want, on **any page** of your Eleventy site, using the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech). You can either self-host the audio assets this plugin generates, or host them on [Cloud Storage](https://cloud.google.com/storage).
|
7 |
|
8 | > :warning: The Cloud Text-to-Speech API has a [limit of 5000 characters](https://cloud.google.com/text-to-speech/quotas).
|
9 | >
|
10 | > See also:
|
11 | >
|
12 | > - [this issue of the Wavenet for Chrome extension](https://github.com/wavenet-for-chrome/extension/issues/12)
|
13 | >
|
14 | > - [this discussion on Google Groups](https://groups.google.com/g/google-translate-api/c/2JsRdq0tEdA)
|
15 |
|
16 |
|
17 |
|
18 | <details><summary>Table of Contents</summary>
|
19 |
|
20 | - [Installation](#installation)
|
21 | - [Preliminary Operations](#preliminary-operations)
|
22 | - [Enable the Text-to-Speech API](#enable-the-text-to-speech-api)
|
23 | - [Set up authentication via a service account](#set-up-authentication-via-a-service-account)
|
24 | - [Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)](#optional-create-cloud-storage-bucket-only-if-you-want-to-host-audio-files-on-cloud-storage)
|
25 | - [Usage](#usage)
|
26 | - [Self-hosting the generated audio assets](#self-hosting-the-generated-audio-assets)
|
27 | - [Hosting the generated audio assets on Cloud Storage](#hosting-the-generated-audio-assets-on-cloud-storage)
|
28 | - [Multiple hosts](#multiple-hosts)
|
29 | - [Configuration](#configuration)
|
30 | - [Required parameters](#required-parameters)
|
31 | - [Options](#options)
|
32 | - [Debug](#debug)
|
33 | - [Credits](#credits)
|
34 |
|
35 |
|
36 | </details>
|
37 |
|
38 | ## Installation
|
39 |
|
40 | ```sh
|
41 | npm install --save-dev @jackdbd/eleventy-plugin-text-to-speech
|
42 | ```
|
43 |
|
44 | ## Preliminary Operations
|
45 |
|
46 | ### Enable the Text-to-Speech API
|
47 |
|
48 | Before you can begin using the Text-to-Speech API, you must enable it. You can enable the API with the following command:
|
49 |
|
50 | ```sh
|
51 | gcloud services enable texttospeech.googleapis.com
|
52 | ```
|
53 |
|
54 | ### Set up authentication via a service account
|
55 |
|
56 | This plugin uses the [official Node.js client library for the Text-to-Speech API](https://github.com/googleapis/nodejs-text-to-speech). In order to authenticate to any Google Cloud API you will need some kind of credentials. At the moment this plugin supports only authentication via a service account JSON key.
|
57 |
|
58 | First, create a service account that can use the Text-to-Speech API. You can also reuse an existing service account if you want. This service account should have the necessary IAM permissions to create/delete objects in a Cloud Storage bucket. You can grant the service account the [Storage Object Admin predefined IAM role](https://cloud.google.com/storage/docs/access-control/iam-roles).
|
59 |
|
60 | ```sh
|
61 | gcloud iam service-accounts create sa-text-to-speech-user \
|
62 | --display-name "Text-to-Speech user SA"
|
63 | ```
|
64 |
|
65 | Second, [download the JSON key of this service account](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and store it somewhere safe. Do **not** track this file in git.
|
66 |
|
67 | ### Optional: Create Cloud Storage bucket (only if you want to host audio files on Cloud Storage)
|
68 |
|
69 | Create a Cloud Storage bucket in your desired [location](https://cloud.google.com/storage/docs/locations). Enable [uniform bucket-level access](https://cloud.google.com/storage/docs/uniform-bucket-level-access) and use the `nearline` [storage class](https://cloud.google.com/storage/docs/storage-classes).
|
70 |
|
71 | ```sh
|
72 | gsutil mb \
|
73 | -p $GCP_PROJECT_ID \
|
74 | -l $CLOUD_STORAGE_LOCATION \
|
75 | -c nearline \
|
76 | -b on \
|
77 | gs://bkt-eleventy-plugin-text-to-speech-audio-files
|
78 | ```
|
79 |
|
80 | If you want, you can check that uniform bucket-level access is **enabled** using this command:
|
81 |
|
82 | ```sh
|
83 | gsutil uniformbucketlevelaccess get \
|
84 | gs://bkt-eleventy-plugin-text-to-speech-audio-files
|
85 | ```
|
86 |
|
87 | Make the bucket's objects publicly available for read access (otherwise people will not be able to listen/download the audio files):
|
88 |
|
89 | ```sh
|
90 | gsutil iam ch allUsers:objectViewer \
|
91 | gs://bkt-eleventy-plugin-text-to-speech-audio-files
|
92 | ```
|
93 |
|
94 | ## Usage
|
95 |
|
96 | Let's say that you are hosting your Eleventy website on Cloudflare Pages. Your current deployment is at the URL indicated by the [environment variable](https://developers.cloudflare.com/pages/platform/build-configuration/#environment-variables) `CF_PAGES_URL`.
|
97 |
|
98 | ### Self-hosting the generated audio assets
|
99 |
|
100 | If you want to self-host the audio assets that this plugin generates and use all default options, you can register the plugin with this code:
|
101 |
|
102 | ```js
|
103 | const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
|
104 |
|
105 | module.exports = function (eleventyConfig) {
|
106 | // some eleventy configuration...
|
107 |
|
108 | eleventyConfig.addPlugin(tts, {
|
109 | audioHost: process.env.CF_PAGES_URL
|
110 | ? new URL(`${process.env.CF_PAGES_URL}/assets/audio`)
|
111 | : new URL('http://localhost:8090/assets/audio')
|
112 | })
|
113 |
|
114 | // some more eleventy configuration...
|
115 | }
|
116 | ```
|
117 |
|
118 | ### Hosting the generated audio assets on Cloud Storage
|
119 |
|
120 | If you want to host the audio assets on a Cloud Storage bucket and configure the rules for the audio matches, you could register the plugin using something like this:
|
121 |
|
122 | ```js
|
123 | const { plugin: tts } = require('@jackdbd/eleventy-plugin-text-to-speech')
|
124 |
|
125 | module.exports = function (eleventyConfig) {
|
126 | // some eleventy configuration...
|
127 |
|
128 | eleventyConfig.addPlugin(tts, {
|
129 | audioHost: {
|
130 | bucketName: 'some-bucket-containing-publicly-readable-files'
|
131 | },
|
132 | rules: [
|
133 | // synthesize the text contained in all <h1> tags, in all posts
|
134 | {
|
135 | regex: new RegExp('posts\\/.*\\.html$'),
|
136 | cssSelectors: ['h1']
|
137 | },
|
138 | // synthesize the text contained in all <p> tags that start with "Once upon a time", in all HTML pages, except the 404.html page
|
139 | {
|
140 | regex: new RegExp('^((?!404).)*\\.html$'),
|
141 | xPathExpressions: ['//p[starts-with(., "Once upon a time")]']
|
142 | }
|
143 | ],
|
144 | voice: 'en-GB-Wavenet-C'
|
145 | })
|
146 |
|
147 | // some more eleventy configuration...
|
148 | }
|
149 | ```
|
150 |
|
151 | ### Multiple hosts
|
152 |
|
153 | If you want to host the generated audio assets on multiple hosts, register this plugin multiple times. Here are a few examples:
|
154 |
|
155 | - self-host some audio assets, and host on a Cloud Storage bucket some other assets
|
156 | - host all audio assets on Cloud Storage, but host some on one bucket, and some others on a different bucket.
|
157 |
|
158 | Have a look at the Eleventy configuration of the [demo-site in this monorepo](../demo-site/README.md).
|
159 |
|
160 | ## Configuration
|
161 |
|
162 | ### Required parameters
|
163 |
|
164 | | Parameter | Explanation |
|
165 | | --- | --- |
|
166 | | `audioHost` | Each audio host should have a matching writer responsible for writing/uploading the assets to the host. |
|
167 |
|
168 | ### Options
|
169 |
|
170 | | Option | Default | Explanation |
|
171 | | --- | --- | --- |
|
172 | | `audioEncodings` | `['OGG_OPUS', 'MP3']` | List of [audio encodings](https://cloud.google.com/speech-to-text/docs/encoding#audio-encodings) to use when generating audio assets from text matches. |
|
173 | | `audioInnerHTML` | see in [src/dom.ts](./src/dom.ts) | Function to use to generate the innerHTML of the `<audio>` tag to inject in the page for each text match. |
|
174 | | `cacheExpiration` | `365d` | Expiration for the 11ty AssetCache. See [here](https://www.11ty.dev/docs/plugins/fetch/#change-the-cache-duration). |
|
175 | | `collectionName` | `audio-items` | Name of the 11ty collection created by this plugin. |
|
176 | | `keyFilename` | `process.env.GOOGLE_APPLICATION_CREDENTIALS` | credentials for the Cloud Text-to-Speech API (and for the Cloud Storage API if you don't set it in `audioHost`). |
|
177 | | `rules` | see in [src/constants.ts](./src/constants.ts) | Rules that determine which texts to convert into speech. |
|
178 | | `transformName` | `inject-audio-tags-into-html` | Name of the 11ty transform created by this plugin. |
|
179 | | `voice` | `en-US-Standard-J` | Voice to use when generating audio assets from text matches. The Speech-to-Text API supports [these voices](https://cloud.google.com/text-to-speech/docs/voices), and might have different [pricing](https://cloud.google.com/text-to-speech/pricing) for diffent voices. |
|
180 |
|
181 | > :warning: Don't forget to set either `keyFilename` or the `GOOGLE_APPLICATION_CREDENTIALS` environment variable on your build server.
|
182 | >
|
183 | > *Tip*: check what I did in the Eleventy configuration file for the [demo-site](../demo-site/README.md) of this monorepo.
|
184 |
|
185 | ## Debug
|
186 |
|
187 | This plugin uses the [debug](https://github.com/debug-js/debug) library for logging. You can control what's logged using the `DEBUG` environment variable. For example, if you set your environment variables in a `.envrc` file, you could do:
|
188 |
|
189 | ```sh
|
190 | # print all logging statements
|
191 | export DEBUG=eleventy-plugin-text-to-speech/*
|
192 |
|
193 | # print just the logging statements from the dom module and the writers module
|
194 | export DEBUG=eleventy-plugin-text-to-speech/dom,eleventy-plugin-text-to-speech/writers
|
195 |
|
196 | # print all logging statements, except the ones from the dom module and the transforms module
|
197 | export DEBUG=eleventy-plugin-text-to-speech/*,-eleventy-plugin-text-to-speech/dom,-eleventy-plugin-text-to-speech/transforms
|
198 | ```
|
199 |
|
200 | ## Credits
|
201 |
|
202 | I had the idea of this plugin while reading the code of the homonym [eleventy-plugin-text-to-speech](https://github.com/larryhudson/eleventy-plugin-text-to-speech) by [Larry Hudson](https://larryhudson.io/). There are a few differences between these plugins, the main one is that this plugin uses the [Google Cloud Text-to-Speech API](https://cloud.google.com/text-to-speech), while Larry's plugin uses the [Microsoft Azure Speech SDK](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-sdk).
|