1 | # Implementation notes
|
2 |
|
3 | This document is intended to discuss, at a high-level, how the pieces of jsxlate
|
4 | work together from an internal perspective.
|
5 |
|
6 | ## The parts:
|
7 |
|
8 | - [Extraction](#extraction)
|
9 | - [Transformation](#transform-plugin)
|
10 | - [Translation](#translation)
|
11 |
|
12 | ---
|
13 |
|
14 | ## Extraction
|
15 |
|
16 | The general goal of the extraction plugin is to:
|
17 |
|
18 | - find marked strings/elements for extraction
|
19 | - validate the strings/elements
|
20 | - sanitize elements with unsafe attributes
|
21 |
|
22 | This is achieved by processing the source through Babel, identifying AST nodes
|
23 | corresponding to `i18n()` or `<I18N>...</I18N>` and extracting their contents.
|
24 |
|
25 | ### Extracting i18n()
|
26 |
|
27 | For `i18n()` the process is currently simple: assert that there is exactly one
|
28 | argument to the function and that it is a `StringLiteral`. All other cases fail.
|
29 | The single argument will be extracted verbatim.
|
30 |
|
31 | ### Extracting <I18N>
|
32 |
|
33 | #### Validation
|
34 |
|
35 | The first step of extraction is to validate the source string. The following
|
36 | constraints are checked for each message:
|
37 |
|
38 | - No nesting of `<I18N>` messages is allowed
|
39 | - Any element with unsafe attributes must have an `i18n-id`
|
40 | - Multiple `<ReactComponent>`s of the same type must have (distinct) `i18n-id`s
|
41 | - Only `Identifier` and simple `MemberExpression` nodes are allowed inside
|
42 | `JSXExpressionContainer`s
|
43 |
|
44 | These checks are enforced via a `babel-traverse` path which maintains a context
|
45 | in order to track all of the names/ids of React Components encountered.
|
46 |
|
47 | #### Sanitization
|
48 |
|
49 | By default, the following attributes are whitelisted (tag: [attributes]):
|
50 |
|
51 | ```javascript
|
52 | whitelistedAttributes: {
|
53 | a: ['href'],
|
54 | img: ['alt'],
|
55 | '*': ['title', 'placeholder', 'alt', 'summary', 'i18n-id'],
|
56 | 'Pluralize': ['on'],
|
57 | 'Match': ['when'],
|
58 | },
|
59 | ```
|
60 |
|
61 | Any attributes for `*` will be merged with attributes for a specific tag or
|
62 | component. It [should be possible](../TODO.md) to specify these via the
|
63 | `.babelrc` [plugin options](https://babeljs.io/docs/plugins/#plugin-options),
|
64 | but this is not yet implemented.
|
65 |
|
66 | Any attribute not present in the whitelist will be removed.
|
67 |
|
68 | ### Message Output
|
69 |
|
70 | Finally, `babel-generator` is used on each message AST node after validation
|
71 | and sanitization. Comments are stripped during this phase. Messages are
|
72 | collected per source file in `extractFromSource()` and merged across files
|
73 | in `extractFromPaths()`.
|
74 |
|
75 |
|
76 | ---
|
77 |
|
78 |
|
79 | ## Transform plugin
|
80 |
|
81 | The transform plugin is intended to be used directly in the babel compilation
|
82 | chain, unlike the rest of the jsxlate plugins. It is expected to be specified in
|
83 | the user's `.babelrc`. Its purpose is to identify `<I18N>` components and
|
84 | transform them into message lookup sites. When authoring with jsxlate,
|
85 | messages are wrapped in `<I18N>...</I18N>` components, but when executed,
|
86 | these are actually turned into self-closing `<I18N/>` components, which have
|
87 | props of `message`, `context`, `arguments`, and `fallback`.
|
88 |
|
89 | ### The transformation process
|
90 |
|
91 | The transform plugin has a single-node-type visitor which looks only at
|
92 | `JSXElement` nodes. If the node is an `I18N` component, and one which has children,
|
93 | yet been transformed, then it will be transformed. (It may not have children if
|
94 | it is the node that was just transformed -- babel immediately re-visits paths
|
95 | after they are replaced.)
|
96 |
|
97 | The transformation process itself consists of a few steps:
|
98 |
|
99 | - determine the free variables present in the message
|
100 | - extracting the message
|
101 | - constructing a fallback renderer
|
102 | - assembling these into the new `I18N` callsite
|
103 |
|
104 | #### Determining the free variables
|
105 |
|
106 | Given the following message:
|
107 |
|
108 | ```js
|
109 | <I18N>{name} sold me a ${this.props.amount} potato.</I18N>
|
110 | ```
|
111 |
|
112 | The free variables are `name` and `this`: Their definition is not present within
|
113 | the message itself, and so must be supplied as arguments to the render function.
|
114 | The transformed source would be:
|
115 |
|
116 | ```js
|
117 | <I18N msg="{name} sold me a ${this.props.amount} potato."
|
118 | context={this}
|
119 | args={[name]}
|
120 | fallback={() => <span>{name} sold me a ${this.props.amount} potato.</span>}
|
121 | />
|
122 | ```
|
123 |
|
124 |
|
125 | #### Extracting the message
|
126 |
|
127 | The extraction is performed as before, with One Weird Trick: because the
|
128 | extraction process actually mutates the AST by removing sanitized attributes,
|
129 | it cannot be used on the same AST that is going to be used by the transformation
|
130 | process. Thus, the code is generated to string and then re-parsed with the
|
131 | extraction plugin, in a hacky form of immutability :)
|
132 |
|
133 |
|
134 | #### Constructing a fallback
|
135 |
|
136 | The fallback is a render function that is unchanged from the original source,
|
137 | and is what is generally used in the original language deployment. (Note: it
|
138 | is possible to translate the original-language strings by adding bundled
|
139 | messages to that language, but it is not recommended.)
|
140 |
|
141 | There are 2 steps to construct the fallback:
|
142 |
|
143 | - change the container element to a `<span>`
|
144 | - strip all i18n-ids, either in namespace or attribute form.
|
145 |
|
146 | The first is accomplished with direct AST manipulation, and the 2nd is
|
147 | performed using `babel-traverse` with a visitor that calls `stripI18nId` on
|
148 | each `JSXElement` node.
|
149 |
|
150 |
|
151 | #### Assembling the new <I18N> callsite
|
152 |
|
153 | The final step is to take all these pieces and glue them together. This is
|
154 | accomplished using the excellent `babel-template` library. `babel-template`
|
155 | allows you to interpolate AST variables into a string template, saving a bunch
|
156 | of time and boilerplate. From `transformation.js`, here is how it is used:
|
157 |
|
158 | ```js
|
159 | const transformElementMarker = template(`
|
160 | <I18N message={MESSAGE} context={this} args={ARGS} fallback={function() { return FALLBACK; }}/>
|
161 | `, {plugins: ['jsx']});
|
162 | ```
|
163 |
|
164 | This creates a function named `transformElementMarker` which will accept an
|
165 | object parameter containing the keys `MESSAGE`, `ARGS`, and `FALLBACK`. Each of
|
166 | the corresponding values will be interpolated into the AST of the parsed
|
167 | template. (See the git history of `src/transform.js` to see the full glory of
|
168 | the pre-`babel-template` version).
|
169 |
|
170 |
|
171 | ---
|
172 |
|
173 |
|
174 | ## Translation
|
175 |
|
176 | The goal of the translation process is to construct a bundle of functions, keyed
|
177 | on the extracted messages of the source code, which will render that message in
|
178 | a given language.
|
179 |
|
180 | For each marker node the following steps are performed:
|
181 |
|
182 | - extract the message from the node
|
183 | - look up the corresponding translation in the input
|
184 | - if a translation is found,
|
185 | - validate the translation
|
186 | - find any free variables therein
|
187 | - generate a renderer for that message
|
188 | - else mark it as missing
|
189 |
|
190 | Message extraction proceeds exactly as in the Transform plugin.
|
191 |
|
192 | ### Validate the translation
|
193 |
|
194 | To validate a translated message, all of the requirements of the extraction
|
195 | validation must be met. In addition, the following checks are performed to
|
196 | ensure consistency between the original source and the translation:
|
197 |
|
198 | - there must be the same number of React Components of a given name in both the
|
199 | source and the translation
|
200 | - there must be the same number of `i18n-id`s in both the source and translation
|
201 | - there must be the same number of named expression definitions in both the
|
202 | source and the translation
|
203 |
|
204 | #### Named Expression Definitions
|
205 |
|
206 | Named expression definitions are simply anything inside a `JSXExpressionContainer`
|
207 | node, e.g. `{foo}`, `{this.bar}`, or `{bar.baz.quux}`. These simple expressions
|
208 | can consist only of `Identifier`, `MemberExpression`, and `ThisExpression` nodes,
|
209 | anything else is invalid.
|
210 |
|
211 | ### Generating a renderer for a given node + translation
|
212 |
|
213 | The translation process can seem quite challenging but is in fact fairly simple.
|
214 | The key here is that the translator's input is almost a valid JSX expression, it
|
215 | just needs to have namespace syntax removed and any missing attributes added to
|
216 | it. As such, the process is:
|
217 |
|
218 | - strip namespaces from element names and remove `i18n-id` attributes
|
219 | - find all sanitized attributes in the original source and create a mapping of
|
220 | `i18n-id`/`ComponentName` to these attributes
|
221 | - `traverse()` the translated message and append sanitized attributes from the
|
222 | previous step to any nodes that match.
|