UNPKG

jsxlate/src/README.md

Version:

7.99 kBMarkdownView Raw

1# Implementation notes
2
3This document is intended to discuss, at a high-level, how the pieces of jsxlate
4work together from an internal perspective.
5
6## The parts:
7
8- [Extraction](#extraction)
9- [Transformation](#transform-plugin)
10- [Translation](#translation)
11
12---
13
14## Extraction
15
16The general goal of the extraction plugin is to:
17
18- find marked strings/elements for extraction
19- validate the strings/elements
20- sanitize elements with unsafe attributes
21
22This is achieved by processing the source through Babel, identifying AST nodes
23corresponding to `i18n()` or `<I18N>...</I18N>` and extracting their contents.
24
25### Extracting i18n()
26
27For `i18n()` the process is currently simple: assert that there is exactly one
28argument to the function and that it is a `StringLiteral`. All other cases fail.
29The single argument will be extracted verbatim.
30
31### Extracting &lt;I18N&gt;
32
33#### Validation
34
35The first step of extraction is to validate the source string. The following
36constraints are checked for each message:
37
38- No nesting of `<I18N>` messages is allowed
39- Any element with unsafe attributes must have an `i18n-id`
40- Multiple `<ReactComponent>`s of the same type must have (distinct) `i18n-id`s
41- Only `Identifier` and simple `MemberExpression` nodes are allowed inside
42  `JSXExpressionContainer`s
43
44These checks are enforced via a `babel-traverse` path which maintains a context
45in order to track all of the names/ids of React Components encountered.
46
47#### Sanitization
48
49By default, the following attributes are whitelisted (tag: [attributes]):
50
51```javascript
52whitelistedAttributes: {
53    a:   ['href'],
54    img: ['alt'],
55    '*': ['title', 'placeholder', 'alt', 'summary', 'i18n-id'],
56    'Pluralize': ['on'],
57    'Match': ['when'],
58},
59```
60
61Any attributes for `*` will be merged with attributes for a specific tag or
62component. It [should be possible](../TODO.md) to specify these via the
63`.babelrc` [plugin options](https://babeljs.io/docs/plugins/#plugin-options),
64but this is not yet implemented.
65
66Any attribute not present in the whitelist will be removed.
67
68### Message Output
69
70Finally, `babel-generator` is used on each message AST node after validation
71and sanitization. Comments are stripped during this phase. Messages are
72collected per source file in `extractFromSource()` and merged across files
73in `extractFromPaths()`.
74
75
76---
77
78
79## Transform plugin
80
81The transform plugin is intended to be used directly in the babel compilation
82chain, unlike the rest of the jsxlate plugins. It is expected to be specified in
83the user's `.babelrc`. Its purpose is to identify `<I18N>` components and
84transform them into message lookup sites. When authoring with jsxlate,
85messages are wrapped in `<I18N>...</I18N>` components, but when executed,
86these are actually turned into self-closing `<I18N/>` components, which have
87props of `message`, `context`, `arguments`, and `fallback`.
88
89### The transformation process
90
91The transform plugin has a single-node-type visitor which looks only at
92`JSXElement` nodes. If the node is an `I18N` component, and one which has children,
93yet been transformed, then it will be transformed. (It may not have children if
94it is the node that was just transformed -- babel immediately re-visits paths
95after they are replaced.)
96
97The transformation process itself consists of a few steps:
98
99- determine the free variables present in the message
100- extracting the message
101- constructing a fallback renderer
102- assembling these into the new `I18N` callsite
103
104#### Determining the free variables
105
106Given the following message:
107
108```js
109<I18N>{name} sold me a ${this.props.amount} potato.</I18N>
110```
111
112The free variables are `name` and `this`: Their definition is not present within
113the message itself, and so must be supplied as arguments to the render function.
114The transformed source would be:
115
116```js
117<I18N msg="{name} sold me a ${this.props.amount} potato."
118      context={this}
119      args={[name]}
120      fallback={() => <span>{name} sold me a ${this.props.amount} potato.</span>}
121      />
122```
123
124
125#### Extracting the message
126
127The extraction is performed as before, with One Weird Trick: because the
128extraction process actually mutates the AST by removing sanitized attributes,
129it cannot be used on the same AST that is going to be used by the transformation
130process. Thus, the code is generated to string and then re-parsed with the
131extraction plugin, in a hacky form of immutability :)
132
133
134#### Constructing a fallback
135
136The fallback is a render function that is unchanged from the original source,
137and is what is generally used in the original language deployment. (Note: it
138is possible to translate the original-language strings by adding bundled
139messages to that language, but it is not recommended.)
140
141There are 2 steps to construct the fallback:
142
143- change the container element to a `<span>`
144- strip all i18n-ids, either in namespace or attribute form.
145
146The first is accomplished with direct AST manipulation, and the 2nd is
147performed using `babel-traverse` with a visitor that calls `stripI18nId` on
148each `JSXElement` node.
149
150
151#### Assembling the new &lt;I18N&gt; callsite
152
153The final step is to take all these pieces and glue them together. This is
154accomplished using the excellent `babel-template` library. `babel-template`
155allows you to interpolate AST variables into a string template, saving a bunch
156of time and boilerplate. From `transformation.js`, here is how it is used:
157
158```js
159const transformElementMarker = template(`
160    <I18N message={MESSAGE} context={this} args={ARGS} fallback={function() { return FALLBACK; }}/>
161`, {plugins: ['jsx']});
162```
163
164This creates a function named `transformElementMarker` which will accept an
165object parameter containing the keys `MESSAGE`, `ARGS`, and `FALLBACK`. Each of
166the corresponding values will be interpolated into the AST of the parsed
167template. (See the git history of `src/transform.js` to see the full glory of
168the pre-`babel-template` version).
169
170
171---
172
173
174## Translation
175
176The goal of the translation process is to construct a bundle of functions, keyed
177on the extracted messages of the source code, which will render that message in
178a given language.
179
180For each marker node the following steps are performed:
181
182- extract the message from the node
183- look up the corresponding translation in the input
184- if a translation is found,
185 - validate the translation
186 - find any free variables therein
187 - generate a renderer for that message
188- else mark it as missing
189
190Message extraction proceeds exactly as in the Transform plugin.
191
192### Validate the translation
193
194To validate a translated message, all of the requirements of the extraction
195validation must be met. In addition, the following checks are performed to
196ensure consistency between the original source and the translation:
197
198- there must be the same number of React Components of a given name in both the
199  source and the translation
200- there must be the same number of `i18n-id`s in both the source and translation
201- there must be the same number of named expression definitions in both the
202  source and the translation
203
204#### Named Expression Definitions
205
206Named expression definitions are simply anything inside a `JSXExpressionContainer`
207node, e.g. `{foo}`, `{this.bar}`, or `{bar.baz.quux}`. These simple expressions
208can consist only of `Identifier`, `MemberExpression`, and `ThisExpression` nodes,
209anything else is invalid.
210
211### Generating a renderer for a given node + translation
212
213The translation process can seem quite challenging but is in fact fairly simple.
214The key here is that the translator's input is almost a valid JSX expression, it
215just needs to have namespace syntax removed and any missing attributes added to
216it. As such, the process is:
217
218- strip namespaces from element names and remove `i18n-id` attributes
219- find all sanitized attributes in the original source and create a mapping of
220  `i18n-id`/`ComponentName` to these attributes
221- `traverse()` the translated message and append sanitized attributes from the
222  previous step to any nodes that match.

1	`# Implementation notes`
2
3	`This document is intended to discuss, at a high-level, how the pieces of jsxlate`
4	`work together from an internal perspective.`
5
6	`## The parts:`
7
8	`- [Extraction](#extraction)`
9	`- [Transformation](#transform-plugin)`
10	`- [Translation](#translation)`
11
12	`---`
13
14	`## Extraction`
15
16	`The general goal of the extraction plugin is to:`
17
18	`- find marked strings/elements for extraction`
19	`- validate the strings/elements`
20	`- sanitize elements with unsafe attributes`
21
22	`This is achieved by processing the source through Babel, identifying AST nodes`
23	corresponding to `i18n()` or `<I18N>...</I18N>` and extracting their contents.
24
25	`### Extracting i18n()`
26
27	For `i18n()` the process is currently simple: assert that there is exactly one
28	argument to the function and that it is a `StringLiteral`. All other cases fail.
29	`The single argument will be extracted verbatim.`
30
31	`### Extracting <I18N>`
32
33	`#### Validation`
34
35	`The first step of extraction is to validate the source string. The following`
36	`constraints are checked for each message:`
37
38	- No nesting of `<I18N>` messages is allowed
39	- Any element with unsafe attributes must have an `i18n-id`
40	- Multiple `<ReactComponent>`s of the same type must have (distinct) `i18n-id`s
41	- Only `Identifier` and simple `MemberExpression` nodes are allowed inside
42	`JSXExpressionContainer`s
43
44	These checks are enforced via a `babel-traverse` path which maintains a context
45	`in order to track all of the names/ids of React Components encountered.`
46
47	`#### Sanitization`
48
49	`By default, the following attributes are whitelisted (tag: [attributes]):`
50
51	```javascript
52	`whitelistedAttributes: {`
53	`a: ['href'],`
54	`img: ['alt'],`
55	`'*': ['title', 'placeholder', 'alt', 'summary', 'i18n-id'],`
56	`'Pluralize': ['on'],`
57	`'Match': ['when'],`
58	`},`
59	```
60
61	Any attributes for `*` will be merged with attributes for a specific tag or
62	`component. It [should be possible](../TODO.md) to specify these via the`
63	`.babelrc` [plugin options](https://babeljs.io/docs/plugins/#plugin-options),
64	`but this is not yet implemented.`
65
66	`Any attribute not present in the whitelist will be removed.`
67
68	`### Message Output`
69
70	Finally, `babel-generator` is used on each message AST node after validation
71	`and sanitization. Comments are stripped during this phase. Messages are`
72	collected per source file in `extractFromSource()` and merged across files
73	in `extractFromPaths()`.
74
75
76	`---`
77
78
79	`## Transform plugin`
80
81	`The transform plugin is intended to be used directly in the babel compilation`
82	`chain, unlike the rest of the jsxlate plugins. It is expected to be specified in`
83	the user's `.babelrc`. Its purpose is to identify `<I18N>` components and
84	`transform them into message lookup sites. When authoring with jsxlate,`
85	messages are wrapped in `<I18N>...</I18N>` components, but when executed,
86	these are actually turned into self-closing `<I18N/>` components, which have
87	props of `message`, `context`, `arguments`, and `fallback`.
88
89	`### The transformation process`
90
91	`The transform plugin has a single-node-type visitor which looks only at`
92	`JSXElement` nodes. If the node is an `I18N` component, and one which has children,
93	`yet been transformed, then it will be transformed. (It may not have children if`
94	`it is the node that was just transformed -- babel immediately re-visits paths`
95	`after they are replaced.)`
96
97	`The transformation process itself consists of a few steps:`
98
99	`- determine the free variables present in the message`
100	`- extracting the message`
101	`- constructing a fallback renderer`
102	- assembling these into the new `I18N` callsite
103
104	`#### Determining the free variables`
105
106	`Given the following message:`
107
108	```js
109	`<I18N>{name} sold me a ${this.props.amount} potato.</I18N>`
110	```
111
112	The free variables are `name` and `this`: Their definition is not present within
113	`the message itself, and so must be supplied as arguments to the render function.`
114	`The transformed source would be:`
115
116	```js
117	`<I18N msg="{name} sold me a ${this.props.amount} potato."`
118	`context={this}`
119	`args={[name]}`
120	`fallback={() => <span>{name} sold me a ${this.props.amount} potato.</span>}`
121	`/>`
122	```
123
124
125	`#### Extracting the message`
126
127	`The extraction is performed as before, with One Weird Trick: because the`
128	`extraction process actually mutates the AST by removing sanitized attributes,`
129	`it cannot be used on the same AST that is going to be used by the transformation`
130	`process. Thus, the code is generated to string and then re-parsed with the`
131	`extraction plugin, in a hacky form of immutability :)`
132
133
134	`#### Constructing a fallback`
135
136	`The fallback is a render function that is unchanged from the original source,`
137	`and is what is generally used in the original language deployment. (Note: it`
138	`is possible to translate the original-language strings by adding bundled`
139	`messages to that language, but it is not recommended.)`
140
141	`There are 2 steps to construct the fallback:`
142
143	- change the container element to a `<span>`
144	`- strip all i18n-ids, either in namespace or attribute form.`
145
146	`The first is accomplished with direct AST manipulation, and the 2nd is`
147	performed using `babel-traverse` with a visitor that calls `stripI18nId` on
148	each `JSXElement` node.
149
150
151	`#### Assembling the new <I18N> callsite`
152
153	`The final step is to take all these pieces and glue them together. This is`
154	accomplished using the excellent `babel-template` library. `babel-template`
155	`allows you to interpolate AST variables into a string template, saving a bunch`
156	of time and boilerplate. From `transformation.js`, here is how it is used:
157
158	```js
159	const transformElementMarker = template(`
160	`<I18N message={MESSAGE} context={this} args={ARGS} fallback={function() { return FALLBACK; }}/>`
161	`, {plugins: ['jsx']});
162	```
163
164	This creates a function named `transformElementMarker` which will accept an
165	object parameter containing the keys `MESSAGE`, `ARGS`, and `FALLBACK`. Each of
166	`the corresponding values will be interpolated into the AST of the parsed`
167	template. (See the git history of `src/transform.js` to see the full glory of
168	the pre-`babel-template` version).
169
170
171	`---`
172
173
174	`## Translation`
175
176	`The goal of the translation process is to construct a bundle of functions, keyed`
177	`on the extracted messages of the source code, which will render that message in`
178	`a given language.`
179
180	`For each marker node the following steps are performed:`
181
182	`- extract the message from the node`
183	`- look up the corresponding translation in the input`
184	`- if a translation is found,`
185	`- validate the translation`
186	`- find any free variables therein`
187	`- generate a renderer for that message`
188	`- else mark it as missing`
189
190	`Message extraction proceeds exactly as in the Transform plugin.`
191
192	`### Validate the translation`
193
194	`To validate a translated message, all of the requirements of the extraction`
195	`validation must be met. In addition, the following checks are performed to`
196	`ensure consistency between the original source and the translation:`
197
198	`- there must be the same number of React Components of a given name in both the`
199	`source and the translation`
200	- there must be the same number of `i18n-id`s in both the source and translation
201	`- there must be the same number of named expression definitions in both the`
202	`source and the translation`
203
204	`#### Named Expression Definitions`
205
206	Named expression definitions are simply anything inside a `JSXExpressionContainer`
207	node, e.g. `{foo}`, `{this.bar}`, or `{bar.baz.quux}`. These simple expressions
208	can consist only of `Identifier`, `MemberExpression`, and `ThisExpression` nodes,
209	`anything else is invalid.`
210
211	`### Generating a renderer for a given node + translation`
212
213	`The translation process can seem quite challenging but is in fact fairly simple.`
214	`The key here is that the translator's input is almost a valid JSX expression, it`
215	`just needs to have namespace syntax removed and any missing attributes added to`
216	`it. As such, the process is:`
217
218	- strip namespaces from element names and remove `i18n-id` attributes
219	`- find all sanitized attributes in the original source and create a mapping of`
220	`i18n-id`/`ComponentName` to these attributes
221	- `traverse()` the translated message and append sanitized attributes from the
222	`previous step to any nodes that match.`