UNPKG

3.67 kBMarkdownView Raw
1# Intl MessageFormat Parser
2
3Parses [ICU Message strings][icu] into an AST via JavaScript.
4
5[![npm Version](https://badgen.net/npm/v/intl-messageformat-parser)](https://www.npmjs.com/package/intl-messageformat-parser)
6[![size](https://badgen.net/bundlephobia/minzip/intl-messageformat-parser)](https://bundlephobia.com/result?p=intl-messageformat-parser)
7
8## Overview
9
10This package implements a parser in JavaScript that parses the industry standard [ICU Message strings][icu] — used for internationalization — into an AST. The produced AST can then be used by a compiler, like [`intl-messageformat`][intl-mf], to produce localized formatted strings for display to users.
11
12This parser is written in [PEG.js][], a parser generator for JavaScript.
13
14## Usage
15
16```ts
17import {parse} from 'intl-messageformat-parser';
18const ast = parse('this is {count, plural, one{# dog} other{# dogs}}');
19```
20
21### Example
22
23Given an ICU Message string like this:
24
25```
26On {takenDate, date, short} {name} took {numPhotos, plural,
27 =0 {no photos.}
28 =1 {one photo.}
29 other {# photos.}
30}
31```
32
33```js
34// Assume `msg` is the string above.
35parse(msg);
36```
37
38This parser will produce this AST:
39
40```json
41[
42 {
43 "type": 0,
44 "value": "On "
45 },
46 {
47 "type": 3,
48 "style": "short",
49 "value": "takenDate"
50 },
51 {
52 "type": 0,
53 "value": " "
54 },
55 {
56 "type": 1,
57 "value": "name"
58 },
59 {
60 "type": 0,
61 "value": " took "
62 },
63 {
64 "type": 6,
65 "pluralType": "cardinal",
66 "value": "numPhotos",
67 "offset": 0,
68 "options": [
69 {
70 "id": "=0",
71 "value": [
72 {
73 "type": 0,
74 "value": "no photos."
75 }
76 ]
77 },
78 {
79 "id": "=1",
80 "value": [
81 {
82 "type": 0,
83 "value": "one photo."
84 }
85 ]
86 },
87 {
88 "id": "other",
89 "value": [
90 {
91 "type": 0,
92 "value": "# photos."
93 }
94 ]
95 }
96 ]
97 }
98]
99```
100
101## Supported DateTime Skeleton
102
103ICU provides a [wide array of pattern](https://www.unicode.org/reports/tr35/tr35-dates.html#Date_Field_Symbol_Table) to customize date time format. However, not all of them are available via ECMA402's Intl API. Therefore, our parser only support the following patterns
104
105| Symbol | Meaning | Notes |
106| ------ | ----------------------------- | ------------------------- |
107| G | Era designator |
108| y | year |
109| M | month in year |
110| L | stand-alone month in year |
111| d | day in month |
112| E | day of week |
113| e | local day of week | `e..eee` is not supported |
114| c | stand-alone local day of week | `c..ccc` is not supported |
115| a | AM/PM marker |
116| h | Hour [1-12] |
117| H | Hour [0-23] |
118| K | Hour [0-11] |
119| k | Hour [1-24] |
120| m | Minute |
121| s | Second |
122| z | Time Zone |
123
124## Benchmarks
125
126```
127complex_msg AST length 2053
128normal_msg AST length 410
129simple_msg AST length 79
130string_msg AST length 36
131complex_msg x 3,926 ops/sec ±2.37% (90 runs sampled)
132normal_msg x 27,641 ops/sec ±3.93% (86 runs sampled)
133simple_msg x 100,764 ops/sec ±5.35% (79 runs sampled)
134string_msg x 120,362 ops/sec ±7.11% (74 runs sampled)
135```
136
137[icu]: http://userguide.icu-project.org/formatparse/messages
138[intl-mf]: https://github.com/formatjs/formatjs
139[peg.js]: https://pegjs.org/