UNPKG

12.5 kBMarkdownView Raw
1# Transliteration
2
3[![Build Status](https://img.shields.io/circleci/project/github/dzcpy/transliteration/master.svg)](https://circleci.com/gh/dzcpy/transliteration)
4[![Dependencies](https://img.shields.io/david/dzcpy/transliteration.svg)](https://github.com/dzcpy/transliteration/blob/master/package.json)
5[![Dev Dependencies](https://img.shields.io/david/dev/dzcpy/transliteration.svg)](https://github.com/dzcpy/transliteration/blob/master/package.json)
6[![Coverage Status](https://coveralls.io/repos/github/dzcpy/transliteration/badge.svg?branch=master)](https://coveralls.io/github/dzcpy/transliteration?branch=master)
7[![NPM Version](https://img.shields.io/npm/v/transliteration.svg)](https://www.npmjs.com/package/transliteration)
8[![NPM Download](https://img.shields.io/npm/dm/transliteration.svg)](https://www.npmjs.com/package/transliteration)
9[![License](https://img.shields.io/npm/l/transliteration.svg)](https://github.com/dzcpy/transliteration/blob/master/LICENSE.txt)
10
11Universal unicode -> latin transliteration / slugify module. Works with all major languages and on all platforms.
12
13## Demo
14
15[Try it out](http://dzcpy.github.io/transliteration)
16
17### Compatibility / Browser support
18
19IE 10+ and all modern browsers.
20
21Node.js, in the browser, Web Worker, ReactNative and CLI
22
23## Installation
24
25### Node.js / React Native
26
27```bash
28npm install transliteration --save
29```
30
31```javascript
32import { transliterate as tr, slugify } from 'transliteration';
33
34tr('你好, world!'); // Ni Hao , world!
35slugify('你好, world!'); // ni-hao-world
36```
37
38### Browser
39
40__CDN:__
41
42```html
43<!-- UMD build -->
44<script async defer src="https://cdn.jsdelivr.net/npm/transliteration@2.0.0-alpha1/dist/browser/bundle.umd.min.js"></script>
45<!-- ESM build -->
46<script async defer src="https://cdn.jsdelivr.net/npm/transliteration@2.0.0-alpha1/dist/browser/bundle.esm.min.js" type="module"></script>
47<script type="module">
48 import { transl } from './bundle.esm.min.js';
49 console.log(transl('你好'));
50</script>
51```
52
53`transliteration` can be loaded as an AMD / CommonJS module, or as global variables (UMD).
54
55When using it in the browser, by default it will create global variables under `window` object:
56
57```javascript
58transl('你好, World'); // window.transl
59// or
60slugify('Hello, 世界'); // window.slugify
61```
62
63### CLI
64
65```bash
66npm install transliteration -g
67
68transliterate 你好 # Ni Hao
69slugify 你好 # ni-hao
70echo 你好 | slugify -S # ni-hao
71```
72
73## Usage
74
75### transliterate(str, [options])
76
77Transliterates the string `str` and return the result. Characters which this module doesn't recognise will be defaulted to the placeholder from the `unknown` argument in the configuration option, defaults to `[?]`.
78
79__Options:__ (optional)
80
81```javascript
82{
83 /**
84 * Ignore a list of strings untouched
85 * @example tr('你好,世界', { ignore: ['你'] }) // 你 Hao , Shi Jie
86 */
87 ignore?: string[];
88 /**
89 * Replace a list of string / regex in the source string into the provided target string before transliteration
90 * The option can either be an array or an object
91 * @example tr('你好,世界', { replace: {你: 'You'} }) // You Hao , Shi Jie
92 * @example tr('你好,世界', { replace: [['你', 'You']] }) // You Hao , Shi Jie
93 * @example tr('你好,世界', { replace: [[/你/g, 'You']] }) // You Hao , Shi Jie
94 */
95 replace?: OptionReplaceCombined;
96 /**
97 * Same as `replace` but after transliteration
98 */
99 replaceAfter?: OptionReplaceCombined;
100 /**
101 * Decides whether or not to trim the result string after transliteration
102 * @default false
103 */
104 trim?: boolean;
105 /**
106 * Any characters not known by this library will be replaced by a specific string `unknown`
107 * @default ''
108 */
109 unknown?: string;
110}
111```
112
113### transliterate.config([optionsObj])
114
115Bind options globally so any following calls will be using `optoinsObj` by default. If `optionsObj` argument is omitted, it will return current default option object.
116
117```javascript
118transliterate.config({ replace: [['你好', 'Hello']] });
119transliterate('你好, world!'); // Result: 'Hello, world!'. This equals transliterate('你好, world!', { replace: [['你好', 'Hello']] });
120```
121
122__Examples:__
123
124```javascript
125import { transliterate as tr } from 'transliteration';
126tr('你好,世界'); // Ni Hao , Shi Jie
127tr('Γεια σας, τον κόσμο'); // Geia sas, ton kosmo
128tr('안녕하세요, 세계'); // annyeonghaseyo, segye
129tr('你好,世界', { replace: {你: 'You'}, ignore: ['好'] }) // You 好, Shi Jie
130tr('你好,世界', { replace: [['你', 'You']], ignore: ['好'] }) // You 好, Shi Jie (option in array form)
131// or use configurations
132tr.config({ replace: [['你', 'You']], ignore: ['好'] });
133tr('你好,世界') // You 好, Shi Jie
134// get configurations
135console.log(tr.config());
136```
137
138### slugify(str, [options])
139
140Converts Unicode string to slugs. So it can be safely used in URL or file name.
141
142__Options:__ (optional)
143
144```javascript
145 /**
146 * Ignore a list of strings untouched
147 * @example tr('你好,世界', { ignore: ['你'] }) // 你 Hao , Shi Jie
148 */
149 ignore?: string[];
150 /**
151 * Replace a list of string / regex in the source string into the provided target string before transliteration
152 * The option can either be an array or an object
153 * @example tr('你好,世界', { replace: {你: 'You'} }) // You Hao , Shi Jie
154 * @example tr('你好,世界', { replace: [['你', 'You']] }) // You Hao , Shi Jie
155 * @example tr('你好,世界', { replace: [[/你/g, 'You']] }) // You Hao , Shi Jie
156 */
157 replace?: OptionReplaceCombined;
158 /**
159 * Same as `replace` but after transliteration
160 */
161 replaceAfter?: OptionReplaceCombined;
162 /**
163 * Decides whether or not to trim the result string after transliteration
164 * @default false
165 */
166 trim?: boolean;
167 /**
168 * Any characters not known by this library will be replaced by a specific string `unknown`
169 * @default ''
170 */
171 unknown?: string;
172 /**
173 * Whether the result need to be converted into lowercase
174 * @default true
175 */
176 lowercase?: boolean;
177 /**
178 * Whether the result need to be converted into uppercase
179 * @default false
180 */
181 uppercase?: boolean;
182 /**
183 * Custom separator string
184 * @default '-'
185 */
186 separator?: string;
187 /**
188 * Allowed characters.
189 * When `allowedChars` is set to `'abc'`, then only characters match `/[abc]/g` will be preserved.
190 * Other characters will all be converted to `separator`
191 * @default 'a-zA-Z0-9-_.~''
192 */
193 allowedChars?: string;
194```
195
196If `options` is not provided, it will use the above default values.
197
198### slugify.config([optionsObj])
199
200Bind options globally so any following calls will be using `optoinsObj` by default. If `optionsObj` argument is omitted, it will return current default option object.
201
202```javascript
203slugify.config({ replace: [['你好', 'Hello']] });
204slugify('你好, world!'); // Result: 'hello-world'. This equals slugify('你好, world!', { replace: [['你好', 'Hello']] });
205```
206
207__Example:__
208
209### Node.js / webpack
210
211```javascript
212import { slugify } from 'transliteration';
213slugify('你好,世界'); // ni-hao-shi-jie
214slugify('你好,世界', { lowercase: false, separator: '_' }); // Ni_Hao_Shi_Jie
215slugify('你好,世界', { replace: {你好: 'Hello', 世界: 'world'}, separator: '_' }); // hello_world
216slugify('你好,世界', { replace: [['你好', 'Hello'], ['世界', 'world']], separator: '_' }); // hello_world (option in array form)
217slugify('你好,世界', { ignore: ['你好'] }); // 你好shi-jie
218// or use configurations
219slugify.config({ lowercase: false, separator: '_' });
220slugify('你好,世界'); // Ni_Hao_Shi_Jie
221// get configurations
222console.log(slugify.config());
223```
224
225If the variable names conflict with other libraries in your project or you prefer not to use global variables, use noConfilict() before loading libraries which contain the conflicting variables.:
226
227### CLI
228
229```
230➜ ~ transliterate --help
231Usage: transliterate <unicode> [options]
232
233Options:
234 --version Show version number [boolean]
235 -u, --unknown Placeholder for unknown characters [string] [default: ""]
236 -r, --replace Custom string replacement [array] [default: []]
237 -i, --ignore String list to ignore [array] [default: []]
238 -S, --stdin Use stdin as input [boolean] [default: false]
239 -h, --help [boolean]
240
241Examples:
242 transliterate "你好, world!" -r 好=good -r Replace `,` into `!`, `world` into `shijie`.
243 "world=Shi Jie" Result: Ni good, Shi Jie!
244 transliterate "你好,世界!" -i 你好 -i , Ignore `你好` and `,`.
245 Result: 你好,Shi Jie !
246```
247
248```
249➜ ~ slugify --help
250Usage: slugify <unicode> [options]
251
252Options:
253 --version Show version number [boolean]
254 -U, --unknown Placeholder for unknown characters [string] [default: ""]
255 -l, --lowercase Peturns result in lowercase [boolean] [default: true]
256 -u, --uppercase Returns result in uppercase [boolean] [default: false]
257 -s, --separator Separator of the slug [string] [default: "-"]
258 -r, --replace Custom string replacement [array] [default: []]
259 -i, --ignore String list to ignore [array] [default: []]
260 -S, --stdin Use stdin as input [boolean] [default: false]
261 -h, --help [boolean]
262
263Examples:
264 slugify "你好, world!" -r 好=good -r "world=Shi Replace `,` into `!` and `world` into
265 Jie" `shijie`.
266 Result: ni-good-shi-jie
267 slugify "你好,世界!" -i 你好 -i , Ignore `你好` and `,`.
268 Result: 你好,shi-jie
269
270```
271
272
273## Change log
274
275### 2.0.0
276
277* **CDN file path changes**
278* The entire module was refactored in Typescript, with a big performance improvement as well as a reduced package size.
279* Better code quality. 100% unit tested.
280* `bower` support was dropped. Please use CDN or together with a js bundler like `webpack` or `rollup`.
281* As according to RFC 3986, more characters(`/a-zA-Z0-9-_.~/`) are kept as result for `slugify`
282
283### 1.6.6
284
285* Added support for `TypeScript`. #77
286
287### 1.5.0
288
289* Minimum node requirement: 6.0+
290
291### 1.0.0
292
293* Code had been entirely refactored since version 1.0.0. Be careful when you plan to upgrade from v0.1.x or v0.2.x to v1.0.x
294
295__Changes:__
296
297* The `options` parameter of `transliterate` now is an `Object` (In 0.1.x it's a string `unknown`).
298* Added `transliterate.config` and `slugify.config`.
299* Unknown string will be transliterated as `[?]` instead of `?`.
300* In the browser, global variables have been changed to `window.transl` and `windnow.slugify`. Other global variables are removed.
301
302
303## Caveats
304
305Currently, `transliteration` only supports 1 to 1 code map (from Unicode to Latin). It is the simplest way to implement, but there are some limitations when dealing with polyphonic characters. It does not work well with all languages, please test all possible situations before using it. Some known issues are:
306
307* __Chinese:__ Polyphonic characters are not always transliterated correctly. Alternative: `pinyin`.
308
309* __Japanese:__ Most Japanese Kanji characters are transliterated into Chinese Pinyin because of the overlapped code map in Unicode. Also there are many polyphonic characters in Japanese which makes it impossible to transliterate Japanese Kanji correctly without tokenizing the sentence. Consider using `kuroshiro` for a better Kanji -> Romaji conversion.
310
311* __Thai:__ Currently it is not working. If you know how to make it work, please contact me.
312
313* __Cylic:__ Cylic characters are overlapped between a few languages. The result might be inaccurate in some specific languages, for example Bulgarian.
314
315If you there's any other issues, please raise a ticket.
316
317### License
318
319MIT