UNPKG

9.59 kBMarkdownView Raw
1<h1 align="center">truncate-html</h1>
2
3<h5 align="center"> Truncate html string(even contains emoji chars) and keep tags in safe. You can custom ellipsis sign, ignore unwanted elements and truncate html by words. </h5>
4<div align="center">
5 <a href="https://github.com/oe/truncate-html/actions/workflows/main.yml">
6 <img src="https://github.com/oe/truncate-html/actions/workflows/main.yml/badge.svg" alt="Github Actions">
7 </a>
8 <a href="#readme">
9 <img src="https://img.shields.io/badge/%3C%2F%3E-typescript-blue" alt="code with typescript">
10 </a>
11 <a href="#readme">
12 <img src="https://badge.fury.io/js/truncate-html.svg" alt="npm version">
13 </a>
14 <a href="https://www.npmjs.com/package/truncate-html">
15 <img src="https://img.shields.io/npm/dm/truncate-html.svg" alt="npm downloads">
16 </a>
17</div>
18
19<br>
20
21**Notice** This is a node module depends on [cheerio](https://github.com/cheeriojs/cheerio) _can only run on nodejs_. If you need a browser version, you may consider [truncate](https://github.com/pathable/truncate) or [nodejs-html-truncate](https://github.com/huang47/nodejs-html-truncate).
22
23```javascript
24const truncate = require('truncate-html')
25truncate('<p><img src="xxx.jpg">Hello from earth!</p>', 2, { byWords: true })
26// => <p><img src="xxx.jpg">Hello from ...</p>
27```
28
29## Installation
30
31`npm install truncate-html` <br>
32or <br>
33`yarn add truncate-html`
34
35## Try it online
36
37Click **<https://npm.runkit.com/truncate-html>** to try.
38
39## API
40
41```javascript
42/**
43 * truncate html
44 * @method truncate(html, [length], [options])
45 * @param {String|CheerioStatic} html html string to truncate, or existing cheerio instance(aka cheerio $)
46 * @param {Object|number} length how many letters(words if `byWords` is true) you want reserve
47 * @param {Object|null} options
48 * @param {Boolean} [options.stripTags] remove all tags, default false
49 * @param {String} [options.ellipsis] ellipsis sign, default '...'
50 * @param {Boolean} [options.decodeEntities] decode html entities(e.g. convert `&amp;` to `&`) before
51 * counting length, default false
52 * @param {String|Array} [options.excludes] elements' selector you want ignore
53 * @param {Number} [options.length] how many letters(words if `byWords` is true)
54 * you want reserve
55 * @param {Boolean} [options.byWords] if true, length means how many words to reserve
56 * @param {Boolean|Number} [options.reserveLastWord] how to deal with when truncate in the middle of a word
57 * 1. by default, just cut at that position.
58 * 2. set it to true, with max exceed 10 letters can exceed to reserver the last word
59 * 3. set it to a positive number decide how many letters can exceed to reserve the last word
60 * 4. set it to negetive number to remove the last word if cut in the middle.
61 * @param {Boolean} [options.trimTheOnlyWord] whether to trim the only word when `reserveLastWord` < 0
62 * if reserveLastWord set to negetive number, and there is only one word in the html string,
63 * when trimTheOnlyWord set to true, the extra letters will be cutted if word's length longer
64 * than `length`.
65 * see issue #23 for more details
66 * @param {Boolean} [options.keepWhitespaces] keep whitespaces, by default continuous
67 * spaces will be replaced with one space
68 * set it true to reserve them, and continuous spaces will count as one
69 * @return {String}
70 */
71truncate(html, [length], [options])
72// and truncate.setup to change default options
73truncate.setup(options)
74```
75
76### Default options
77
78```js
79{
80 stripTags: false,
81 ellipsis: '...',
82 decodeEntities: false,
83 excludes: '',
84 byWords: false,
85 reserveLastWord: false,
86 trimTheOnlyWord: false,
87 keepWhitespaces: false
88}
89```
90
91You can change default options by using `truncate.setup`
92
93e.g.
94
95```js
96truncate.setup({ stripTags: true, length: 10 })
97truncate('<p><img src="xxx.jpg">Hello from earth!</p>')
98// => Hello from
99```
100
101or use existing [cheerio instance](https://github.com/cheeriojs/cheerio#loading)
102
103```js
104import * as cheerio from 'cheerio'
105truncate.setup({ stripTags: true, length: 10 })
106// truncate option `decodeEntities` will not work
107// you should config it in cheerio options by yourself
108const $ = cheerio.load('<p><img src="xxx.jpg">Hello from earth!</p>', {
109 /** set decodeEntities if you need it */
110 decodeEntities: true
111 /* any cheerio instance options*/
112}, false) // third parameter is for `isDocument` option, set to false to get rid of extra wrappers, see cheerio's doc for details
113truncate($)
114// => Hello from
115```
116
117## Notice
118
119### Typescript support
120
121This lib is written with typescript and has a type definition file along with it. ~~You may need to update your `tsconfig.json` by adding `"esModuleInterop": true` to the `compilerOptions` if you encounter some typing errors, see [#19](https://github.com/oe/truncate-html/issues/19).~~
122
123### About final string length
124
125If the html string content's length is shorter than `options.length`, then no ellipsis will be appended to the final html string. If longer, then the final string length will be `options.length` + `options.ellipsis`. And if you set `reserveLastWord` to true or none zero number, the final string will be various.
126
127### About html comments
128
129All html comments `<!-- xxx -->` will be removed
130
131### About dealing with none alphabetic languages
132
133When dealing with none alphabetic languages, such as Chinese/Japanese/Korean, they don't separate words with whitespaces, so options `byWords` and `reserveLastWord` should only works well with alphabetic languages.
134
135And the only dependency of this project `cheerio` has an issue when dealing with none alphabetic languages, see [Known Issues](#known-issues) for details.
136
137### Using existing cheerio instance
138
139If you want to use existing cheerio instance, truncate option `decodeEntities` will not work, you should set it in your own cheerio instance:
140
141```js
142var html = '<p><img src="abc.png">This is a string</p> for test.'
143const $ = cheerio.load(`${html}`, {
144 decodeEntities: true
145 /** other cheerio options */
146}, false) // third parameter is for `isDocument` option, set to false to get rid of extra wrappers, see cheerio's doc for details
147truncate($, 10)
148
149```
150
151## Examples
152
153```javascript
154var truncate = require('truncate-html')
155
156// truncate html
157var html = '<p><img src="abc.png">This is a string</p> for test.'
158truncate(html, 10)
159// returns: <p><img src="abc.png">This is a ...</p>
160
161// truncate string with emojis
162var string = '<p>poo 💩💩💩💩💩<p>'
163truncate(string, 6)
164// returns: <p>poo 💩💩...</p>
165
166// with options, remove all tags
167var html = '<p><img src="abc.png">This is a string</p> for test.'
168truncate(html, 10, { stripTags: true })
169// returns: This is a ...
170
171// with options, truncate by words.
172// if you try to truncate none alphabet language(like CJK)
173// it will not act as you wish
174var html = '<p><img src="abc.png">This is a string</p> for test.'
175truncate(html, 3, { byWords: true })
176// returns: <p><img src="abc.png">This is a ...</p>
177
178// with options, keep whitespaces
179var html = '<p> <img src="abc.png">This is a string</p> for test.'
180truncate(html, 10, { keepWhitespaces: true })
181// returns: <p> <img src="abc.png">This is a ...</p>
182
183// combine length and options
184var html = '<p><img src="abc.png">This is a string</p> for test.'
185truncate(html, {
186 length: 10,
187 stripTags: true
188})
189// returns: This is a ...
190
191// custom ellipsis sign
192var html = '<p><img src="abc.png">This is a string</p> for test.'
193truncate(html, {
194 length: 10,
195 ellipsis: '~'
196})
197// returns: <p><img src="abc.png">This is a ~</p>
198
199// exclude some special elements(by selector), they will be removed before counting content's length
200var html = '<p><img src="abc.png">This is a string</p> for test.'
201truncate(html, {
202 length: 10,
203 ellipsis: '~',
204 excludes: 'img'
205})
206// returns: <p>This is a ~</p>
207
208// exclude more than one category elements
209var html =
210 '<p><img src="abc.png">This is a string</p><div class="something-unwanted"> unwanted string inserted ( ´•̥̥̥ω•̥̥̥` )</div> for test.'
211truncate(html, {
212 length: 20,
213 stripTags: true,
214 ellipsis: '~',
215 excludes: ['img', '.something-unwanted']
216})
217// returns: This is a string for~
218
219// handing encoded characters
220var html = '<p>&nbsp;test for &lt;p&gt; encoded string</p>'
221truncate(html, {
222 length: 20,
223 decodeEntities: true
224})
225// returns: <p> test for &lt;p&gt; encode...</p>
226
227// when set decodeEntities false
228var html = '<p>&nbsp;test for &lt;p&gt; encoded string</p>'
229truncate(html, {
230 length: 20,
231 decodeEntities: false // this is the default value
232})
233// returns: <p>&nbsp;test for &lt;p...</p>
234
235// and there may be a surprise by setting `decodeEntities` to true when handing CJK characters
236var html = '<p>&nbsp;test for &lt;p&gt; 中文 string</p>'
237truncate(html, {
238 length: 20,
239 decodeEntities: true
240})
241// returns: <p> test for &lt;p&gt; &#x4E2D;&#x6587; str...</p>
242// to fix this, see below for instructions
243```
244
245for More usages, check [truncate.spec.ts](./test/truncate.spec.ts)
246
247## Credits
248
249Thanks to:
250
251- [@calebeno](https://github.com/calebeno) es6 support and unit tests
252- [@aaditya-thakkar](https://github.com/aaditya-thakkar) emoji truncating support