1 | <h1 align="center">truncate-html</h1>
|
2 |
|
3 | <h5 align="center"> Truncate html string(even contains emoji chars) and keep tags in safe. You can custom ellipsis sign, ignore unwanted elements and truncate html by words. </h5>
|
4 | <div align="center">
|
5 | <a href="https://github.com/oe/truncate-html/actions/workflows/main.yml">
|
6 | <img src="https://github.com/oe/truncate-html/actions/workflows/main.yml/badge.svg" alt="Github Actions">
|
7 | </a>
|
8 | <a href="#readme">
|
9 | <img src="https://img.shields.io/badge/%3C%2F%3E-typescript-blue" alt="code with typescript">
|
10 | </a>
|
11 | <a href="#readme">
|
12 | <img src="https://badge.fury.io/js/truncate-html.svg" alt="npm version">
|
13 | </a>
|
14 | <a href="https://www.npmjs.com/package/truncate-html">
|
15 | <img src="https://img.shields.io/npm/dm/truncate-html.svg" alt="npm downloads">
|
16 | </a>
|
17 | </div>
|
18 |
|
19 | <br>
|
20 |
|
21 | **Notice** This is a node module depends on [cheerio](https://github.com/cheeriojs/cheerio) _can only run on nodejs_. If you need a browser version, you may consider [truncate](https://github.com/pathable/truncate) or [nodejs-html-truncate](https://github.com/huang47/nodejs-html-truncate).
|
22 |
|
23 | ```javascript
|
24 | const truncate = require('truncate-html')
|
25 | truncate('<p><img src="xxx.jpg">Hello from earth!</p>', 2, { byWords: true })
|
26 | // => <p><img src="xxx.jpg">Hello from ...</p>
|
27 | ```
|
28 |
|
29 | ## Installation
|
30 |
|
31 | `npm install truncate-html` <br>
|
32 | or <br>
|
33 | `yarn add truncate-html`
|
34 |
|
35 | ## Try it online
|
36 |
|
37 | Click **<https://npm.runkit.com/truncate-html>** to try.
|
38 |
|
39 | ## API
|
40 |
|
41 | ```javascript
|
42 | /**
|
43 | * truncate html
|
44 | * @method truncate(html, [length], [options])
|
45 | * @param {String|CheerioStatic} html html string to truncate, or existing cheerio instance(aka cheerio $)
|
46 | * @param {Object|number} length how many letters(words if `byWords` is true) you want reserve
|
47 | * @param {Object|null} options
|
48 | * @param {Boolean} [options.stripTags] remove all tags, default false
|
49 | * @param {String} [options.ellipsis] ellipsis sign, default '...'
|
50 | * @param {Boolean} [options.decodeEntities] decode html entities(e.g. convert `&` to `&`) before
|
51 | * counting length, default false
|
52 | * @param {String|Array} [options.excludes] elements' selector you want ignore
|
53 | * @param {Number} [options.length] how many letters(words if `byWords` is true)
|
54 | * you want reserve
|
55 | * @param {Boolean} [options.byWords] if true, length means how many words to reserve
|
56 | * @param {Boolean|Number} [options.reserveLastWord] how to deal with when truncate in the middle of a word
|
57 | * 1. by default, just cut at that position.
|
58 | * 2. set it to true, with max exceed 10 letters can exceed to reserver the last word
|
59 | * 3. set it to a positive number decide how many letters can exceed to reserve the last word
|
60 | * 4. set it to negetive number to remove the last word if cut in the middle.
|
61 | * @param {Boolean} [options.trimTheOnlyWord] whether to trim the only word when `reserveLastWord` < 0
|
62 | * if reserveLastWord set to negetive number, and there is only one word in the html string,
|
63 | * when trimTheOnlyWord set to true, the extra letters will be cutted if word's length longer
|
64 | * than `length`.
|
65 | * see issue #23 for more details
|
66 | * @param {Boolean} [options.keepWhitespaces] keep whitespaces, by default continuous
|
67 | * spaces will be replaced with one space
|
68 | * set it true to reserve them, and continuous spaces will count as one
|
69 | * @return {String}
|
70 | */
|
71 | truncate(html, [length], [options])
|
72 | // and truncate.setup to change default options
|
73 | truncate.setup(options)
|
74 | ```
|
75 |
|
76 | ### Default options
|
77 |
|
78 | ```js
|
79 | {
|
80 | stripTags: false,
|
81 | ellipsis: '...',
|
82 | decodeEntities: false,
|
83 | excludes: '',
|
84 | byWords: false,
|
85 | reserveLastWord: false,
|
86 | trimTheOnlyWord: false,
|
87 | keepWhitespaces: false
|
88 | }
|
89 | ```
|
90 |
|
91 | You can change default options by using `truncate.setup`
|
92 |
|
93 | e.g.
|
94 |
|
95 | ```js
|
96 | truncate.setup({ stripTags: true, length: 10 })
|
97 | truncate('<p><img src="xxx.jpg">Hello from earth!</p>')
|
98 | // => Hello from
|
99 | ```
|
100 |
|
101 | or use existing [cheerio instance](https://github.com/cheeriojs/cheerio#loading)
|
102 |
|
103 | ```js
|
104 | import * as cheerio from 'cheerio'
|
105 | truncate.setup({ stripTags: true, length: 10 })
|
106 | // truncate option `decodeEntities` will not work
|
107 | // you should config it in cheerio options by yourself
|
108 | const $ = cheerio.load('<p><img src="xxx.jpg">Hello from earth!</p>', {
|
109 | /** set decodeEntities if you need it */
|
110 | decodeEntities: true
|
111 | /* any cheerio instance options*/
|
112 | }, false) // third parameter is for `isDocument` option, set to false to get rid of extra wrappers, see cheerio's doc for details
|
113 | truncate($)
|
114 | // => Hello from
|
115 | ```
|
116 |
|
117 | ## Notice
|
118 |
|
119 | ### Typescript support
|
120 |
|
121 | This lib is written with typescript and has a type definition file along with it. ~~You may need to update your `tsconfig.json` by adding `"esModuleInterop": true` to the `compilerOptions` if you encounter some typing errors, see [#19](https://github.com/oe/truncate-html/issues/19).~~
|
122 |
|
123 | ### About final string length
|
124 |
|
125 | If the html string content's length is shorter than `options.length`, then no ellipsis will be appended to the final html string. If longer, then the final string length will be `options.length` + `options.ellipsis`. And if you set `reserveLastWord` to true or none zero number, the final string will be various.
|
126 |
|
127 | ### About html comments
|
128 |
|
129 | All html comments `<!-- xxx -->` will be removed
|
130 |
|
131 | ### About dealing with none alphabetic languages
|
132 |
|
133 | When dealing with none alphabetic languages, such as Chinese/Japanese/Korean, they don't separate words with whitespaces, so options `byWords` and `reserveLastWord` should only works well with alphabetic languages.
|
134 |
|
135 | And the only dependency of this project `cheerio` has an issue when dealing with none alphabetic languages, see [Known Issues](#known-issues) for details.
|
136 |
|
137 | ### Using existing cheerio instance
|
138 |
|
139 | If you want to use existing cheerio instance, truncate option `decodeEntities` will not work, you should set it in your own cheerio instance:
|
140 |
|
141 | ```js
|
142 | var html = '<p><img src="abc.png">This is a string</p> for test.'
|
143 | const $ = cheerio.load(`${html}`, {
|
144 | decodeEntities: true
|
145 | /** other cheerio options */
|
146 | }, false) // third parameter is for `isDocument` option, set to false to get rid of extra wrappers, see cheerio's doc for details
|
147 | truncate($, 10)
|
148 |
|
149 | ```
|
150 |
|
151 | ## Examples
|
152 |
|
153 | ```javascript
|
154 | var truncate = require('truncate-html')
|
155 |
|
156 | // truncate html
|
157 | var html = '<p><img src="abc.png">This is a string</p> for test.'
|
158 | truncate(html, 10)
|
159 | // returns: <p><img src="abc.png">This is a ...</p>
|
160 |
|
161 | // truncate string with emojis
|
162 | var string = '<p>poo 💩💩💩💩💩<p>'
|
163 | truncate(string, 6)
|
164 | // returns: <p>poo 💩💩...</p>
|
165 |
|
166 | // with options, remove all tags
|
167 | var html = '<p><img src="abc.png">This is a string</p> for test.'
|
168 | truncate(html, 10, { stripTags: true })
|
169 | // returns: This is a ...
|
170 |
|
171 | // with options, truncate by words.
|
172 | // if you try to truncate none alphabet language(like CJK)
|
173 | // it will not act as you wish
|
174 | var html = '<p><img src="abc.png">This is a string</p> for test.'
|
175 | truncate(html, 3, { byWords: true })
|
176 | // returns: <p><img src="abc.png">This is a ...</p>
|
177 |
|
178 | // with options, keep whitespaces
|
179 | var html = '<p> <img src="abc.png">This is a string</p> for test.'
|
180 | truncate(html, 10, { keepWhitespaces: true })
|
181 | // returns: <p> <img src="abc.png">This is a ...</p>
|
182 |
|
183 | // combine length and options
|
184 | var html = '<p><img src="abc.png">This is a string</p> for test.'
|
185 | truncate(html, {
|
186 | length: 10,
|
187 | stripTags: true
|
188 | })
|
189 | // returns: This is a ...
|
190 |
|
191 | // custom ellipsis sign
|
192 | var html = '<p><img src="abc.png">This is a string</p> for test.'
|
193 | truncate(html, {
|
194 | length: 10,
|
195 | ellipsis: '~'
|
196 | })
|
197 | // returns: <p><img src="abc.png">This is a ~</p>
|
198 |
|
199 | // exclude some special elements(by selector), they will be removed before counting content's length
|
200 | var html = '<p><img src="abc.png">This is a string</p> for test.'
|
201 | truncate(html, {
|
202 | length: 10,
|
203 | ellipsis: '~',
|
204 | excludes: 'img'
|
205 | })
|
206 | // returns: <p>This is a ~</p>
|
207 |
|
208 | // exclude more than one category elements
|
209 | var html =
|
210 | '<p><img src="abc.png">This is a string</p><div class="something-unwanted"> unwanted string inserted ( ´•̥̥̥ω•̥̥̥` )</div> for test.'
|
211 | truncate(html, {
|
212 | length: 20,
|
213 | stripTags: true,
|
214 | ellipsis: '~',
|
215 | excludes: ['img', '.something-unwanted']
|
216 | })
|
217 | // returns: This is a string for~
|
218 |
|
219 | // handing encoded characters
|
220 | var html = '<p> test for <p> encoded string</p>'
|
221 | truncate(html, {
|
222 | length: 20,
|
223 | decodeEntities: true
|
224 | })
|
225 | // returns: <p> test for <p> encode...</p>
|
226 |
|
227 | // when set decodeEntities false
|
228 | var html = '<p> test for <p> encoded string</p>'
|
229 | truncate(html, {
|
230 | length: 20,
|
231 | decodeEntities: false // this is the default value
|
232 | })
|
233 | // returns: <p> test for <p...</p>
|
234 |
|
235 | // and there may be a surprise by setting `decodeEntities` to true when handing CJK characters
|
236 | var html = '<p> test for <p> ä¸æ–‡ string</p>'
|
237 | truncate(html, {
|
238 | length: 20,
|
239 | decodeEntities: true
|
240 | })
|
241 | // returns: <p> test for <p> 中文 str...</p>
|
242 | // to fix this, see below for instructions
|
243 | ```
|
244 |
|
245 | for More usages, check [truncate.spec.ts](./test/truncate.spec.ts)
|
246 |
|
247 | ## Credits
|
248 |
|
249 | Thanks to:
|
250 |
|
251 | - [@calebeno](https://github.com/calebeno) es6 support and unit tests
|
252 | - [@aaditya-thakkar](https://github.com/aaditya-thakkar) emoji truncating support
|