1 | <h1 align="center">truncate-html</h1>
2 |
3 | <h5 align="center"> Truncate html string(even contains emoji chars) and keep tags in safe. You can custom ellipsis sign, ignore unwanted elements and truncate html by words. </h5>
4 | <div align="center">
5 | <a href="https://github.com/oe/truncate-html/actions/workflows/main.yml">
6 | <img src="https://github.com/oe/truncate-html/actions/workflows/main.yml/badge.svg" alt="Github Actions">
7 | </a>
8 | <a href="#readme">
9 | <img src="https://img.shields.io/badge/%3C%2F%3E-typescript-blue" alt="code with typescript">
10 | </a>
11 | <a href="#readme">
12 | <img src="https://badge.fury.io/js/truncate-html.svg" alt="npm version">
13 | </a>
14 | <a href="https://www.npmjs.com/package/truncate-html">
15 | <img src="https://img.shields.io/npm/dm/truncate-html.svg" alt="npm downloads">
16 | </a>
17 | </div>
18 |
19 | <br>
20 |
21 | **Notice** This is a node module depends on [cheerio](https://github.com/cheeriojs/cheerio) _can only run on nodejs_. If you need a browser version, you may consider [truncate](https://github.com/pathable/truncate) or [nodejs-html-truncate](https://github.com/huang47/nodejs-html-truncate).
22 |
23 | ```javascript
24 | const truncate = require('truncate-html')
25 | truncate('<p><img src="xxx.jpg">Hello from earth!</p>', 2, { byWords: true })
26 | // => <p><img src="xxx.jpg">Hello from ...</p>
27 | ```
28 |
29 | ## Installation
30 |
31 | `npm install truncate-html` <br>
32 | or <br>
33 | `yarn add truncate-html`
34 |
35 | ## Try it online
36 |
37 | Click **<https://npm.runkit.com/truncate-html>** to try.
38 |
39 | ## API
40 |
41 | ```javascript
42 | /**
43 | * truncate html
44 | * @method truncate(html, [length], [options])
45 | * @param {String|CheerioStatic} html html string to truncate, or existing cheerio instance(aka cheerio $)
46 | * @param {Object|number} length how many letters(words if `byWords` is true) you want reserve
47 | * @param {Object|null} options
48 | * @param {Boolean} [options.stripTags] remove all tags, default false
49 | * @param {String} [options.ellipsis] ellipsis sign, default '...'
50 | * @param {Boolean} [options.decodeEntities] decode html entities(e.g. convert `&` to `&`) before
51 | * counting length, default false
52 | * @param {String|Array} [options.excludes] elements' selector you want ignore
53 | * @param {Number} [options.length] how many letters(words if `byWords` is true)
54 | * you want reserve
55 | * @param {Boolean} [options.byWords] if true, length means how many words to reserve
56 | * @param {Boolean|Number} [options.reserveLastWord] how to deal with when truncate in the middle of a word
57 | * 1. by default, just cut at that position.
58 | * 2. set it to true, with max exceed 10 letters can exceed to reserver the last word
59 | * 3. set it to a positive number decide how many letters can exceed to reserve the last word
60 | * 4. set it to negetive number to remove the last word if cut in the middle.
61 | * @param {Boolean} [options.trimTheOnlyWord] whether to trim the only word when `reserveLastWord` < 0
62 | * if reserveLastWord set to negetive number, and there is only one word in the html string,
63 | * when trimTheOnlyWord set to true, the extra letters will be cutted if word's length longer
64 | * than `length`.
65 | * see issue #23 for more details
66 | * @param {Boolean} [options.keepWhitespaces] keep whitespaces, by default continuous
67 | * spaces will be replaced with one space
68 | * set it true to reserve them, and continuous spaces will count as one
69 | * @return {String}
70 | */
71 | truncate(html, [length], [options])
72 | // and truncate.setup to change default options
73 | truncate.setup(options)
74 | ```
75 |
76 | ### Default options
77 |
78 | ```js
79 | {
80 | stripTags: false,
81 | ellipsis: '...',
82 | decodeEntities: false,
83 | excludes: '',
84 | byWords: false,
85 | reserveLastWord: false,
86 | trimTheOnlyWord: false,
87 | keepWhitespaces: false
88 | }
89 | ```
90 |
91 | You can change default options by using `truncate.setup`
92 |
93 | e.g.
94 |
95 | ```js
96 | truncate.setup({ stripTags: true, length: 10 })
97 | truncate('<p><img src="xxx.jpg">Hello from earth!</p>')
98 | // => Hello from
99 | ```
100 |
101 | or use existing [cheerio instance](https://github.com/cheeriojs/cheerio#loading)
102 |
103 | ```js
104 | import * as cheerio from 'cheerio'
105 | truncate.setup({ stripTags: true, length: 10 })
106 | // truncate option `decodeEntities` will not work
107 | // you should config it in cheerio options by yourself
108 | const $ = cheerio.load('<p><img src="xxx.jpg">Hello from earth!</p>', {
109 | /** set decodeEntities if you need it */
110 | decodeEntities: true
111 | /* any cheerio instance options*/
112 | }, false) // third parameter is for `isDocument` option, set to false to get rid of extra wrappers, see cheerio's doc for details
113 | truncate($)
114 | // => Hello from
115 | ```
116 |
117 | ## Notice
118 |
119 | ### Typescript support
120 |
121 | This lib is written with typescript and has a type definition file along with it. ~~You may need to update your `tsconfig.json` by adding `"esModuleInterop": true` to the `compilerOptions` if you encounter some typing errors, see [#19](https://github.com/oe/truncate-html/issues/19).~~
122 |
123 | ### About final string length
124 |
125 | If the html string content's length is shorter than `options.length`, then no ellipsis will be appended to the final html string. If longer, then the final string length will be `options.length` + `options.ellipsis`. And if you set `reserveLastWord` to true or none zero number, the final string will be various.
126 |
127 | ### About html comments
128 |
129 | All html comments `<!-- xxx -->` will be removed
130 |
131 | ### About dealing with none alphabetic languages
132 |
133 | When dealing with none alphabetic languages, such as Chinese/Japanese/Korean, they don't separate words with whitespaces, so options `byWords` and `reserveLastWord` should only works well with alphabetic languages.
134 |
135 | And the only dependency of this project `cheerio` has an issue when dealing with none alphabetic languages, see [Known Issues](#known-issues) for details.
136 |
137 | ### Using existing cheerio instance
138 |
139 | If you want to use existing cheerio instance, truncate option `decodeEntities` will not work, you should set it in your own cheerio instance:
140 |
141 | ```js
142 | var html = '<p><img src="abc.png">This is a string</p> for test.'
143 | const $ = cheerio.load(`${html}`, {
144 | decodeEntities: true
145 | /** other cheerio options */
146 | }, false) // third parameter is for `isDocument` option, set to false to get rid of extra wrappers, see cheerio's doc for details
147 | truncate($, 10)
148 |
149 | ```
150 |
151 | ## Examples
152 |
153 | ```javascript
154 | var truncate = require('truncate-html')
155 |
156 | // truncate html
157 | var html = '<p><img src="abc.png">This is a string</p> for test.'
158 | truncate(html, 10)
159 | // returns: <p><img src="abc.png">This is a ...</p>
160 |
161 | // truncate string with emojis
162 | var string = '<p>poo 💩💩💩💩💩<p>'
163 | truncate(string, 6)
164 | // returns: <p>poo 💩💩...</p>
165 |
166 | // with options, remove all tags
167 | var html = '<p><img src="abc.png">This is a string</p> for test.'
168 | truncate(html, 10, { stripTags: true })
169 | // returns: This is a ...
170 |
171 | // with options, truncate by words.
172 | // if you try to truncate none alphabet language(like CJK)
173 | // it will not act as you wish
174 | var html = '<p><img src="abc.png">This is a string</p> for test.'
175 | truncate(html, 3, { byWords: true })
176 | // returns: <p><img src="abc.png">This is a ...</p>
177 |
178 | // with options, keep whitespaces
179 | var html = '<p> <img src="abc.png">This is a string</p> for test.'
180 | truncate(html, 10, { keepWhitespaces: true })
181 | // returns: <p> <img src="abc.png">This is a ...</p>
182 |
183 | // combine length and options
184 | var html = '<p><img src="abc.png">This is a string</p> for test.'
185 | truncate(html, {
186 | length: 10,
187 | stripTags: true
188 | })
189 | // returns: This is a ...
190 |
191 | // custom ellipsis sign
192 | var html = '<p><img src="abc.png">This is a string</p> for test.'
193 | truncate(html, {
194 | length: 10,
195 | ellipsis: '~'
196 | })
197 | // returns: <p><img src="abc.png">This is a ~</p>
198 |
199 | // exclude some special elements(by selector), they will be removed before counting content's length
200 | var html = '<p><img src="abc.png">This is a string</p> for test.'
201 | truncate(html, {
202 | length: 10,
203 | ellipsis: '~',
204 | excludes: 'img'
205 | })
206 | // returns: <p>This is a ~</p>
207 |
208 | // exclude more than one category elements
209 | var html =
210 | '<p><img src="abc.png">This is a string</p><div class="something-unwanted"> unwanted string inserted ( ´•̥̥̥ω•̥̥̥` )</div> for test.'
211 | truncate(html, {
212 | length: 20,
213 | stripTags: true,
214 | ellipsis: '~',
215 | excludes: ['img', '.something-unwanted']
216 | })
217 | // returns: This is a string for~
218 |
219 | // handing encoded characters
220 | var html = '<p> test for <p> encoded string</p>'
221 | truncate(html, {
222 | length: 20,
223 | decodeEntities: true
224 | })
225 | // returns: <p> test for <p> encode...</p>
226 |
227 | // when set decodeEntities false
228 | var html = '<p> test for <p> encoded string</p>'
229 | truncate(html, {
230 | length: 20,
231 | decodeEntities: false // this is the default value
232 | })
233 | // returns: <p> test for <p...</p>
234 |
235 | // and there may be a surprise by setting `decodeEntities` to true when handing CJK characters
236 | var html = '<p> test for <p> ä¸æ–‡ string</p>'
237 | truncate(html, {
238 | length: 20,
239 | decodeEntities: true
240 | })
241 | // returns: <p> test for <p> 中文 str...</p>
242 | // to fix this, see below for instructions
243 | ```
244 |
245 | for More usages, check [truncate.spec.ts](./test/truncate.spec.ts)
246 |
247 | ## Credits
248 |
249 | Thanks to:
250 |
251 | - [@calebeno](https://github.com/calebeno) es6 support and unit tests
252 | - [@aaditya-thakkar](https://github.com/aaditya-thakkar) emoji truncating support