UNPKG

2.88 kBMarkdownView Raw
1## HTML cleaner and beautifier
2
3[![NPM Stats](https://nodei.co/npm/clean-html.png?downloads=true&downloadRank=true)](https://npmjs.org/packages/clean-html/)
4
5Do you have crappy HTML? I do!
6
7```html
8<table width="100%" border="0" cellspacing="0" cellpadding="0">
9 <tr>
10 <td height="31"><b>Currently we have these articles available:</b>
11
12 <blockquote>
13 <p><a href="foo.html">The History of Foo</a><br />
14 An <span color="red">informative</span> piece of <FONT FACE="ARIAL">information</FONT>.</p>
15 <p><a href="bar.html">A Horse Walked Into a Bar</a><br/> The bartender said
16 "Why the long face?"</p>
17 </blockquote>
18 </td>
19 </tr>
20 </table>
21```
22
23Just look at those blank lines and random line breaks, trailing spaces, mixed tabs, deprecated tags - it's outrageous!
24
25Let's clean it up...
26
27```bash
28$ npm install clean-html
29```
30
31```javascript
32var cleaner = require('clean-html'),
33 fs = require('fs'),
34 file = process.argv[2];
35
36fs.readFile(file, 'utf-8', function (err, data) {
37 process.stdout.write(cleaner.clean(data) + '\n');
38});
39```
40
41Sanity restored!
42
43```html
44<table>
45 <tr>
46 <td>
47 <b>Currently we have these articles available:</b>
48 <blockquote>
49 <p>
50 <a href="foo.html">The History of Foo</a><br>
51 An <span>informative</span> piece of information.
52 </p>
53 <p>
54 <a href="bar.html">A Horse Walked Into a Bar</a><br>
55 The bartender said "Why the long face?"
56 </p>
57 </blockquote>
58 </td>
59 </tr>
60</table>
61```
62
63## Options
64
65### attr-to-remove
66
67Attributes to remove from markup.
68
69Type: Array
70Default: `['align', 'valign', 'bgcolor', 'color', 'width', 'height', 'border', 'cellpadding', 'cellspacing']`
71
72### block-tags
73
74Block level element tags. Line breaks are added before and after, and nested content is indented. Note: this option has no effect unless pretty print is enabled.
75
76Type: Array
77Default: `['div', 'p', 'table', 'tr', 'td', 'blockquote', 'hr']`
78
79### empty-tags
80
81Empty element tags. Trailing slashes are removed.
82
83Type: Array
84Default: `['br', 'hr', 'img']`
85
86### pretty
87
88Pretty prints the output by adding line breaks and indentation.
89
90Type: Boolean
91Default: `true`
92
93### remove-comments
94
95Removes comments.
96
97Type: Boolean
98Default: `false`
99
100### tags-to-remove
101
102Tags to remove from markup.
103
104Type: Array
105Default: `['font']`
106
107## Adding values to option lists
108
109These options are added for your convenience.
110
111### add-attr-to-remove
112
113Additional attributes to remove from markup.
114
115Type: Array
116Default: `null`
117
118### add-block-tags
119
120Additional block level element tags.
121
122Type: Array
123Default: `null`
124
125### add-empty-tags
126
127Additional empty element tags.
128
129Type; Array
130Default: `null`
131
132### add-tags-to-remove
133
134Additional tags to remove from markup.
135
136Type; Array
137Default: `null`