UNPKG

3.25 kBMarkdownView Raw
1## HTML cleaner and beautifier
2
3[![NPM Stats](https://nodei.co/npm/clean-html.png?downloads=true&downloadRank=true)](https://npmjs.org/packages/clean-html/)
4
5Do you have crappy HTML? I do!
6
7```html
8<table width="100%" border="0" cellspacing="0" cellpadding="0">
9 <tr>
10 <td height="31"><b>Currently we have these articles available:</b>
11
12 <blockquote>
13 <!-- List articles -->
14 <p><a href="foo.html">The History of Foo</a><br />
15 An <span color="red">informative</span> piece of <FONT FACE="ARIAL">information</FONT>.</p>
16 <p><a href="bar.html">A Horse Walked Into a Bar</a><br/> The bartender said
17 "Why the long face?"</p>
18 </blockquote>
19 </td>
20 </tr>
21 </table>
22```
23
24Just look at those blank lines and random line breaks, trailing spaces, mixed tabs, deprecated tags - it's outrageous!
25
26Let's clean it up...
27
28```bash
29$ npm install clean-html
30```
31
32```javascript
33var cleaner = require('clean-html'),
34 fs = require('fs'),
35 file = process.argv[2];
36
37fs.readFile(file, 'utf-8', function (err, data) {
38 cleaner.clean(data, function (html) {
39 console.log(html);
40 });
41});
42```
43
44Sanity restored!
45
46```html
47<table>
48 <tr>
49 <td>
50 <b>Currently we have these articles available:</b>
51 <blockquote>
52 <!-- List articles -->
53 <p>
54 <a href="foo.html">The History of Foo</a><br>
55 An <span>informative</span> piece of information.
56 </p>
57 <p>
58 <a href="bar.html">A Horse Walked Into a Bar</a><br>
59 The bartender said "Why the long face?"
60 </p>
61 </blockquote>
62 </td>
63 </tr>
64</table>
65```
66
67## Options
68
69### attr-to-remove
70
71Attributes to remove from markup.
72
73Type: Array
74Default: `['align', 'bgcolor', 'border', 'cellpadding', 'cellspacing', 'color', 'disabled', 'height', 'target', 'valign', 'width']`
75
76### block-tags
77
78Block level element tags. Line breaks are added before and after, and nested content is indented.
79
80Type: Array
81Default: `['blockquote', 'div', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hr', 'p', 'table', 'td', 'tr']`
82
83### break-around-comments
84
85Adds line breaks before and after comments.
86
87Type: Boolean
88Default: `true`
89
90### break-after-br
91
92Adds line breaks after br tags.
93
94Type: Boolean
95Default: `true`
96
97### empty-tags
98
99Empty element tags.
100
101Type: Array
102Default: `['br', 'hr', 'img']`
103
104### indent
105
106The string to use for indentation. e.g., a tab character or one or more spaces.
107
108Type: String
109Default: `' '` (two spaces)
110
111### remove-comments
112
113Removes comments.
114
115Type: Boolean
116Default: `false`
117
118### remove-empty-paras
119
120Removes empty paragraph tags.
121
122Type: Boolean
123Default: `false`
124
125### tags-to-remove
126
127Tags to remove from markup.
128
129Type: Array
130Default: `['center', 'font']`
131
132## Adding values to option lists
133
134These options are added for your convenience.
135
136### add-attr-to-remove
137
138Additional attributes to remove from markup.
139
140Type: Array
141Default: `null`
142
143### add-block-tags
144
145Additional block level element tags.
146
147Type: Array
148Default: `null`
149
150### add-empty-tags
151
152Additional empty element tags.
153
154Type: Array
155Default: `null`
156
157### add-tags-to-remove
158
159Additional tags to remove from markup.
160
161Type: Array
162Default: `null`