1 | ## HTML cleaner and beautifier
|
2 |
|
3 | [![NPM Stats](https://nodei.co/npm/clean-html.png?downloads=true&downloadRank=true)](https://npmjs.org/packages/clean-html/)
|
4 |
|
5 | Do you have crappy HTML? I do!
|
6 |
|
7 | ```html
|
8 | <table width="100%" border="0" cellspacing="0" cellpadding="0">
|
9 | <tr>
|
10 | <td height="31"><b>Currently we have these articles available:</b>
|
11 |
|
12 | <blockquote>
|
13 | <!-- List articles -->
|
14 | <p><a href="foo.html">The History of Foo</a><br />
|
15 | An <span color="red">informative</span> piece of <FONT FACE="ARIAL">information</FONT>.</p>
|
16 | <p><a href="bar.html">A Horse Walked Into a Bar</a><br/> The bartender said
|
17 | "Why the long face?"</p>
|
18 | </blockquote>
|
19 | </td>
|
20 | </tr>
|
21 | </table>
|
22 | ```
|
23 |
|
24 | Just look at those blank lines and random line breaks, trailing spaces, mixed tabs, deprecated tags - it's outrageous!
|
25 |
|
26 | Let's clean it up...
|
27 |
|
28 | ```bash
|
29 | $ npm install clean-html
|
30 | ```
|
31 |
|
32 | ```javascript
|
33 | var cleaner = require('clean-html'),
|
34 | fs = require('fs'),
|
35 | file = process.argv[2];
|
36 |
|
37 | fs.readFile(file, 'utf-8', function (err, data) {
|
38 | process.stdout.write(cleaner.clean(data) + '\n');
|
39 | });
|
40 | ```
|
41 |
|
42 | Sanity restored!
|
43 |
|
44 | ```html
|
45 | <table>
|
46 | <tr>
|
47 | <td>
|
48 | <b>Currently we have these articles available:</b>
|
49 | <blockquote>
|
50 | <!-- List articles -->
|
51 | <p>
|
52 | <a href="foo.html">The History of Foo</a><br>
|
53 | An <span>informative</span> piece of information.
|
54 | </p>
|
55 | <p>
|
56 | <a href="bar.html">A Horse Walked Into a Bar</a><br>
|
57 | The bartender said "Why the long face?"
|
58 | </p>
|
59 | </blockquote>
|
60 | </td>
|
61 | </tr>
|
62 | </table>
|
63 | ```
|
64 |
|
65 | If you like, you can even close the empty tags, lose the comments and get rid of that nasty presentational markup:
|
66 |
|
67 | ```javascript
|
68 | var options = {
|
69 | 'close-empty-tags': true,
|
70 | 'remove-comments': true,
|
71 | 'add-tags-to-remove': ['table', 'tr', 'td', 'blockquote']
|
72 | };
|
73 |
|
74 | process.stdout.write(cleaner.clean(data, options) + '\n');
|
75 | ```
|
76 |
|
77 | Voila!
|
78 |
|
79 | ```html
|
80 | <b>Currently we have these articles available:</b>
|
81 | <p>
|
82 | <a href="foo.html">The History of Foo</a><br/>
|
83 | An <span>informative</span> piece of information.
|
84 | </p>
|
85 | <p>
|
86 | <a href="bar.html">A Horse Walked Into a Bar</a><br/>
|
87 | The bartender said "Why the long face?"
|
88 | </p>
|
89 | ```
|
90 |
|
91 | ## Options
|
92 |
|
93 | ### attr-to-remove
|
94 |
|
95 | Attributes to remove from markup.
|
96 |
|
97 | Type: Array
|
98 | Default: `['align', 'valign', 'bgcolor', 'color', 'width', 'height', 'border', 'cellpadding', 'cellspacing']`
|
99 |
|
100 | ### block-tags
|
101 |
|
102 | Block level element tags. Line breaks are added before and after, and nested content is indented. Note: this option has no effect unless pretty is set to true.
|
103 |
|
104 | Type: Array
|
105 | Default: `['div', 'p', 'table', 'tr', 'td', 'blockquote', 'hr']`
|
106 |
|
107 | ### break-after-br
|
108 |
|
109 | Adds line breaks after br tags. Note: this option has no effect unless pretty is set to true.
|
110 |
|
111 | Type: Boolean
|
112 | Default: `true`
|
113 |
|
114 | ### close-empty-tags
|
115 |
|
116 | If set to true, adds trailing slashes to empty tags. Otherwise removes trailing slashes.
|
117 |
|
118 | Type: Boolean
|
119 | Default: `false`
|
120 |
|
121 | ### empty-tags
|
122 |
|
123 | Empty element tags.
|
124 |
|
125 | Type: Array
|
126 | Default: `['br', 'hr', 'img']`
|
127 |
|
128 | ### indent
|
129 |
|
130 | The string to use for indentation. e.g., a tab character or one or more spaces. Note: this option has no effect unless pretty is set to true.
|
131 |
|
132 | Type: String
|
133 | Default: `' '` (two spaces)
|
134 |
|
135 | ### pretty
|
136 |
|
137 | Pretty prints the output by adding line breaks and indentation.
|
138 |
|
139 | Type: Boolean
|
140 | Default: `true`
|
141 |
|
142 | ### remove-comments
|
143 |
|
144 | Removes comments.
|
145 |
|
146 | Type: Boolean
|
147 | Default: `false`
|
148 |
|
149 | ### tags-to-remove
|
150 |
|
151 | Tags to remove from markup.
|
152 |
|
153 | Type: Array
|
154 | Default: `['font']`
|
155 |
|
156 | ## Adding values to option lists
|
157 |
|
158 | These options are added for your convenience.
|
159 |
|
160 | ### add-attr-to-remove
|
161 |
|
162 | Additional attributes to remove from markup.
|
163 |
|
164 | Type: Array
|
165 | Default: `null`
|
166 |
|
167 | ### add-block-tags
|
168 |
|
169 | Additional block level element tags.
|
170 |
|
171 | Type: Array
|
172 | Default: `null`
|
173 |
|
174 | ### add-empty-tags
|
175 |
|
176 | Additional empty element tags.
|
177 |
|
178 | Type: Array
|
179 | Default: `null`
|
180 |
|
181 | ### add-tags-to-remove
|
182 |
|
183 | Additional tags to remove from markup.
|
184 |
|
185 | Type: Array
|
186 | Default: `null`
|