1 | ## HTML cleaner and beautifier
|
2 |
|
3 | [![NPM Stats](https://nodei.co/npm/clean-html.png?downloads=true&downloadRank=true)](https://npmjs.org/packages/clean-html/)
|
4 |
|
5 | Do you have crappy HTML? I do!
|
6 |
|
7 | ```html
|
8 | <table width="100%" border="0" cellspacing="0" cellpadding="0">
|
9 | <tr>
|
10 | <td height="31"><b>Currently we have these articles available:</b>
|
11 |
|
12 | <blockquote>
|
13 | <p><a href="foo.html">The History of Foo</a><br />
|
14 | An <span color="red">informative</span> piece of <FONT FACE="ARIAL">information</FONT>.</p>
|
15 | <p><a href="bar.html">A Horse Walked Into a Bar</a><br/> The bartender said
|
16 | "Why the long face?"</p>
|
17 | </blockquote>
|
18 | </td>
|
19 | </tr>
|
20 | </table>
|
21 | ```
|
22 |
|
23 | Just look at those blank lines and random line breaks, trailing spaces, mixed tabs, deprecated tags - it's outrageous!
|
24 |
|
25 | Let's clean it up...
|
26 |
|
27 | ```bash
|
28 | $ npm install clean-html
|
29 | ```
|
30 |
|
31 | ```javascript
|
32 | var cleaner = require('clean-html'),
|
33 | fs = require('fs'),
|
34 | file = process.argv[2];
|
35 |
|
36 | fs.readFile(file, 'utf-8', function (err, data) {
|
37 | process.stdout.write(cleaner.clean(data) + '\n');
|
38 | });
|
39 | ```
|
40 |
|
41 | Sanity restored!
|
42 |
|
43 | ```html
|
44 | <table>
|
45 | <tr>
|
46 | <td>
|
47 | <b>Currently we have these articles available:</b>
|
48 | <blockquote>
|
49 | <p>
|
50 | <a href="foo.html">The History of Foo</a><br>
|
51 | An <span>informative</span> piece of information.
|
52 | </p>
|
53 | <p>
|
54 | <a href="bar.html">A Horse Walked Into a Bar</a><br>
|
55 | The bartender said "Why the long face?"
|
56 | </p>
|
57 | </blockquote>
|
58 | </td>
|
59 | </tr>
|
60 | </table>
|
61 | ```
|
62 |
|
63 | ## Options
|
64 |
|
65 | ### attr-to-remove
|
66 |
|
67 | Attributes to remove from markup.
|
68 |
|
69 | Type: Array
|
70 | Default: `['align', 'valign', 'bgcolor', 'color', 'width', 'height', 'border', 'cellpadding', 'cellspacing']`
|
71 |
|
72 | ### block-tags
|
73 |
|
74 | Block level element tags. Line breaks are added before and after, and nested content is indented. Note: this option has no effect unless pretty print is enabled.
|
75 |
|
76 | Type: Array
|
77 | Default: `['div', 'p', 'table', 'tr', 'td', 'blockquote', 'hr']`
|
78 |
|
79 | ### break-after-br
|
80 |
|
81 | Adds line breaks after br tags. Note: this option has no effect unless pretty print is enabled.
|
82 |
|
83 | Type: Boolean
|
84 | Default: `true`
|
85 |
|
86 | ### close-empty-tags
|
87 |
|
88 | If set to true, adds trailing slashes to empty tags. Otherwise removes trailing slashes.
|
89 |
|
90 | Type: Boolean
|
91 | Default: `false`
|
92 |
|
93 | ### empty-tags
|
94 |
|
95 | Empty element tags. Used in combination with `close-empty-tags` option.
|
96 |
|
97 | Type: Array
|
98 | Default: `['br', 'hr', 'img']`
|
99 |
|
100 | ### indent
|
101 |
|
102 | The string to use for indentation. e.g., a tab character or one or more spaces. A falsy value indicates that the output should not be indented.
|
103 |
|
104 | Type: String
|
105 | Default: ` `
|
106 |
|
107 | ### pretty
|
108 |
|
109 | Pretty prints the output by adding line breaks and indentation.
|
110 |
|
111 | Type: Boolean
|
112 | Default: `true`
|
113 |
|
114 | ### remove-comments
|
115 |
|
116 | Removes comments.
|
117 |
|
118 | Type: Boolean
|
119 | Default: `false`
|
120 |
|
121 | ### tags-to-remove
|
122 |
|
123 | Tags to remove from markup.
|
124 |
|
125 | Type: Array
|
126 | Default: `['font']`
|
127 |
|
128 | ## Adding values to option lists
|
129 |
|
130 | These options are added for your convenience.
|
131 |
|
132 | ### add-attr-to-remove
|
133 |
|
134 | Additional attributes to remove from markup.
|
135 |
|
136 | Type: Array
|
137 | Default: `null`
|
138 |
|
139 | ### add-block-tags
|
140 |
|
141 | Additional block level element tags.
|
142 |
|
143 | Type: Array
|
144 | Default: `null`
|
145 |
|
146 | ### add-empty-tags
|
147 |
|
148 | Additional empty element tags.
|
149 |
|
150 | Type; Array
|
151 | Default: `null`
|
152 |
|
153 | ### add-tags-to-remove
|
154 |
|
155 | Additional tags to remove from markup.
|
156 |
|
157 | Type; Array
|
158 | Default: `null`
|