UNPKG

postcss/docs/architecture.md

Version:

8.55 kBMarkdownView Raw

1## PostCSS Architecture
2
3General overview of the PostCSS architecture.
4It can be useful for everyone who wish to contribute to core or develop better understanding of the tool.
5
6**Table of Contents**
7
8- [Overview](#overview)
9- [Workflow](#workflow)
10- [Core Structures](#core-structures)
11    * [Tokenizer](#tokenizer--libtokenizees6-)
12    * [Parser](#parser--libparsees6-libparseres6-)
13    * [Processor](#processor--libprocessores6-)
14    * [Stringifier](#stringifier--libstringifyes6-libstringifieres6-)
15- [API](#api-reference)
16
17### Overview
18
19> This section describes ideas lying behind PostCSS
20
21Before diving deeper into development of PostCSS let's briefly describe what is PostCSS and what is not.
22
23**PostCSS**
24
25- *is **NOT** a style preprocessor like `Sass` or `Less`.*
26
27    It does not define custom syntax and semantic, it's not actually a language.
28    PostCSS works with CSS and can be easily integrated with tools described above. That being said any valid CSS can be processed by PostCSS.
29
30- *is a tool for CSS syntax transformations*
31
32    It allows you to define custom CSS like syntax that could be understandable and transformed by plugins. That being said PostCSS is not strictly about CSS spec but about syntax definition manner of CSS. In such way you can define custom syntax constructs like at-rule, that could be very helpful for tools build around PostCSS. PostCSS plays a role of framework for building outstanding tools for CSS manipulations.
33
34- *is a big player in CSS ecosystem*
35
36    Large amount of lovely tools like `Autoprefixer`, `Stylelint`, `CSSnano` were built on PostCSS ecosystem. There is big chance that you already use it implicitly, just check your `node_modules` :smiley:
37
38### Workflow
39
40This is high-level overview of whole PostCSS workflow
41
42<img width="300" src="https://upload.wikimedia.org/wikipedia/commons/thumb/a/aa/PostCSS_scheme.svg/512px-PostCSS_scheme.svg.png" alt="workflow">
43
44As you can see from diagram above, PostCSS architecture is pretty straightforward but some parts of it could be misunderstood.
45
46From diagram above you can see part called *Parser*, this construct will be described in details later on, just for now think about it as a structure that can understand your CSS like syntax and create object representation of it.
47
48That being said, there are few ways to write parser
49
50 - *Write a single file with string to AST transformation*
51
52    This method is quite popular, for example, the [Rework analyzer](https://github.com/reworkcss/css/blob/master/lib/parse/index.js) was written in this style. But with a large code base, the code becomes hard to read and pretty slow.
53
54 - *Split it into lexical analysis/parsing steps (source string → tokens → AST)*
55
56    This is the way of how we do it in PostCSS and also the most popular one.
57    A lot of parsers like [`@babel/parser` (parser behind Babel)](https://github.com/babel/babel/tree/master/packages/babel-parser), [`CSSTree`](https://github.com/csstree/csstree) were written in such way.
58    The main reasons to separate tokenization from parsing steps are performance and abstracting complexity.
59
60Let think about why second way is better for our needs.
61
62First of all because string to tokens step takes more time than parsing step. We operate on large source string and process it char by char, this is why it is very inefficient operation in terms of performance and we should perform it only once.
63
64But from other side tokens to AST transformation is logically more complex so with such separation we could write very fast tokenizer (but from this comes sometimes hard to read code) and easy to read (but slow) parser.
65
66Summing it up splitting in two steps improve performance and code readability.
67
68So now lets look more closely on structures that play main role in PostCSS workflow.
69
70### Core Structures
71
72 - #### Tokenizer ( [lib/tokenize.es6](https://github.com/postcss/postcss/blob/master/lib/tokenize.es6) )
73
74    Tokenizer (aka Lexer) plays important role in syntax analysis.
75
76    It accepts CSS string and returns list of tokens.
77
78    Token is a simple structure that describes some part of syntax like `at-rule`, `comment` or `word`. It can also contain positional information for more descriptive errors.
79
80    For example if we consider following CSS
81
82    ```css
83    .className { color: #FFF; }
84    ```
85
86    corresponding tokens representation from PostCSS will be
87    ```js
88    [
89        ["word", ".className", 1, 1, 1, 10]
90        ["space", " "]
91        ["{", "{", 1, 12]
92        ["space", " "]
93        ["word", "color", 1, 14, 1, 18]
94        [":", ":", 1, 19]
95        ["space", " "]
96        ["word", "#FFF" , 1, 21, 1, 23]
97        [";", ";", 1, 24]
98        ["space", " "]
99        ["}", "}", 1, 26]
100    ]
101    ```
102
103    As you can see from the example above single token represented as a list and also `space` token doesn't have positional information.
104
105    Lets look more closely on single token like `word`. As it was said each token represented as a list and follow such pattern.
106
107    ```js
108    const token = [
109         // represents token type
110        'word',
111
112        // represents matched word
113        '.className',
114
115        // This two numbers represent start position of token.
116        // It's optional value as we saw in example above,
117        // tokens like `space` don't have such information.
118
119        // Here the first number is line number and the second one is corresponding column.
120        1, 1,
121
122        // Next two numbers also optional and represent end position for multichar tokens like this one. Numbers follow same rule as was described above
123        1, 10
124    ]
125    ```
126   There are many patterns how tokenization could be done, PostCSS motto is performance and simplicity. Tokenization is complex computing operation and take large amount of syntax analysis time ( ~90% ), that why PostCSS' Tokenizer looks dirty but it was optimized for speed. Any high-level constructs like classes could dramatically slow down tokenizer.
127
128    PostCSS' Tokenizer use some sort of streaming/chaining API where you exposes [`nextToken()`](https://github.com/postcss/postcss/blob/master/lib/tokenize.es6#L48-L308) method to Parser. In this manner we provide clean interface for Parser and reduce memory usage by storing only few tokens and not whole list of tokens.
129
130- #### Parser ( [lib/parse.es6](https://github.com/postcss/postcss/blob/master/lib/parse.es6), [lib/parser.es6](https://github.com/postcss/postcss/blob/master/lib/parser.es6) )
131
132    Parser is main structure that responsible for [syntax analysis](https://en.wikipedia.org/wiki/Parsing) of incoming CSS. Parser produces structure called [Abstract Syntax Tree (AST)](https://en.wikipedia.org/wiki/Abstract_syntax_tree) that could then be transformed by plugins later on.
133
134    Parser works in common with Tokenizer and operates over tokens not source string, as it would be very inefficient operation.
135
136    It use mostly `nextToken` and `back` methods provided by Tokenizer for obtaining single or multiple tokens and then construct part of AST called `Node`
137
138    There are multiple Node types that PostCSS could produce but all of them inherit from base Node [class](https://github.com/postcss/postcss/blob/master/lib/node.es6#L34).
139
140- #### Processor ( [lib/processor.es6](https://github.com/postcss/postcss/blob/master/lib/processor.es6) )
141
142    Processor is a very plain structure that initializes plugins and run syntax transformations. Plugin is just a function registered with [postcss.plugin](https://github.com/postcss/postcss/blob/master/lib/postcss.es6#L109) call.
143
144    It exposes quite few public API methods. Description of them could be found on [api.postcss.org/Processor](http://api.postcss.org/Processor.html)
145
146- #### Stringifier ( [lib/stringify.es6](https://github.com/postcss/postcss/blob/master/lib/stringify.es6), [lib/stringifier.es6](https://github.com/postcss/postcss/blob/master/lib/stringifier.es6) )
147
148    Stringifier is a base class that translates modified AST to pure CSS string. Stringifier traverse AST starting from provided Node and generate raw string representation of it calling corresponding methods.
149
150    The most essential method is [`Stringifier.stringify`](https://github.com/postcss/postcss/blob/master/lib/stringifier.es6#L25-L27)
151    that accepts initial Node and semicolon indicator.
152    You can learn more by checking [stringifier.es6](https://github.com/postcss/postcss/blob/master/lib/stringifier.es6)
153
154### API Reference
155
156More descriptive API documentation could be found [here](http://api.postcss.org/)

1	`## PostCSS Architecture`
2
3	`General overview of the PostCSS architecture.`
4	`It can be useful for everyone who wish to contribute to core or develop better understanding of the tool.`
5
6	`Table of Contents`
7
8	`- [Overview](#overview)`
9	`- [Workflow](#workflow)`
10	`- [Core Structures](#core-structures)`
11	`* [Tokenizer](#tokenizer--libtokenizees6-)`
12	`* [Parser](#parser--libparsees6-libparseres6-)`
13	`* [Processor](#processor--libprocessores6-)`
14	`* [Stringifier](#stringifier--libstringifyes6-libstringifieres6-)`
15	`- [API](#api-reference)`
16
17	`### Overview`
18
19	`> This section describes ideas lying behind PostCSS`
20
21	`Before diving deeper into development of PostCSS let's briefly describe what is PostCSS and what is not.`
22
23	`PostCSS`
24
25	- is NOT* a style preprocessor like `Sass` or `Less`.*
26
27	`It does not define custom syntax and semantic, it's not actually a language.`
28	`PostCSS works with CSS and can be easily integrated with tools described above. That being said any valid CSS can be processed by PostCSS.`
29
30	`- is a tool for CSS syntax transformations`
31
32	`It allows you to define custom CSS like syntax that could be understandable and transformed by plugins. That being said PostCSS is not strictly about CSS spec but about syntax definition manner of CSS. In such way you can define custom syntax constructs like at-rule, that could be very helpful for tools build around PostCSS. PostCSS plays a role of framework for building outstanding tools for CSS manipulations.`
33
34	`- is a big player in CSS ecosystem`
35
36	Large amount of lovely tools like `Autoprefixer`, `Stylelint`, `CSSnano` were built on PostCSS ecosystem. There is big chance that you already use it implicitly, just check your `node_modules` :smiley:
37
38	`### Workflow`
39
40	`This is high-level overview of whole PostCSS workflow`
41
42	`<img width="300" src="https://upload.wikimedia.org/wikipedia/commons/thumb/a/aa/PostCSS_scheme.svg/512px-PostCSS_scheme.svg.png" alt="workflow">`
43
44	`As you can see from diagram above, PostCSS architecture is pretty straightforward but some parts of it could be misunderstood.`
45
46	`From diagram above you can see part called Parser, this construct will be described in details later on, just for now think about it as a structure that can understand your CSS like syntax and create object representation of it.`
47
48	`That being said, there are few ways to write parser`
49
50	`- Write a single file with string to AST transformation`
51
52	`This method is quite popular, for example, the [Rework analyzer](https://github.com/reworkcss/css/blob/master/lib/parse/index.js) was written in this style. But with a large code base, the code becomes hard to read and pretty slow.`
53
54	`- Split it into lexical analysis/parsing steps (source string → tokens → AST)`
55
56	`This is the way of how we do it in PostCSS and also the most popular one.`
57	A lot of parsers like [`@babel/parser` (parser behind Babel)](https://github.com/babel/babel/tree/master/packages/babel-parser), [`CSSTree`](https://github.com/csstree/csstree) were written in such way.
58	`The main reasons to separate tokenization from parsing steps are performance and abstracting complexity.`
59
60	`Let think about why second way is better for our needs.`
61
62	`First of all because string to tokens step takes more time than parsing step. We operate on large source string and process it char by char, this is why it is very inefficient operation in terms of performance and we should perform it only once.`
63
64	`But from other side tokens to AST transformation is logically more complex so with such separation we could write very fast tokenizer (but from this comes sometimes hard to read code) and easy to read (but slow) parser.`
65
66	`Summing it up splitting in two steps improve performance and code readability.`
67
68	`So now lets look more closely on structures that play main role in PostCSS workflow.`
69
70	`### Core Structures`
71
72	`- #### Tokenizer ( [lib/tokenize.es6](https://github.com/postcss/postcss/blob/master/lib/tokenize.es6) )`
73
74	`Tokenizer (aka Lexer) plays important role in syntax analysis.`
75
76	`It accepts CSS string and returns list of tokens.`
77
78	Token is a simple structure that describes some part of syntax like `at-rule`, `comment` or `word`. It can also contain positional information for more descriptive errors.
79
80	`For example if we consider following CSS`
81
82	```css
83	`.className { color: #FFF; }`
84	```
85
86	`corresponding tokens representation from PostCSS will be`
87	```js
88	`[`
89	`["word", ".className", 1, 1, 1, 10]`
90	`["space", " "]`
91	`["{", "{", 1, 12]`
92	`["space", " "]`
93	`["word", "color", 1, 14, 1, 18]`
94	`[":", ":", 1, 19]`
95	`["space", " "]`
96	`["word", "#FFF" , 1, 21, 1, 23]`
97	`[";", ";", 1, 24]`
98	`["space", " "]`
99	`["}", "}", 1, 26]`
100	`]`
101	```
102
103	As you can see from the example above single token represented as a list and also `space` token doesn't have positional information.
104
105	Lets look more closely on single token like `word`. As it was said each token represented as a list and follow such pattern.
106
107	```js
108	`const token = [`
109	`// represents token type`
110	`'word',`
111
112	`// represents matched word`
113	`'.className',`
114
115	`// This two numbers represent start position of token.`
116	`// It's optional value as we saw in example above,`
117	// tokens like `space` don't have such information.
118
119	`// Here the first number is line number and the second one is corresponding column.`
120	`1, 1,`
121
122	`// Next two numbers also optional and represent end position for multichar tokens like this one. Numbers follow same rule as was described above`
123	`1, 10`
124	`]`
125	```
126	`There are many patterns how tokenization could be done, PostCSS motto is performance and simplicity. Tokenization is complex computing operation and take large amount of syntax analysis time ( ~90% ), that why PostCSS' Tokenizer looks dirty but it was optimized for speed. Any high-level constructs like classes could dramatically slow down tokenizer.`
127
128	PostCSS' Tokenizer use some sort of streaming/chaining API where you exposes [`nextToken()`](https://github.com/postcss/postcss/blob/master/lib/tokenize.es6#L48-L308) method to Parser. In this manner we provide clean interface for Parser and reduce memory usage by storing only few tokens and not whole list of tokens.
129
130	`- #### Parser ( [lib/parse.es6](https://github.com/postcss/postcss/blob/master/lib/parse.es6), [lib/parser.es6](https://github.com/postcss/postcss/blob/master/lib/parser.es6) )`
131
132	`Parser is main structure that responsible for [syntax analysis](https://en.wikipedia.org/wiki/Parsing) of incoming CSS. Parser produces structure called [Abstract Syntax Tree (AST)](https://en.wikipedia.org/wiki/Abstract_syntax_tree) that could then be transformed by plugins later on.`
133
134	`Parser works in common with Tokenizer and operates over tokens not source string, as it would be very inefficient operation.`
135
136	It use mostly `nextToken` and `back` methods provided by Tokenizer for obtaining single or multiple tokens and then construct part of AST called `Node`
137
138	`There are multiple Node types that PostCSS could produce but all of them inherit from base Node [class](https://github.com/postcss/postcss/blob/master/lib/node.es6#L34).`
139
140	`- #### Processor ( [lib/processor.es6](https://github.com/postcss/postcss/blob/master/lib/processor.es6) )`
141
142	`Processor is a very plain structure that initializes plugins and run syntax transformations. Plugin is just a function registered with [postcss.plugin](https://github.com/postcss/postcss/blob/master/lib/postcss.es6#L109) call.`
143
144	`It exposes quite few public API methods. Description of them could be found on [api.postcss.org/Processor](http://api.postcss.org/Processor.html)`
145
146	`- #### Stringifier ( [lib/stringify.es6](https://github.com/postcss/postcss/blob/master/lib/stringify.es6), [lib/stringifier.es6](https://github.com/postcss/postcss/blob/master/lib/stringifier.es6) )`
147
148	`Stringifier is a base class that translates modified AST to pure CSS string. Stringifier traverse AST starting from provided Node and generate raw string representation of it calling corresponding methods.`
149
150	The most essential method is [`Stringifier.stringify`](https://github.com/postcss/postcss/blob/master/lib/stringifier.es6#L25-L27)
151	`that accepts initial Node and semicolon indicator.`
152	`You can learn more by checking [stringifier.es6](https://github.com/postcss/postcss/blob/master/lib/stringifier.es6)`
153
154	`### API Reference`
155
156	`More descriptive API documentation could be found [here](http://api.postcss.org/)`