UNPKG

@mfjs/compiler/README.md

Version:
24.3 kBMarkdownView Raw
1# JavaScript embedded effects compiler
2
3This is  a JavaScript to JavaScript transpiler. It offers extending JavaScript
4language with various effects by means of runtime libraries, without even using
5compiler plugins.
6
7There are such libraries for:
8
9 * [Asynchronous programming](https://github.com/awto/mfjs-promise)
10 * [Reactive programming (with RxJS)](https://github.com/awto/mfjs-rx)
11 * [Logical programming](https://github.com/awto/mfjs-logic)
12 * [Multi-prompt delimited continuations](https://github.com/awto/mfjs-cc)
13
14These are small libraries, some of them are just tiny wrappers of well known
15interfaces, such as Promises and Rx Observables.
16
17The compiler converts ES5 to ES5 without any syntax extensions. So it may be
18applied to results of other compilers targeting JS such as CoffeeScript,
19TypeScript, Babel etc.
20
21It stratifies input JavaScript code into two levels, namely object and meta
22level. 
23
24Object level syntax is represented using higher order abstract functions
25calls in generated code. Their concrete implementation are loaded from
26specific effects libraries in runtime. The interface is based on Monads
27interfaces hierarchy from Haskell (Functor, Applicative, Alternative,
28Monad).
29
30The other level, meta-level, is still plain JavaScript code executed by some
31JS engine in browser or with node.js. Code of this level constructs objects
32representing syntax of object level. I will call them monadic values in the
33doc. For well known libraries these are Promise, Rx Observable types etc.
34I will also abuse term *pure* for meta level values and code. This doesn't
35mean the code is really pure of course. This is original JavaScript and it
36may use the same side effects already embedded in JavaScript. Including IO,
37references, exceptions etc. 
38
39The transformation is selective, there are means to specify what part of the
40program is effectful or pure. By default there are two translation modes,
41where only expressions wrapped with special function call or all function
42calls are treated as effectful.
43
44The tool is not intended for implementations for most used Haskell monad's like
45State, Reader, Exception, IO etc. The toolset is for adding only effects
46practically useful in JavaScript but not already available in there.
47
48
49## Usage
50
51### Command line tool
52
53In the current version it is very simple, offering only running transform
54and saving results to another folder or file.
55
56    $ npm install @mfjs/compiler
57    $ mfjsc input.js --output dist
58
59The remained options are used to augment default configuration object, and
60it is constructed using [optimist](https://www.npmjs.com/package/optimist)
61command line encoding. For example namespace for core and translation profile
62may be specified via command line too:
63
64    $ mfjsc input.js --output dist --transform.packageVar=M --transform.start=defaultFull
65
66It will use stdin/stdout if no input/output files are specified.
67
68### Browserify transform
69
70    $ npm install @mfjs/compiler
71    $ browserify -t @dmjs/compiler/monadify input.js -o index.js
72
73Or with options:
74
75    $ browserify -t [@dmjs/compiler/monadify --transform.packageVar=M --transform.start=defaultFull] input.js -o index.js
76
77### Hook for require in node.js
78
79The hook is implemented in `@mfjs/compiler/nastyRegister`. It overrides all
80currently registered extensions and post-process their output with the
81transpiler. It is bad idea to use it in production.
82
83For example, for mocha
84
85    $ mocha --compilers js:@mfjs/compiler/nastyRegister
86
87### Other options
88
89 * [gulp plugin](https://github.com/awto/mfjs-gulp)
90 * [grunt plugin](https://github.com/awto/mfjs-grunt)
91
92
93## Code transformation
94
95By default it doesn't touch any code, so it is safe to apply it to all JS
96files in the project. The translation is activated when it encounters CommonJS
97require for core library module. By default it is `@mfjs/core` but may be
98configured. The require call expression must be assigned to some namespace
99variable and that namespace name is used to detect translation directives
100further in the code. I will use name M for core namespace in the doc but
101it may be any other.
102
103In default mode (after require) it still doesn't translate anything. To
104start actual translation specify how eager you want it to be with `M.profile`
105directive receiving string with name of the mode. For example "defaultFull"
106mode may be used to treat all function calls as effectful in all function
107definition except the current one.
108
109In short the tool converts effectful monadic values in code into its inner
110pure value. For example for promises it converts Promise object into the value
111it is going to be resolved to (or already resolved). In generated code this is
112converted to appropriate `then` usages. The backward translation is also
113needed. In the promises example the code may need access to original promise
114object, for example to wait a few promises in parallel.
115
116### From pure to effectul
117
118 * `M.reify` - expects function expression, the function is immediately
119    executed and returned value isn't translated to pure value
120    and so may be passed to a function expecting effectful value.
121 * `M.p` - is the same as `M.reify` but its argument is not a function
122   but just some value the compiler will treat as already pure.
123 * `M.$` is similar to `M.reify` with argument function containing all the
124   statements in the current block (delimited with curly braces) before
125   statement with `M.$` as argument. It may be also used as a function. In
126   this case expression in the argument will be returned as result of reified
127   monadic value.
128
129For example if implementation library has some `sum` function it may be
130called like:
131
132```javascript
133    const k = getValue()
134    const s = M.$(k).sum()
135```
136
137### From effectful to pure
138
139The general backward translation from effectful value to pure value can be
140performed using `M` directive, that will translate monadic value into pure
141even in "minimal" profile. In "full" profile all functions call will be
142immediately reflected into pure value.
143
144
145### Coercions
146
147Since JavaScript is dynamically typed language we cannot know in compile type
148if the function will indeed return monadic value or it may return any pure
149value or no value at all. For this to work the library must keep checking type
150of the value and construct monadic value (with M.pure function) if it is not
151already monadic.
152
153This of course adds additional overhead, so there is an option to disable
154coercions. In this case generated code will always return effectful values from
155effectful functions, and will not add runtime checks. This requires stronger
156discipline for functions not not translated with the compiler using some strong
157code style conventions or probably some future type checker. If accidently not
158effectful functions is called it will crash.
159
160There is also exceptions coercion. If it is enabled (disabled by default) all
161function invocation will be wrapped into `try..catch` block and in case of
162exception its value is translated into `M.raise`.
163
164The coercion level is defined using coerce option possible values: "none",
165"all" or "value".
166
167### Variables scope
168
169Some monads may re-execute some parts of control path several times. The most
170typical example is logical programming monad, reactive programming and
171continuations.
172
173Programmers would expect variables values to revert their values
174on backtracking in typical logical programming language. The compiler tries to
175capture some local variables values for this to work. Non local variables are
176treated as global references (the references related effects are embedded into
177JavaScript). If you apply mutable operator to local variable it will also be
178visible on backtracking. This is a good reason to move to immutable data
179structures though.
180
181To avoid local variables capturing use `M.ref` directive
182specifying variables as arguments. If the code will be used only for monads
183where variable capturing is not needed (like Promise) to avoid capturing
184overhead set option `varCapt` to false.
185
186### Applicative vs Monad interface
187
188There is interfaces hierarchy Functor <- Applicative <- Monad. Functor allows
189only changing inner value of effectful value, Applicative allows combining
190several effectful values into one, and Monad is the more generic one allows
191changing structure of effectful value depending on inner value. In mfjs the
192main corresponding functions for the interfaces are `mapply`, `M.arr`,
193`mbind` respectively. 
194
195Looking on differences between `mapply` and `mbind`, the first one always
196returns pure value while the second (with enabled coercion) may either return
197pure or effectful value. And in fact if coercions is enabled if `mbind` returns
198pure value it is semantically absolutely equivalent to `mapply`. So one may
199wonder why `mapply` is needed at all. Monad generalizes Applicative interface,
200so if something has Monad interface it is automatically Functor and Applicative.
201But monadic expression is not suitable for static analysis, since parts of it
202depend on inner value, and the inner value is not known until some future point
203in runtime. 
204
205For example, for parser combinators monad this means Monad-based one allows
206introducing context dependencies (where depending on some already parsed part
207you may return grammar for the next part), while Applicative combinators don't
208offer such context dependency but they allow building much more efficient parsers
209because it may analyze grammars during parser construction (calculate FOLLOW,
210FIRST sets etc). That is impossible for Monad parser because parts of its grammar
211will be known only during parsing.
212
213Applicative interface allows implicit program parallelization like in
214[haxl](https://github.com/facebook/Haxl).
215
216For some functional reactive libraries which build data-flow graph, monadish
217application means graph rebuilding (switching). The same applicative version
218will have that graph always static. So for example `if-then-else`, if this is
219implemented as Monad the 2 branches graph will be always rebuilt on new value
220of condition, and there will be only one branch branch constructed. While for
221Applicative-based one both branches will be always built and only signal
222propagation will continue to some single branch depending on input value. This
223may be more efficient for libraries where graph construction is expensive,
224because of some possible optimizations.
225
226At the present version compiler always tries to translate expressions into
227Applicative form, unless they are logical operators (&&, ||) or conditional
228(?:), or they change some variable value using direct assignment or update
229(+=, ++, =). This diverges from JavaScript semantics because order of
230operations and whence side effects order isn’t defined. This is a bad code
231practice to have such operation in a single expression anyway. But you may
232still disable this by setting option `expr: "seq"`.
233
234The compiler will translate for-loop into `M.forPar` if its tests and update
235expressions are pure, tests and body doesn’t change any variable (assignment
236is allowed). Some monad implementation may run each iteration in parallel.
237
238In the future versions the compiler will try to to translate more code
239patterns into Applicative form.
240
241### Alternatives
242
243Many monads may support multiply inner values in single monadic value.
244These are reactive programming, logical programming monads etc.
245
246If monad support this you may use either methods from the interface directly
247or directives (`M.answer` or `M.yield`), `yield` expression. All three are
248aliases and have the same encoding. It acts similar to return statement but
249allows continuing same function executions after the point where they were
250invoked adding more answers to result of the function.
251
252No answer result is `M.empty` function call. Yield will discharge no answer
253values in the current block (between {}). So execution will be not performed in
254single block between something executing `M.empty` and up till next and
255including `M.yield`.
256
257## Configuration
258
259There is a global configuration object and it is possible to augment it either
260from command line,  browserify transform arguments, gulp/grunt options, or
261directly from some loaded module if `M.require` is used. The tool also looks
262for a file `mfjs-config.json` in folder there input file located and in its
263ancestors folders using only the first closest found. It is parsed and merged
264into configuration object too.
265
266Top level fields of the configuration object:
267
268 * `parser` - directly passed to esprima parser options
269 * `printer` - directly passed to escodegen printer options
270 * `transform` - options for transformation, described in this section
271     * `packageVar` - name of namespace variable used for importing core library
272                    using CommonJS require, it may be specified if no require
273                    is used
274     * `packageName` - name of core library package used to detect CommonJS
275                     require call to guess `packageVar` in sources
276     * `start` - initial state name, by default it is a state where it doesn't
277               touch anything
278     * `verbose` - debug output
279     * `policyTrace` - outpus details about how options for specific AST node
280                     were chosen
281     * `states` - rules description for deriving translation options for each
282                AST nodes
283
284To set some options in other module you may use `M.require`. During compilation
285the module is loaded with `M.compileTime === true`. The tool expects it exports
286object with `_compile` method and it is called with `this` bound to compilers
287`Scope` object and an variable name the require call is assigned to as argument.
288The `M.require` call is replaced with `require` call in resulting code. The `Scope`
289object has  `profile` and `option` methods similar to corresponding directives.
290
291
292```javascript
293var M = require("@dmjs/core");
294if (M.compileTime) {
295  module.exports = {
296     _compile: function(v) {
297        var p = {};
298        p[v] = true;
299        this.profile("defaultMinimal");
300        this.option({minimal:{CallExpression:{match:{name:p}}}});
301     }
302  }
303}
304
305```
306
307And in some other file:
308
309```javascript
310
311var P = M.require("lib");
312// here translation is in defaultMinimal state and effectful
313// values may be marked with P function call
314
315```
316
317## Interface
318
319There is a reference implementation library
320[@mfjs/core](https://github.com/awto/mfjs-core)
321with documentation for each function, but it is not required to use that
322library. 
323
324The generated code expects monadic value to have following methods:
325
326 * `mbind` - takes function argument and applies it to inner value of `this`
327             returning coerced monadic result (Haskell’s `>>=` function from
328             Monad class) 
329 * `mapply`  - takes function argument and applies it to inner value of `this`
330               replacing that inner value with result without changing monadic
331               value structure (Haskell `fmap` function from Functor class)
332 * `mopt` - adds `undefined` to set of answers of `this`
333 * `mfinally` - takes a function and executes after control exits `this` block
334 * `mhandle` - takes a function and executes it if `this` throws exception
335 * `mconst` - a helper for variables threading,
336              same as `v => this.mapply(() => v)`
337 * `munshiftTo` - a helper for variables threading, same as
338                  `arr => this.mapply(v => arr.unshift(v), v)`
339
340And the imported core library should have following free functions:
341
342 * `M` - coerces value, if returns argument as-is if it is already monadic or
343         `M.pure(v)` otherwise.
344 * `M.pure` - returns monadic value with inner value from argument and with no
345              effects
346 * `M.raise` - returns monadic value representing exception throw
347 * `M.coerce` - wraps function in argument in `try-catch` block and converts
348                thrown exceptions into `M.raise`
349 * `M.empty ` - monadic value with no answers
350 * `M.repeat` - takes a function and initial arguments for it, apply that
351                function infinitely, threading output arguments to input of
352                next function invocation
353 * `M.forPar` - takes test function, iteration body function, update function,
354                with first iteration arguments, test and update functions are
355                pure, all receives current iteration arguments, update
356                function returns iteration arguments for next iteration. 
357 * `M.block` - takes a function and executes it giving another function as
358               argument for exiting the block, it is used for `break`,
359               `continue`, `yield` encodings
360 * `M.scope` - same as `M.block` but for the whole function, for `return`
361               encoding
362 * `M.arr` - takes array of monadic value and returns monadic value of array of
363             inner values corresponding to input array, this is from Haskell’s
364             Applicative interface, but adapted for more convenient usage in
365             JavaScript
366 * `M.spread` - it simply converts function receiving several arguments to
367                function receiving array of arguments, it will be replaced with
368                modern ES function spreads after the tool will start
369                generating it
370
371The compiler requires specific iterator interface. It is not compatible with ES
372iterators because they are mutable, while for some monads execution control may
373backtrack to some already passed position. In fact if ES allowed such iterators
374cloning this compiler wouldn't be needed.
375
376The iterator is a function object, with `value` field for current value. 
377
378 * `M.iterator` - takes ES iterable object returns mfjs compatible iterator, it
379    is just interface adapter, but works like ES one, taking next iterator
380    invalidates previous ones, returned iterator already points to first element
381    or null if input collection is empty
382 * `M.iteratorBuf` - same as `M.iterator` but it will keep all passed values
383    so all iterators are valid
384 * `M.forInIterator` - returns mfjs compatible iterator for `for-in` statement
385
386## Selective transform
387
388Because of coercions it is pretty ok to transform just everything with "full"
389profile but overhead of some heavy monads is quite sensible. There are means
390to apply transformation option to a some parts of code. There are predefined
391profiles mentioned before but there is also flexible tuning utility. You may
392specify some very custom project policy based on some code conventions in your
393project. 
394
395The policy tool is based on extended but still very low level finite state
396machine. It is very simple and flexible but some policies definition may be big
397because it is too low level. There are plans to implement some more higher
398level system however that is pretty good idea to restrict from creating complex
399policies, that may lead to incomprehensible code.
400
401States and transitions are defined in `transform.states` map in configuration.
402The default one is defined in `config.js`.
403
404### M.profile
405
406The directive simply moves the policy machine into specified state. Its scope
407is a block delimited with curly brackets. On current block exit old state will
408be returned. There are a few predefined states. 
409
410 * "defaultMinimal" - in next function definitions doesn’t translate anything
411    except it is specified as exception in configuration (for example
412    `M` function).
413 * "defaultFull" - in next function definitions  treats as effectful all
414    function calls except exceptions from configuration (for example by default
415    there window, process, and console functions are exceptions)
416 * "full" - same as "defaultFull" but starts immediately
417 * "minimal" - same as "defaultMinimal" but starts immediately
418
419### M.option
420
421The function augments the policy state machine definition. It simply merges the
422object from argument into `transform.states` and recalculates caches. So the
423function allows creating new states and changing options and matcher in current
424state (for example to add more exception to "full" and "minimal" policies). Its
425scope is function definition scope. When translation exits function definition
426where `M.option` was called it will revert states object to the one used before
427entering function.
428
429During transformation of each AST node it is matched using custom predicate
430specific for current state. If it is matched it further may instruct the
431system to move to some other state or just return some set of options used
432to guide transformation process.
433
434The transition system is encoded as a plain JS object. Each field is either
435option value or some key used to specify more exact match. On the first level
436map keys are [ESTree](https://github.com/estree/estree/blob/master/spec.md)
437node type name and default options for all node. On the second level for each
438node type arbitrary predicate may be used (names specified in `select` field).
439It may be registered by adding a function to `require('@mfjs/compiler/policy').selector`
440map. However in the present version it is better to use only predefined ones
441because of likely near future changes.
442
443To match declaration or a function call by name use `matchDeclName` or
444`matchCallName` selectors. As configuration they use `match` and `cases`
445fields. The first assigns some key to some name pattern and the second
446matches key to options to next level. The pattern is an object with prefix,
447postfix, name, package and qname fields. They are also objects with fields
448corresponding to name’s prefix, postfix, full name, package name (the first
449name of qualified names) and qname (fully qualified name) respectively, and
450values of the fields is the key to be looked in `cases` for further
451instructions.
452
453There are `sub` and `next` fields to specify state to transit to if the node
454is matched. The state will be reverted after the node is handled if it is `sub`
455transition or kept until next jump or function definition exit if it is `next`.
456
457
458For example, here we augment `minimal` mode to translate from monadic to pure
459function calls where function name ends with M letter.
460
461```javascript
462M.option({minimal:{CallExpression:{match:{postfix:{M:true}}}}});
463```
464
465More details in upcoming reference. 
466
467Here is the set of possible options:
468
469 * `bind` - the main option, it specifies if the current node is to be treated
470   as effectful or not, i.e. if compiler needs to inject code translating
471   effectful value to pure one for it
472 * `compile` - compiles transforms function definition
473 * `coerce` - `enum {none, value, all}` specify compilers needs to add coercion
474    where needed
475 * `expr` - if it is equal to `seq` expression will follow JS rules for order
476    of execution of sub expression
477 * `bindAssoc` - if value is "right" it will prefer right associative binds
478   generation. If monad definition satisfies monad’s laws the generated code
479   should have the same semantics as default left associative, but code may
480   be cleaner because some variable threading may be avoided.
481 * `loop` - if its value is "seq" the compiler will not generate `M.forPar`
482    loops
483 * `subScope` - if true it will detect JS trick for variables scopes
484   (i.e. `(function() { /*...*/)()`) and treat it as scope not as function
485   declaration (default is true).
486 * `keepScope` - if it is true the compiler doesn’t remove useless M.scope
487    calls
488 * `varCapt` - doesn't do variable capturing if false
489 * `keepForOf` - for pure functions don't translate `for-of` statements 
490
491
492## Directives
493
494The toolset doesn't interoduce any syntax extension, however it uses a set of
495predefined function as directives to provide some translation options. They
496are executed in compile time. Here is the list of currently used ones:
497
498 * `M/M.reflect` - converts from effectful expression to pure
499 * `M.p` - converts from pure expression to effectful
500 * `M.reify` - same as `M.p` but receives function expression which is to be
501    called immediately
502 * `M.$` - placeholder position 
503 * `M.option` - changes state transition definitions
504 * `M.profile` - changes current state
505 * `M.ref` - treats variables in arguments as references and don't capture them
506 * `M.require` - replacement for CommonJS require allowing to do some compile
507    time actions defined in external module
508 * `M.answer/M.yield` - may be used as replacement of `yield` expression
509
510## LICENSE
511
512Copyright © 2016 Vitaliy Akimov
513
514Distributed under the terms of the The MIT License (MIT).
515
516