UNPKG

7.17 kBMarkdownView Raw
1*dataflo.ws*: dataflow processing for node.js
2=============================================
3
4example
5-------------------------------
6
7you can see a working example by running
8
9 npm install -g dataflo.ws
10 cd $NODE_PATH/dataflo.ws/example/yql/ # example project directory
11 dataflows daemon yql
12
13abstract
14-------------------------------
15
16every application is based on series of dataflows. if your dataflow is written
17in a program code, you have a problem. people don't want to screw up, but
18they do. `dataflo.ws` is an abstract async processing framework for
19describing dataflows in simple configuration.
20
21you can use a dataflow for description of programmatic dataflows. or
22real life ones.
23
24add DSL by your own taste.
25
26concept
27-------------------------------
28
29this project concept is born combining [flow based paradigm](http://en.wikipedia.org/wiki/Flow-based_programming) and
30[responsibility driven design](http://en.wikipedia.org/wiki/Responsibility-driven_design).
31
32typically, the dataflo.ws is designed for usage in client-server applications.
33an external system (client) makes request and an initiator receives this request.
34after parsing request, an initiator makes decision which dataflow is responsible for
35this type of request. when decision is made and a dataflow is found, an initiator starts
36a dataflow. a dataflow is based on series of tasks. tasks have input parameters and
37may have a response value. tasks don't talk one-to-one; instead, output
38from a task is delivered to a dataflow (a controlling routine) via event data (a message);
39input parameters provided by passing parameters from dataflow.
40
41thus, tasks have [loose coupling](http://en.wikipedia.org/wiki/Coupling_(computer_science))
42and can be [distributed](http://en.wikipedia.org/wiki/Distributed_data_flow).
43
44tasks must be designed [functionally cohesed](http://en.wikipedia.org/wiki/Cohesion_(computer_science))
45in mind, which leads to a good system design.
46
47terms
48------------------------------
49
50### initiator ###
51
52it's an entry point generator. every dataflow has one entry point
53from one initiator.
54
55for example, initiator can be a http daemon, an amqp listener,
56a file system event (file changed), a process signal handler,
57a timer or any other external thing.
58
59### entry point ###
60
61it's a trigger for a specific dataflow. each initiator has its own entry point
62description.
63
64for example, an entry point for a httpd daemon is an url. httpd may have a simple
65url match config or a complex route configuration — you decide which initiator handles requests.
66
67### dataflow ###
68
69a set of tasks, which leads dataflow to success or fail
70
71### task ###
72
73a black processing cube.
74
75every `task` has its own requirements. requirements can be an `entry point` data (for
76example, a query string in case of httpd `initiator`) or another `task` data.
77if requirements are met, the `task` starts execution. after execution the `task`
78produces a dataset.
79
80real life example
81-------------------------------
82
83you want to visit a conference. your company does all dirty bureaucratic work for you. but,
84you must make some bureocratic things too.
85
861. receive approval from your boss.
872. wait for all the needed documents for the conference (foreign visa, travel tickets,
88hotel booking, the conference fee pay)
893. get travel allowance
904. visit the conference and make a report for finance department (a hotel invoice,
91airline tickets, a taxi receipt and so on)
925. make a presentation about the conference
93
94then tasks look like:
95
96 conferenceRequest (
97 conferenceInfo
98 ) -> approve
99
100after approval we can book a hotel
101
102 hotelBookingRequest (
103 approve
104 ) -> hotelBooking
105
106documents for visa must already contain conference info and booking
107
108 visaRequest (
109 approve, conferenceInfo, hotelBooking
110 ) -> visa
111
112we pay for the conference and tickets after the visa is received
113
114 conferenceFeePay (
115 approve, visa
116 ) -> conferenceFee
117
118 ticketsPay (
119 approve, visa
120 ) -> tickets
121
122
123
124synopsis
125-------------------------------
126
127
128 var httpd = require ('initiator/http');
129
130 var httpdConfig = {
131 "dataflows": [{
132 "url": "/save",
133 "tasks": [{
134 "$class": "post",
135 "request": "{$request}",
136 $set: "data.body"
137 }, {
138 "$class": "render",
139 "type": "json",
140 "data": "{$data.body}",
141 "output": "{$response}"
142 }]
143 }, {
144 "url": "/entity/tickets/list.json",
145 "tasks": [{
146 "$class": "mongoRequest",
147 "connector": "mongo",
148 "collection": "messages",
149 $set: "data.filter"
150 }, {
151 "$class": "render",
152 "type": "json",
153 "data": "{$data.filter}",
154 "output": "{$response}"
155 }]
156 }]
157 };
158
159 var initiator = new httpd (httpdConfig);
160
161
162
163implementation details
164-----------------------
165
166### initiator ###
167
168an initiator makes a request object, which contains all basic info about particular request. the basic info doesn't mean you received all the data, but everything required to complete the request data.
169
170example: using a httpd initiator, you receive all GET data i.e. a query string, but in case of a POST request, you'll need to receive all the POST data by yourself (using a task within a dataflow)
171
172### task ###
173
174every task has its own state and requirements. all task states:
175
176* scarce - the task is in initial state; not enough data to fulfill requirements
177* ready - task is ready to run (when all task requirements are met)
178* running - a dataflow has decided to launch this task
179* idle - not implemented
180* completed - the task completed without errors
181* error - the task completed with errors
182* skipped - the task is skipped, because another execution branch is selected (see below)
183* exception - somewhere exception was thrown
184
185
186### flow ###
187
188the flow module checks for task requirements and switches task state to `ready`.
189if any running slots are available, the flow starts task exectution.
190
191
192how to write your own task
193--------------------------
194
195assume we have a task for checking file stats
196
197first of all, we need to load a task base class along with fs node module:
198
199 var task = require ('task/base'),
200 fs = require ('fs');
201
202next, we need to write a constructor of our class. a constructor receives all
203of task parameters and must call this.init (config) after all preparations.
204
205 var statTask = module.exports = function (config) {
206 this.path = config.path;
207 this.init (config);
208 };
209
210next, we inherit task base class and add some methods to our stat class:
211
212 util.inherits (statTask, task);
213
214 util.extend (statTask.prototype, {
215
216 run: function () {
217
218 var self = this;
219
220 fs.stat (self.path, function (err, stats) {
221 if (err) {
222 self.emit ('warn', 'stat error for: ' + self.path);
223 self.failed (err);
224 return;
225 }
226
227 self.emit ('log', 'stat done for: ' + self.path);
228 self.completed (stats);
229 })
230 }
231
232 });
233
234in code above i've used these methods:
235
236* emit - emits message to subscriber
237* failed - a method for indication of the failed task
238* completed - marks the task as completed successfully
239
240
241see also
242---------------------------
243
244[http://docs.constructibl.es/specs/js/]
245
246
247
248license
249---------------------------
250
251[MIT License](http://opensource.org/licenses/mit-license.html)