UNPKG

24.7 kBMarkdownView Raw
1---
2title: DataSources
3layout: default
4category: Kettle
5---
6# DataSources
7
8A DataSource is an Infusion component which meets a simple contract for read/write access to indexed data.
9DataSource is a simple semantic, broadly the same as that encoded in
10[CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete), although the current DataSource semantic does
11not provide explicitly for deletion.
12
13The concrete DataSources in Kettle provide support for HTTP endpoints (with a particular variety specialised for
14accessing CouchDB databases with CRUDlike semantics) as well as the filesystem, with an emphasis on JSON payloads.
15
16The DataSource API is drawn from the following two methods – a read-only DataSource will just implement `get`, and a
17writeable DataSource will implement both `get` and `set`:
18
19```javascript
20 /* @param directModel {Object} A JSON structure holding the "coordinates" of the state to be read -
21 * this model is morally equivalent to (the substitutable parts of) a file path or URL
22 * @param options {Object} [Optional] A JSON structure holding configuration options good for just
23 * this request. These will be specially interpreted by the particular concrete grade of DataSource
24 * – there are no options valid across all implementations of this grade.
25 * @return {Promise} A promise representing successful or unsuccessful resolution of the read state
26 */
27 dataSource.get(directModel, options);
28 /* @param directModel {Object} As for get
29 * @param model {Object} The state to be written to the coordinates
30 * @param options {Object} [Optional] A JSON structure holding configuration options good for just
31 * this request. These will be specially interpreted by the
32 * particular concrete grade of DataSource – there are no options valid across all implementations
33 * of this grade. For example, a URL DataSource will accept an option `writeMethod` which will
34 * allow the user to determine which HTTP method (PUT or POST) will be used to implement the write
35 * operation.
36 * @return {Promise} A promise representing resolution of the written state,
37 * which may also optionally resolve to any returned payload from the write process
38 */
39 dataSource.set(directModel, model, options);
40```
41
42## Simple example of using an HTTP dataSource
43
44In this example we define and instantiate a simple HTTP-backed dataSource accepting one argument to configure a URL
45segment:
46
47```javascript
48var fluid = require("infusion"),
49 kettle = require("../../kettle.js"),
50 examples = fluid.registerNamespace("examples");
51
52
53fluid.defaults("examples.httpDataSource", {
54 gradeNames: "kettle.dataSource.URL",
55 url: "http://jsonplaceholder.typicode.com/posts/%postId",
56 termMap: {
57 postId: "%directPostId"
58 }
59});
60
61var myDataSource = examples.httpDataSource();
62var promise = myDataSource.get({directPostId: 42});
63
64promise.then(function (response) {
65 console.log("Got dataSource response of ", response);
66}, function (error) {
67 console.error("Got dataSource error response of ", error);
68});
69```
70
71You can run this snippet from our code samples by running `node simpleDataSource.js` from
72[examples/simpleDataSource](../examples/simpleDataSource) in our samples area.
73This contacts the useful JSON placeholder API service at
74[`jsonplaceholder.typicode.com`](http://jsonplaceholder.typicode.com/)
75to retrieve a small JSON document holding some placeholder text. If you get a 404 or an error, please contact us and
76we'll update this sample to contact a new service.
77
78An interesting element in this snippet is the `termMap` configured as options of our dataSource. This sets up an
79indirection between the `directModel` supplied as the argument to the `dataSource.get` call, and the URL issued in the
80HTTP request. The keys in the `termMap` are interpolation variables in the URL, which in the URL are prefixed by `%`.
81The values in the `termMap` represent either
82
83* Plain values to be interpolated as strings directly into the URL, or
84* If the first character of the value in the `termMap` is %, the remainder of the string represents a path which will
85 be dereferenced from the `directModel` argument to the current `set` or `get` request.
86
87In addition, if the term value has the prefix `noencode:`, it will be interpolated without any
88[URI encoding](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent).
89
90We document these configuration options in the next section:
91
92## Configuration options accepted by `kettle.dataSource.URL`
93
94<table>
95 <thead>
96 <tr>
97 <th colspan="3">Supported configurable options for a <code>kettle.dataSource.URL</code></th>
98 </tr>
99 <tr>
100 <th>Option Path</th>
101 <th>Type</th>
102 <th>Description</th>
103 </tr>
104 </thead>
105 <tbody>
106 <tr>
107 <td><code>writable</code></td>
108 <td><code>Boolean</code> (default: <code>false</code>)</td>
109 <td>If this option is set to <code>true</code>, a <code>set</code> method will be fabricated for this
110 dataSource – otherwise, it will implement only a <code>get</code> method.</td>
111 </tr>
112 <tr>
113 <td><code>writeMethod</code></td>
114 <td><code>String</code> (default: <code>PUT</code>)</td>
115 <td>The HTTP method to be used when the <code>set</code> method is operated on this writable DataSource
116 (with <code>writable: true</code>). This defaults to <code>PUT</code> but
117 <code>POST</code> is another option. Note that this option can also be supplied within the
118 <code>options</code> argument to the <code>set</code> method itself.</td>
119 </tr>
120 <tr>
121 <td><code>url</code></td>
122 <td><code>String</code></td>
123 <td>A URL template, with interpolable elements expressed by terms beginning with the <code>%</code>
124 character, for the URL which will be operated by the <code>get</code> and <code>set</code> methods of
125 this dataSource.</td>
126 </tr>
127 <tr>
128 <td><code>termMap</code></td>
129 <td><code>Object</code> (map of <code>String</code> to <code>String</code>)</td>
130 <td>A map, of which the keys are some of the interpolation terms held in the <code>url</code> string,
131 and the values will be used to perform the interpolation. If a value begins with <code>%</code>,
132 the remainder of the string represents a
133 <a href="http://docs.fluidproject.org/infusion/development/FrameworkConcepts.html#el-paths">path</a>
134 into the <code>directModel</code> argument accepted by the <code>get</code> and <code>set</code>
135 methods of the DataSource. By default any such values looked up will be
136 <a href="https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent">
137 URI Encoded</a> before being interpolated into the URL – unless their value in the termMap is
138 prefixed by the string <code>noencode:</code>.</td>
139 </tr>
140 <tr>
141 <td><code>notFoundIsEmpty</code></td>
142 <td><code>Boolean</code></a> (default: <code>false</code>)</td>
143 <td>If this option is set to <code>true</code>, a fetch of a nonexistent resource (that is, a
144 nonexistent file, or an HTTP resource giving a 404) will result in a <code>resolve</code> with an empty
145 payload rather than a <code>reject</code> response.</td>
146 </tr>
147 <tr>
148 <td><code>censorRequestOptionsLog</code></td>
149 <td><code>Object</code> (map of <code>String</code> to <code>Boolean</code>) (default:
150 <code>{auth: true, "headers.Authorization": true}</code>)
151 </td>
152 <td>A map of paths into the <a href="https://nodejs.org/api/http.html#http_http_request_options_callback">
153 request options</a> which should be censored from appearing in logs. Any path which maps to <code>true</code>
154 will not appear either in the logging output derived from the request options parsed from the url
155 or the url itself.
156 </td>
157 </tr>
158 <tr>
159 <td><code>components.encoding.type</code></td>
160 <td><code>String</code> (grade name)</td>
161 <td>A <code>kettle.dataSource.URL</code> has a subcomponent named <code>encoding</code> which the user can
162 override in order to choose the encoding used to read and write the <code>model</code>
163 object to and from the textual form in persistence. This defaults to
164 <code>kettle.dataSource.encoding.JSON</code>. Other builtin encodings are
165 <code>kettle.dataSource.encoding.formenc</code> operating HTML
166 <a href="http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded">form
167 encoding</a> and <code>kettle.dataSource.encoding.none</code> which applies no encoding.
168 More details in <a href="#using-content-encodings-with-a-datasource">Using Content Encodings with a
169 DataSource</a>.</td>
170 </tr>
171 <tr>
172 <td><code>setResponseTransforms</code></td>
173 <td><code>Array of String</code></a> (default: <code>["encoding"]</code>)</td>
174 <td>Contains a list of the namespaces of the transform elements (see section
175 <a href="#transforming-promise-chains">transforming promise chains</a> that are to be applied if there
176 is a response payload from the <code>set</code> method, which is often the case with an HTTP backend.
177 With a JSON encoding these encoding typically happens symmetrically - with a JSON request one will
178 receive a JSON response - however, with other encoding such as
179 <a href="http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded">form
180 encoding</a> this is often not the case and one might like to defeat the effect of trying to decode
181 the HTTP response as a form. In this case, for example, one can override
182 <code>setResponseTransforms</code> with the empty array <code>[]</code>. </td>
183 </tr>
184 <tr>
185 <td><code>charEncoding</code></td>
186 <td><code>String</code> (default: <code>utf8</code>)</td>
187 <td>The character encoding of the incoming HTTP stream used to convert its data to characters - this will
188 be sent directly to the
189 <a href="https://nodejs.org/api/stream.html#stream_readable_setencoding_encoding">setEncoding</code>
190 method of the response stream</td>
191 </tr>
192 <tr>
193 <td><code>invokers.resolveUrl</code></td>
194 <td><a href="http://docs.fluidproject.org/infusion/development/Invokers.html"><code>IoC Invoker</code></a>
195 (default: <code>kettle.dataSource.URL.resolveUrl</code>)</td>
196 <td>This invoker can be overridden to customise the process of building the url for a dataSource request.
197 The default implementation uses an invocation of
198 <a href="http://docs.fluidproject.org/infusion/development/CoreAPI.html#fluid-stringtemplate-template-terms-"><code>fluid.stringTemplate</code></a>
199 to interpolate elements from <code>termMap</code> and the <code>directModel</code> argument into the
200 template string held in <code>url</code>. By overriding this invoker, the user can implement a
201 strategy of their choosing. The supplied arguments to the invoker consist of the values
202 <code>(url, termMap, directModel)</code> taken from these options and the dataSource request arguments,
203 but the override can replace these with any IoC-sourced values in the invoker definition.</td>
204 </tr>
205 </tbody>
206</table>
207
208In addition, a `kettle.dataSource.URL` component will accept any options accepted by node's native
209[`http.request`](https://nodejs.org/api/http.html#http_http_request_options_callback) constructor – supported in
210addition to the above are `protocol`, `host`, `port`, `headers`, `hostname`, `family`, `localAddress`, `socketPath`,
211`auth` and `agent`. All of these options will be overriden by options of the same names supplied as the `options` object
212supplied as the last argument to the dataSource's `get` and `set` methods. This is a good way, for example, to send
213custom HTTP headers along with a URL dataSource request. Note that any of these component-level options (e.g. `port`,
214`protocol`, etc.) that can be derived from parsing the `url` option will override the value from the url. Compare this
215setup with the very similar one operated in the testing framework for
216[`kettle.test.request.http`](KettleTestingFramework.md#kettle.test.request.http).
217
218## Configuration options accepted by `kettle.dataSource.file`
219
220An alternative dataSource implementation is `kettle.dataSource.file` - this is backed by the node filesystem API to
221allow files to be read and written in various encodings. The interpolation support based on `termMap` is very similar
222to that for `kettle.dataSource.URL`, but with the location template option named `path` representing an absolute
223filesystem path rather than the `url` property of `kettle.dataSource.URL` representing
224a URL.
225
226Exactly the same scheme based on the subcomponent named `encoding` can be used to control content encoding for a
227`kettle.dataSource.file` as for a `kettle.dataSource.URL`. Similarly, `kettle.dataSource.file` supports
228a further option named `charEncoding` which can select between various of the character encodings supported by node.js.
229
230<table>
231 <thead>
232 <tr>
233 <th colspan="3">Supported configurable options for a <code>kettle.dataSource.file</code></th>
234 </tr>
235 <tr>
236 <th>Option Path</th>
237 <th>Type</th>
238 <th>Description</th>
239 </tr>
240 </thead>
241 <tbody>
242 <tr>
243 <td><code>writable</code></td>
244 <td><code>Boolean</code> (default: <code>false</code>)</td>
245 <td>If this option is set to <code>true</code>, a <code>set</code> method will be fabricated for this
246 dataSource – otherwise, it will implement only a <code>get</code> method.</td>
247 </tr>
248 <tr>
249 <td><code>path</code></td>
250 <td><code>String</code></td>
251 <td>An (absolute) file path template, with interpolable elements expressed by terms beginning with the
252 <code>%</code> character, for the file which will be read and written the <code>get</code> and
253 <code>set</code> methods of this dataSource.</td>
254 </tr>
255 <tr>
256 <td><code>termMap</code></td>
257 <td><code>Object</code> (map of <code>String</code> to <code>String</code>)</td>
258 <td>A map, of which the keys are some of the interpolation terms held in the <code>url</code> string, and
259 the values, if prefixed by <code>%</code> are paths into the <code>directModel</code> argument
260 accepted by the <code>get</code> and <code>set</code> methods of the DataSource.</td>
261 </tr>
262 <tr>
263 <td><code>charEncoding</code></td>
264 <td><code>String</code> (default: <code>utf8</code></td>
265 <td>The character encoding of the file used to convert its data to characters - one of the values supported
266 by the <a href="https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options">node filesystem
267 API</a> - values it advertises include <code>utf8</code>, <code>ascii</code> or <code>based64</code>.
268 There is also evidence of support for <code>ucs2</code>.</td>
269 </tr>
270 </tbody>
271</table>
272
273A helpful mixin grade for `kettle.dataSource.file` is `kettle.dataSource.file.moduleTerms` which will allow
274interpolation by any module name registered with the Infusion module system
275[`fluid.module.register`](http://docs.fluidproject.org/infusion/development/NodeAPI.html#fluid-module-register-name-basedir-modulerequire-)
276 – e.g. `%kettle/tests/data/couchDataSourceError.json`.
277
278## Using content encodings with a DataSource
279
280`kettle.dataSource.URL` has a subcomponent named `encoding` which the user can override in order to choose the content
281encoding used to convert the model seen at the `get/set` API to the textual (character) form in which it is
282transmitted by the dataSource. The encoding subcomponent will also correctly set the
283[`Content-Type`](http://www.w3.org/Protocols/rfc1341/4_Content-Type.html) header of the outgoing HTTP request in the
284case of a `set` request. The encoding defaults to a JSON encoding represented by a subcomponent of type
285`kettle.dataSource.encoding.JSON`. Here is an example of choosing a different encoding to submit
286[form encoded](http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded) data to an HTTP
287endpoint:
288
289```javascript
290fluid.defaults("examples.formDataSource", {
291 gradeNames: "kettle.dataSource.URL",
292 url: "http://httpbin.org/post",
293 writable: true,
294 writeMethod: "POST",
295 components: {
296 encoding: {
297 type: "kettle.dataSource.encoding.formenc"
298 }
299 },
300 setResponseTransforms: [] // Do not parse the "set" response as formenc - it is in fact JSON
301});
302
303var myDataSource = examples.formDataSource();
304var promise = myDataSource.set(null, {myField1: "myValue1", myField2: "myValue2"});
305
306promise.then(function (response) {
307 console.log("Got dataSource response of ", JSON.parse(response));
308}, function (error) {
309 console.error("Got dataSource error response of ", error);
310});
311```
312
313In this example we set up a form-encoded, writable dataSource targetted at the popular HTTP testing site `httpbin.org`
314sending a simple payload encoding two form elements. We use Kettle's built-in form encoding grade by configuring an
315`encoding` subcomponent name `kettle.dataSource.encoding.formenc`. You can try out this sample live in its place in the
316[examples directory](examples/formDataSource/formDataSource.js). Note that since this particular endpoint sends a JSON
317response rather than a form-encoded response,
318we need to defeat the dataSource's attempt to apply the inverse decoding in the response by writing
319`setResponseTransforms: []`.
320
321## Built-in content encodings
322
323Kettle features three built-in content encoding grades which can be configured as the subcomponent of a dataSource
324named `encoding` in order to determine what encoding it applies to models. They are described in this table:
325
326|Grade name| Encoding type | Content-Type header |
327|----------|---------------|----------------|
328|`kettle.dataSource.encoding.JSON`|[JSON](http://json.org)|`application/json`|
329|`kettle.dataSource.encoding.JSON5`|[JSON5](http://json5.org)|`application/json5`|
330|`kettle.dataSource.encoding.formenc`|[form encoding](http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded)|`application/x-www-form-urlencoded`|
331|`kettle.dataSource.encoding.none`|No encoding|`text/plain`|
332
333## Elements of an encoding component
334
335You can operate a custom encoding by implementing a grade with the following elements, and using it as the `encoding`
336subcomponent in place of one of the built-in implementations in the above table:
337
338|Member name| Type | Description |
339|-----------|------|-------------|
340|`parse`|`Function (String) -> Any`| Parses the textual form of the data from its encoded form into the in-memory form|
341|`render`|`Function (Any) -> String`| Renders the in-memory form of the data into its textual form|
342|`contentType`|`String`| Holds the value that should be supplied in the
343[`Content-Type`](http://www.w3.org/Protocols/rfc1341/4_Content-Type.html) of an outgoing HTTP request whose body is
344encoded in this form|
345
346## The `kettle.dataSource.CouchDB` mixin grade
347
348Kettle includes a further mixin grade, `kettle.dataSource.CouchDB`, which is suitable for reading and writing to the
349[`doc`](http://docs.couchdb.org/en/1.6.1/api/document/common.html) URL space of a [CouchDB](http://couchdb.apache.org/)
350database.
351This can be applied to either a `kettle.dataSource.URL` or a `kettle.dataSource.file` (the latter clearly only useful
352for testing purposes). This is a basic implementation which simply adapts the base documents in this API to a simple
353CRUD contract, taking care of:
354
355* Packaging and unpackaging the special `_id` and `_rev` fields which appear at top level in a CouchDB document
356 * The user's document is in fact escaped in a top-level path named `value` to avoid conflicts between its keys and
357 any of those of the CouchDB machinery. If you wish to change this behavior, you can do so by providing different
358 [model transformation rules](http://docs.fluidproject.org/infusion/development/ModelTransformationAPI.html) in
359 `options.rules.readPayload` and `options.rules.writePayload`.
360* Applying a "read-before-write" of the `_rev` field to minimise (but not eliminate completely) the possibility for a
361 Couch-level conflict
362
363This grade is not properly tested and still carries some (though very small) risk of a conflict during update – it
364should be used with caution. Please contact the development team if you are interested in improved Couch-specific
365functionality.
366
367## Advanced implementation notes on DataSources
368
369In this section are a few notes for advanced users of DataSources, who are interested in extending their functionality
370or else in issuing I/O in Kettle by other means.
371
372### Transforming promise chains
373
374The detailed implementation of the Kettle DataSource is structured around a particular device taken from the Infusion
375Promises library, the concept of a
376["transforming promise chain"](http://docs.fluidproject.org/infusion/development/PromisesAPI.html#fluid-promise-firetransformevent-event-payload-options-).
377The core DataSource grade implements two events, `onRead` and and `onWrite`. These events are fired during the `get` and
378`set` operations of the DataSource, respectively.
379These events are better described as "pseudoevents" since they are not fired in the conventional way – rather than each
380event listener receiving the same signature, each instead receives the payload returned by the previous listener – it
381may then transform this payload and produce its own return in the form of a promise. Any promise rejection terminates
382the listener notification chain and propagates the failure to the caller. The DataSource implementation in fact fires
383these events by invoking the
384[`fireTransformEvent`](http://docs.fluidproject.org/infusion/development/PromisesAPI.html#fluid-promise-firetransformevent-event-payload-options-)
385function from Infusion's Promises API.
386
387The virtue of this implementation strategy is that extra stages of processing
388for the DataSource can be inserted and removed from any part of the processing chain by means of supplying suitable
389event [priorities](http://docs.fluidproject.org/infusion/development/Priorities.html) to
390the event's
391[listeners](http://docs.fluidproject.org/infusion/development/InfusionEventSystem.html#registering-a-listener-to-an-event).
392Both the JSON encoding/decoding and CouchDB wrapping/unwrapping facilities for the DataSources are implemented in
393terms of event listeners of this type, rather than in terms of conditional implementation code. This is a powerful and
394open implementation strategy which we plan to extend in future.
395
396### Callback wrapping in DataSources
397
398It's important that Kettle's inbuilt DataSources are used whenever possible when performing I/O from a Kettle
399application, since it is crucial that any running implementation code is always properly contextualised by its
400appropriate [request component](RequestHandlersAndApps.md#request-components). Kettle guarantees that the
401[IoC context](http://docs.fluidproject.org/infusion/development/Contexts.html) `{request}` will always be resolvable
402onto the appropriate request component from any code executing within that request. If arbitrary callbacks are supplied
403to node I/O APIs, the code executing in them will not be properly contextualised. If for some reason a DataSource is
404not appropriate, you can manually wrap any callbacks that you use by supplying them to the API `kettle.wrapCallback`.
405[Get in touch](../README.md#getting-started-and-community) with the dev team if you find yourself in this situation.