1 | ---
|
2 | title: DataSources
|
3 | layout: default
|
4 | category: Kettle
|
5 | ---
|
6 |
|
7 | ## DataSources
|
8 |
|
9 | A DataSource is an Infusion component which meets a simple contract for read/write access to indexed data. DataSource is a simple semantic, broadly the same as that
|
10 | encoded in [CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete), although the current DataSource semantic does not provide explicitly for deletion.
|
11 |
|
12 | The concrete DataSources in Kettle provide support for HTTP endpoints (with a particular variety specialised for accessing CouchDB databases with CRUDlike semantics) as well as the filesystem, with
|
13 | an emphasis on JSON payloads.
|
14 |
|
15 | The DataSource API is drawn from the following two methods – a read-only DataSource will just implement `get`, and a writeable DataSource will implement both `get` and `set`:
|
16 |
|
17 | /* @param directModel {Object} A JSON structure holding the "coordinates" of the state to be read -
|
18 | * this model is morally equivalent to (the substitutable parts of) a file path or URL
|
19 | * @param options {Object} [Optional] A JSON structure holding configuration options good for just
|
20 | * this request. These will be specially interpreted by the particular concrete grade of DataSource
|
21 | * – there are no options valid across all implementations of this grade.
|
22 | * @return {Promise} A promise representing successful or unsuccessful resolution of the read state
|
23 | */
|
24 | dataSource.get(directModel, options);
|
25 | /* @param directModel {Object} As for get
|
26 | * @param model {Object} The state to be written to the coordinates
|
27 | * @param options {Object} [Optional] A JSON structure holding configuration options good for just
|
28 | * this request. These will be specially interpreted by the
|
29 | * particular concrete grade of DataSource – there are no options valid across all implementations
|
30 | * of this grade. For example, a URL DataSource will accept an option `writeMethod` which will
|
31 | * allow the user to determine which HTTP method (PUT or POST) will be used to implement the write
|
32 | * operation.
|
33 | * @return {Promise} A promise representing resolution of the written state,
|
34 | * which may also optionally resolve to any returned payload from the write process
|
35 | */
|
36 | dataSource.set(directModel, model, options);
|
37 |
|
38 | ### Simple example of using an HTTP dataSource
|
39 |
|
40 | In this example we define and instantiate a simple HTTP-backed dataSource accepting one argument to configure a URL segment:
|
41 |
|
42 | ```javascript
|
43 | var fluid = require("infusion"),
|
44 | kettle = require("../../kettle.js"),
|
45 | examples = fluid.registerNamespace("examples");
|
46 |
|
47 |
|
48 | fluid.defaults("examples.httpDataSource", {
|
49 | gradeNames: "kettle.dataSource.URL",
|
50 | url: "http://jsonplaceholder.typicode.com/posts/%postId",
|
51 | termMap: {
|
52 | postId: "%directPostId"
|
53 | }
|
54 | });
|
55 |
|
56 | var myDataSource = examples.httpDataSource();
|
57 | var promise = myDataSource.get({directPostId: 42});
|
58 |
|
59 | promise.then(function (response) {
|
60 | console.log("Got dataSource response of ", response);
|
61 | }, function (error) {
|
62 | console.error("Got dataSource error response of ", error);
|
63 | });
|
64 | ```
|
65 |
|
66 | You can run this snippet from our code samples by running `node simpleDataSource.js` from [examples/simpleDataSource](../examples/simpleDataSource) in our samples area.
|
67 | This contacts the useful JSON placeholder API service at [`jsonplaceholder.typicode.com`](http://jsonplaceholder.typicode.com/) to retrieve a small JSON document holding some placeholder text. If you get
|
68 | a 404 or an error, please contact us and we'll update this sample to contact a new service.
|
69 |
|
70 | An interesting element in this snippet is the `termMap` configured as options of our dataSource. This sets up an indirection between the `directModel` supplied as the
|
71 | argument to the `dataSource.get` call, and the URL issued in the HTTP request. The keys in the `termMap` are interpolation variables in the URL, which in the URL are
|
72 | prefixed by `%`. The values in the `termMap` represent either
|
73 |
|
74 | * Plain values to be interpolated as strings directly into the URL, or
|
75 | * If the first character of the value in the `termMap` is %, the remainder of the string represents a path which will be dereferenced from the `directModel` argument to the current `set` or `get` request.
|
76 |
|
77 | In addition, if the term value has the prefix `noencode:`, it will be interpolated without any [URI encoding](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent).
|
78 |
|
79 | We document these configuration options in the next section:
|
80 |
|
81 | ### Configuration options accepted by `kettle.dataSource.URL`
|
82 |
|
83 |
|
84 | <table>
|
85 | <thead>
|
86 | <tr>
|
87 | <th colspan="3">Supported configurable options for a <code>kettle.dataSource.URL</code></th>
|
88 | </tr>
|
89 | <tr>
|
90 | <th>Option Path</th>
|
91 | <th>Type</th>
|
92 | <th>Description</th>
|
93 | </tr>
|
94 | </thead>
|
95 | <tbody>
|
96 | <tr>
|
97 | <td><code>writable</code></td>
|
98 | <td><code>Boolean</code> (default: <code>false</code>)</td>
|
99 | <td>If this option is set to <code>true</code>, a <code>set</code> method will be fabricated for this dataSource – otherwise, it will implement only a <code>get</code> method.</td>
|
100 | </tr>
|
101 | <tr>
|
102 | <td><code>writeMethod</code></td>
|
103 | <td><code>String</code> (default: <code>PUT</code>)</td>
|
104 | <td>The HTTP method to be used when the <code>set</code> method is operated on this writable DataSource (with <code>writable: true</code>). This defaults to <code>PUT</code> but
|
105 | <code>POST</code> is another option. Note that this option can also be supplied within the <code>options</code> argument to the <code>set</code> method itself.</td>
|
106 | </tr>
|
107 | <tr>
|
108 | <td><code>url</code></td>
|
109 | <td><code>String</code></td>
|
110 | <td>A URL template, with interpolable elements expressed by terms beginning with the <code>%</code> character, for the URL which will be operated by the <code>get</code> and
|
111 | <code>set</code> methods of this dataSource.</td>
|
112 | </tr>
|
113 | <tr>
|
114 | <td><code>termMap</code></td>
|
115 | <td><code>Object</code> (map of <code>String</code> to <code>String</code>)</td>
|
116 | <td>A map, of which the keys are some of the interpolation terms held in the <code>url</code> string, and the values will be used to perform the interpolation. If a value begins with <code>%</code>, the remainder of the string
|
117 | represents a <a href="http://docs.fluidproject.org/infusion/development/FrameworkConcepts.html#el-paths">path</a> into the <code>directModel</code> argument
|
118 | accepted by the <code>get</code> and <code>set</code> methods of the DataSource. By default any such values looked up will be <a href="https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent">
|
119 | URI Encoded</a> before being interpolated into the URL – unless their value in the termMap is prefixed by the string <code>noencode:</code>.</td>
|
120 | </tr>
|
121 | <tr>
|
122 | <td><code>notFoundIsEmpty</code></td>
|
123 | <td><code>Boolean</code></a> (default: <code>false</code>)</td>
|
124 | <td>If this option is set to <code>true</code>, a fetch of a nonexistent resource (that is, a nonexistent file, or an HTTP resource giving a 404) will result in a <code>resolve</code> with an empty
|
125 | payload rather than a <code>reject</code> response.</td>
|
126 | </tr>
|
127 | <tr>
|
128 | <td><code>components.encoding.type</code></td>
|
129 | <td><code>String</code> (grade name)</td>
|
130 | <td>A <code>kettle.dataSource.URL</code> has a subcomponent named <code>encoding</code> which the user can override in order to choose the encoding used to read and write the <code>model</code>
|
131 | object to and from the textual form in persistence. This defaults to <code>kettle.dataSource.encoding.JSON</code>. Other builtin encodings are <code>kettle.dataSource.encoding.formenc</code> operating
|
132 | HTML <a href="http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded">form encoding</code> and <code>kettle.dataSource.encoding.none</code> which applies no encoding.
|
133 | More details in <a href="#using-content-encodings-with-a-datasource">Using Content Encodings with a DataSource</a>.</td>
|
134 | </tr>
|
135 | <tr>
|
136 | <td><code>setResponseTransforms</code></td>
|
137 | <td><code>Array of String</code></a> (default: <code>["encoding"]</code>)</td>
|
138 | <td>Contains a list of the namespaces of the transform elements (see section <a href="#transforming-promise-chains">transforming promise chains</a> that are to be applied if there is a response payload
|
139 | from the <code>set</code> method, which is often the case with an HTTP backend. With a JSON encoding these encoding typically happens symmetrically - with a JSON request one will receive a JSON response -
|
140 | however, with other encoding such as <a href="http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded">form encoding</a> this is often not the case and one might like to
|
141 | defeat the effect of trying to decode the HTTP response as a form. In this case, for example, one can override <code>setResponseTransforms</code> with the empty array <code>[]</code>. </td>
|
142 | </tr>
|
143 | <tr>
|
144 | <td><code>charEncoding</code></td>
|
145 | <td><code>String</code> (default: <code>utf8</code>)</td>
|
146 | <td>The character encoding of the incoming HTTP stream used to convert its data to characters - this will be sent directly to the <a href="https://nodejs.org/api/stream.html#stream_readable_setencoding_encoding">setEncoding</code> method of
|
147 | the response stream</td>
|
148 | </tr>
|
149 | <tr>
|
150 | <td><code>invokers.resolveUrl</code></td>
|
151 | <td><a href="http://docs.fluidproject.org/infusion/development/Invokers.html"><code>IoC Invoker</code></a> (default: <code>kettle.dataSource.URL.resolveUrl</code>)</td>
|
152 | <td>This invoker can be overridden to customise the process of building the url for a dataSource request. The default implementation uses an invocation of
|
153 | <a href="http://docs.fluidproject.org/infusion/development/CoreAPI.html#fluid-stringtemplate-template-terms-"><code>fluid.stringTemplate</code></a> to interpolate elements from <code>termMap</code> and the <code>directModel</code>
|
154 | argument into the template string held in <code>url</code>. By overriding this invoker, the user can implement a strategy of their choosing. The supplied arguments to the invoker consist of the values
|
155 | <code>(url, termMap, directModel)</code> taken from these options and the dataSource request arguments, but the override can replace these with any IoC-sourced values in the invoker definition.</td>
|
156 | </tr>
|
157 | </tbody>
|
158 | </table>
|
159 |
|
160 | In addition, a `kettle.dataSource.URL` component will accept any options accepted by node's native
|
161 | [`http.request`](https://nodejs.org/api/http.html#http_http_request_options_callback) constructor – supported in addition to the above are
|
162 | `protocol`, `host`, `port`, `headers`, `hostname`, `family`, `localAddress`, `socketPath`, `auth` and `agent`. All of these options will be overriden by options of the same names supplied as the `options` object
|
163 | supplied as the last argument to the dataSource's `get` and `set` methods. This is a good way, for example, to send custom HTTP headers along with a URL dataSource request.
|
164 | Note that any of these component-level options (e.g. `port`, `protocol`, etc.) that can be derived from parsing the `url` option will override the value from the url. Compare this setup with
|
165 | the very similar one operated in the testing framework for [`kettle.test.request.http`](KettleTestingFramework.md#kettle.test.request.http).
|
166 |
|
167 | ### Configuration options accepted by `kettle.dataSource.file`
|
168 |
|
169 | An alternative dataSource implementation is `kettle.dataSource.file` - this is backed by the node filesystem API to allow files to be read and written in various encodings. The interpolation support based on `termMap`
|
170 | is very similar to that for `kettle.dataSource.URL`, but with the location template option named `path` representing an absolute filesystem path rather than the `url` property of `kettle.dataSource.URL` representing
|
171 | a URL.
|
172 |
|
173 | Exactly the same scheme based on the subcomponent named `encoding` can be used to control content encoding for a `kettle.dataSource.file` as for a `kettle.dataSource.URL`. Similarly, `kettle.dataSource.file` supports
|
174 | a further option named `charEncoding` which can select between various of the character encodings supported by node.js.
|
175 |
|
176 | <table>
|
177 | <thead>
|
178 | <tr>
|
179 | <th colspan="3">Supported configurable options for a <code>kettle.dataSource.file</code></th>
|
180 | </tr>
|
181 | <tr>
|
182 | <th>Option Path</th>
|
183 | <th>Type</th>
|
184 | <th>Description</th>
|
185 | </tr>
|
186 | </thead>
|
187 | <tbody>
|
188 | <tr>
|
189 | <td><code>writable</code></td>
|
190 | <td><code>Boolean</code> (default: <code>false</code>)</td>
|
191 | <td>If this option is set to <code>true</code>, a <code>set</code> method will be fabricated for this dataSource – otherwise, it will implement only a <code>get</code> method.</td>
|
192 | </tr>
|
193 | <tr>
|
194 | <td><code>path</code></td>
|
195 | <td><code>String</code></td>
|
196 | <td>An (absolute) file path template, with interpolable elements expressed by terms beginning with the <code>%</code> character, for the file which will be read and written the <code>get</code> and
|
197 | <code>set</code> methods of this dataSource.</td>
|
198 | </tr>
|
199 | <tr>
|
200 | <td><code>termMap</code></td>
|
201 | <td><code>Object</code> (map of <code>String</code> to <code>String</code>)</td>
|
202 | <td>A map, of which the keys are some of the interpolation terms held in the <code>url</code> string, and the values, if prefixed by <code>%</code> are paths into the <code>directModel</code> argument
|
203 | accepted by the <code>get</code> and <code>set</code> methods of the DataSource.</td>
|
204 | </tr>
|
205 | <tr>
|
206 | <td><code>charEncoding</code></td>
|
207 | <td><code>String</code> (default: <code>utf8</code></td>
|
208 | <td>The character encoding of the file used to convert its data to characters - one of the values supported by the <a href="https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options">node filesystem API</a> -
|
209 | values it advertises include <code>utf8</code>, <code>ascii</code> or <code>based64</code>. There is also evidence of support for <code>ucs2</code>.</td>
|
210 | </tr>
|
211 | </tbody>
|
212 | </table>
|
213 |
|
214 | A helpful mixin grade for `kettle.dataSource.file` is `kettle.dataSource.file.moduleTerms` which will allow interpolation by any module name registered with the Infusion module system
|
215 | [`fluid.module.register`](http://docs.fluidproject.org/infusion/development/NodeAPI.html#fluid-module-register-name-basedir-modulerequire-) - e.g. `%kettle/tests/data/couchDataSourceError.json`.
|
216 |
|
217 |
|
218 | ### Using content encodings with a DataSource
|
219 |
|
220 | `kettle.dataSource.URL` has a subcomponent named `encoding` which the user can override in order to choose the content encoding used to convert the model seen at the `get/set` API to the textual (character) form in which it is
|
221 | transmitted by the dataSource. The encoding subcomponent will also correctly set the [`Content-Type`](http://www.w3.org/Protocols/rfc1341/4_Content-Type.html) header of the outgoing HTTP request in the
|
222 | case of a `set` request. The encoding defaults to a JSON encoding represented by a subcomponent of type `kettle.dataSource.encoding.JSON`. Here is an example of choosing a different encoding to submit
|
223 | [form encoded](http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded) data to an HTTP endpoint:
|
224 |
|
225 | ```javascript
|
226 | fluid.defaults("examples.formDataSource", {
|
227 | gradeNames: "kettle.dataSource.URL",
|
228 | url: "http://httpbin.org/post",
|
229 | writable: true,
|
230 | writeMethod: "POST",
|
231 | components: {
|
232 | encoding: {
|
233 | type: "kettle.dataSource.encoding.formenc"
|
234 | }
|
235 | },
|
236 | setResponseTransforms: [] // Do not parse the "set" response as formenc - it is in fact JSON
|
237 | });
|
238 |
|
239 | var myDataSource = examples.formDataSource();
|
240 | var promise = myDataSource.set(null, {myField1: "myValue1", myField2: "myValue2"});
|
241 |
|
242 | promise.then(function (response) {
|
243 | console.log("Got dataSource response of ", JSON.parse(response));
|
244 | }, function (error) {
|
245 | console.error("Got dataSource error response of ", error);
|
246 | });
|
247 | ```
|
248 |
|
249 | In this example we set up a form-encoded, writable dataSource targetted at the popular HTTP testing site `httpbin.org` sending a simple payload encoding two form elements. We use Kettle's built-in form encoding
|
250 | grade by configuring an `encoding` subcomponent name `kettle.dataSource.encoding.formenc`. You can try out this
|
251 | sample live in its place in the [examples directory](examples/formDataSource/formDataSource.js). Note that since this particular endpoint sends a JSON response rather than a form-encoded response,
|
252 | we need to defeat the dataSource's attempt to apply the inverse decoding in the response by writing `setResponseTransforms: []`.
|
253 |
|
254 | ### Built-in content encodings
|
255 |
|
256 | Kettle features three built-in content encoding grades which can be configured as the subcomponent of a dataSource named `encoding` in order to determine what encoding it applies to models. They are described in this table:
|
257 |
|
258 | |Grade name| Encoding type | Content-Type header |
|
259 | |----------|---------------|----------------|
|
260 | |`kettle.dataSource.encoding.JSON`|[JSON](http://json.org)|`application/json`|
|
261 | |`kettle.dataSource.encoding.JSON5`|[JSON5](http://json5.org)|`application/json5`|
|
262 | |`kettle.dataSource.encoding.formenc`|[form encoding](http://www.w3.org/TR/html401/interact/forms.html#didx-applicationx-www-form-urlencoded)|`application/x-www-form-urlencoded`|
|
263 | |`kettle.dataSource.encoding.none`|No encoding|`text/plain`|
|
264 |
|
265 | ### Elements of an encoding component
|
266 |
|
267 | You can operate a custom encoding by implementing a grade with the following elements, and using it as the `encoding` subcomponent in place of one of the built-in implementations in the above table:
|
268 |
|
269 | |Member name| Type | Description |
|
270 | |-----------|------|-------------|
|
271 | |`parse`|`Function (String) -> Any`| Parses the textual form of the data from its encoded form into the in-memory form|
|
272 | |`render`|`Function (Any) -> String`| Renders the in-memory form of the data into its textual form|
|
273 | |`contentType`|`String`| Holds the value that should be supplied in the [`Content-Type`](http://www.w3.org/Protocols/rfc1341/4_Content-Type.html) of an outgoing HTTP request whose body is encoded in this form|
|
274 |
|
275 | ### The `kettle.dataSource.CouchDB` mixin grade
|
276 |
|
277 | Kettle includes a further mixin grade, `kettle.dataSource.CouchDB`, which is suitable for reading and writing to the [`doc`](http://docs.couchdb.org/en/1.6.1/api/document/common.html) URL space of a [CouchDB](http://couchdb.apache.org/) database.
|
278 | This can be applied to either a `kettle.dataSource.URL` or a `kettle.dataSource.file` (the latter clearly only useful for testing purposes).
|
279 | This is a basic implementation which simply adapts the base documents in this API to a simple CRUD contract, taking care of:
|
280 |
|
281 | * Packaging and unpackaging the special `_id` and `_rev` fields which appear at top level in a CouchDB document
|
282 | * The user's document is in fact escaped in a top-level path named `value` to avoid conflicts between its keys and any of those of the CouchDB machinery. If you wish to change this behavior, you can do so by providing different [model transformation rules](http://docs.fluidproject.org/infusion/development/ModelTransformationAPI.html) in `options.rules.readPayload` and `options.rules.writePayload`.
|
283 | * Applying a "read-before-write" of the `_rev` field to minimise (but not eliminate completely) the possibility for a Couch-level conflict
|
284 |
|
285 | This grade is not properly tested and still carries some (though very small) risk of a conflict during update – it should be used with caution. Please contact the development team if
|
286 | you are interested in improved Couch-specific functionality.
|
287 |
|
288 | ## Advanced implementation notes on DataSources
|
289 |
|
290 | In this section are a few notes for advanced users of DataSources, who are interested in extending their functionality or else in issuing I/O in Kettle by other means.
|
291 |
|
292 | ### Transforming promise chains
|
293 |
|
294 | The detailed implementation of the Kettle DataSource is structured around a particular device taken from the Infusion Promises library, the concept of a ["transforming promise chain"](http://docs.fluidproject.org/infusion/development/PromisesAPI.html#fluid-promise-firetransformevent-event-payload-options-). The core
|
295 | DataSource grade implements two events, `onRead` and and `onWrite`. These events are fired during the `get` and `set` operations of the DataSource, respectively.
|
296 | These events are better described as "pseudoevents" since they are not fired in the conventional way – rather than each event
|
297 | listener receiving the same signature, each instead receives the payload returned by the previous listener – it may then transform this payload and produce its own return in the form
|
298 | of a promise. Any promise rejection terminates the listener notification chain and propagates the failure to the caller. The DataSource implementation in fact fires these events by invoking
|
299 | the [`fireTransformEvent`](http://docs.fluidproject.org/infusion/development/PromisesAPI.html#fluid-promise-firetransformevent-event-payload-options-) function from Infusion's Promises API.
|
300 |
|
301 | The virtue of this implementation strategy is that extra stages of processing
|
302 | for the DataSource can be inserted and removed from any part of the processing chain by means of supplying suitable event [priorities](http://docs.fluidproject.org/infusion/development/Priorities.html) to
|
303 | the event's [listeners](http://docs.fluidproject.org/infusion/development/InfusionEventSystem.html#registering-a-listener-to-an-event). Both the JSON encoding/decoding and CouchDB wrapping/unwrapping
|
304 | facilities for the DataSources are implemented in terms of event listeners of this type, rather than in terms of conditional implementation code. This is a powerful and open
|
305 | implementation strategy which we plan to extend in future.
|
306 |
|
307 | ### Callback wrapping in DataSources
|
308 |
|
309 | It's important that Kettle's inbuilt DataSources are used whenever possible when performing I/O from a Kettle application, since it is crucial that any running implementation
|
310 | code is always properly contextualised by its appropriate [request component](RequestHandlersAndApps.md#request-components). Kettle guarantees that the [IoC context](http://docs.fluidproject.org/infusion/development/Contexts.html) `{request}`
|
311 | will always be resolvable onto the appropriate request component from any code executing within that request. If arbitrary callbacks are supplied to node I/O APIs, the code executing in them
|
312 | will not be properly contextualised. If for some reason a DataSource is not appropriate, you can manually wrap any callbacks that you use by supplying them to the API `kettle.wrapCallback`.
|
313 | [Get in touch](../README.md#getting-started-and-community) with the dev team if you find yourself in this situation.
|
314 |
|