UNPKG

104 kBMarkdownView Raw
1BigML Node.js Bindings
2======================
3
4[BigML](https://bigml.com) makes machine learning easy by taking care
5of the details required to add data-driven decisions and predictive
6power to your company.
7Unlike other machine learning services, BigML
8creates
9[beautiful predictive models](https://bigml.com/gallery/models) that
10can be easily understood and interacted with.
11
12These BigML Node.js bindings allow you to interact with BigML.io, the API
13for BigML. You can use it to easily create, retrieve, list, update, and
14delete BigML resources (i.e., sources, datasets, models and
15predictions).
16
17This module is licensed under the [Apache License, Version
182.0](http://www.apache.org/licenses/LICENSE-2.0.html).
19
20Support
21-------
22
23Please report problems and bugs to our [BigML.io issue
24tracker](https://github.com/bigmlcom/io/issues).
25
26Discussions about the different bindings take place in the general
27[BigML mailing list](http://groups.google.com/group/bigml). Or join us
28in our [Campfire chatroom](https://bigmlinc.campfirenow.com/f20a0).
29
30Requirements
31------------
32
33Node 0.10 is currently supported by these bindings.
34
35The only mandatory third-party dependencies are the
36[request](https://github.com/mikeal/request.git),
37[winston](https://github.com/flatiron/winston.git) and
38[form-data](https://github.com/felixge/node-form-data.git) libraries.
39
40The testing environment requires the additional
41[mocha](https://github.com/visionmedia/mocha) package that can be installed
42with the following command:
43
44 $ sudo npm install -g mocha
45
46Installation
47------------
48
49To install the latest stable release with
50[npm](https://npmjs.org/):
51
52 $ npm install bigml
53
54You can also install the development version of the bindings by cloning the
55Git repository to your local computer and issuing:
56
57 $ npm install .
58
59Testing
60-------
61
62The test suite is run automatically using `mocha` as test framework. As all the
63tested api objects perform one or more connections to the remote resources in
64bigml.com, you may have to enlarge the default timeout used by `mocha` in
65each test. For instance:
66
67 $ mocha -t 20000
68
69will set the timeout limit to 20 seconds.
70This limit should typically be enough, but you can change it to fit
71the latencies of your connection. You can also add the `-R spec` flag to see
72the definition of each step as they go.
73
74Importing the modules
75---------------------
76
77To use the library, import it with `require`:
78
79 $ node
80 > bigml = require('bigml');
81
82this will give you access to the following library structure:
83
84 - bigml.constants common constants
85 - bigml.BigML connection object
86 - bigml.Resource common API methods
87 - bigml.Source Source API methods
88 - bigml.Dataset Dataset API methods
89 - bigml.Model Model API methods
90 - bigml.Ensemble Ensemble API methods
91 - bigml.Prediction Prediction API methods
92 - bigml.BatchPrediction BatchPrediction API methods
93 - bigml.Evaluation Evaluation API methods
94 - bigml.Cluster Cluster API methods
95 - bigml.Centroid Centroid API methods
96 - bigml.BatchCentroid BatchCentroid API methods
97 - bigml.Anomaly Anomaly detector API methods
98 - bigml.AnomalyScore Anomaly score API methods
99 - bigml.BatchAnomalyScore BatchAnomalyScore API methods
100 - bigml.Project Project API methods
101 - bigml.Sample Sample API methods
102 - bigml.Correlation Correlation API methods
103 - bigml.StatisticalTests StatisticalTest API methods
104 - bigml.LogisticRegression LogisticRegression API methods
105 - bigml.Association Association API methods
106 - bigml.AssociationSet Associationset API methods
107 - bigml.TopicModel Topic Model API methods
108 - bigml.TopicDistribution Topic Distribution API methods
109 - bigml.BatchTopicDistribution Batch Topic Distribution API methods
110 - bigml.Deepnet Deepnet API methods
111 - bigml.Fusion Fusion API methods
112 - bigml.PCA PCA API methods
113 - bigml.Projection Projection API methods
114 - bigml.BatchProjection Batch Projection API methods
115 - bigml.LinearRegression Linear Regression API methods
116 - bigml.ExternalConnector External Connector API methods
117 - bigml.Script Script API methods
118 - bigml.Execution Execution API methods
119 - bigml.Library Library API methods
120 - bigml.LocalModel Model for local predictions
121 - bigml.LocalEnsemble Ensemble for local predictions
122 - bigml.LocalCluster Cluster for local centroids
123 - bigml.LocalAnomaly Anomaly detector for local anomaly scores
124 - bigml.LocalLogisticRegression Logistic regression model for local predictions
125 - bigml.LocalAssociation Association model for associaton rules
126 - bigml.LocalTopicModel Topic Model for local predictions
127 - bigml.LocalTimeSeries Time Series for local forecasts
128 - bigml.LocalDeepnet Deepnets for local predictions
129 - bigml.LocalFusion Fusions for local predictions
130 - bigml.LocalPCA PCA for local projections
131 - bigml.LocalLinearRegression Linear Regression for local predictions
132
133
134Authentication
135--------------
136
137All the requests to BigML.io must be authenticated using your username
138and [API key](https://bigml.com/account/apikey) and are always
139transmitted over HTTPS.
140
141This module will look for your username and API key in the environment
142variables `BIGML_USERNAME` and `BIGML_API_KEY` respectively. You can
143add the following lines to your `.bashrc` or `.bash_profile` to set
144those variables automatically when you log in::
145
146 export BIGML_USERNAME=myusername
147 export BIGML_API_KEY=ae579e7e53fb9abd646a6ff8aa99d4afe83ac291
148
149With that environment set up, connecting to BigML is a breeze::
150
151 connection = new bigml.BigML();
152
153Otherwise, you can initialize directly when instantiating the BigML
154class as follows::
155
156 connection = new bigml.BigML('myusername',
157 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291');
158
159Connection information
160----------------------
161
162If your BigML installation is not in `bigml.io` you can adapt your connection
163to point to your customized domain. For instance, if your user is in the
164australian site, your domain should point to `au.bigml.io`. This can be
165achieved by adding the::
166
167 export BIGML_DOMAIN=au.bigml.io
168
169environment variable and your connection object will take its value and
170create the convenient urls. You can also set this value dinamically (together
171with the protocol used, if you need to change it)::
172
173 connection = new bigml.BigML('myusername',
174 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291',
175 {domain: 'au.bigml.io',
176 protocol: 'https'});
177
178The default if no domain or protocol information is provided, the connection
179is uses `bigml.io` and `https` as default.
180
181Also, you can set a local directory to be used as storage. Using this
182mechanism, any resource you download using this connection object is stored
183as a json file in the directory. The name of the file is the resource ID string
184replacing the slash by an underscore::
185
186 connection = new bigml.BigML('myusername',
187 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291',
188 {storage: './my_storage'});
189
190Note that the old `devMode` parameter has been deprecated and will no longer
191be accepted.
192
193
194Quick Start
195-----------
196
197Let's see the steps that will lead you from [this csv
198file](https://static.bigml.com/csv/iris.csv) containing the [Iris
199flower dataset](http://en.wikipedia.org/wiki/Iris_flower_data_set) to
200predicting the species of a flower whose `sepal length` is `5` and
201whose `sepal width` is `2.5`. By default, BigML considers the last field
202(`species`) in the row as the
203objective field (i.e., the field that you want to generate predictions
204for). The csv structure is::
205
206 sepal length,sepal width,petal length,petal width,species
207 5.1,3.5,1.4,0.2,Iris-setosa
208 4.9,3.0,1.4,0.2,Iris-setosa
209 4.7,3.2,1.3,0.2,Iris-setosa
210 ...
211
212The steps required to generate a prediction are creating a set of
213source, dataset and model objects::
214
215```js
216 var bigml = require('bigml');
217 var source = new bigml.Source();
218 source.create('./data/iris.csv', function(error, sourceInfo) {
219 if (!error && sourceInfo) {
220 var dataset = new bigml.Dataset();
221 dataset.create(sourceInfo, function(error, datasetInfo) {
222 if (!error && datasetInfo) {
223 var model = new bigml.Model();
224 model.create(datasetInfo, function (error, modelInfo) {
225 if (!error && modelInfo) {
226 var prediction = new bigml.Prediction();
227 prediction.create(modelInfo, {'petal length': 1})
228 }
229 });
230 }
231 });
232 }
233 });
234```
235
236Note that in our example the `prediction.create` call has no associated
237callback. All the CRUD methods of any resource allow assigning a callback as
238the last parameter,
239but if you don't the default action will be
240printing the resulting resource or the error. For the `create` method:
241
242 > result:
243 { code: 201,
244 object:
245 { category: 0,
246 code: 201,
247 content_type: 'text/csv',
248 created: '2013-06-08T15:22:36.834797',
249 credits: 0,
250 description: '',
251 fields_meta: { count: 0, limit: 1000, offset: 0, total: 0 },
252 file_name: 'iris.csv',
253 md5: 'd1175c032e1042bec7f974c91e4a65ae',
254 name: 'iris.csv',
255 number_of_datasets: 0,
256 number_of_ensembles: 0,
257 number_of_models: 0,
258 number_of_predictions: 0,
259 private: true,
260 resource: 'source/51b34c3c37203f4678000020',
261 size: 4608,
262 source_parser: {},
263 status:
264 { code: 1,
265 message: 'The request has been queued and will be processed soon' },
266 subscription: false,
267 tags: [],
268 type: 0,
269 updated: '2013-06-08T15:22:36.834844' },
270 resource: 'source/51b34c3c37203f4678000020',
271 location: 'https://localhost:1026/andromeda/source/51b34c3c37203f4678000020',
272 error: null }
273
274
275The generated objects can be retrieved, updated and deleted through the
276corresponding REST methods. For instance, in the previous example you would
277use:
278
279```js
280 bigml = require('bigml');
281 var source = new bigml.Source();
282 source.get('source/51b25fb237203f4410000010', function (error, resource) {
283 if (!error && resource) {
284 console.log(resource);
285 }
286 })
287```
288to recover and show the source information.
289
290When a resource `create` call is sent,
291the request creates an evolving resource that will go through some stages
292till it ends up being finished or faulty.
293BigML's API will give asynchronous access to the resource at any time,
294so the `create` method response might contain an in-process resource that
295will lack some of the properties that it will have when finished.
296That is helpful to build any-time models and to use non-blocking
297calls. However, in order to have the complete information that a finished
298resource contains, we will probably need to wait till it
299reaches its final state. The `createAndWait` methods provide alternatives
300to `create` that wait for the resource to be finished before returning the
301result.
302
303```js
304 bigml = require('bigml');
305 var dataset = new bigml.Dataset();
306 dataset.createAndWait('source/51b25fb237203f4410000010',
307 function (error, resource) {
308 if (!error && resource) {
309 console.log("The dataset has been completely created.")
310 console.log(resource);
311 }
312 })
313```
314
315You can work with different credentials by setting them in the connection
316object, as explained in the Authentication section.
317
318```js
319 bigml = require('bigml');
320 var connection = new bigml.BigML('myusername',
321 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291')
322 // the new Source object will use the connection credentials for remote
323 // authentication.
324 var source = new bigml.Source(connection);
325 source.get('source/51b25fb237203f4410000010' function (error, resource) {
326 if (!error && resource) {
327 console.log(resource);
328 }
329 })
330```
331
332You can also generate local predictions using the information of your
333models:
334
335```js
336 bigml = require('bigml');
337 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010');
338 localModel.predict({'petal length': 1},
339 function(error, prediction) {console.log(prediction)});
340```
341
342And similarly, for your ensembles
343
344```js
345 bigml = require('bigml');
346 var localEnsemble = new bigml.LocalEnsemble('ensemble/51901f4337203f3a9a000215');
347 localEnsemble.predict({'petal length': 1}, 0,
348 function(error, prediction) {console.log(prediction)});
349```
350or any of the other modeling resources offered by BigML. The previous example
351will generate a prediction for a `Decision Forest` ensemble
352by combining the predictions of each of the models
353they enclose. The example uses the `plurality` combination method (whose code
354is `0`. Check the docs for more information about the available combination
355methods). All the three kinds of ensembles available in BigML
356(`Decision Forest`,
357`Random Decision Forest` and `Boosting Trees`) can be used to predict locally
358through this `LocalEnsemble` object.
359
360Types of resources
361------------------
362
363Currently these are the types of resources in bigml.com:
364
365- **external connectors** Contain the information to connect to an external
366database manager. They can be used to upload data to BigML to build `sources`.
367These resources are handled through `bigml.ExternalConnector`.
368
369- **sources** Contain the data uploaded from your local data file, inline data
370or an external data source after processing (interpreting field types
371or missing characters, for instance).
372You can set their locale settings or their field names or types. These resources
373are handled through `bigml.Source`.
374
375- **datasets** Contain the information of the source in a structured summarized
376way according to their file types (numeric, categorical, text or datetime).
377These resources are handled through `bigml.Dataset`.
378
379- **models** They are tree-like structures extracted from a dataset in order to
380predict one field, the objective field, according to the values of other
381fields, the input fields. These resources
382are handled through `bigml.Model`.
383
384- **predictions** Are the representation of the predicted value for the
385objective field obtained by applying the model to an input data set. These
386resources are handled through `bigml.Prediction`.
387
388- **ensembles** Are a group of models extracted from a single dataset to be
389used together in order to predict the objective field. BigML offers three
390kinds of ensembles:
391`Decision Forests`, `Random Decision Forests` and `Boosting Trees`.
392All these resources are handled through `bigml.Ensemble`.
393
394- **evaluations** Are a set of measures of performance defined on your model
395or ensemble by checking predictions for the objective field of
396a test dataset with its provided values. These resources
397are handled through `bigml.Evaluation`.
398
399- **batch predictions** Are groups of predictions for the objective field
400obtained by applying the model or ensemble to a dataset resource. These
401resources are handled through `bigml.BatchPredictions`.
402
403- **clusters** They are unsupervised learning models that define groups of
404instances in the training dataset according to the similarity of their
405features. Each group has a central instance, named Centroid, and all
406instances in the group form a new dataset. These resources are handled
407through `bigml.Cluster`.
408
409- **centroids** Are the central instances of the groups defined in a cluster.
410They are the values predicted by the cluster when new input data is given.
411These resources are handled through `bigml.Centroid`
412
413- **batch centroids** Are lists of centroids obtained by using the cluster to
414classify a dataset of input data. They are analogous to the batch
415predictions generated from models, but for clusters. These resources
416are handled through `bigml.BatchCentroid`.
417
418- **anomaly detectors** They are unsupervised learning models
419that detect instances in the training dataset that are anomalous.
420The information it returns encloses a `top_anomalies` block
421that contains a list of the most anomalous
422points. For each instance, a `score` from 0 to 1 is computed. The closer to 1,
423the more anomalous. These resources are handled
424through `bigml.Anomaly`.
425
426- **anomaly scores** Are scores computed for any user-given input data using
427an anomaly detector. These resources are handled through `bigml.AnomalyScore`
428
429- **batch anomaly scores** Are lists of anomaly scores obtained by using
430the anomaly detector to
431classify a dataset of input data. They are analogous to the batch
432predictions generated from models, but for anomalies. These resources
433are handled through `bigml.BatchAnomalyScore`.
434
435- **projects** These resources are meant for organizational purposes only.
436The rest of resources can be related to one `project` that groups them.
437Only sources can be assigned to a `project`, the rest of resources inherit
438the `project` reference from its originating source. Projects are handled
439through `bigml.Project`.
440
441- **samples** These resources provide quick access to your raw data. They are
442objects cached in-memory by the server that can be queried for subsets
443of data by limiting
444their size, the fields or the rows returned. Samples are handled
445through `bigml.Sample`.
446
447- **correlations** These resources contain a series of computations that
448reflect the
449degree of dependence between the field set as objective for your predictions
450and the rest of fields in your dataset. The dependence degree is obtained by
451comparing the distributions in every objective and non-objective field pair,
452as independent fields should have probabilistic
453independent distributions. Depending on the types of the fields to compare,
454the metrics used to compute the correlation degree will change. Check the
455[developers documentation](https://bigml.com/api/correlations#retrieving-correlation)
456for a detailed description. Correlations are handled
457through `bigml.Correlation`.
458
459- **statistical tests** These resources contain a series of statistical tests
460that compare the
461distribution of data in each numeric field to certain canonical distributions,
462such as the
463[normal distribution](https://en.wikipedia.org/wiki/Normal_distribution)
464or [Benford's law](https://en.wikipedia.org/wiki/Benford%27s_law)
465distribution. Statistical tests are handled
466through `bigml.StatisticalTest`.
467
468- **logistic regressions** These resources are models to solve classification
469problems by predicting one field of the dataset, the objective field,
470based on the values of the other fields, the input fields. The prediction
471is made using a logistic function whose argument is a linear combination
472of the predictor's values. Check the
473[developers documentation](https://bigml.com/api/logisticregressions)
474for a detailed description. These resources
475are handled through `bigml.LogisticRegression`.
476
477- **associations** These resources are models to discover the existing
478associations between the field values in your dataset. Check the
479[developers documentation](https://bigml.com/api/associations)
480for a detailed description. These resources
481are handled through `bigml.Association`.
482
483- **association sets** These resources are the sets of items associated to
484the ones in your input data and their score. Check the
485[developers documentation](https://bigml.com/api/associationsets)
486for a detailed description. These resources
487are handled through `bigml.AssociationSet`.
488
489- **topic models** These resources are models to discover topics underlying a
490collection of documents. Check the
491[developers documentation](https://bigml.com/api/topicmodels)
492for a detailed description. These resources
493are handled through `bigml.TopicModel`.
494
495- **topic distributions** These resources contain the
496probabilites
497for a document to belong to each one of the topics in a `topic model`.
498Check the
499[developers documentation](https://bigml.com/api/topicdistributions)
500for a detailed description. These resources
501are handled through `bigml.TopicDistribution`.
502
503- **batch topic distributions** These resources contain a list of
504the probabilites
505for a collection of documents to belong to each one of the topics in
506a `topic model`. Check the
507[developers documentation](https://bigml.com/api/batchtopicdistributions)
508for a detailed description. These resources
509are handled through `bigml.BatchTopicDistribution`.
510
511- **time series** These resources are models to discover the patterns in
512the properties of a sequence of ordered data. Check the
513[developers documentation](https://bigml.com/api/timeseries)
514for a detailed description. These resources
515are handled through `bigml.TimeSeries`.
516
517- **forecasts** These resources contain forecasts for the numeric
518fields in a dataset as predicted by a `timeseries` model.
519Check the
520[developers documentation](https://bigml.com/api/forecasts)
521for a detailed description. These resources
522are handled through `bigml.Forecast`.
523
524- **deepnets** These resources are classification and regression models based
525on deep neural networks. Check the
526[developers documentation](https://bigml.com/api/deepnets)
527for a detailed description. These resources
528are handled through `bigml.Deepnet`.
529
530- **fusions** These resources are classification and regression models based
531on mixed supervised models. Check the
532[developers documentation](https://bigml.com/api/fusions)
533for a detailed description. These resources
534are handled through `bigml.Fusion`.
535
536- **PCAs** These resources are models for dimensional reduction. Check the
537[developers documentation](https://bigml.com/api/pcas)
538for a detailed description. These resources
539are handled through `bigml.PCA`.
540
541- **projections** These resources are the result of applying PCAs to get
542a smaller features set covering the variance in data. Check the
543[developers documentation](https://bigml.com/api/projections)
544for a detailed description. These resources
545are handled through `bigml.Projection`.
546
547- **batch projections** These resources are the result of applying PCAs to
548a dataset to get a smaller features set covering the variance in data.
549Check the
550[developers documentation](https://bigml.com/api/batchprojections)
551for a detailed description. These resources
552are handled through `bigml.Fusion`
553
554- **linear regressions** These resources are regression models based on the
555assumption of a linear relation between the predictors and the outcome.
556Check the
557[developers documentation](https://bigml.com/api/linearregressions)
558for a detailed description. These resources
559are handled through `bigml.LinearRegression`
560
561- **scripts** These resources are Whizzml scripts, that can be created
562to handle workflows, which provide a means of automating the creation and
563management of the rest of resources. Check the
564[developers documentation](https://bigml.com/api/scripts)
565for a detailed description. These resources
566are handled through `bigml.Script`.
567
568- **executions** These resources are Whizzml scripts' executions, that
569can be created to execute the workflows defined in the `Whizzml scripts`.
570Check the
571[developers documentation](https://bigml.com/api/executions)
572for a detailed description. These resources
573are handled through `bigml.Execution`.
574
575- **libraries** These resources are Whizzml libraries, that
576can be created to store definitions of constants and functions which can
577be imported and used in the `Whizzml scripts`.
578Check the
579[developers documentation](https://bigml.com/api/libraries)
580for a detailed description. These resources
581are handled through `bigml.Library`.
582
583Creating resources
584------------------
585
586As you've seen in the quick start section, each resource has its own creation
587method availabe in the corresponding resource object. Sources are created
588by uploading a local csv file:
589
590```js
591 var bigml = require('bigml');
592 var source = new bigml.Source();
593 source.create('./data/iris.csv', {name: 'my source'}, true,
594 function(error, sourceInfo) {
595 if (!error && sourceInfo) {
596 console.log(sourceInfo);
597 }
598 });
599```
600The first argument in the `create` method of the `bigml.Source` is the csv
601file, the next one is an object to set some of the source properties,
602in this case its name, a boolean that determines if retries will be used
603in case a resumable error occurs, and finally the chosen callback.
604The arguments are optional (for this method and all
605the `create` methods of the rest of resources).
606
607It is important to instantiate a new resource object (`new bigml.Source()` in
608this case) for each different resource, because each one stores internally the
609parameters used in the last REST call. They are available
610to be used if the call needs to be retried. For instance, if your internet
611connection falls for a while, the `create` call will be retried a limited number
612of times using this information unless you explicitely command it by using the
613
614
615For datasets to be created you need a source object or id, another dataset
616object or id, a list of dataset ids or a cluster id as first argument
617in the `create` method.
618In the first case, it
619generates a dataset using the data of the source and in the second,
620the method is used to generate new datasets by splitting the original one.
621For instance,
622
623```js
624 var bigml = require('bigml');
625 var dataset = new bigml.Dataset();
626 dataset.create('source/51b25fb237203f4410000010',
627 {name: 'my dataset', size: 1024}, true,
628 function(error, datasetInfo) {
629 if (!error && datasetInfo) {
630 console.log(datasetInfo);
631 }
632 });
633```
634
635will create a dataset named `my dataset` with the first 1024
636bytes of the source. And
637
638```js
639 dataset.create('dataset/51b3c4c737203f16230000d1',
640 {name: 'split dataset', sample_rate: 0.8}, true,
641 function(error, datasetInfo) {
642 if (!error && datasetInfo) {
643 console.log(datasetInfo);
644 }
645 });
646```
647
648will create a new dataset by sampling 80% of the data in the original dataset.
649
650Clusters can also be used to generate datasets containing the instances
651grouped around each centroid. You will need the cluster id and the centroid id
652to reference the dataset to be created. For instance,
653
654```js
655 cluster.create(datasetId, function (error, data) {
656 var clusterId = data.resource;
657 var centroidId = '000000';
658 dataset.create(clusterId, {centroid: centroidId},
659 function (error, data) {
660 console.log(data);
661 });
662 });
663```
664
665All datasets can be exported to a local CSV file using the ``download``
666method of ``Dataset`` objects.
667
668```js
669 var bigml = require('bigml');
670 dataset = new bigml.Dataset();
671 dataset.download('dataset/53b0aa6837203f4341000034',
672 'my_exported_file.csv',
673 function (error, data) {
674 console.log("data:" + data);
675 });
676```
677
678would generate a new dataset containing the subset of instances in the cluster
679associated to the centroid id ``000000``.
680
681Similarly to create models, ensembles, logistic regressions, deepnets and any modeling resource,
682you will need a dataset as first argument.
683
684Evaluations will need a model as first argument and a dataset as second one and
685predictions need a model as first argument too:
686
687```js
688 var bigml = require('bigml');
689 var evaluation = new bigml.Evaluation();
690 evaluation.create('model/51922d0b37203f2a8c000010',
691 'dataset/51b3c4c737203f16230000d1',
692 {'name': 'my evaluation'}, true,
693 function(error, evaluationInfo) {
694 if (!error && evaluationInfo) {
695 console.log(evaluationInfo);
696 }
697 });
698```
699
700Newly-created resources are returned in an object with the following
701keys:
702
703- **code**: If the request is successful you will get a
704 `constants.HTTP_CREATED` (201) status code. Otherwise, it will be
705 one of the standard HTTP error codes [detailed in the
706 documentation](https://bigml.com/api/status_codes).
707- **resource**: The identifier of the new resource.
708- **location**: The location of the new resource.
709- **object**: The resource itself, as computed by BigML.
710- **error**: If an error occurs and the resource cannot be created, it
711 will contain an additional code and a description of the error. In
712 this case, **location**, and **resource** will be `null`.
713
714Bigml.com will answer your `create` call immediately, even if the resource
715is not finished yet (see the
716[documentation on status
717codes](https://bigml.com/api/status_codes) for the listing of
718potential states and their semantics). To retrieve a finished resource,
719you'll need to use the `get` method described in the next section.
720
721Getting resources
722-----------------
723
724To retrieve an existing resource, you use the `get`
725method of the corresponding class. Let's see an example of model retrieval:
726
727```js
728 var bigml = require('bigml');
729 var model = new bigml.Model();
730 model.get('model/51b3c45a37203f16230000b5',
731 true,
732 'only_model=true;limit=-1',
733 function (error, resource) {
734 if (!error && resource) {
735 console.log(resource);
736 }
737 })
738```
739
740The first parameter is, obviously, the model id, and the rest of parameters are
741optional. Passing a `true` value as the second argument (as in the example)
742forces the `get` method to retry to
743retrieve a finished model. In the previous section we saw that, right after
744creation, resources evolve
745through a series of states until they end up in a `FINISHED` (or `FAULTY`)
746state.
747Setting this boolean to `true` will force the `get` method to wait for
748the resource to be finished before
749executing the corresponding callback (default is set to `false`).
750The third parameter is a query string
751that can be used to filter the fields returned. In the example we set the
752fields to be retrieved to those used in the model (default is an empty string).
753The callback parameter is set to
754a default printing function if absent.
755
756
757Updating Resources
758------------------
759
760Each type of resource has a set of properties whose values can be updated.
761Check the properties subsection of each resource in the [developers
762documentation](https://bigml.com/developers) to see which are marked as
763updatable. The `update` method of each resource class will let you modify
764such properties. For instance,
765
766```js
767 var bigml = require('bigml');
768 var ensemble = new bigml.Ensemble();
769 ensemble.update('ensemble/51901f4337203f3a9a000215',
770 {name: 'my name', tags: 'code example'}, true,
771 function (error, resource) {
772 if (!error && resource) {
773 console.log(resource);
774 }
775 })
776```
777
778will set the name `my name` to your ensemble and add the
779tags `code` and `example`. The callback function is optional and a default
780printing function will be used if absent.
781
782If you have a look at the returned resource
783you will see that its status will
784be `constants.HTTP_ACCEPTED` if the resource can be updated without
785problems or one of the HTTP standard error codes otherwise.
786
787Deleting Resources
788------------------
789
790Resources can be deleted individually using the `delete` method of the
791corresponding class.
792
793```js
794 var bigml = require('bigml');
795 var source = new bigml.Source();
796 source.delete('source/51b25fb237203f4410000010', true,
797 function (error, result) {
798 if (!error && result) {
799 console.log(result);
800 }
801 })
802```
803
804The call will return an object with the following keys:
805
806- **code** If the request is successful, the code will be a
807 `constants.HTTP_NO_CONTENT` (204) status code. Otherwise, it wil be
808 one of the standard HTTP error codes. See the [documentation on
809 status codes](https://bigml.com/api/status_codes) for more
810 info.
811- **error** If the request does not succeed, it will contain an
812 object with an error code and a message. It will be `null`
813 otherwise.
814
815The callback parameter is optional and a printing function is used as default.
816
817Downloading Batch Predictions' (or Centroids') output
818-----------------------------------------------------
819
820Using batch predictions you can obtain the predictions given by a model or
821ensemble on a dataset. Similarly, using batch centroids you will get the
822centroids predicted by a cluster for each instance of a dataset.
823The output is accessible through a BigML URL and can
824be stored in a local file by using the download method.
825
826```js
827 var bigml = require('bigml');
828 var batchPrediction = new bigml.BatchPrediction(),
829 tmpFileName='/tmp/predictions.csv';
830 // batch prediction creation call
831 batchPrediction.create('model/52e4680f37203f20bb000da7',
832 'dataset/52e6bd1a37203f3eac000392',
833 {'name': 'my batch prediction'},
834 function(error, batchPredictionInfo) {
835 if (!error && batchPredictionInfo) {
836 // retrieving batch prediction finished resource
837 batchPrediction.get(batchPredictionInfo, true,
838 function (error, batchPredictionInfo) {
839 if (batchPredictionInfo.object.status.code === bigml.constants.FINISHED) {
840 // retrieving the batch prediction output file and storing it
841 // in the local file system
842 batchPrediction.download(batchPredictionInfo,
843 tmpFileName,
844 function (error, cbFilename) {
845 console.log(cbFilename);
846 });
847 }
848 });
849 }
850 });
851```
852
853If no `filename` is given, the callback receives the error and the
854request object used to download the url.
855
856Listing, Filtering and Ordering Resources
857-----------------------------------------
858
859Each type of resource has its own `list` method that allows you to
860retrieve groups of available resources of that kind. You can also add some
861filters to select
862specific subsets of them and even order the results. The returned list will
863show the 20 most recent resources. That limit can be modified by setting
864the `limit` argument in the query string. For more information about the syntax
865of query strings filters and orderings, you can check the fields labeled
866as *filterable* and *sortable* in the listings section of [BigML
867documentation](https://bigml.com/developers) for each resource. As an
868example, we can see how to list the first 20 sources
869
870```js
871 var bigml = require('bigml');
872 var source = new bigml.Source();
873 source.list(function (error, list) {
874 if (!error && list) {
875 console.log(list);
876 }
877 })
878```
879
880and if you want the first 5 sources created before April 1st,
8812013:
882
883```js
884 var bigml = require('bigml');
885 var source = new bigml.Source();
886 source.list('limit=5;created__lt=2013-04-1',
887 function (error, list) {
888 if (!error && list) {
889 console.log(list);
890 }
891 })
892```
893
894and if you want to select the first 5 as ordered by name:
895
896```js
897 var bigml = require('bigml');
898 var source = new bigml.Source();
899 source.list('limit=5;created__lt=2013-04-1;order_by=name',
900 function (error, list) {
901 if (!error && list) {
902 console.log(list);
903 }
904 })
905```
906
907In this method, both parameters are optional and, if no callback is given,
908a basic printing function is used instead.
909
910The list object will have the following structure:
911
912- **code**: If the request is successful you will get a
913 `constants.HTTP_OK` (200) status code. Otherwise, it will be one of
914 the standard HTTP error codes. See [BigML documentation on status
915 codes](https://bigml.com/api/status_codes) for more info.
916- **meta**: An object including the following keys that can help you
917 paginate listings:
918
919 - **previous**: Path to get the previous page or `null` if there
920 is no previous page.
921 - **next**: Path to get the next page or `null` if there is no
922 next page.
923 - **offset**: How far off from the first entry in the resources is
924 the first one listed in the resources key.
925 - **limit**: Maximum number of resources that you will get listed in
926 the resources key.
927 - **total\_count**: The total number of resources in BigML.
928
929- **objects**: A list of resources as returned by BigML.
930- **error**: If an error occurs and the resource cannot be created, it
931 will contain an additional code and a description of the error. In
932 this case, **meta**, and **resources** will be `null`.
933
934a simple example of what a `list` call would retrieve is this one, where
935we asked for the 2 most recent sources:
936
937```js
938 var bigml = require('bigml');
939 var source = new bigml.Source();
940 source.list('limit=2',
941 function (error, list) {
942 if (!error && list) {
943 console.log(list);
944 }
945 })
946 > { code: 200,
947 meta:
948 { limit: 2,
949 next: '/andromeda/source?username=mmerce&api_key=c972018dc5f2789e65c74ba3170fda31d02e00c0&limit=2&offset=2',
950 offset: 0,
951 previous: null,
952 total_count: 653 },
953 resources:
954 [ { category: 0,
955 code: 200,
956 content_type: 'text/csv',
957 created: '2013-06-11T00:01:51.526000',
958 credits: 0,
959 description: '',
960 file_name: 'iris.csv',
961 md5: 'd1175c032e1042bec7f974c91e4a65ae',
962 name: 'iris.csv',
963 number_of_datasets: 0,
964 number_of_ensembles: 0,
965 number_of_models: 0,
966 number_of_predictions: 0,
967 private: true,
968 resource: 'source/51b668ef37203f50a4000005',
969 size: 4608,
970 source_parser: [Object],
971 status: [Object],
972 subscription: false,
973 tags: [],
974 type: 0,
975 updated: '2013-06-11T00:02:06.381000' },
976 { category: 0,
977 code: 200,
978 content_type: 'text/csv',
979 created: '2013-06-09T00:15:00.574000',
980 credits: 0,
981 description: '',
982 file_name: 'iris.csv',
983 md5: 'd1175c032e1042bec7f974c91e4a65ae',
984 name: 'my source',
985 number_of_datasets: 0,
986 number_of_ensembles: 0,
987 number_of_models: 0,
988 number_of_predictions: 0,
989 private: true,
990 resource: 'source/51b3c90437203f16230000dd',
991 size: 4608,
992 source_parser: [Object],
993 status: [Object],
994 subscription: false,
995 tags: [],
996 type: 0,
997 updated: '2013-06-09T00:15:00.780000' } ],
998 error: null }
999```
1000
1001Local Predictions: file system or cache storage
1002-----------------------------------------------
1003
1004Every model available in BigML is white-box and can be downloaded from the
1005API as a JSON. This response can either be stored in the file system as a file
1006or stored in a cache-manager. In both cases, the bindings provide a class
1007for every model which can interpret this JSON and predict (supervised models),
1008or assing centroids, compute anomaly scores, etc. Thus, these classes allow
1009you to use your BigML model locally, with no connection whatsoever to
1010BigML's servers.
1011
1012The following sections explain each of these classes and their methods. As a
1013general summary, in order to create a local model from a BigML `model`
1014resource, the class to use is called `LocalModel`. You can make a local model
1015from a remote model by providing its ID.
1016
1017```js
1018 var bigml = require('bigml');
1019 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010');
1020```
1021
1022In this case, the localModel object will create a default connection object
1023that will retrieve your credentials from the environment variables and will
1024create a `./storage` directory in your current directory to be used as
1025local storage. Once this is done, it will download the JSON of the model
1026into the `./storage` directory, naming the file after the model ID
1027(replacing `/` by `_`). This stored file will be used the next time you
1028instantiate the `LocalModel` class for the same model ID. Thus, the connection
1029to BigML's servers is only needed the first time you use a model to download
1030its JSON information and once it's stored, the local version in your file
1031system will be used.
1032
1033If you prefer to use another directory to store your models, you can provide
1034a different connection object that specifies the folder to be used:
1035
1036
1037```js
1038 var bigml = require('bigml');
1039 var localModel = new bigml.LocalModel(
1040 'model/51922d0b37203f2a8c000010',
1041 new bigml.BigML(undefined,
1042 undefined,
1043 {storage: './my_storage_dir'}));
1044```
1045
1046If you stored the model JSON in your local system by any other means, you
1047can use the path to the file as first argument too:
1048
1049```js
1050 var bigml = require('bigml');
1051 var localModel = new bigml.LocalModel(
1052 '/my_dir/my_model_json');
1053```
1054
1055You can also use some cache-manager to store the JSON. In that case, the
1056`storage` attribute in the connection should contain the cache-manager
1057object that provides the `.get` and `.set` methods to manage the cache.
1058The cache-manager mechanism itself is not included in the bindings code
1059or its dependencies. Here's an example of use:
1060
1061```js
1062 var bigml = require('bigml');
1063 var cacheManager = require('cache-manager');
1064 var memoryCache = cacheManager.caching({store: 'memory',
1065 max: 100,
1066 ttl: 100});
1067 var localModel = new bigml.LocalModel(
1068 'model/51922d0b37203f2a8c000010',
1069 new bigml.BigML(undefined,
1070 undefined,
1071 {storage: memoryCache}));
1072```
1073
1074Other types of local model classes (LocalCluster, LocalAnomaly,
1075etc.) offer the same kind of mechanisms.
1076Please, check the following sections for details.
1077
1078
1079Local Models
1080------------
1081
1082A remote model encloses all the information required to make
1083predictions. Thus, once you retrieve a remote model, you can build its local
1084version and predict locally. This can be easily done using
1085the `LocalModel` class.
1086
1087```js
1088 var bigml = require('bigml');
1089 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010');
1090 localModel.predict({'petal length': 1},
1091 function(error, prediction) {console.log(prediction)});
1092```
1093
1094As you see, the first parameter to the `LocalModel` constructor is a model id
1095(or object or the path to a JSON file containing the full model information).
1096The constructor allows a second optional argument, a connection
1097object (as described in the [Authentication section](#authentication)).
1098
1099```js
1100 var bigml = require('bigml');
1101 var myUser = 'myuser';
1102 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1103 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010',
1104 new bigml.BigML(myUser, myKey));
1105 localModel.predict({'petal length': 1},
1106 function(error, prediction) {console.log(prediction)});
1107```
1108
1109The connection object can also include a storage directory. Setting that
1110will cause the `LocalModel` to check whether it can find a local model JSON
1111file in this directory before trying to download it from the server. This
1112means that your model information will only be downloaded the first time
1113you use it in a `LocalModel` instance. Instances that use the same connection
1114object will read the local file instead.
1115
1116```js
1117 var bigml = require('bigml');
1118 var myUser = 'myuser';
1119 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1120 var my_storage = './my_storage'
1121 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
1122 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010',
1123 connection);
1124 localModel.predict({'petal length': 1},
1125 function(error, prediction) {console.log(prediction)});
1126```
1127
1128
1129The predict method can also be used labelling input data with the corresponding
1130field id.
1131
1132```js
1133 var bigml = require('bigml');
1134 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010');
1135 localModel.predict({'000002': 1},
1136 function(error, prediction) {console.log(prediction)});
1137```
1138
1139When the first argument is a finished model object, the constructor creates
1140immediately
1141a `LocalModel` instance ready to predict. Then, the `LocalModel.predict`
1142method can be immediately called in a synchronous way.
1143
1144
1145```js
1146 var bigml = require('bigml');
1147 var model = new bigml.Model();
1148 model.get('model/51b3c45a37203f16230000b5',
1149 true,
1150 'only_model=true;limit=-1',
1151 function (error, resource) {
1152 if (!error && resource) {
1153 var localModel = new bigml.LocalModel(resource);
1154 var prediction = localModel.predict({'petal length': 3});
1155 console.log(prediction);
1156 }
1157 })
1158```
1159Note that the `get` method's second and third arguments ensure that the
1160retrieval waits for the model to be finished before retrieving it and that all
1161the fields used in the model will be downloaded. Beware of using
1162filtered fields models to instantiate a local model. If an important field is
1163missing (because it has been excluded or
1164filtered), an exception will arise. In this example, the connection to BigML
1165is used only in the `get` method call to retrieve the remote model information.
1166The callback code, where the `localModel` and predictions are built, is
1167strictly local.
1168
1169On the other hand, when the first argument for the `LocalModel` constructor
1170is a model id, it automatically calls internally
1171the `bigml.Model.get` method to retrieve the remote model information. As this
1172is an asyncronous procedure, the `LocalModel.predict` method must wait for
1173the built process to complete before making predictions. When using the
1174previous callback syntax this condition is internally ensured and you need
1175not care for these details. However, you may
1176want to use the synchronous version of the predict method in this case too.
1177Then you must be aware that the `LocalModel`
1178`ready` event is triggered on completion and at the same time the
1179`LocalModel.ready` attribute is set to true. You can wait for
1180the `ready` event to make predictions synchronously from then on like in:
1181
1182```js
1183 var bigml = require('bigml');
1184 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010');
1185 function doPredictions() {
1186 var prediction = localModel.predict({'petal length': 1});
1187 console.log(prediction);
1188 }
1189 if (localModel.ready) {
1190 doPredictions();
1191 } else {
1192 localModel.on('ready', function () {doPredictions()});
1193 }
1194```
1195
1196You can also create a `LocalModel` from a JSON file containing the model
1197structure by setting the path to the file as first parameter:
1198
1199```js
1200 var bigml = require('bigml');
1201 var localModel = new bigml.LocalModel('my_dir/my_model.json');
1202 localModel.predict({'000002': 1},
1203 function(error, prediction) {console.log(prediction)});
1204```
1205
1206For classifications, the prediction of a local model will be one of the
1207available categories in the objective field and an associated `confidence`
1208or `probability` that is used to decide which is the predicted category.
1209If you prefer the model predictions to be operated using any of them, you can
1210use the `operatingKind` argument in the `predict` method. Here's the example
1211to use predictions based on `confidence`:
1212
1213```js
1214 localModel.predict({'000002': 1},
1215 undefined,
1216 undefined,
1217 undefined,
1218 undefined,
1219 "confidence",
1220 function(error, prediction) {console.log(prediction)});
1221```
1222
1223
1224Predictions' Missing Strategy
1225-----------------------------
1226
1227There are two different strategies when dealing with missing values
1228in input data for the fields used in the model rules. The default
1229strategy used in predictions when a missing value is found for the
1230field used to split the node is returning the prediction of the previous node.
1231
1232```js
1233 var bigml = require('bigml');
1234 var LAST_PREDICTION = 0;
1235 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010');
1236 localModel.predict({'petal length': 1}, LAST_PREDICTION,
1237 function(error, prediction) {console.log(prediction)});
1238```
1239
1240The other strategy when a missing value is found is considering both splits
1241valid and following the rules to the final leaves of the tree. The prediction
1242is built considering all the predictions of the leaves reached, averaging
1243them in regressions and selecting the majority class for classifications.
1244
1245```js
1246 var bigml = require('bigml');
1247 var PROPORTIONAL = 1;
1248 var localModel = new bigml.LocalModel('model/51922d0b37203f2a8c000010');
1249 localModel.predict({'petal length': 1}, PROPORTIONAL,
1250 function(error, prediction) {console.log(prediction)});
1251```
1252
1253Operating point's predictions
1254-----------------------------
1255
1256In classification problems,
1257Models, Ensembles and Logistic Regressions can be used at different
1258operating points, that is, associated to particular thresholds. Each
1259operating point is then defined by the kind of property you use as threshold,
1260its value and the class that is supposed to be predicted if the threshold
1261is reached.
1262
1263Let's assume you decide that you have a binary problem, with classes `True`
1264and `False` as possible outcomes. Imagine you want to be very sure to
1265predict the `True` outcome, so you don't want to predict that unless the
1266probability associated to it is over `0.8`. You can achieve this with any
1267classification model by creating an operating point:
1268
1269```js
1270 var operatingPoint = {kind: 'probability',
1271 positiveClass: 'True',
1272 threshold: 0.8};
1273```
1274
1275to predict using this restriction, you can use the `predictOperating`
1276method:
1277
1278```js
1279
1280 var prediction = localModel.predictOperating(inputData,
1281 missingStrategy,
1282 operatingPoint,
1283 cb);
1284```
1285
1286where `inputData` should contain the values for which you want to predict
1287and
1288`missingStrategy` is the strategy to use when values are missing (please,
1289check the previous section for details about this parameter).
1290
1291Local models allow two kinds of operating points: `probability` and
1292`confidence`. For both of them, the threshold can be set to any number
1293in the `[0, 1]` range.
1294
1295You can also use the `predict` method in its most general form:
1296
1297```js
1298 var prediction = localModel.predict(inputData,
1299 missingStrategy,
1300 median,
1301 addUnusedFields,
1302 operatingPoint,
1303 cb);
1304
1305```
1306
1307Local Ensembles
1308---------------
1309
1310As in the local model case, remote ensembles can also be used locally through
1311the `LocalEnsemble` class to make local predictions. The simplest way to
1312create a `LocalEnsemble` is:
1313
1314```js
1315 var bigml = require('bigml');
1316 var localEnsemble = new bigml.LocalEnsemble('ensemble/51901f4337203f3a9a000215');
1317 localEnsemble.predict({'petal length': 1},
1318 function(error, prediction) {console.log(prediction)});
1319```
1320
1321This call will download all the ensemble related info (and each of its
1322component models) and use it to predict by combining the predictions
1323of each individual
1324model. The algorithm used to combine these predictions depends
1325on the ensemble type (`Decision Forest`,
1326`Random Decision Forest` or `Boosting Trees`) and you can learn more about
1327them in the
1328[ensembles section of the API documents](https://bigml.com/api/ensembles).
1329The example shows
1330a `Decision Forest` using a majority system (classifications)
1331or an average system
1332(regressions) to combine the models' predictions.
1333
1334For classifications
1335the prediction will be one amongst the list of categories in the objective
1336field. When each model in the ensemble
1337is used to predict, each category has a confidence, a
1338probability or a vote associated to this prediction.
1339Then, through the collection
1340of models in the
1341ensemble, each category gets an averaged confidence, probabiity and number of
1342votes. Thus you can decide whether to operate the ensemble using the
1343``confidence``, the ``probability`` or the ``votes`` so that the predicted
1344category is the one that scores higher in any of these quantities. The
1345criteria can be set using the `operatingKind` option:
1346
1347
1348```js
1349 var bigml = require('bigml');
1350 var localEnsemble = new bigml.LocalEnsemble('ensemble/51901f4337203f3a9a000215');
1351 localEnsemble.predict({'petal length': 1}, undefined,
1352 {operatingKind: "probability"},
1353 function(error, prediction) {console.log(prediction)});
1354```
1355
1356
1357The first argument in the `LocalEnsemble.predict`
1358method
1359is the input data to predict from. The second argument is a legacy
1360parameter to be deprecated that used to decide the combination method. This
1361parameter has been overridden by the ``operatingKind`` option that can be
1362sent in the third argument, which is an object where you can specify some
1363additional configuration values, such as the missing strategy used in each
1364model's prediction:
1365
1366```js
1367 var bigml = require('bigml');
1368 var localEnsemble = new bigml.LocalEnsemble('ensemble/51901f4337203f3a9a000215');
1369 localEnsemble.predict({'petal length': 1},
1370 undefined,
1371 {missingStrategy: 1,
1372 operatingKind: "confidence"},
1373 function(error, prediction) {console.log(prediction)});
1374```
1375in this case the proportional missing strategy (default would be last
1376prediction missing strategy) will be applied.
1377
1378As in `LocalModel`, the constructor of `LocalEnsemble` has as
1379first argument the ensemble id (or object) or a list of model ids (or objects)
1380as well as a second optional connection
1381argument. Building a `LocalEnsemble` is an asynchronous process because the
1382constructor will need to call the `get` methods of the remote ensemble object
1383and its component models. Thus, the `LocalEnsemble.predict` method will have
1384to wait for the object to be entirely built before making the prediction. This
1385is internally done when you use the callback syntax for the `predict` method.
1386In case you want to call the `LocalEnsemble.predict` method as a synchronous
1387function, you should first make sure that the constructor has finished building
1388the object by checking the `LocalEnsemble.ready` attribute and listening
1389to the `ready` event. For instance,
1390
1391```js
1392 var bigml = require('bigml');
1393 var localEnsemble = new bigml.LocalEnsemble('ensemble/51901f4337203f3a9a000215');
1394 function doPredictions() {
1395 var prediction = localEnsemble.predict({'petal length': 1});
1396 console.log(prediction);
1397 }
1398 if (localEnsemble.ready) {
1399 doPredictions();
1400 } else {
1401 localEnsemble.on('ready', function () {doPredictions()});
1402 }
1403```
1404would first download the remote ensemble and its component models, then
1405construct a local model for each one and predict using these local models.
1406
1407The same can be done for an array containing a list of models (only bagging
1408ensembles and random decision forests can be built this way), regardless of
1409whether they belong to a remote ensemble or not:
1410
1411```js
1412 var bigml = require('bigml');
1413 var localEnsemble = new bigml.LocalEnsemble([
1414 'model/51bb69b437203f02b50004ce', 'model/51bb69b437203f02b50004d0']);
1415 localEnsemble.predict({'petal length': 1},
1416 function(error, prediction) {console.log(prediction)});
1417```
1418
1419Operating point predictions are also available for local ensembles and an
1420example of it would be:
1421
1422```js
1423 var operatingPoint = {kind: 'probability',
1424 positiveClass: 'True',
1425 threshold: 0.8};
1426 var prediction = localEnsemble.predictOperating(inputData,
1427 missingStrategy,
1428 operatingPoint,
1429 cb);
1430```
1431
1432or using the `predict` method:
1433
1434```js
1435 var prediction = localEnsemble.predict(inputData,
1436 undefined,
1437 {operatingPoint: operatingPoint},
1438 cb);
1439```
1440
1441You can check the
1442[Operating point's predictions](#operating-point's-predictions) section
1443to learn about
1444operating points. For ensembles, three kinds of operating points are available:
1445`votes`, `probability` and `confidence`. The `votes` option
1446will use as threshold the
1447number of models in the ensemble that vote for the positive class. The other
1448two are already explained in the above mentioned section.
1449
1450
1451The local ensemble constructor accepts also a connection object.
1452The connection object can also include the user's credentials and
1453a storage directory. Setting that
1454will cause the `LocalEnsemble` to check whether it can find a local ensemble
1455JSON file in this directory before trying to download it from the server. This
1456means that your model information will only be downloaded the first time
1457you use it in a `LocalEnsemble` instance. Instances that use the same
1458connection
1459object will read the local file instead.
1460
1461```js
1462 var bigml = require('bigml');
1463 var myUser = 'myuser';
1464 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1465 var my_storage = './my_storage';
1466 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
1467 var localEnsemble = new bigml.LocalEnsemble(
1468 'ensemble/51922d0b37203f2a8c000010', connection);
1469```
1470
1471Local Logistic Regressions
1472--------------------------
1473
1474A remote logistic regression model encloses all the information
1475required to predict the categorical value of the objective field associated
1476to a given input data set.
1477Thus, you can build a local version of
1478a logistic regression model and predict the category locally using
1479the `LocalLogisticRegression` class.
1480
1481```js
1482 var bigml = require('bigml');
1483 var localLogisticRegression = new bigml.LocalLogisticRegression(
1484 'logisticregression/51922d0b37203f2a8c001010');
1485 localLogisticRegression.predict({'petal length': 1, 'petal width': 1,
1486 'sepal length': 1, 'sepal width': 1},
1487 function(error, prediction) {
1488 console.log(prediction)});
1489```
1490
1491Note that, to find the associated prediction, input data cannot contain missing
1492values in numeric fields. The predict method can also be used labelling
1493input data with the corresponding field id.
1494
1495As you see, the first parameter to the `LocalLogisticRegression` constructor
1496is a logistic regression id (or object). The constructor allows a second
1497optional argument, a connection
1498object (as described in the [Authentication section](#authentication)).
1499
1500```js
1501 var bigml = require('bigml');
1502 var myUser = 'myuser';
1503 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1504 var localLogisticRegression = new bigml.LocalLogisticRegression(
1505 'logisticregression/51922d0b37203f2a8c001010',
1506 new bigml.BigML(myUser, myKey));
1507 localLogisticRegression.predict({'000000': 1, '000001': 1,
1508 '000002': 1, '000003': 1},
1509 function(error, prediction) {
1510 console.log(prediction)});
1511```
1512
1513When the first argument is a finished logistic regression object,
1514the constructor creates immediately
1515a `LocalLogisticRegression` instance ready to predict. Then,
1516the `LocalLogisticRegression.predict`
1517method can be immediately called in a synchronous way.
1518
1519
1520```js
1521 var bigml = require('bigml');
1522 var logisticRegression = new bigml.LogisticRegression();
1523 logisticRegression.get('logisticregression/51b3c45a37203f16230000b5', true,
1524 'only_model=true;limit=-1',
1525 function (error, resource) {
1526 if (!error && resource) {
1527 var localLogisticRegression = new bigml.LocalLogisticRegression(
1528 resource);
1529 var prediction = localLogisticRegression.predict(
1530 {'000000': 1, '000001': 1,
1531 '000002': 1, '000003': 1});
1532 console.log(prediction);
1533 }
1534 })
1535```
1536Note that the `get` method's second and third arguments ensure that the
1537retrieval waits for the model to be finished before retrieving it and that all
1538the fields used in the logistic regression will be downloaded respectively.
1539Beware of using
1540filtered fields logistic regressions to instantiate a local logistic regression
1541object. If an important field
1542is missing (because it has been excluded or
1543filtered), an exception will arise. In this example, the connection to BigML
1544is used only in the `get` method call to retrieve the remote logistic
1545regression
1546information. The callback code, where the `localLogisticRegression`
1547and predictions
1548are built, is strictly local.
1549
1550Operating point predictions are also available for local logistic regressions
1551and an example of it would be:
1552
1553```js
1554 var operatingPoint = {kind: 'probability',
1555 positiveClass: 'True',
1556 threshold: 0.8};
1557 localLogistic.predictOperating(inputData,
1558 operatingPoint,
1559 cb);
1560```
1561
1562or using the `predict` method:
1563
1564```js
1565 localLogistic.predict(inputData,
1566 undefined,
1567 operatingPoint,
1568 cb);
1569```
1570
1571You can check the
1572[Operating point's predictions](#operating-point's-predictions) section
1573to learn about
1574operating points. For logistic regressions, the only available kind is
1575`probability`, that sets the threshold of probability to be reached for the
1576prediction to be the positive class.
1577
1578
1579The local logistic regression constructor accepts also a connection object.
1580The connection object can also include the user's credentials and
1581a storage directory. Setting that
1582will cause the `LocalLogistic` to check whether it can find a local logistic
1583regression JSON file in this directory before trying to download
1584it from the server. This
1585means that your model information will only be downloaded the first time
1586you use it in a `LocalLogistic` instance. Instances that use the same
1587connection
1588object will read the local file instead.
1589
1590```js
1591 var bigml = require('bigml');
1592 var myUser = 'myuser';
1593 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1594 var my_storage = './my_storage'
1595 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
1596 var localLogistic = new bigml.LocalLogistic(
1597 'logisticregression/51922d0b37203f2a8c000010', connection);
1598```
1599
1600Local Deepnets
1601--------------
1602
1603A remote deepnet model encloses all the information
1604required to predict the value of the objective field associated
1605to a given input data set.
1606Thus, you can build a local version of
1607a deepnet model and predict the category locally using
1608the `LocalDeepnet` class.
1609
1610```js
1611 var bigml = require('bigml');
1612 var localDeepnet = new bigml.LocalDeepnet(
1613 'deepnet/51922d0b37203f2a8c001010');
1614 localDeepnet.predict({'petal length': 1, 'petal width': 1,
1615 'sepal length': 1, 'sepal width': 1},
1616 function(error, prediction) {
1617 console.log(prediction)});
1618```
1619
1620As you see, the first parameter to the `LocalDeepnet` constructor
1621is a deepnet id (or object). The constructor allows a second
1622optional argument, a connection
1623object (as described in the [Authentication section](#authentication)).
1624
1625```js
1626 var bigml = require('bigml');
1627 var myUser = 'myuser';
1628 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1629 var localDeepnet = new bigml.LocalDeepnet(
1630 'deepnet/51922d0b37203f2a8c001010',
1631 new bigml.BigML(myUser, myKey));
1632 localDeepnet.predict({'000000': 1, '000001': 1,
1633 '000002': 1, '000003': 1},
1634 function(error, prediction) {
1635 console.log(prediction)});
1636```
1637
1638When the first argument is a finished deepnet object,
1639the constructor creates immediately
1640a `LocalDeepnet` instance ready to predict. Then,
1641the `LocalDeepnet.predict`
1642method can be immediately called in a synchronous way.
1643
1644
1645```js
1646 var bigml = require('bigml');
1647 var deepnet = new bigml.Deepnet();
1648 deepnet.get('deepnet/51b3c45a37203f16230000b5', true,
1649 'only_model=true;limit=-1',
1650 function (error, resource) {
1651 if (!error && resource) {
1652 var localDeepnet = new bigml.LocalDeepnet(
1653 resource);
1654 var prediction = localDeepnet.predict(
1655 {'000000': 1, '000001': 1,
1656 '000002': 1, '000003': 1});
1657 console.log(prediction);
1658 }
1659 })
1660```
1661Note that the `get` method's second and third arguments ensure that the
1662retrieval waits for the model to be finished before retrieving it and that all
1663the fields used in the deepnet will be downloaded.
1664Beware of using
1665filtered fields deepnets to instantiate a local deepnet
1666object. If an important field
1667is missing (because it has been excluded or
1668filtered), an exception will arise. In this example, the connection to BigML
1669is used only in the `get` method call to retrieve the remote deepnet
1670information. The callback code, where the `localDeepnet`
1671and predictions
1672are built, is strictly local.
1673
1674Operating point predictions are also available for local deepnets
1675and an example of it would be:
1676
1677```js
1678 var operatingPoint = {kind: 'probability',
1679 positiveClass: 'True',
1680 threshold: 0.8};
1681 localDeepnet.predictOperating(inputData,
1682 operatingPoint,
1683 cb);
1684```
1685
1686or using the `predict` method:
1687
1688```js
1689 localDeepnet.predict(inputData,
1690 false,
1691 operatingPoint,
1692 cb);
1693```
1694
1695You can check the
1696[Operating point's predictions](#operating-point's-predictions) section
1697to learn about
1698operating points. For deepnets, the only available kind is
1699`probability`, that sets the threshold of probability to be reached for the
1700prediction to be the positive class.
1701
1702The local deepnet constructor accepts also a connection object.
1703The connection object can also include the user's credentials and
1704a storage directory. Setting that
1705will cause the `LocalDeepnet` to check whether it can find a local deepnet
1706JSON file in this directory before trying to download
1707it from the server. This
1708means that your model information will only be downloaded the first time
1709you use it in a `LocalDeepnet` instance. Instances that use the same
1710connection
1711object will read the local file instead.
1712
1713```js
1714 var bigml = require('bigml');
1715 var myUser = 'myuser';
1716 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1717 var my_storage = './my_storage'
1718 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
1719 var localDeepnet = new bigml.LocalDeepnet(
1720 'deepnet/51922d0b37203f2a8c000010', connection);
1721```
1722
1723Local Fusions
1724-------------
1725
1726A remote fusion model encloses all the information
1727required to predict the value of the objective field associated
1728to a given input data set. The fusion model is composed of a list of
1729supervised models whose predictions are aggregated to produce a final
1730fusion prediction. You can build a local version of
1731a fusion and predict the category (or numeric objective) locally using
1732the `LocalFusion` class.
1733
1734```js
1735 var bigml = require('bigml');
1736 var localFusion = new bigml.LocalFusion(
1737 'fusion/51922d0b37203f2a8c001013');
1738 localFusion.predict({'petal length': 1, 'petal width': 1,
1739 'sepal length': 1, 'sepal width': 1},
1740 function(error, prediction) {
1741 console.log(prediction)});
1742```
1743
1744As you see, the first parameter to the `LocalFusion` constructor
1745is a fusion id (or object). The constructor allows a second
1746optional argument, a connection
1747object (as described in the [Authentication section](#authentication)).
1748
1749```js
1750 var bigml = require('bigml');
1751 var myUser = 'myuser';
1752 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1753 var localFusion = new bigml.LocalFusion(
1754 'fusion/51922d0b37203f2a8c001010',
1755 new bigml.BigML(myUser, myKey));
1756 localFusion.predict({'000000': 1, '000001': 1,
1757 '000002': 1, '000003': 1},
1758 function(error, prediction) {
1759 console.log(prediction)});
1760```
1761
1762When the first argument is a finished fusion object,
1763the constructor creates immediately
1764a `LocalFusion` instance ready to predict. Then,
1765the `LocalFusion.predict`
1766method can be immediately called in a synchronous way.
1767
1768
1769```js
1770 var bigml = require('bigml');
1771 var fusion = new bigml.Fusion();
1772 fusion.get('fusion/51b3c45a37203f16230000b5', true,
1773 'only_model=true;limit=-1',
1774 function (error, resource) {
1775 if (!error && resource) {
1776 var localFusion = new bigml.LocalFusion(
1777 resource);
1778 var prediction = localFusion.predict(
1779 {'000000': 1, '000001': 1,
1780 '000002': 1, '000003': 1});
1781 console.log(prediction);
1782 }
1783 })
1784```
1785Note that the `get` method's second and third arguments ensure that the
1786retrieval waits for the model to be finished before retrieving it and that all
1787the fields used in the fusion will be downloaded.
1788Beware of using
1789a fusion with filtered fields information to instantiate a local fusion
1790object. If an important field
1791is missing (because it has been excluded or
1792filtered), an exception will arise. In this example, the connection to BigML
1793is used only in the `get` method call to retrieve the remote fusion
1794information. The callback code, where the `localFusion`
1795and predictions
1796are built, is strictly local.
1797
1798Operating point predictions are also available for local fusions
1799and an example of it would be:
1800
1801```js
1802 var operatingPoint = {kind: 'probability',
1803 positiveClass: 'True',
1804 threshold: 0.8};
1805 locaFusion.predictOperating(inputData,
1806 operatingPoint,
1807 cb);
1808```
1809
1810or using the `predict` method:
1811
1812```js
1813 localFusion.predict(inputData,
1814 operatingPoint,
1815 cb);
1816```
1817
1818You can check the
1819[Operating point's predictions](#operating-point's-predictions) section
1820to learn about
1821operating points. For fusions, the only available kind is
1822`probability`, that sets the threshold of probability to be reached for the
1823prediction to be the positive class.
1824
1825Local PCA
1826---------
1827
1828A remote PCA model describes the set of orthogonal features best adapted to
1829a particular dataset to describe the variance in its data. These features
1830are built by linearly combining the original set of features.
1831You can build a local version of
1832a PCA and use it to compute the projections of every instance from the
1833original feature set to the PCA one using
1834the `LocalPCA` class.
1835
1836```js
1837 var bigml = require('bigml');
1838 var localPCA = new bigml.LocalPCA(
1839 'pca/51922d0b37203f2a8c001017');
1840 localPCA.projection({'petal length': 1, 'petal width': 1,
1841 'sepal length': 1, 'sepal width': 1},
1842 function(error, prediction) {
1843 console.log(prediction)});
1844```
1845
1846As you see, the first parameter to the `LocalPCA` constructor
1847is a PCA id (or object). The constructor allows a second
1848optional argument, a connection
1849object (as described in the [Authentication section](#authentication))
1850that can also include the user's credentials and
1851a storage directory. Setting that
1852will cause the `LocalPCA` to check whether it can find a local
1853PCA JSON file in this directory before trying to download
1854it from the server. This
1855means that your model information will only be downloaded the first time
1856you use it in a `LocalPCA` instance. Instances that use the same
1857connection
1858object will read the local file instead.
1859
1860```js
1861 var bigml = require('bigml');
1862 var myUser = 'myuser';
1863 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1864 var my_storage = './my_storage'
1865 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
1866 var localPCA = new bigml.PCA(
1867 'pca/51922d0b37203f2a8c000010', connection);
1868 localPCA.projection({'000000': 1, '000001': 1,
1869 '000002': 1, '000003': 1},
1870 function(error, projection) {
1871 console.log(projection)});
1872```
1873
1874When the first argument is a finished PCA object,
1875the constructor creates immediately
1876a `LocalPCA` instance ready to work. Then,
1877the `LocalPCA.predict`
1878method can be immediately called in a synchronous way.
1879
1880
1881```js
1882 var bigml = require('bigml');
1883 var pca = new bigml.PCA();
1884 pca.get('pca/51b3c45a37203f16230000b5', true,
1885 'only_model=true;limit=-1',
1886 function (error, resource) {
1887 if (!error && resource) {
1888 var localPCA = new bigml.LocalPCA(
1889 resource);
1890 var projection = localPCA.predict(
1891 {'000000': 1, '000001': 1,
1892 '000002': 1, '000003': 1});
1893 console.log(projection);
1894 }
1895 })
1896```
1897Note that the `get` method's second and third arguments ensure that the
1898retrieval waits for the model to be finished before retrieving it and that all
1899the fields used in the PCA will be downloaded.
1900Beware of using
1901a PCA with filtered fields information to instantiate a local PCA
1902object. If an important field
1903is missing (because it has been excluded or
1904filtered), an exception will arise. In this example, the connection to BigML
1905is used only in the `get` method call to retrieve the remote PCA
1906information. The callback code, where the `localPCA`
1907and projections
1908are built, is strictly local.
1909
1910
1911Local Linear Regressions
1912------------------------
1913
1914A remote linear regression model encloses all the information
1915required to predict the numeric value of the objective field associated
1916to a given input data set.
1917Thus, you can build a local version of
1918a linear regression model and predict the numeric objective locally using
1919the `LocalLinearRegression` class.
1920
1921```js
1922 var bigml = require('bigml');
1923 var localLinearRegression = new bigml.LocalLinearRegression(
1924 'linearregression/51922d0b37203f2a8c001010');
1925 localLinearRegression.predict({'petal length': 1, 'petal width': 1,
1926 'sepal length': 1, 'species': 'Iris-setosa'},
1927 function(error, prediction) {
1928 console.log(prediction)});
1929```
1930
1931Note that, to find the associated prediction, input data cannot contain missing
1932values in fields that had not missings in the training data.
1933The predict method can also be used labelling
1934input data with the corresponding field id.
1935
1936As you see, the first parameter to the `LocalLinearRegression` constructor
1937is a linear regression id (or object). The constructor allows a second
1938optional argument, a connection
1939object (as described in the [Authentication section](#authentication)).
1940
1941```js
1942 var bigml = require('bigml');
1943 var myUser = 'myuser';
1944 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
1945 var localLinearRegression = new bigml.LocalLinearRegression(
1946 'linearregression/51922d0b37203f2a8c001010',
1947 new bigml.BigML(myUser, myKey));
1948 localLinearRegression.predict({'000000': 1, '000001': 1,
1949 '000002': 1, '000004': 'Iris-setosa'},
1950 function(error, prediction) {
1951 console.log(prediction)});
1952```
1953
1954When the first argument is a finished linear regression object,
1955the constructor creates immediately
1956a `LocalLinearRegression` instance ready to predict. Then,
1957the `LocalLinearRegression.predict`
1958method can be immediately called in a synchronous way.
1959
1960
1961```js
1962 var bigml = require('bigml');
1963 var linearRegression = new bigml.LinearRegression();
1964 linearRegression.get('linearregression/51b3c45a37203f16230000b5', true,
1965 'only_model=true;limit=-1',
1966 function (error, resource) {
1967 if (!error && resource) {
1968 var localLinearRegression = new bigml.LocalLinearRegression(
1969 resource);
1970 var prediction = localLinearRegression.predict(
1971 {'000000': 1, '000001': 1,
1972 '000002': 1, '000004': 'Iris-setosa'});
1973 console.log(prediction);
1974 }
1975 })
1976```
1977Note that the `get` method's second and third arguments ensure that the
1978retrieval waits for the model to be finished before retrieving it and that all
1979the fields used in the linear regression will be downloaded respectively.
1980Beware of using filtered fields linear regressions to
1981instantiate a local linear regression
1982object. If an important field is missing (because it has been excluded or
1983filtered), an exception will arise. In this example, the connection to BigML
1984is used only in the `get` method call to retrieve the remote linear
1985regression information. The callback code, where the `localLinearRegression`
1986and predictions are built, is strictly local.
1987
1988The local linear regression constructor accepts also a connection object.
1989The connection object can also include the user's credentials and
1990a storage directory. Setting that
1991will cause the `LocalLinear` to check whether it can find a local linear
1992regression JSON file in this directory before trying to download
1993it from the server. This
1994means that your model information will only be downloaded the first time
1995you use it in a `LocalLinear` instance. Instances that use the same
1996connection
1997object will read the local file instead.
1998
1999```js
2000 var bigml = require('bigml');
2001 var myUser = 'myuser';
2002 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2003 var my_storage = './my_storage'
2004 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
2005 var localLinear = new bigml.LocalLinear(
2006 'linearregression/51922d0b37203f2a8c000010', connection);
2007```
2008
2009
2010Local predictions' distribution
2011-------------------------------
2012
2013For classification models, the local model, ensemble, logistic regression,
2014deepnet, linear regression, or fusion objects offer a method that
2015produces the predicted distribution of probabilities
2016for each of the categories in the objective field.
2017
2018```js
2019 var probabilities = localModel.predictProbability(inputData,
2020 missingStrategy,
2021 cb);
2022```
2023The result of this call will generate list of objects that contain the
2024category name and the probability predicted for that category, for instance:
2025
2026```js
2027 [{"category": "Iris-setosa", "probability": 0.53},
2028 {"category": "Iris-virginica", "probability": 0.20},
2029 {"category": "Iris-versicolor", "probability": 0.27}]
2030```
2031
2032Local Clusters
2033--------------
2034
2035A remote cluster encloses all the information required to predict the centroid
2036associated to a given input data set. Thus, you can build a local version of
2037a cluster and predict centroids locally using
2038the `LocalCluster` class.
2039
2040```js
2041 var bigml = require('bigml');
2042 var localCluster = new bigml.LocalCluster('cluster/51922d0b37203f2a8c000010');
2043 localCluster.centroid({'petal length': 1, 'petal width': 1,
2044 'sepal length': 1, 'sepal width': 1,
2045 'species': 'Iris-setosa'},
2046 function(error, centroid) {console.log(centroid)});
2047```
2048
2049Note that, to find the associated centroid, input data cannot contain missing
2050values in numeric fields. The centroid method can also be used labelling
2051input data with the corresponding field id.
2052
2053As you see, the first parameter to the `LocalCluster` constructor is a cluster
2054id (or object). The constructor allows a second optional argument, a connection
2055object (as described in the [Authentication section](#authentication)).
2056
2057```js
2058 var bigml = require('bigml');
2059 var myUser = 'myuser';
2060 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2061 var localCluster = new bigml.LocalCluster('cluster/51922d0b37203f2a8c000010',
2062 new bigml.BigML(myUser, myKey));
2063 localCluster.centroid({'000000': 1, '000001': 1,
2064 '000002': 1, '000003': 1,
2065 '000004': 'Iris-setosa'},
2066 function(error, centroid) {console.log(centroid)});
2067```
2068
2069When the first argument is a finished cluster object, the constructor creates
2070immediately
2071a `LocalCluster` instance ready to predict. Then, the `LocalCluster.centroid`
2072method can be immediately called in a synchronous way.
2073
2074
2075```js
2076 var bigml = require('bigml');
2077 var cluster = new bigml.Cluster();
2078 cluster.get('cluster/51b3c45a37203f16230000b5', true,
2079 'only_model=true;limit=-1',
2080 function (error, resource) {
2081 if (!error && resource) {
2082 var localCluster = new bigml.LocalCluster(resource);
2083 var centroid = localCluster.centroid({'000000': 1, '000001': 1,
2084 '000002': 1, '000003': 1,
2085 '000004': 'Iris-setosa'});
2086 console.log(centroid);
2087 }
2088 })
2089```
2090Note that the `get` method's second and third arguments ensure that the
2091retrieval waits for the model to be finished before retrieving it and that all
2092the fields used in the cluster will be downloaded respectively. Beware of using
2093filtered fields clusters to instantiate a local cluster. If an important field
2094is missing (because it has been excluded or
2095filtered), an exception will arise. In this example, the connection to BigML
2096is used only in the `get` method call to retrieve the remote cluster
2097information. The callback code, where the `localCluster` and predictions
2098are built, is strictly local.
2099
2100The local cluster constructor accepts also a connection object.
2101The connection object can also include the user's credentials and
2102a storage directory. Setting that
2103will cause the `LocalCluster` to check whether it can find a local cluster
2104SON file in this directory before trying to download
2105it from the server. This
2106means that your model information will only be downloaded the first time
2107you use it in a `LocalCluster` instance. Instances that use the same
2108connection
2109object will read the local file instead.
2110
2111```js
2112 var bigml = require('bigml');
2113 var myUser = 'myuser';
2114 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2115 var my_storage = './my_storage'
2116 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
2117 var localCluster = new bigml.LocalCluster(
2118 'cluster/51922d0b37203f2a8c000010', connection);
2119```
2120
2121Local Anomaly Detectors
2122-----------------------
2123
2124A remote anomaly detector encloses all the information required to predict the
2125anomaly score
2126associated to a given input data set. Thus, you can build a local version of
2127an anomaly detector and predict anomaly scores locally using
2128the `LocalAnomaly` class.
2129
2130```js
2131 var bigml = require('bigml');
2132 var localAnomaly = new bigml.LocalAnomaly('anomaly/51922d0b37203f2a8c003010');
2133 localAnomaly.anomalyScore({'srv_serror_rate': 0.0, 'src_bytes': 181.0,
2134 'srv_count': 8.0, 'serror_rate': 0.0},
2135 function(error, anomalyScore) {
2136 console.log(anomalyScore)});
2137```
2138
2139The anomaly score method can also be used labelling
2140input data with the corresponding field id.
2141
2142As you see, the first parameter to the `LocalAnomaly` constructor is an anomaly
2143detector id (or object). The constructor allows a second optional argument,
2144a connection
2145object (as described in the [Authentication section](#authentication)).
2146
2147```js
2148 var bigml = require('bigml');
2149 var myUser = 'myuser';
2150 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2151 var localAnomaly = new bigml.LocalAnomaly('anomaly/51922d0b37203f2a8c003010',
2152 new bigml.BigML(myUser, myKey));
2153 localAnomaly.anomalyScore({'000020': 9.0, '000004': 181.0, '000016': 8.0,
2154 '000024': 0.0, '000025': 0.0},
2155 function(error, anomalyScore) {console.log(anomalyScore)});
2156```
2157
2158When the first argument is a finished anomaly detector object, the constructor creates
2159immediately
2160a `LocalAnomaly` instance ready to predict. Then, the `LocalAnomaly.anomalyScore`
2161method can be immediately called in a synchronous way.
2162
2163
2164```js
2165 var bigml = require('bigml');
2166 var anomaly = new bigml.Anomaly();
2167 anomaly.get('anomaly/51b3c45a37203f16230030b5', true,
2168 'only_model=true;limit=-1',
2169 function (error, resource) {
2170 if (!error && resource) {
2171 var localAnomaly = new bigml.LocalAnomaly(resource);
2172 var anomalyScore = localAnomaly.anomalyScore(
2173 {'000020': 9.0, '000004': 181.0, '000016': 8.0,
2174 '000024': 0.0, '000025': 0.0});
2175 console.log(anomalyScore);
2176 }
2177 })
2178```
2179Note that the `get` method's second and third arguments ensure that the
2180retrieval waits for the model to be finished before retrieving it and that all
2181the fields used in the anomaly detector will be downloaded respectively.
2182Beware of using
2183filtered fields anomaly detectors to instantiate a local anomaly detector.
2184If an important field
2185is missing (because it has been excluded or
2186filtered), an exception will arise. In this example, the connection to BigML
2187is used only in the `get` method call to retrieve the remote anomaly detector
2188information. The callback code, where the `localAnomaly` and scores
2189are built, is strictly local.
2190
2191The top anomalies in the `LocalAnomaly` can be extracted from the original
2192dataset by filtering the rows that have the highest score. The filter
2193expression that can single out these rows can be extracted using the
2194`anomaliesFilter` method:
2195
2196```js
2197 localAnomaly.anomaliesFilter(true,
2198 function(error, data) {console.log(data);});
2199```
2200
2201When the first argument is set to `true`, the filter corresponds to the top
2202anomalies. On the contrary, if set to `false` the filter will exclude
2203the top anomalies from the dataset.
2204
2205The local anomaly constructor accepts also a connection object.
2206The connection object can also include the user's credentials and
2207a storage directory. Setting that
2208will cause the `LocalAnomaly` to check whether it can find a local anomaly
2209JSON file in this directory before trying to download
2210it from the server. This
2211means that your model information will only be downloaded the first time
2212you use it in a `LocalAnomaly` instance. Instances that use the same
2213connection
2214object will read the local file instead.
2215
2216```js
2217 var bigml = require('bigml');
2218 var myUser = 'myuser';
2219 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2220 var my_storage = './my_storage'
2221 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
2222 var localAnomaly = new bigml.LocalAnomaly(
2223 'anomaly/51922d0b37203f2a8c000010', connection);
2224```
2225
2226Local Associations
2227------------------
2228
2229A remote association object encloses the information about which field values
2230in your dataset are related. The values are structured as items and
2231their relations are described as rules. The `LocalAssociation`
2232class allows you to build a local version of this remote object, and get
2233both the list of `Items` that can be related and the list of `AssociationRules`
2234that describe such relations. Creating a `LocalAssociation` object is as
2235simple as
2236
2237
2238```js
2239 var bigml = require('bigml');
2240 var localAssociation = new bigml.LocalAssociation('association/51922d0b37203f2a8c003010');
2241```
2242
2243and you can list the `AssociationRules` that it contains using the `getRules`
2244method
2245
2246```js
2247 var index = 0;
2248 associationRules = localAssociation.getRules();
2249 for (index = 0; index < associationRules.length; index++) {
2250 console.log(associationRules[index].describe());
2251 }
2252```
2253
2254As you can see in the previous code, the `AssociationRule` object has a
2255`describe` method that will generate a human-readable description of the
2256rule.
2257
2258The `getRules` method accepts also several arguments that will allow you to
2259filter the rules by its leverage, strength, support, p-value, the list of
2260items they contain or a user-given filter function. They can all be added
2261as attributes of a filters object as first argument. The second argument can
2262be a callback function. The previous example was a syncrhonous call to the
2263method that will only work once the `localAssociation` object is ready. To
2264use the method asyncrhonously you can use:
2265
2266```js
2267 var associationRules;
2268 localAssociation.getRules(
2269 {minLeverage: 0.3}, // filter by minimum Leverage
2270 function(error, data) {associationRules = data;}) // callback code
2271```
2272
2273See the method docstring for filter options details.
2274
2275Similarly, you can obtain the list of `Items` involved in the association
2276rules
2277
2278```js
2279 var index = 0;
2280 items = localAssociation.getItems();
2281 for (index = 0; index < items.length; index++) {
2282 console.log(items[index].describe());
2283 }
2284```
2285
2286and they can be filtered by their field ID, name, an object containing
2287input data or a user-given function. See the method docstring for details.
2288
2289You can also save the rules to a CSV file using the `rulesCSV` method
2290
2291```js
2292 minimumLeverage = 0.3;
2293 localAssociation.rulesCSV(
2294 './my_csv.csv', // fileName
2295 {minLeverage: minimumLeverage}); // filters for the rules
2296```
2297
2298as you can see, the first argument is the path to the CSV file where the
2299rules will be stored and the second one is the list of rules. In this example,
2300we are only storing the rules whose leverage is over the 0.3 threshold.
2301
2302Both the `getItems` and the `rulesCSV` methods can also be called
2303asynchronously as we saw for the `getRules` method.
2304
2305
2306The `LocalAssociation` object can be used to retrieve the `association sets`
2307related to a certain input data.
2308
2309
2310```js
2311 var bigml = require('bigml');
2312 var localAssociation = new bigml.LocalAssociation(
2313 'association/55922d0b37203f2a8c003010');
2314 localAssociation.associationSet({product: 'cat food'},
2315 function(error, associationSet) {
2316 console.log(associationSet)});
2317```
2318
2319When the
2320`LocalAssociation` instance is ready, the `LocalAssociation.associationSet`
2321method can be immediately called to predict in a synchronous way.
2322
2323
2324```js
2325 var bigml = require('bigml');
2326 var association = new bigml.Association();
2327 association.get('association/51b3c45a37203f16230530b5', true,
2328 'only_model=true;limit=-1',
2329 function (error, resource) {
2330 if (!error && resource) {
2331 var localAssociation = new bigml.LocalAssociation(resource);
2332 var associationSet = localAssociation.associationSet(
2333 {'000020': 'cat food'});
2334 console.log(associationSet);
2335 }
2336 })
2337```
2338In this example, the connection to BigML
2339is used only in the `get` method call to retrieve the remote association
2340information. The callback code, where the `localAssociation` and scores
2341are built, is strictly local.
2342
2343
2344Local Topic Models
2345------------------
2346
2347A remote `topic model` object contains a list of topics extracted from the
2348terms in the collection of documents used in its training. The
2349`LocalTopicModel`
2350class allows you to build a local version of this remote object, and get
2351the list of `Topics` and the terms distribution for each of them. Using this
2352information, its `distribution` method computes the probabilities for a new
2353document to be classified under each of the `Topics`.
2354Creating a `LocalTopicModel` object is as
2355simple as
2356
2357
2358```js
2359 var bigml = require('bigml');
2360 var localTopicModel = new bigml.LocalTopicModel('topicmodel/51922d0b37203f4b8c003010');
2361```
2362
2363and obtaining the `TopicDistribution` for a new document:
2364
2365```js
2366 var newDocument = {"Message": "Where are you?when wil you reach here?"}
2367 localTopicModel.distribution(newDocument,
2368 function(error, topicDistribution) {
2369 console.log(topicDistribution)});
2370```
2371
2372Note that only text fields are considered to decide the `topic distribution`
2373of a document, and their contents will be concatenated.
2374
2375As you see, the first parameter to the `LocalTopicModel` constructor is a
2376`topic model`
2377id (or object). The constructor allows a second optional argument, a connection
2378object (as described in the [Authentication section](#authentication)).
2379
2380```js
2381 var bigml = require('bigml');
2382 var myUser = 'myuser';
2383 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2384 var localTopicModel = new bigml.LocalTopicModel('topicmodel/51522d0b37203f2a8c000010',
2385 new bigml.BigML(myUser, myKey));
2386 localTopicModel.distribution({'000001': "Where are you?when wil you reach here?"},
2387 function(error, topicDistribution) {console.log(topicDistribution)});
2388```
2389
2390When the first argument is a finished `topic model` object,
2391the constructor creates
2392immediately
2393a `LocalTopicModel` instance ready to be used. Then, the
2394`LocalTopicModel.distribution` method can be immediately called
2395in a synchronous way.
2396
2397
2398```js
2399 var bigml = require('bigml');
2400 var topicModel = new bigml.TopicModel();
2401 topicModel.get('topicmodel/51b3c45a47203f16230000b5', true,
2402 'only_model=true;limit=-1',
2403 function (error, resource) {
2404 if (!error && resource) {
2405 var localTopicModel = new bigml.LocalTopicModel(resource);
2406 var topicDistribution = localTopicModel.distribution(
2407 {'000001': "Where are you?when wil you reach here?"},
2408 console.log(topicDistribution);
2409 }
2410 })
2411```
2412Note that the `get` method's second and third arguments ensure that the
2413retrieval waits for the `topic model` to be finished before retrieving
2414it and that all
2415the fields used in the `topic model` will be downloaded.
2416Beware of using
2417filtered fields topic models to instantiate a local topic model.
2418If an important field
2419is missing (because it has been excluded or
2420filtered), an exception will arise. In this example, the connection to BigML
2421is used only in the `get` method call to retrieve the remote topic model
2422information. The callback code, where the `localTopicModel` and distributions
2423are built, is strictly local.
2424
2425The local association constructor accepts also a connection object.
2426The connection object can also include the user's credentials and
2427a storage directory. Setting that
2428will cause the `LocalAssociation` to check whether it can find a local
2429association JSON file in this directory before trying to download
2430it from the server. This
2431means that your model information will only be downloaded the first time
2432you use it in a `LocalAssociation` instance. Instances that use the same
2433connection
2434object will read the local file instead.
2435
2436```js
2437 var bigml = require('bigml');
2438 var myUser = 'myuser';
2439 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2440 var my_storage = './my_storage'
2441 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
2442 var localAssociation = new bigml.LocalAssociation(
2443 'association/51922d0b37203f2a8c000010', connection);
2444```
2445
2446Local Time Series
2447-----------------
2448
2449A remote time series resource encloses all the information
2450required to produce forecasts for all the numeric fields that have been
2451previously declared as its objective fields.
2452Thus, you can build a local version of
2453a time series and generate forecasts using the `LocalTimeSeries` class.
2454
2455```js
2456 var bigml = require('bigml');
2457 var localTimeSeries = new bigml.LocalTimeSeries(
2458 'timeseries/51922d0b37203f2a8c001010');
2459 localTimeSeries.forecast({'Final': {'horizon': 10}},
2460 function(error, forecast) {
2461 console.log(forecast)});
2462```
2463
2464The forecast method can also be used labelling
2465input data with the corresponding field id. The result of this call will
2466contain forecasts for the fields, horizons and ETS models given as input
2467data. In the example, the response will be an object like:
2468
2469```js
2470 { '000005': [ { model: 'A,N,N', pointForecast: [ 68.53181,
2471 68.53181,
2472 68.53181,
2473 68.53181,
2474 68.53181,
2475 68.53181,
2476 68.53181,
2477 68.53181,
2478 68.53181,
2479 68.53181 ]}]}
2480```
2481that contains the ID of the forecasted field and the forecast for the best
2482performing model in the `time series` (according to `aic`). You can
2483read more about the available error metrics and the input data parameters in
2484the [time series API documentation](https://bigml.com/api/timeseries) and
2485the [forecast API documentation](https://bigml.com/api/forecasts).
2486
2487As you see, the first parameter to the `LocalTimeSeries` constructor
2488is a time series ID (or object). The constructor allows a second
2489optional argument, a connection
2490object (as described in the [Authentication section](#authentication)).
2491
2492```js
2493 var bigml = require('bigml');
2494 var myUser = 'myuser';
2495 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2496 var localTimeSeries = new bigml.LocalTimeSeries(
2497 'timeseries/51922d0b37203f2a8c001010',
2498 new bigml.BigML(myUser, myKey));
2499 localTimeSeries.forecast({'000005': {"horizon": 10,
2500 "ets_models": {"criterion": "aic",
2501 "names": ["A,A,N"],
2502 "indices": [3,7],
2503 "limit": 2}}},
2504 function(error, forecast) {
2505 console.log(forecast)});
2506```
2507
2508When the first argument is a finished time series object,
2509the constructor creates immediately
2510a `LocalTimeSeries` instance ready to predict. Then,
2511the `LocalTimeSeries.forecast`
2512method can be immediately called in a synchronous way.
2513
2514
2515```js
2516 var bigml = require('bigml');
2517 var timeSeries = new bigml.TimeSeries();
2518 timeSeries.get('timeseries/51b3c45a37203f16230000b5', true,
2519 'only_model=true;limit=-1',
2520 function (error, resource) {
2521 if (!error && resource) {
2522 var localTimeSeries = new bigml.LocalTimeSeries(
2523 resource);
2524 var prediction = localTimeSeries.forecast(
2525 {'Final': {'horizon': 10}});
2526 console.log(prediction);
2527 }
2528 })
2529```
2530Note that the `get` method's second and third arguments ensure that the
2531retrieval waits for the time series to be finished before retrieving it
2532and that all
2533the fields used in the time series models will be downloaded respectively.
2534Beware of using
2535filtered fields time series to instantiate a local time series
2536object.
2537
2538The local time series constructor accepts also a connection object.
2539The connection object can also include the user's credentials and
2540a storage directory. Setting that
2541will cause the `LocalTimeSeries` to check whether it can find a local time
2542series JSON file in this directory before trying to download
2543it from the server. This
2544means that your model information will only be downloaded the first time
2545you use it in a `LocalTimeSeries` instance. Instances that use the same
2546connection
2547object will read the local file instead.
2548
2549```js
2550 var bigml = require('bigml');
2551 var myUser = 'myuser';
2552 var myKey = 'ae579e7e53fb9abd646a6ff8aa99d4afe83ac291';
2553 var my_storage = './my_storage'
2554 var connection = new bigml.BigML(myUser, myKey, {storage: my_storage});
2555 var localTimeSeries = new bigml.LocalTimeSeries(
2556 'timeseries/51922d0b37203f2a8c000010', connection);
2557```
2558
2559External Connectors
2560-------------------
2561
2562BigML offers an `externalconnector` resource that can be used to connect to
2563external data sources. The description of the API requirements to create an
2564`ExternalConnector` can be found in the
2565[API documentation](https://bigml.com/api/externalconnectors). The required
2566information to create an external connector are parameters like the type
2567of database manager, the host, user, password and table that we need to access.
2568This must be provided as first argument to the `bigml.ExternalConnector`
2569create method or can be set in environment variables (leaving the first
2570argument undefined):
2571
2572```batch
2573export BIGML_EXTERNAL_CONN_HOST=db.host.com
2574export BIGML_EXTERNAL_CONN_PORT=4324
2575export BIGML_EXTERNAL_CONN_USER=my_user
2576export BIGML_EXTERNAL_CONN_PWD=my_password
2577export BIGML_EXTERNAL_CONN_DB=my_database
2578export BIGML_EXTERNAL_CONN_SOURCE="postgresql"
2579```
2580
2581
2582```js
2583 var bigml = require('bigml');
2584 var externalConnectorConn = new bigml.ExternalConnector(),
2585 externalConnectorId, connectionInfo,
2586 args = {"name": "my connector", "source": "postgresql"}, retry = false;
2587 externalConnectorConn.create(connectionInfo, args, retry,
2588 function(error, data) {
2589 if (error) {console.log(error);
2590 } else {
2591 externalConnectorId = data.resource;
2592 }
2593 });
2594 // As connectionInfo is undefined, the environment variables are retrieved
2595 // and the information used to create the external connector is
2596 // {"host": "db.host.com",
2597 // "port": 4321,
2598 // "user": "my_user",
2599 // "password": "my_password",
2600 // "database": "my_database"}
2601 // Alternatively, you can provide this information as first argument
2602 // for the ExternalConnector create method.
2603```
2604
2605Logging configuration and Exceptions
2606------------------------------------
2607
2608Logging is configured at startup to use the
2609[winston](https://github.com/flatiron/winston) logging library. The environment
2610variables ``BIGML_LOG_EXCEPTIONS`` and ``BIGML_EXIT_ON_ERROR`` can be set to
2611``0`` or ``1`` to control whether BigML takes care of logging the errors and/or
2612causes exiting. By default, BigML will use ``winston`` to handle errors and
2613will exit when an uncaught exception is raised.
2614
2615Logs will be sent
2616both to console and a `bigml.log` file by default. You can change this
2617behaviour by using:
2618
2619- BIGML_LOG_FILE: path to the log file.
2620- BIGML_LOG_LEVEL: log level (0 - no output at all, 1 - console and file log,
2621 2 - console log only, 3 - file log only,
2622 4 - console and log file with debug info)
2623
2624For instance,
2625
2626```batch
2627export BIGML_LOG_FILE=/tmp/my_log_file.log
2628export BIGML_LOG_LEVEL=3
2629```
2630
2631would store log information only in the `/tmp/my_log_file.log` file.
2632
2633Additional Information
2634----------------------
2635
2636For additional information about the API, see the
2637[BigML developer's documentation](https://bigml.com/api).