1 | # [SheetJS](https://sheetjs.com)
|
2 |
|
3 | The SheetJS Community Edition offers battle-tested open-source solutions for
|
4 | extracting useful data from almost any complex spreadsheet and generating new
|
5 | spreadsheets that will work with legacy and modern software alike.
|
6 |
|
7 | [SheetJS Pro](https://sheetjs.com/pro) offers solutions beyond data processing:
|
8 | Edit complex templates with ease; let out your inner Picasso with styling; make
|
9 | custom sheets with images/graphs/PivotTables; evaluate formula expressions and
|
10 | port calculations to web apps; automate common spreadsheet tasks, and much more!
|
11 |
|
12 | ![License](https://img.shields.io/github/license/SheetJS/sheetjs)
|
13 | [![Build Status](https://img.shields.io/github/workflow/status/sheetjs/sheetjs/Tests:%20node.js)](https://github.com/SheetJS/sheetjs/actions)
|
14 | [![Snyk Vulnerabilities](https://img.shields.io/snyk/vulnerabilities/github/SheetJS/sheetjs)](https://snyk.io/test/github/SheetJS/sheetjs)
|
15 | [![npm Downloads](https://img.shields.io/npm/dm/xlsx.svg)](https://npmjs.org/package/xlsx)
|
16 | [![Analytics](https://ga-beacon.appspot.com/UA-36810333-1/SheetJS/sheetjs?pixel)](https://github.com/SheetJS/sheetjs)
|
17 |
|
18 | [**Browser Test and Support Matrix**](https://oss.sheetjs.com/sheetjs/tests/)
|
19 |
|
20 | [![Build Status](https://saucelabs.com/browser-matrix/sheetjs.svg)](https://saucelabs.com/u/sheetjs)
|
21 |
|
22 | **Supported File Formats**
|
23 |
|
24 | ![circo graph of format support](formats.png)
|
25 |
|
26 | ![graph legend](legend.png)
|
27 |
|
28 | ## Table of Contents
|
29 |
|
30 | <details>
|
31 | <summary><b>Expand to show Table of Contents</b></summary>
|
32 |
|
33 |
|
34 |
|
35 | - [Getting Started](#getting-started)
|
36 | * [Installation](#installation)
|
37 | * [Usage](#usage)
|
38 | * [The Zen of SheetJS](#the-zen-of-sheetjs)
|
39 | * [JS Ecosystem Demos](#js-ecosystem-demos)
|
40 | - [Acquiring and Extracting Data](#acquiring-and-extracting-data)
|
41 | * [Parsing Workbooks](#parsing-workbooks)
|
42 | * [Processing JSON and JS Data](#processing-json-and-js-data)
|
43 | * [Processing HTML Tables](#processing-html-tables)
|
44 | - [Processing Data](#processing-data)
|
45 | * [Modifying Workbook Structure](#modifying-workbook-structure)
|
46 | * [Modifying Cell Values](#modifying-cell-values)
|
47 | * [Modifying Other Worksheet / Workbook / Cell Properties](#modifying-other-worksheet--workbook--cell-properties)
|
48 | - [Packaging and Releasing Data](#packaging-and-releasing-data)
|
49 | * [Writing Workbooks](#writing-workbooks)
|
50 | * [Writing Examples](#writing-examples)
|
51 | * [Streaming Write](#streaming-write)
|
52 | * [Generating JSON and JS Data](#generating-json-and-js-data)
|
53 | * [Generating HTML Tables](#generating-html-tables)
|
54 | * [Generating Single-Worksheet Snapshots](#generating-single-worksheet-snapshots)
|
55 | - [Interface](#interface)
|
56 | * [Parsing functions](#parsing-functions)
|
57 | * [Writing functions](#writing-functions)
|
58 | * [Utilities](#utilities)
|
59 | - [Common Spreadsheet Format](#common-spreadsheet-format)
|
60 | * [General Structures](#general-structures)
|
61 | * [Cell Object](#cell-object)
|
62 | + [Data Types](#data-types)
|
63 | + [Dates](#dates)
|
64 | * [Sheet Objects](#sheet-objects)
|
65 | + [Worksheet Object](#worksheet-object)
|
66 | + [Chartsheet Object](#chartsheet-object)
|
67 | + [Macrosheet Object](#macrosheet-object)
|
68 | + [Dialogsheet Object](#dialogsheet-object)
|
69 | * [Workbook Object](#workbook-object)
|
70 | + [Workbook File Properties](#workbook-file-properties)
|
71 | * [Workbook-Level Attributes](#workbook-level-attributes)
|
72 | + [Defined Names](#defined-names)
|
73 | + [Workbook Views](#workbook-views)
|
74 | + [Miscellaneous Workbook Properties](#miscellaneous-workbook-properties)
|
75 | * [Document Features](#document-features)
|
76 | + [Formulae](#formulae)
|
77 | + [Row and Column Properties](#row-and-column-properties)
|
78 | + [Number Formats](#number-formats)
|
79 | + [Hyperlinks](#hyperlinks)
|
80 | + [Cell Comments](#cell-comments)
|
81 | + [Sheet Visibility](#sheet-visibility)
|
82 | + [VBA and Macros](#vba-and-macros)
|
83 | - [Parsing Options](#parsing-options)
|
84 | * [Input Type](#input-type)
|
85 | * [Guessing File Type](#guessing-file-type)
|
86 | - [Writing Options](#writing-options)
|
87 | * [Supported Output Formats](#supported-output-formats)
|
88 | * [Output Type](#output-type)
|
89 | - [Utility Functions](#utility-functions)
|
90 | * [Array of Arrays Input](#array-of-arrays-input)
|
91 | * [Array of Objects Input](#array-of-objects-input)
|
92 | * [HTML Table Input](#html-table-input)
|
93 | * [Formulae Output](#formulae-output)
|
94 | * [Delimiter-Separated Output](#delimiter-separated-output)
|
95 | + [UTF-16 Unicode Text](#utf-16-unicode-text)
|
96 | * [HTML Output](#html-output)
|
97 | * [JSON](#json)
|
98 | - [File Formats](#file-formats)
|
99 | - [Testing](#testing)
|
100 | * [Node](#node)
|
101 | * [Browser](#browser)
|
102 | * [Tested Environments](#tested-environments)
|
103 | * [Test Files](#test-files)
|
104 | - [Contributing](#contributing)
|
105 | * [OSX/Linux](#osxlinux)
|
106 | * [Windows](#windows)
|
107 | * [Tests](#tests)
|
108 | - [License](#license)
|
109 | - [References](#references)
|
110 |
|
111 |
|
112 |
|
113 | </details>
|
114 |
|
115 | ## Getting Started
|
116 |
|
117 | ### Installation
|
118 |
|
119 | **Standalone Browser Scripts**
|
120 |
|
121 | The complete browser standalone build is saved to `dist/xlsx.full.min.js` and
|
122 | can be directly added to a page with a `script` tag:
|
123 |
|
124 | ```html
|
125 | <script lang="javascript" src="dist/xlsx.full.min.js"></script>
|
126 | ```
|
127 |
|
128 | <details>
|
129 | <summary><b>CDN Availability</b> (click to show)</summary>
|
130 |
|
131 | | CDN | URL |
|
132 | |-----------:|:-------------------------------------------|
|
133 | | `unpkg` | <https://unpkg.com/xlsx/> |
|
134 | | `jsDelivr` | <https://jsdelivr.com/package/npm/xlsx> |
|
135 | | `CDNjs` | <https://cdnjs.com/libraries/xlsx> |
|
136 |
|
137 | For example, `unpkg` makes the latest version available at:
|
138 |
|
139 | ```html
|
140 | <script src="https://unpkg.com/xlsx/dist/xlsx.full.min.js"></script>
|
141 | ```
|
142 |
|
143 | </details>
|
144 |
|
145 | <details>
|
146 | <summary><b>Browser builds</b> (click to show)</summary>
|
147 |
|
148 | The complete single-file version is generated at `dist/xlsx.full.min.js`
|
149 |
|
150 | `dist/xlsx.core.min.js` omits codepage library (no support for XLS encodings)
|
151 |
|
152 | A slimmer build is generated at `dist/xlsx.mini.min.js`. Compared to full build:
|
153 | - codepage library skipped (no support for XLS encodings)
|
154 | - no support for XLSB / XLS / Lotus 1-2-3 / SpreadsheetML 2003 / Numbers
|
155 | - node stream utils removed
|
156 |
|
157 | </details>
|
158 |
|
159 |
|
160 | With [bower](https://bower.io/search/?q=js-xlsx):
|
161 |
|
162 | ```bash
|
163 | $ bower install js-xlsx
|
164 | ```
|
165 |
|
166 | **ECMAScript Modules**
|
167 |
|
168 | The ECMAScript Module build is saved to `xlsx.mjs` and can be directly added to
|
169 | a page with a `script` tag using `type=module`:
|
170 |
|
171 | ```html
|
172 | <script type="module">
|
173 | import { read, writeFileXLSX } from "./xlsx.mjs";
|
174 |
|
175 | /* load the codepage support library for extended support with older formats */
|
176 | import { set_cptable } from "./xlsx.mjs";
|
177 | import * as cptable from './dist/cpexcel.full.mjs';
|
178 | set_cptable(cptable);
|
179 | </script>
|
180 | ```
|
181 |
|
182 | The [npm package](https://www.npmjs.org/package/xlsx) also exposes the module
|
183 | with the `module` parameter, supported in Angular and other projects:
|
184 |
|
185 | ```ts
|
186 | import { read, writeFileXLSX } from "xlsx";
|
187 |
|
188 | /* load the codepage support library for extended support with older formats */
|
189 | import { set_cptable } from "xlsx";
|
190 | import * as cptable from 'xlsx/dist/cpexcel.full.mjs';
|
191 | set_cptable(cptable);
|
192 | ```
|
193 |
|
194 | **Deno**
|
195 |
|
196 | `xlsx.mjs` can be imported in Deno. It is available from `unpkg`:
|
197 |
|
198 | ```ts
|
199 | // @deno-types="https://unpkg.com/xlsx/types/index.d.ts"
|
200 | import * as XLSX from 'https://unpkg.com/xlsx/xlsx.mjs';
|
201 |
|
202 | /* load the codepage support library for extended support with older formats */
|
203 | import * as cptable from 'https://unpkg.com/xlsx/dist/cpexcel.full.mjs';
|
204 | XLSX.set_cptable(cptable);
|
205 | ```
|
206 |
|
207 | **NodeJS**
|
208 |
|
209 | With [npm](https://www.npmjs.org/package/xlsx):
|
210 |
|
211 | ```bash
|
212 | $ npm install xlsx
|
213 | ```
|
214 |
|
215 | By default, the module supports `require`:
|
216 |
|
217 | ```js
|
218 | var XLSX = require("xlsx");
|
219 | ```
|
220 |
|
221 | The module also ships with `xlsx.mjs` for use with `import`:
|
222 |
|
223 | ```js
|
224 | import * as XLSX from 'xlsx/xlsx.mjs';
|
225 |
|
226 | /* load 'fs' for readFile and writeFile support */
|
227 | import * as fs from 'fs';
|
228 | XLSX.set_fs(fs);
|
229 |
|
230 | /* load 'stream' for stream support */
|
231 | import { Readable } from 'stream';
|
232 | XLSX.stream.set_readable(Readable);
|
233 |
|
234 | /* load the codepage support library for extended support with older formats */
|
235 | import * as cpexcel from 'xlsx/dist/cpexcel.full.mjs';
|
236 | XLSX.set_cptable(cpexcel);
|
237 | ```
|
238 |
|
239 | **Photoshop and InDesign**
|
240 |
|
241 | `dist/xlsx.extendscript.js` is an ExtendScript build for Photoshop and InDesign
|
242 | that is included in the `npm` package. It can be directly referenced with a
|
243 | `#include` directive:
|
244 |
|
245 | ```extendscript
|
246 | #include "xlsx.extendscript.js"
|
247 | ```
|
248 |
|
249 |
|
250 | <details>
|
251 | <summary><b>Internet Explorer and ECMAScript 3 Compatibility</b> (click to show)</summary>
|
252 |
|
253 | For broad compatibility with JavaScript engines, the library is written using
|
254 | ECMAScript 3 language dialect as well as some ES5 features like `Array#forEach`.
|
255 | Older browsers require shims to provide missing functions.
|
256 |
|
257 | To use the shim, add the shim before the script tag that loads `xlsx.js`:
|
258 |
|
259 | ```html
|
260 | <!-- add the shim first -->
|
261 | <script type="text/javascript" src="shim.min.js"></script>
|
262 | <!-- after the shim is referenced, add the library -->
|
263 | <script type="text/javascript" src="xlsx.full.min.js"></script>
|
264 | ```
|
265 |
|
266 | The script also includes `IE_LoadFile` and `IE_SaveFile` for loading and saving
|
267 | files in Internet Explorer versions 6-9. The `xlsx.extendscript.js` script
|
268 | bundles the shim in a format suitable for Photoshop and other Adobe products.
|
269 |
|
270 | </details>
|
271 |
|
272 | ### Usage
|
273 |
|
274 | Most scenarios involving spreadsheets and data can be broken into 5 parts:
|
275 |
|
276 | 1) **Acquire Data**: Data may be stored anywhere: local or remote files,
|
277 | databases, HTML TABLE, or even generated programmatically in the web browser.
|
278 |
|
279 | 2) **Extract Data**: For spreadsheet files, this involves parsing raw bytes to
|
280 | read the cell data. For general JS data, this involves reshaping the data.
|
281 |
|
282 | 3) **Process Data**: From generating summary statistics to cleaning data
|
283 | records, this step is the heart of the problem.
|
284 |
|
285 | 4) **Package Data**: This can involve making a new spreadsheet or serializing
|
286 | with `JSON.stringify` or writing XML or simply flattening data for UI tools.
|
287 |
|
288 | 5) **Release Data**: Spreadsheet files can be uploaded to a server or written
|
289 | locally. Data can be presented to users in an HTML TABLE or data grid.
|
290 |
|
291 | A common problem involves generating a valid spreadsheet export from data stored
|
292 | in an HTML table. In this example, an HTML TABLE on the page will be scraped,
|
293 | a row will be added to the bottom with the date of the report, and a new file
|
294 | will be generated and downloaded locally. `XLSX.writeFile` takes care of
|
295 | packaging the data and attempting a local download:
|
296 |
|
297 | ```js
|
298 | // Acquire Data (reference to the HTML table)
|
299 | var table_elt = document.getElementById("my-table-id");
|
300 |
|
301 | // Extract Data (create a workbook object from the table)
|
302 | var workbook = XLSX.utils.table_to_book(table_elt);
|
303 |
|
304 | // Process Data (add a new row)
|
305 | var ws = workbook.Sheets["Sheet1"];
|
306 | XLSX.utils.sheet_add_aoa(ws, [["Created "+new Date().toISOString()]], {origin:-1});
|
307 |
|
308 | // Package and Release Data (`writeFile` tries to write and save an XLSB file)
|
309 | XLSX.writeFile(workbook, "Report.xlsb");
|
310 | ```
|
311 |
|
312 | This library tries to simplify steps 2 and 4 with functions to extract useful
|
313 | data from spreadsheet files (`read` / `readFile`) and generate new spreadsheet
|
314 | files from data (`write` / `writeFile`). Additional utility functions like
|
315 | `table_to_book` work with other common data sources like HTML tables.
|
316 |
|
317 | This documentation and various demo projects cover a number of common scenarios
|
318 | and approaches for steps 1 and 5.
|
319 |
|
320 | Utility functions help with step 3.
|
321 |
|
322 | ["Acquiring and Extracting Data"](#acquiring-and-extracting-data) describes
|
323 | solutions for common data import scenarios.
|
324 |
|
325 | ["Packaging and Releasing Data"](#packaging-and-releasing-data) describes
|
326 | solutions for common data export scenarios.
|
327 |
|
328 | ["Processing Data"](#packaging-and-releasing-data) describes solutions for
|
329 | common workbook processing and manipulation scenarios.
|
330 |
|
331 | ["Utility Functions"](#utility-functions) details utility functions for
|
332 | translating JSON Arrays and other common JS structures into worksheet objects.
|
333 |
|
334 | ### The Zen of SheetJS
|
335 |
|
336 | _Data processing should fit in any workflow_
|
337 |
|
338 | The library does not impose a separate lifecycle. It fits nicely in websites
|
339 | and apps built using any framework. The plain JS data objects play nice with
|
340 | Web Workers and future APIs.
|
341 |
|
342 | _JavaScript is a powerful language for data processing_
|
343 |
|
344 | The ["Common Spreadsheet Format"](#common-spreadsheet-format) is a simple object
|
345 | representation of the core concepts of a workbook. The various functions in the
|
346 | library provide low-level tools for working with the object.
|
347 |
|
348 | For friendly JS processing, there are utility functions for converting parts of
|
349 | a worksheet to/from an Array of Arrays. The following example combines powerful
|
350 | JS Array methods with a network request library to download data, select the
|
351 | information we want and create a workbook file:
|
352 |
|
353 | <details>
|
354 | <summary><b>Get Data from a JSON Endpoint and Generate a Workbook</b> (click to show)</summary>
|
355 |
|
356 | The goal is to generate a XLSB workbook of US President names and birthdays.
|
357 |
|
358 | **Acquire Data**
|
359 |
|
360 | _Raw Data_
|
361 |
|
362 | <https://theunitedstates.io/congress-legislators/executive.json> has the desired
|
363 | data. For example, John Adams:
|
364 |
|
365 | ```js
|
366 | {
|
367 | "id": { /* (data omitted) */ },
|
368 | "name": {
|
369 | "first": "John", // <-- first name
|
370 | "last": "Adams" // <-- last name
|
371 | },
|
372 | "bio": {
|
373 | "birthday": "1735-10-19", // <-- birthday
|
374 | "gender": "M"
|
375 | },
|
376 | "terms": [
|
377 | { "type": "viceprez", /* (other fields omitted) */ },
|
378 | { "type": "viceprez", /* (other fields omitted) */ },
|
379 | { "type": "prez", /* (other fields omitted) */ } // <-- look for "prez"
|
380 | ]
|
381 | }
|
382 | ```
|
383 |
|
384 | _Filtering for Presidents_
|
385 |
|
386 | The dataset includes Aaron Burr, a Vice President who was never President!
|
387 |
|
388 | `Array#filter` creates a new array with the desired rows. A President served
|
389 | at least one term with `type` set to `"prez"`. To test if a particular row has
|
390 | at least one `"prez"` term, `Array#some` is another native JS function. The
|
391 | complete filter would be:
|
392 |
|
393 | ```js
|
394 | const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez"));
|
395 | ```
|
396 |
|
397 | _Lining up the data_
|
398 |
|
399 | For this example, the name will be the first name combined with the last name
|
400 | (`row.name.first + " " + row.name.last`) and the birthday will be the subfield
|
401 | `row.bio.birthday`. Using `Array#map`, the dataset can be massaged in one call:
|
402 |
|
403 | ```js
|
404 | const rows = prez.map(row => ({
|
405 | name: row.name.first + " " + row.name.last,
|
406 | birthday: row.bio.birthday
|
407 | }));
|
408 | ```
|
409 |
|
410 | The result is an array of "simple" objects with no nesting:
|
411 |
|
412 | ```js
|
413 | [
|
414 | { name: "George Washington", birthday: "1732-02-22" },
|
415 | { name: "John Adams", birthday: "1735-10-19" },
|
416 | // ... one row per President
|
417 | ]
|
418 | ```
|
419 |
|
420 | **Extract Data**
|
421 |
|
422 | With the cleaned dataset, `XLSX.utils.json_to_sheet` generates a worksheet:
|
423 |
|
424 | ```js
|
425 | const worksheet = XLSX.utils.json_to_sheet(rows);
|
426 | ```
|
427 |
|
428 | `XLSX.utils.book_new` creates a new workbook and `XLSX.utils.book_append_sheet`
|
429 | appends a worksheet to the workbook. The new worksheet will be called "Dates":
|
430 |
|
431 | ```js
|
432 | const workbook = XLSX.utils.book_new();
|
433 | XLSX.utils.book_append_sheet(workbook, worksheet, "Dates");
|
434 | ```
|
435 |
|
436 | **Process Data**
|
437 |
|
438 | _Fixing headers_
|
439 |
|
440 | By default, `json_to_sheet` creates a worksheet with a header row. In this case,
|
441 | the headers come from the JS object keys: "name" and "birthday".
|
442 |
|
443 | The headers are in cells A1 and B1. `XLSX.utils.sheet_add_aoa` can write text
|
444 | values to the existing worksheet starting at cell A1:
|
445 |
|
446 | ```js
|
447 | XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" });
|
448 | ```
|
449 |
|
450 | _Fixing Column Widths_
|
451 |
|
452 | Some of the names are longer than the default column width. Column widths are
|
453 | set by [setting the `"!cols"` worksheet property](#row-and-column-properties).
|
454 |
|
455 | The following line sets the width of column A to approximately 10 characters:
|
456 |
|
457 | ```js
|
458 | worksheet["!cols"] = [ { wch: 10 } ]; // set column A width to 10 characters
|
459 | ```
|
460 |
|
461 | One `Array#reduce` call over `rows` can calculate the maximum width:
|
462 |
|
463 | ```js
|
464 | const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10);
|
465 | worksheet["!cols"] = [ { wch: max_width } ];
|
466 | ```
|
467 |
|
468 | Note: If the starting point was a file or HTML table, `XLSX.utils.sheet_to_json`
|
469 | will generate an array of JS objects.
|
470 |
|
471 | **Package and Release Data**
|
472 |
|
473 | `XLSX.writeFile` creates a spreadsheet file and tries to write it to the system.
|
474 | In the browser, it will try to prompt the user to download the file. In NodeJS,
|
475 | it will write to the local directory.
|
476 |
|
477 | ```js
|
478 | XLSX.writeFile(workbook, "Presidents.xlsx");
|
479 | ```
|
480 |
|
481 | **Complete Example**
|
482 |
|
483 | ```js
|
484 | // Uncomment the next line for use in NodeJS:
|
485 | // const XLSX = require("xlsx"), axios = require("axios");
|
486 |
|
487 | (async() => {
|
488 | /* fetch JSON data and parse */
|
489 | const url = "https://theunitedstates.io/congress-legislators/executive.json";
|
490 | const raw_data = (await axios(url, {responseType: "json"})).data;
|
491 |
|
492 | /* filter for the Presidents */
|
493 | const prez = raw_data.filter(row => row.terms.some(term => term.type === "prez"));
|
494 |
|
495 | /* flatten objects */
|
496 | const rows = prez.map(row => ({
|
497 | name: row.name.first + " " + row.name.last,
|
498 | birthday: row.bio.birthday
|
499 | }));
|
500 |
|
501 | /* generate worksheet and workbook */
|
502 | const worksheet = XLSX.utils.json_to_sheet(rows);
|
503 | const workbook = XLSX.utils.book_new();
|
504 | XLSX.utils.book_append_sheet(workbook, worksheet, "Dates");
|
505 |
|
506 | /* fix headers */
|
507 | XLSX.utils.sheet_add_aoa(worksheet, [["Name", "Birthday"]], { origin: "A1" });
|
508 |
|
509 | /* calculate column width */
|
510 | const max_width = rows.reduce((w, r) => Math.max(w, r.name.length), 10);
|
511 | worksheet["!cols"] = [ { wch: max_width } ];
|
512 |
|
513 | /* create an XLSX file and try to save to Presidents.xlsx */
|
514 | XLSX.writeFile(workbook, "Presidents.xlsx");
|
515 | })();
|
516 | ```
|
517 |
|
518 | For use in the web browser, assuming the snippet is saved to `snippet.js`,
|
519 | script tags should be used to include the `axios` and `xlsx` standalone builds:
|
520 |
|
521 | ```html
|
522 | <script src="https://unpkg.com/xlsx/dist/xlsx.full.min.js"></script>
|
523 | <script src="https://unpkg.com/axios/dist/axios.min.js"></script>
|
524 | <script src="snippet.js"></script>
|
525 | ```
|
526 |
|
527 |
|
528 | </details>
|
529 |
|
530 | _File formats are implementation details_
|
531 |
|
532 | The parser covers a wide gamut of common spreadsheet file formats to ensure that
|
533 | "HTML-saved-as-XLS" files work as well as actual XLS or XLSX files.
|
534 |
|
535 | The writer supports a number of common output formats for broad compatibility
|
536 | with the data ecosystem.
|
537 |
|
538 | To the greatest extent possible, data processing code should not have to worry
|
539 | about the specific file formats involved.
|
540 |
|
541 |
|
542 | ### JS Ecosystem Demos
|
543 |
|
544 | The [`demos` directory](demos/) includes sample projects for:
|
545 |
|
546 | **Frameworks and APIs**
|
547 | - [`angularjs`](demos/angular/)
|
548 | - [`angular and ionic`](demos/angular2/)
|
549 | - [`knockout`](demos/knockout/)
|
550 | - [`meteor`](demos/meteor/)
|
551 | - [`react and react-native`](demos/react/)
|
552 | - [`vue 2.x and weex`](demos/vue/)
|
553 | - [`XMLHttpRequest and fetch`](demos/xhr/)
|
554 | - [`nodejs server`](demos/server/)
|
555 | - [`databases and key/value stores`](demos/database/)
|
556 | - [`typed arrays and math`](demos/array/)
|
557 |
|
558 | **Bundlers and Tooling**
|
559 | - [`browserify`](demos/browserify/)
|
560 | - [`fusebox`](demos/fusebox/)
|
561 | - [`parcel`](demos/parcel/)
|
562 | - [`requirejs`](demos/requirejs/)
|
563 | - [`rollup`](demos/rollup/)
|
564 | - [`systemjs`](demos/systemjs/)
|
565 | - [`typescript`](demos/typescript/)
|
566 | - [`webpack 2.x`](demos/webpack/)
|
567 |
|
568 | **Platforms and Integrations**
|
569 | - [`deno`](demos/deno/)
|
570 | - [`electron application`](demos/electron/)
|
571 | - [`nw.js application`](demos/nwjs/)
|
572 | - [`Chrome / Chromium extensions`](demos/chrome/)
|
573 | - [`Download a Google Sheet locally`](demos/google-sheet/)
|
574 | - [`Adobe ExtendScript`](demos/extendscript/)
|
575 | - [`Headless Browsers`](demos/headless/)
|
576 | - [`canvas-datagrid`](demos/datagrid/)
|
577 | - [`x-spreadsheet`](demos/xspreadsheet/)
|
578 | - [`react-data-grid`](demos/react/modify/)
|
579 | - [`vue3-table-light`](demos/vue/modify/)
|
580 | - [`Swift JSC and other engines`](demos/altjs/)
|
581 | - [`"serverless" functions`](demos/function/)
|
582 | - [`internet explorer`](demos/oldie/)
|
583 |
|
584 | Other examples are included in the [showcase](demos/showcase/).
|
585 |
|
586 | <https://sheetjs.com/demos/modify.html> shows a complete example of reading,
|
587 | modifying, and writing files.
|
588 |
|
589 | <https://github.com/SheetJS/sheetjs/blob/HEAD/bin/xlsx.njs> is the command-line
|
590 | tool included with node installations, reading spreadsheet files and exporting
|
591 | the contents in various formats.
|
592 | ## Acquiring and Extracting Data
|
593 |
|
594 | ### Parsing Workbooks
|
595 |
|
596 | **API**
|
597 |
|
598 | _Extract data from spreadsheet bytes_
|
599 |
|
600 | ```js
|
601 | var workbook = XLSX.read(data, opts);
|
602 | ```
|
603 |
|
604 | The `read` method can extract data from spreadsheet bytes stored in a JS string,
|
605 | "binary string", NodeJS buffer or typed array (`Uint8Array` or `ArrayBuffer`).
|
606 |
|
607 |
|
608 | _Read spreadsheet bytes from a local file and extract data_
|
609 |
|
610 | ```js
|
611 | var workbook = XLSX.readFile(filename, opts);
|
612 | ```
|
613 |
|
614 | The `readFile` method attempts to read a spreadsheet file at the supplied path.
|
615 | Browsers generally do not allow reading files in this way (it is deemed a
|
616 | security risk), and attempts to read files in this way will throw an error.
|
617 |
|
618 | The second `opts` argument is optional. ["Parsing Options"](#parsing-options)
|
619 | covers the supported properties and behaviors.
|
620 |
|
621 | **Examples**
|
622 |
|
623 | Here are a few common scenarios (click on each subtitle to see the code):
|
624 |
|
625 | <details>
|
626 | <summary><b>Local file in a NodeJS server</b> (click to show)</summary>
|
627 |
|
628 | `readFile` uses `fs.readFileSync` under the hood:
|
629 |
|
630 | ```js
|
631 | var XLSX = require("xlsx");
|
632 |
|
633 | var workbook = XLSX.readFile("test.xlsx");
|
634 | ```
|
635 |
|
636 | For Node ESM, the `readFile` helper is not enabled. Instead, `fs.readFileSync`
|
637 | should be used to read the file data as a `Buffer` for use with `XLSX.read`:
|
638 |
|
639 | ```js
|
640 | import { readFileSync } from "fs";
|
641 | import { read } from "xlsx/xlsx.mjs";
|
642 |
|
643 | const buf = readFileSync("test.xlsx");
|
644 | /* buf is a Buffer */
|
645 | const workbook = read(buf);
|
646 | ```
|
647 |
|
648 | </details>
|
649 |
|
650 | <details>
|
651 | <summary><b>Local file in a Deno application</b> (click to show)</summary>
|
652 |
|
653 | `readFile` uses `Deno.readFileSync` under the hood:
|
654 |
|
655 | ```js
|
656 | // @deno-types="https://deno.land/x/sheetjs/types/index.d.ts"
|
657 | import * as XLSX from 'https://deno.land/x/sheetjs/xlsx.mjs'
|
658 |
|
659 | const workbook = XLSX.readFile("test.xlsx");
|
660 | ```
|
661 |
|
662 | Applications reading files must be invoked with the `--allow-read` flag. The
|
663 | [`deno` demo](demos/deno/) has more examples
|
664 |
|
665 | </details>
|
666 |
|
667 | <details>
|
668 | <summary><b>User-submitted file in a web page ("Drag-and-Drop")</b> (click to show)</summary>
|
669 |
|
670 | For modern websites targeting Chrome 76+, `File#arrayBuffer` is recommended:
|
671 |
|
672 | ```js
|
673 | // XLSX is a global from the standalone script
|
674 |
|
675 | async function handleDropAsync(e) {
|
676 | e.stopPropagation(); e.preventDefault();
|
677 | const f = e.dataTransfer.files[0];
|
678 | /* f is a File */
|
679 | const data = await f.arrayBuffer();
|
680 | /* data is an ArrayBuffer */
|
681 | const workbook = XLSX.read(data);
|
682 |
|
683 | /* DO SOMETHING WITH workbook HERE */
|
684 | }
|
685 | drop_dom_element.addEventListener("drop", handleDropAsync, false);
|
686 | ```
|
687 |
|
688 | For maximal compatibility, the `FileReader` API should be used:
|
689 |
|
690 | ```js
|
691 | function handleDrop(e) {
|
692 | e.stopPropagation(); e.preventDefault();
|
693 | var f = e.dataTransfer.files[0];
|
694 | /* f is a File */
|
695 | var reader = new FileReader();
|
696 | reader.onload = function(e) {
|
697 | var data = e.target.result;
|
698 | /* reader.readAsArrayBuffer(file) -> data will be an ArrayBuffer */
|
699 | var workbook = XLSX.read(data);
|
700 |
|
701 | /* DO SOMETHING WITH workbook HERE */
|
702 | };
|
703 | reader.readAsArrayBuffer(f);
|
704 | }
|
705 | drop_dom_element.addEventListener("drop", handleDrop, false);
|
706 | ```
|
707 |
|
708 | <https://oss.sheetjs.com/sheetjs/> demonstrates the FileReader technique.
|
709 |
|
710 | </details>
|
711 |
|
712 | <details>
|
713 | <summary><b>User-submitted file with an HTML INPUT element</b> (click to show)</summary>
|
714 |
|
715 | Starting with an HTML INPUT element with `type="file"`:
|
716 |
|
717 | ```html
|
718 | <input type="file" id="input_dom_element">
|
719 | ```
|
720 |
|
721 | For modern websites targeting Chrome 76+, `Blob#arrayBuffer` is recommended:
|
722 |
|
723 | ```js
|
724 | // XLSX is a global from the standalone script
|
725 |
|
726 | async function handleFileAsync(e) {
|
727 | const file = e.target.files[0];
|
728 | const data = await file.arrayBuffer();
|
729 | /* data is an ArrayBuffer */
|
730 | const workbook = XLSX.read(data);
|
731 |
|
732 | /* DO SOMETHING WITH workbook HERE */
|
733 | }
|
734 | input_dom_element.addEventListener("change", handleFileAsync, false);
|
735 | ```
|
736 |
|
737 | For broader support (including IE10+), the `FileReader` approach is recommended:
|
738 |
|
739 | ```js
|
740 | function handleFile(e) {
|
741 | var file = e.target.files[0];
|
742 | var reader = new FileReader();
|
743 | reader.onload = function(e) {
|
744 | var data = e.target.result;
|
745 | /* reader.readAsArrayBuffer(file) -> data will be an ArrayBuffer */
|
746 | var workbook = XLSX.read(e.target.result);
|
747 |
|
748 | /* DO SOMETHING WITH workbook HERE */
|
749 | };
|
750 | reader.readAsArrayBuffer(file);
|
751 | }
|
752 | input_dom_element.addEventListener("change", handleFile, false);
|
753 | ```
|
754 |
|
755 | The [`oldie` demo](demos/oldie/) shows an IE-compatible fallback scenario.
|
756 |
|
757 | </details>
|
758 |
|
759 | <details>
|
760 | <summary><b>Fetching a file in the web browser ("Ajax")</b> (click to show)</summary>
|
761 |
|
762 | For modern websites targeting Chrome 42+, `fetch` is recommended:
|
763 |
|
764 | ```js
|
765 | // XLSX is a global from the standalone script
|
766 |
|
767 | (async() => {
|
768 | const url = "http://oss.sheetjs.com/test_files/formula_stress_test.xlsx";
|
769 | const data = await (await fetch(url)).arrayBuffer();
|
770 | /* data is an ArrayBuffer */
|
771 | const workbook = XLSX.read(data);
|
772 |
|
773 | /* DO SOMETHING WITH workbook HERE */
|
774 | })();
|
775 | ```
|
776 |
|
777 | For broader support, the `XMLHttpRequest` approach is recommended:
|
778 |
|
779 | ```js
|
780 | var url = "http://oss.sheetjs.com/test_files/formula_stress_test.xlsx";
|
781 |
|
782 | /* set up async GET request */
|
783 | var req = new XMLHttpRequest();
|
784 | req.open("GET", url, true);
|
785 | req.responseType = "arraybuffer";
|
786 |
|
787 | req.onload = function(e) {
|
788 | var workbook = XLSX.read(req.response);
|
789 |
|
790 | /* DO SOMETHING WITH workbook HERE */
|
791 | };
|
792 |
|
793 | req.send();
|
794 | ```
|
795 |
|
796 | The [`xhr` demo](demos/xhr/) includes a longer discussion and more examples.
|
797 |
|
798 | <http://oss.sheetjs.com/sheetjs/ajax.html> shows fallback approaches for IE6+.
|
799 |
|
800 | </details>
|
801 |
|
802 | <details>
|
803 | <summary><b>Local file in a PhotoShop or InDesign plugin</b> (click to show)</summary>
|
804 |
|
805 | `readFile` wraps the `File` logic in Photoshop and other ExtendScript targets.
|
806 | The specified path should be an absolute path:
|
807 |
|
808 | ```js
|
809 | #include "xlsx.extendscript.js"
|
810 |
|
811 | /* Read test.xlsx from the Documents folder */
|
812 | var workbook = XLSX.readFile(Folder.myDocuments + "/test.xlsx");
|
813 | ```
|
814 |
|
815 | The [`extendscript` demo](demos/extendscript/) includes a more complex example.
|
816 |
|
817 | </details>
|
818 |
|
819 | <details>
|
820 | <summary><b>Local file in an Electron app</b> (click to show)</summary>
|
821 |
|
822 | `readFile` can be used in the renderer process:
|
823 |
|
824 | ```js
|
825 | /* From the renderer process */
|
826 | var XLSX = require("xlsx");
|
827 |
|
828 | var workbook = XLSX.readFile(path);
|
829 | ```
|
830 |
|
831 | Electron APIs have changed over time. The [`electron` demo](demos/electron/)
|
832 | shows a complete example and details the required version-specific settings.
|
833 |
|
834 | </details>
|
835 |
|
836 | <details>
|
837 | <summary><b>Local file in a mobile app with React Native</b> (click to show)</summary>
|
838 |
|
839 | The [`react` demo](demos/react) includes a sample React Native app.
|
840 |
|
841 | Since React Native does not provide a way to read files from the filesystem, a
|
842 | third-party library must be used. The following libraries have been tested:
|
843 |
|
844 | - [`react-native-file-access`](https://npm.im/react-native-file-access)
|
845 |
|
846 | The `base64` encoding returns strings compatible with the `base64` type:
|
847 |
|
848 | ```js
|
849 | import XLSX from "xlsx";
|
850 | import { FileSystem } from "react-native-file-access";
|
851 |
|
852 | const b64 = await FileSystem.readFile(path, "base64");
|
853 | /* b64 is a base64 string */
|
854 | const workbook = XLSX.read(b64, {type: "base64"});
|
855 | ```
|
856 |
|
857 | - [`react-native-fs`](https://npm.im/react-native-fs)
|
858 |
|
859 | The `ascii` encoding returns binary strings compatible with the `binary` type:
|
860 |
|
861 | ```js
|
862 | import XLSX from "xlsx";
|
863 | import { readFile } from "react-native-fs";
|
864 |
|
865 | const bstr = await readFile(path, "ascii");
|
866 | /* bstr is a binary string */
|
867 | const workbook = XLSX.read(bstr, {type: "binary"});
|
868 | ```
|
869 |
|
870 | </details>
|
871 |
|
872 | <details>
|
873 | <summary><b>NodeJS Server File Uploads</b> (click to show)</summary>
|
874 |
|
875 | `read` can accept a NodeJS buffer. `readFile` can read files generated by a
|
876 | HTTP POST request body parser like [`formidable`](https://npm.im/formidable):
|
877 |
|
878 | ```js
|
879 | const XLSX = require("xlsx");
|
880 | const http = require("http");
|
881 | const formidable = require("formidable");
|
882 |
|
883 | const server = http.createServer((req, res) => {
|
884 | const form = new formidable.IncomingForm();
|
885 | form.parse(req, (err, fields, files) => {
|
886 | /* grab the first file */
|
887 | const f = Object.entries(files)[0][1];
|
888 | const path = f.filepath;
|
889 | const workbook = XLSX.readFile(path);
|
890 |
|
891 | /* DO SOMETHING WITH workbook HERE */
|
892 | });
|
893 | }).listen(process.env.PORT || 7262);
|
894 | ```
|
895 |
|
896 | The [`server` demo](demos/server) has more advanced examples.
|
897 |
|
898 | </details>
|
899 |
|
900 | <details>
|
901 | <summary><b>Download files in a NodeJS process</b> (click to show)</summary>
|
902 |
|
903 | Node 17.5 and 18.0 have native support for fetch:
|
904 |
|
905 | ```js
|
906 | const XLSX = require("xlsx");
|
907 |
|
908 | const data = await (await fetch(url)).arrayBuffer();
|
909 | /* data is an ArrayBuffer */
|
910 | const workbook = XLSX.read(data);
|
911 | ```
|
912 |
|
913 | For broader compatibility, third-party modules are recommended.
|
914 |
|
915 | [`request`](https://npm.im/request) requires a `null` encoding to yield Buffers:
|
916 |
|
917 | ```js
|
918 | var XLSX = require("xlsx");
|
919 | var request = require("request");
|
920 |
|
921 | request({url: url, encoding: null}, function(err, resp, body) {
|
922 | var workbook = XLSX.read(body);
|
923 |
|
924 | /* DO SOMETHING WITH workbook HERE */
|
925 | });
|
926 | ```
|
927 |
|
928 | [`axios`](https://npm.im/axios) works the same way in browser and in NodeJS:
|
929 |
|
930 | ```js
|
931 | const XLSX = require("xlsx");
|
932 | const axios = require("axios");
|
933 |
|
934 | (async() => {
|
935 | const res = await axios.get(url, {responseType: "arraybuffer"});
|
936 | /* res.data is a Buffer */
|
937 | const workbook = XLSX.read(res.data);
|
938 |
|
939 | /* DO SOMETHING WITH workbook HERE */
|
940 | })();
|
941 | ```
|
942 |
|
943 | </details>
|
944 |
|
945 | <details>
|
946 | <summary><b>Download files in an Electron app</b> (click to show)</summary>
|
947 |
|
948 | The `net` module in the main process can make HTTP/HTTPS requests to external
|
949 | resources. Responses should be manually concatenated using `Buffer.concat`:
|
950 |
|
951 | ```js
|
952 | const XLSX = require("xlsx");
|
953 | const { net } = require("electron");
|
954 |
|
955 | const req = net.request(url);
|
956 | req.on("response", (res) => {
|
957 | const bufs = []; // this array will collect all of the buffers
|
958 | res.on("data", (chunk) => { bufs.push(chunk); });
|
959 | res.on("end", () => {
|
960 | const workbook = XLSX.read(Buffer.concat(bufs));
|
961 |
|
962 | /* DO SOMETHING WITH workbook HERE */
|
963 | });
|
964 | });
|
965 | req.end();
|
966 | ```
|
967 |
|
968 | </details>
|
969 |
|
970 | <details>
|
971 | <summary><b>Readable Streams in NodeJS</b> (click to show)</summary>
|
972 |
|
973 | When dealing with Readable Streams, the easiest approach is to buffer the stream
|
974 | and process the whole thing at the end:
|
975 |
|
976 | ```js
|
977 | var fs = require("fs");
|
978 | var XLSX = require("xlsx");
|
979 |
|
980 | function process_RS(stream, cb) {
|
981 | var buffers = [];
|
982 | stream.on("data", function(data) { buffers.push(data); });
|
983 | stream.on("end", function() {
|
984 | var buffer = Buffer.concat(buffers);
|
985 | var workbook = XLSX.read(buffer, {type:"buffer"});
|
986 |
|
987 | /* DO SOMETHING WITH workbook IN THE CALLBACK */
|
988 | cb(workbook);
|
989 | });
|
990 | }
|
991 | ```
|
992 |
|
993 | </details>
|
994 |
|
995 | <details>
|
996 | <summary><b>ReadableStream in the browser</b> (click to show)</summary>
|
997 |
|
998 | When dealing with `ReadableStream`, the easiest approach is to buffer the stream
|
999 | and process the whole thing at the end:
|
1000 |
|
1001 | ```js
|
1002 | // XLSX is a global from the standalone script
|
1003 |
|
1004 | async function process_RS(stream) {
|
1005 | /* collect data */
|
1006 | const buffers = [];
|
1007 | const reader = stream.getReader();
|
1008 | for(;;) {
|
1009 | const res = await reader.read();
|
1010 | if(res.value) buffers.push(res.value);
|
1011 | if(res.done) break;
|
1012 | }
|
1013 |
|
1014 | /* concat */
|
1015 | const out = new Uint8Array(buffers.reduce((acc, v) => acc + v.length, 0));
|
1016 |
|
1017 | let off = 0;
|
1018 | for(const u8 of arr) {
|
1019 | out.set(u8, off);
|
1020 | off += u8.length;
|
1021 | }
|
1022 |
|
1023 | return out;
|
1024 | }
|
1025 |
|
1026 | const data = await process_RS(stream);
|
1027 | /* data is Uint8Array */
|
1028 | const workbook = XLSX.read(data);
|
1029 | ```
|
1030 |
|
1031 | </details>
|
1032 |
|
1033 | More detailed examples are covered in the [included demos](demos/)
|
1034 |
|
1035 | ### Processing JSON and JS Data
|
1036 |
|
1037 | JSON and JS data tend to represent single worksheets. This section will use a
|
1038 | few utility functions to generate workbooks.
|
1039 |
|
1040 | _Create a new Workbook_
|
1041 |
|
1042 | ```js
|
1043 | var workbook = XLSX.utils.book_new();
|
1044 | ```
|
1045 |
|
1046 | The `book_new` utility function creates an empty workbook with no worksheets.
|
1047 |
|
1048 | Spreadsheet software generally require at least one worksheet and enforce the
|
1049 | requirement in the user interface. This library enforces the requirement at
|
1050 | write time, throwing errors if an empty workbook is passed to write functions.
|
1051 |
|
1052 |
|
1053 | **API**
|
1054 |
|
1055 | _Create a worksheet from an array of arrays of JS values_
|
1056 |
|
1057 | ```js
|
1058 | var worksheet = XLSX.utils.aoa_to_sheet(aoa, opts);
|
1059 | ```
|
1060 |
|
1061 | The `aoa_to_sheet` utility function walks an "array of arrays" in row-major
|
1062 | order, generating a worksheet object. The following snippet generates a sheet
|
1063 | with cell `A1` set to the string `A1`, cell `B1` set to `B1`, etc:
|
1064 |
|
1065 | ```js
|
1066 | var worksheet = XLSX.utils.aoa_to_sheet([
|
1067 | ["A1", "B1", "C1"],
|
1068 | ["A2", "B2", "C2"],
|
1069 | ["A3", "B3", "C3"]
|
1070 | ]);
|
1071 | ```
|
1072 |
|
1073 | ["Array of Arrays Input"](#array-of-arrays-input) describes the function and the
|
1074 | optional `opts` argument in more detail.
|
1075 |
|
1076 |
|
1077 | _Create a worksheet from an array of JS objects_
|
1078 |
|
1079 | ```js
|
1080 | var worksheet = XLSX.utils.json_to_sheet(jsa, opts);
|
1081 | ```
|
1082 |
|
1083 | The `json_to_sheet` utility function walks an array of JS objects in order,
|
1084 | generating a worksheet object. By default, it will generate a header row and
|
1085 | one row per object in the array. The optional `opts` argument has settings to
|
1086 | control the column order and header output.
|
1087 |
|
1088 | ["Array of Objects Input"](#array-of-arrays-input) describes the function and
|
1089 | the optional `opts` argument in more detail.
|
1090 |
|
1091 | **Examples**
|
1092 |
|
1093 | ["Zen of SheetJS"](#the-zen-of-sheetjs) contains a detailed example "Get Data
|
1094 | from a JSON Endpoint and Generate a Workbook"
|
1095 |
|
1096 |
|
1097 | [`x-spreadsheet`](https://github.com/myliang/x-spreadsheet) is an interactive
|
1098 | data grid for previewing and modifying structured data in the web browser. The
|
1099 | [`xspreadsheet` demo](/demos/xspreadsheet) includes a sample script with the
|
1100 | `xtos` function for converting from x-spreadsheet data object to a workbook.
|
1101 | <https://oss.sheetjs.com/sheetjs/x-spreadsheet> is a live demo.
|
1102 |
|
1103 | <details>
|
1104 | <summary><b>Records from a database query (SQL or no-SQL)</b> (click to show)</summary>
|
1105 |
|
1106 | The [`database` demo](/demos/database/) includes examples of working with
|
1107 | databases and query results.
|
1108 |
|
1109 | </details>
|
1110 |
|
1111 |
|
1112 | <details>
|
1113 | <summary><b>Numerical Computations with TensorFlow.js</b> (click to show)</summary>
|
1114 |
|
1115 | [`@tensorflow/tfjs`](@tensorflow/tfjs) and other libraries expect data in simple
|
1116 | arrays, well-suited for worksheets where each column is a data vector. That is
|
1117 | the transpose of how most people use spreadsheets, where each row is a vector.
|
1118 |
|
1119 | When recovering data from `tfjs`, the returned data points are stored in a typed
|
1120 | array. An array of arrays can be constructed with loops. `Array#unshift` can
|
1121 | prepend a title row before the conversion:
|
1122 |
|
1123 | ```js
|
1124 | const XLSX = require("xlsx");
|
1125 | const tf = require('@tensorflow/tfjs');
|
1126 |
|
1127 | /* suppose xs and ys are vectors (1D tensors) -> tfarr will be a typed array */
|
1128 | const tfdata = tf.stack([xs, ys]).transpose();
|
1129 | const shape = tfdata.shape;
|
1130 | const tfarr = tfdata.dataSync();
|
1131 |
|
1132 | /* construct the array of arrays */
|
1133 | const aoa = [];
|
1134 | for(let j = 0; j < shape[0]; ++j) {
|
1135 | aoa[j] = [];
|
1136 | for(let i = 0; i < shape[1]; ++i) aoa[j][i] = tfarr[j * shape[1] + i];
|
1137 | }
|
1138 | /* add headers to the top */
|
1139 | aoa.unshift(["x", "y"]);
|
1140 |
|
1141 | /* generate worksheet */
|
1142 | const worksheet = XLSX.utils.aoa_to_sheet(aoa);
|
1143 | ```
|
1144 |
|
1145 | The [`array` demo](demos/array/) shows a complete example.
|
1146 |
|
1147 | </details>
|
1148 |
|
1149 |
|
1150 | ### Processing HTML Tables
|
1151 |
|
1152 | **API**
|
1153 |
|
1154 | _Create a worksheet by scraping an HTML TABLE in the page_
|
1155 |
|
1156 | ```js
|
1157 | var worksheet = XLSX.utils.table_to_sheet(dom_element, opts);
|
1158 | ```
|
1159 |
|
1160 | The `table_to_sheet` utility function takes a DOM TABLE element and iterates
|
1161 | through the rows to generate a worksheet. The `opts` argument is optional.
|
1162 | ["HTML Table Input"](#html-table-input) describes the function in more detail.
|
1163 |
|
1164 |
|
1165 |
|
1166 | _Create a workbook by scraping an HTML TABLE in the page_
|
1167 |
|
1168 | ```js
|
1169 | var workbook = XLSX.utils.table_to_book(dom_element, opts);
|
1170 | ```
|
1171 |
|
1172 | The `table_to_book` utility function follows the same logic as `table_to_sheet`.
|
1173 | After generating a worksheet, it creates a blank workbook and appends the
|
1174 | spreadsheet.
|
1175 |
|
1176 | The options argument supports the same options as `table_to_sheet`, with the
|
1177 | addition of a `sheet` property to control the worksheet name. If the property
|
1178 | is missing or no options are specified, the default name `Sheet1` is used.
|
1179 |
|
1180 | **Examples**
|
1181 |
|
1182 | Here are a few common scenarios (click on each subtitle to see the code):
|
1183 |
|
1184 | <details>
|
1185 | <summary><b>HTML TABLE element in a webpage</b> (click to show)</summary>
|
1186 |
|
1187 | ```html
|
1188 | <!-- include the standalone script and shim. this uses the UNPKG CDN -->
|
1189 | <script src="https://unpkg.com/xlsx/dist/shim.min.js"></script>
|
1190 | <script src="https://unpkg.com/xlsx/dist/xlsx.full.min.js"></script>
|
1191 |
|
1192 | <!-- example table with id attribute -->
|
1193 | <table id="tableau">
|
1194 | <tr><td>Sheet</td><td>JS</td></tr>
|
1195 | <tr><td>12345</td><td>67</td></tr>
|
1196 | </table>
|
1197 |
|
1198 | <!-- this block should appear after the table HTML and the standalone script -->
|
1199 | <script type="text/javascript">
|
1200 | var workbook = XLSX.utils.table_to_book(document.getElementById("tableau"));
|
1201 |
|
1202 | /* DO SOMETHING WITH workbook HERE */
|
1203 | </script>
|
1204 | ```
|
1205 |
|
1206 | Multiple tables on a web page can be converted to individual worksheets:
|
1207 |
|
1208 | ```js
|
1209 | /* create new workbook */
|
1210 | var workbook = XLSX.utils.book_new();
|
1211 |
|
1212 | /* convert table "table1" to worksheet named "Sheet1" */
|
1213 | var sheet1 = XLSX.utils.table_to_sheet(document.getElementById("table1"));
|
1214 | XLSX.utils.book_append_sheet(workbook, sheet1, "Sheet1");
|
1215 |
|
1216 | /* convert table "table2" to worksheet named "Sheet2" */
|
1217 | var sheet2 = XLSX.utils.table_to_sheet(document.getElementById("table2"));
|
1218 | XLSX.utils.book_append_sheet(workbook, sheet2, "Sheet2");
|
1219 |
|
1220 | /* workbook now has 2 worksheets */
|
1221 | ```
|
1222 |
|
1223 | Alternatively, the HTML code can be extracted and parsed:
|
1224 |
|
1225 | ```js
|
1226 | var htmlstr = document.getElementById("tableau").outerHTML;
|
1227 | var workbook = XLSX.read(htmlstr, {type:"string"});
|
1228 | ```
|
1229 |
|
1230 | </details>
|
1231 |
|
1232 | <details>
|
1233 | <summary><b>Chrome/Chromium Extension</b> (click to show)</summary>
|
1234 |
|
1235 | The [`chrome` demo](demos/chrome/) shows a complete example and details the
|
1236 | required permissions and other settings.
|
1237 |
|
1238 | In an extension, it is recommended to generate the workbook in a content script
|
1239 | and pass the object back to the extension:
|
1240 |
|
1241 | ```js
|
1242 | /* in the worker script */
|
1243 | chrome.runtime.onMessage.addListener(function(msg, sender, cb) {
|
1244 | /* pass a message like { sheetjs: true } from the extension to scrape */
|
1245 | if(!msg || !msg.sheetjs) return;
|
1246 | /* create a new workbook */
|
1247 | var workbook = XLSX.utils.book_new();
|
1248 | /* loop through each table element */
|
1249 | var tables = document.getElementsByTagName("table")
|
1250 | for(var i = 0; i < tables.length; ++i) {
|
1251 | var worksheet = XLSX.utils.table_to_sheet(tables[i]);
|
1252 | XLSX.utils.book_append_sheet(workbook, worksheet, "Table" + i);
|
1253 | }
|
1254 | /* pass back to the extension */
|
1255 | return cb(workbook);
|
1256 | });
|
1257 | ```
|
1258 |
|
1259 | </details>
|
1260 |
|
1261 | <details>
|
1262 | <summary><b>Server-Side HTML Tables with Headless Chrome</b> (click to show)</summary>
|
1263 |
|
1264 | The [`headless` demo](demos/headless/) includes a complete demo to convert HTML
|
1265 | files to XLSB workbooks. The core idea is to add the script to the page, parse
|
1266 | the table in the page context, generate a `base64` workbook and send it back
|
1267 | for further processing:
|
1268 |
|
1269 | ```js
|
1270 | const XLSX = require("xlsx");
|
1271 | const { readFileSync } = require("fs"), puppeteer = require("puppeteer");
|
1272 |
|
1273 | const url = `https://sheetjs.com/demos/table`;
|
1274 |
|
1275 | /* get the standalone build source (node_modules/xlsx/dist/xlsx.full.min.js) */
|
1276 | const lib = readFileSync(require.resolve("xlsx/dist/xlsx.full.min.js"), "utf8");
|
1277 |
|
1278 | (async() => {
|
1279 | /* start browser and go to web page */
|
1280 | const browser = await puppeteer.launch();
|
1281 | const page = await browser.newPage();
|
1282 | await page.goto(url, {waitUntil: "networkidle2"});
|
1283 |
|
1284 | /* inject library */
|
1285 | await page.addScriptTag({content: lib});
|
1286 |
|
1287 | /* this function `s5s` will be called by the script below, receiving the Base64-encoded file */
|
1288 | await page.exposeFunction("s5s", async(b64) => {
|
1289 | const workbook = XLSX.read(b64, {type: "base64" });
|
1290 |
|
1291 | /* DO SOMETHING WITH workbook HERE */
|
1292 | });
|
1293 |
|
1294 | /* generate XLSB file in webpage context and send back result */
|
1295 | await page.addScriptTag({content: `
|
1296 | /* call table_to_book on first table */
|
1297 | var workbook = XLSX.utils.table_to_book(document.querySelector("TABLE"));
|
1298 |
|
1299 | /* generate XLSX file */
|
1300 | var b64 = XLSX.write(workbook, {type: "base64", bookType: "xlsb"});
|
1301 |
|
1302 | /* call "s5s" hook exposed from the node process */
|
1303 | window.s5s(b64);
|
1304 | `});
|
1305 |
|
1306 | /* cleanup */
|
1307 | await browser.close();
|
1308 | })();
|
1309 | ```
|
1310 |
|
1311 | </details>
|
1312 |
|
1313 | <details>
|
1314 | <summary><b>Server-Side HTML Tables with Headless WebKit</b> (click to show)</summary>
|
1315 |
|
1316 | The [`headless` demo](demos/headless/) includes a complete demo to convert HTML
|
1317 | files to XLSB workbooks using [PhantomJS](https://phantomjs.org/). The core idea
|
1318 | is to add the script to the page, parse the table in the page context, generate
|
1319 | a `binary` workbook and send it back for further processing:
|
1320 |
|
1321 | ```js
|
1322 | var XLSX = require('xlsx');
|
1323 | var page = require('webpage').create();
|
1324 |
|
1325 | /* this code will be run in the page */
|
1326 | var code = [ "function(){",
|
1327 | /* call table_to_book on first table */
|
1328 | "var wb = XLSX.utils.table_to_book(document.body.getElementsByTagName('table')[0]);",
|
1329 |
|
1330 | /* generate XLSB file and return binary string */
|
1331 | "return XLSX.write(wb, {type: 'binary', bookType: 'xlsb'});",
|
1332 | "}" ].join("");
|
1333 |
|
1334 | page.open('https://sheetjs.com/demos/table', function() {
|
1335 | /* Load the browser script from the UNPKG CDN */
|
1336 | page.includeJs("https://unpkg.com/xlsx/dist/xlsx.full.min.js", function() {
|
1337 | /* The code will return an XLSB file encoded as binary string */
|
1338 | var bin = page.evaluateJavaScript(code);
|
1339 |
|
1340 | var workbook = XLSX.read(bin, {type: "binary"});
|
1341 | /* DO SOMETHING WITH workbook HERE */
|
1342 |
|
1343 | phantom.exit();
|
1344 | });
|
1345 | });
|
1346 | ```
|
1347 |
|
1348 | </details>
|
1349 |
|
1350 | <details>
|
1351 | <summary><b>NodeJS HTML Tables without a browser</b> (click to show)</summary>
|
1352 |
|
1353 | NodeJS does not include a DOM implementation and Puppeteer requires a hefty
|
1354 | Chromium build. [`jsdom`](https://npm.im/jsdom) is a lightweight alternative:
|
1355 |
|
1356 | ```js
|
1357 | const XLSX = require("xlsx");
|
1358 | const { readFileSync } = require("fs");
|
1359 | const { JSDOM } = require("jsdom");
|
1360 |
|
1361 | /* obtain HTML string. This example reads from test.html */
|
1362 | const html_str = fs.readFileSync("test.html", "utf8");
|
1363 | /* get first TABLE element */
|
1364 | const doc = new JSDOM(html_str).window.document.querySelector("table");
|
1365 | /* generate workbook */
|
1366 | const workbook = XLSX.utils.table_to_book(doc);
|
1367 | ```
|
1368 |
|
1369 | </details>
|
1370 |
|
1371 | ## Processing Data
|
1372 |
|
1373 | The ["Common Spreadsheet Format"](#common-spreadsheet-format) is a simple object
|
1374 | representation of the core concepts of a workbook. The utility functions work
|
1375 | with the object representation and are intended to handle common use cases.
|
1376 |
|
1377 | ### Modifying Workbook Structure
|
1378 |
|
1379 | **API**
|
1380 |
|
1381 | _Append a Worksheet to a Workbook_
|
1382 |
|
1383 | ```js
|
1384 | XLSX.utils.book_append_sheet(workbook, worksheet, sheet_name);
|
1385 | ```
|
1386 |
|
1387 | The `book_append_sheet` utility function appends a worksheet to the workbook.
|
1388 | The third argument specifies the desired worksheet name. Multiple worksheets can
|
1389 | be added to a workbook by calling the function multiple times. If the worksheet
|
1390 | name is already used in the workbook, it will throw an error.
|
1391 |
|
1392 | _Append a Worksheet to a Workbook and find a unique name_
|
1393 |
|
1394 | ```js
|
1395 | var new_name = XLSX.utils.book_append_sheet(workbook, worksheet, name, true);
|
1396 | ```
|
1397 |
|
1398 | If the fourth argument is `true`, the function will start with the specified
|
1399 | worksheet name. If the sheet name exists in the workbook, a new worksheet name
|
1400 | will be chosen by finding the name stem and incrementing the counter:
|
1401 |
|
1402 | ```js
|
1403 | XLSX.utils.book_append_sheet(workbook, sheetA, "Sheet2", true); // Sheet2
|
1404 | XLSX.utils.book_append_sheet(workbook, sheetB, "Sheet2", true); // Sheet3
|
1405 | XLSX.utils.book_append_sheet(workbook, sheetC, "Sheet2", true); // Sheet4
|
1406 | XLSX.utils.book_append_sheet(workbook, sheetD, "Sheet2", true); // Sheet5
|
1407 | ```
|
1408 |
|
1409 | _List the Worksheet names in tab order_
|
1410 |
|
1411 | ```js
|
1412 | var wsnames = workbook.SheetNames;
|
1413 | ```
|
1414 |
|
1415 | The `SheetNames` property of the workbook object is a list of the worksheet
|
1416 | names in "tab order". API functions will look at this array.
|
1417 |
|
1418 | _Replace a Worksheet in place_
|
1419 |
|
1420 | ```js
|
1421 | workbook.Sheets[sheet_name] = new_worksheet;
|
1422 | ```
|
1423 |
|
1424 | The `Sheets` property of the workbook object is an object whose keys are names
|
1425 | and whose values are worksheet objects. By reassigning to a property of the
|
1426 | `Sheets` object, the worksheet object can be changed without disrupting the
|
1427 | rest of the worksheet structure.
|
1428 |
|
1429 | **Examples**
|
1430 |
|
1431 | <details>
|
1432 | <summary><b>Add a new worksheet to a workbook</b> (click to show)</summary>
|
1433 |
|
1434 | This example uses [`XLSX.utils.aoa_to_sheet`](#array-of-arrays-input).
|
1435 |
|
1436 | ```js
|
1437 | var ws_name = "SheetJS";
|
1438 |
|
1439 | /* Create worksheet */
|
1440 | var ws_data = [
|
1441 | [ "S", "h", "e", "e", "t", "J", "S" ],
|
1442 | [ 1 , 2 , 3 , 4 , 5 ]
|
1443 | ];
|
1444 | var ws = XLSX.utils.aoa_to_sheet(ws_data);
|
1445 |
|
1446 | /* Add the worksheet to the workbook */
|
1447 | XLSX.utils.book_append_sheet(wb, ws, ws_name);
|
1448 | ```
|
1449 |
|
1450 | </details>
|
1451 |
|
1452 | ### Modifying Cell Values
|
1453 |
|
1454 | **API**
|
1455 |
|
1456 | _Modify a single cell value in a worksheet_
|
1457 |
|
1458 | ```js
|
1459 | XLSX.utils.sheet_add_aoa(worksheet, [[new_value]], { origin: address });
|
1460 | ```
|
1461 |
|
1462 | _Modify multiple cell values in a worksheet_
|
1463 |
|
1464 | ```js
|
1465 | XLSX.utils.sheet_add_aoa(worksheet, aoa, opts);
|
1466 | ```
|
1467 |
|
1468 | The `sheet_add_aoa` utility function modifies cell values in a worksheet. The
|
1469 | first argument is the worksheet object. The second argument is an array of
|
1470 | arrays of values. The `origin` key of the third argument controls where cells
|
1471 | will be written. The following snippet sets `B3=1` and `E5="abc"`:
|
1472 |
|
1473 | ```js
|
1474 | XLSX.utils.sheet_add_aoa(worksheet, [
|
1475 | [1], // <-- Write 1 to cell B3
|
1476 | , // <-- Do nothing in row 4
|
1477 | [/*B5*/, /*C5*/, /*D5*/, "abc"] // <-- Write "abc" to cell E5
|
1478 | ], { origin: "B3" });
|
1479 | ```
|
1480 |
|
1481 | ["Array of Arrays Input"](#array-of-arrays-input) describes the function and the
|
1482 | optional `opts` argument in more detail.
|
1483 |
|
1484 | **Examples**
|
1485 |
|
1486 | <details>
|
1487 | <summary><b>Appending rows to a worksheet</b> (click to show)</summary>
|
1488 |
|
1489 | The special origin value `-1` instructs `sheet_add_aoa` to start in column A of
|
1490 | the row after the last row in the range, appending the data:
|
1491 |
|
1492 | ```js
|
1493 | XLSX.utils.sheet_add_aoa(worksheet, [
|
1494 | ["first row after data", 1],
|
1495 | ["second row after data", 2]
|
1496 | ], { origin: -1 });
|
1497 | ```
|
1498 |
|
1499 | </details>
|
1500 |
|
1501 |
|
1502 | ### Modifying Other Worksheet / Workbook / Cell Properties
|
1503 |
|
1504 | The ["Common Spreadsheet Format"](#common-spreadsheet-format) section describes
|
1505 | the object structures in greater detail.
|
1506 |
|
1507 | ## Packaging and Releasing Data
|
1508 |
|
1509 | ### Writing Workbooks
|
1510 |
|
1511 | **API**
|
1512 |
|
1513 | _Generate spreadsheet bytes (file) from data_
|
1514 |
|
1515 | ```js
|
1516 | var data = XLSX.write(workbook, opts);
|
1517 | ```
|
1518 |
|
1519 | The `write` method attempts to package data from the workbook into a file in
|
1520 | memory. By default, XLSX files are generated, but that can be controlled with
|
1521 | the `bookType` property of the `opts` argument. Based on the `type` option,
|
1522 | the data can be stored as a "binary string", JS string, `Uint8Array` or Buffer.
|
1523 |
|
1524 | The second `opts` argument is required. ["Writing Options"](#writing-options)
|
1525 | covers the supported properties and behaviors.
|
1526 |
|
1527 | _Generate and attempt to save file_
|
1528 |
|
1529 | ```js
|
1530 | XLSX.writeFile(workbook, filename, opts);
|
1531 | ```
|
1532 |
|
1533 | The `writeFile` method packages the data and attempts to save the new file. The
|
1534 | export file format is determined by the extension of `filename` (`SheetJS.xlsx`
|
1535 | signals XLSX export, `SheetJS.xlsb` signals XLSB export, etc).
|
1536 |
|
1537 | The `writeFile` method uses platform-specific APIs to initiate the file save. In
|
1538 | NodeJS, `fs.readFileSync` can create a file. In the web browser, a download is
|
1539 | attempted using the HTML5 `download` attribute, with fallbacks for IE.
|
1540 |
|
1541 | _Generate and attempt to save an XLSX file_
|
1542 |
|
1543 | ```js
|
1544 | XLSX.writeFileXLSX(workbook, filename, opts);
|
1545 | ```
|
1546 |
|
1547 | The `writeFile` method embeds a number of different export functions. This is
|
1548 | great for developer experience but not amenable to tree shaking using the
|
1549 | current developer tools. When only XLSX exports are needed, this method avoids
|
1550 | referencing the other export functions.
|
1551 |
|
1552 | The second `opts` argument is optional. ["Writing Options"](#writing-options)
|
1553 | covers the supported properties and behaviors.
|
1554 |
|
1555 | **Examples**
|
1556 |
|
1557 | <details>
|
1558 | <summary><b>Local file in a NodeJS server</b> (click to show)</summary>
|
1559 |
|
1560 | `writeFile` uses `fs.writeFileSync` in server environments:
|
1561 |
|
1562 | ```js
|
1563 | var XLSX = require("xlsx");
|
1564 |
|
1565 | /* output format determined by filename */
|
1566 | XLSX.writeFile(workbook, "out.xlsb");
|
1567 | ```
|
1568 |
|
1569 | For Node ESM, the `writeFile` helper is not enabled. Instead, `fs.writeFileSync`
|
1570 | should be used to write the file data to a `Buffer` for use with `XLSX.write`:
|
1571 |
|
1572 | ```js
|
1573 | import { writeFileSync } from "fs";
|
1574 | import { write } from "xlsx/xlsx.mjs";
|
1575 |
|
1576 | const buf = write(workbook, {type: "buffer", bookType: "xlsb"});
|
1577 | /* buf is a Buffer */
|
1578 | const workbook = writeFileSync("out.xlsb", buf);
|
1579 | ```
|
1580 |
|
1581 | </details>
|
1582 |
|
1583 | <details>
|
1584 | <summary><b>Local file in a Deno application</b> (click to show)</summary>
|
1585 |
|
1586 | `writeFile` uses `Deno.writeFileSync` under the hood:
|
1587 |
|
1588 | ```js
|
1589 | // @deno-types="https://deno.land/x/sheetjs/types/index.d.ts"
|
1590 | import * as XLSX from 'https://deno.land/x/sheetjs/xlsx.mjs'
|
1591 |
|
1592 | XLSX.writeFile(workbook, "test.xlsx");
|
1593 | ```
|
1594 |
|
1595 | Applications writing files must be invoked with the `--allow-write` flag. The
|
1596 | [`deno` demo](demos/deno/) has more examples
|
1597 |
|
1598 | </details>
|
1599 |
|
1600 | <details>
|
1601 | <summary><b>Local file in a PhotoShop or InDesign plugin</b> (click to show)</summary>
|
1602 |
|
1603 | `writeFile` wraps the `File` logic in Photoshop and other ExtendScript targets.
|
1604 | The specified path should be an absolute path:
|
1605 |
|
1606 | ```js
|
1607 | #include "xlsx.extendscript.js"
|
1608 |
|
1609 | /* output format determined by filename */
|
1610 | XLSX.writeFile(workbook, "out.xlsx");
|
1611 | /* at this point, out.xlsx is a file that you can distribute */
|
1612 | ```
|
1613 |
|
1614 | The [`extendscript` demo](demos/extendscript/) includes a more complex example.
|
1615 |
|
1616 | </details>
|
1617 |
|
1618 | <details>
|
1619 | <summary><b>Download a file in the browser to the user machine</b> (click to show)</summary>
|
1620 |
|
1621 | `XLSX.writeFile` wraps a few techniques for triggering a file save:
|
1622 |
|
1623 | - `URL` browser API creates an object URL for the file, which the library uses
|
1624 | by creating a link and forcing a click. It is supported in modern browsers.
|
1625 | - `msSaveBlob` is an IE10+ API for triggering a file save.
|
1626 | - `IE_FileSave` uses VBScript and ActiveX to write a file in IE6+ for Windows
|
1627 | XP and Windows 7. The shim must be included in the containing HTML page.
|
1628 |
|
1629 | There is no standard way to determine if the actual file has been downloaded.
|
1630 |
|
1631 | ```js
|
1632 | /* output format determined by filename */
|
1633 | XLSX.writeFile(workbook, "out.xlsb");
|
1634 | /* at this point, out.xlsb will have been downloaded */
|
1635 | ```
|
1636 |
|
1637 | </details>
|
1638 |
|
1639 | <details>
|
1640 | <summary><b>Download a file in legacy browsers</b> (click to show)</summary>
|
1641 |
|
1642 | `XLSX.writeFile` techniques work for most modern browsers as well as older IE.
|
1643 | For much older browsers, there are workarounds implemented by wrapper libraries.
|
1644 |
|
1645 | [`FileSaver.js`](https://github.com/eligrey/FileSaver.js/) implements `saveAs`.
|
1646 | Note: `XLSX.writeFile` will automatically call `saveAs` if available.
|
1647 |
|
1648 | ```js
|
1649 | /* bookType can be any supported output type */
|
1650 | var wopts = { bookType:"xlsx", bookSST:false, type:"array" };
|
1651 |
|
1652 | var wbout = XLSX.write(workbook,wopts);
|
1653 |
|
1654 | /* the saveAs call downloads a file on the local machine */
|
1655 | saveAs(new Blob([wbout],{type:"application/octet-stream"}), "test.xlsx");
|
1656 | ```
|
1657 |
|
1658 | [`Downloadify`](https://github.com/dcneiner/downloadify) uses a Flash SWF button
|
1659 | to generate local files, suitable for environments where ActiveX is unavailable:
|
1660 |
|
1661 | ```js
|
1662 | Downloadify.create(id,{
|
1663 | /* other options are required! read the downloadify docs for more info */
|
1664 | filename: "test.xlsx",
|
1665 | data: function() { return XLSX.write(wb, {bookType:"xlsx", type:"base64"}); },
|
1666 | append: false,
|
1667 | dataType: "base64"
|
1668 | });
|
1669 | ```
|
1670 |
|
1671 | The [`oldie` demo](demos/oldie/) shows an IE-compatible fallback scenario.
|
1672 |
|
1673 | </details>
|
1674 |
|
1675 | <details>
|
1676 | <summary><b>Browser upload file (ajax)</b> (click to show)</summary>
|
1677 |
|
1678 | A complete example using XHR is [included in the XHR demo](demos/xhr/), along
|
1679 | with examples for fetch and wrapper libraries. This example assumes the server
|
1680 | can handle Base64-encoded files (see the demo for a basic nodejs server):
|
1681 |
|
1682 | ```js
|
1683 | /* in this example, send a base64 string to the server */
|
1684 | var wopts = { bookType:"xlsx", bookSST:false, type:"base64" };
|
1685 |
|
1686 | var wbout = XLSX.write(workbook,wopts);
|
1687 |
|
1688 | var req = new XMLHttpRequest();
|
1689 | req.open("POST", "/upload", true);
|
1690 | var formdata = new FormData();
|
1691 | formdata.append("file", "test.xlsx"); // <-- server expects `file` to hold name
|
1692 | formdata.append("data", wbout); // <-- `data` holds the base64-encoded data
|
1693 | req.send(formdata);
|
1694 | ```
|
1695 |
|
1696 | </details>
|
1697 |
|
1698 | <details>
|
1699 | <summary><b>PhantomJS (Headless Webkit) File Generation</b> (click to show)</summary>
|
1700 |
|
1701 | The [`headless` demo](demos/headless/) includes a complete demo to convert HTML
|
1702 | files to XLSB workbooks using [PhantomJS](https://phantomjs.org/). PhantomJS
|
1703 | `fs.write` supports writing files from the main process but has a different
|
1704 | interface from the NodeJS `fs` module:
|
1705 |
|
1706 | ```js
|
1707 | var XLSX = require('xlsx');
|
1708 | var fs = require('fs');
|
1709 |
|
1710 | /* generate a binary string */
|
1711 | var bin = XLSX.write(workbook, { type:"binary", bookType: "xlsx" });
|
1712 | /* write to file */
|
1713 | fs.write("test.xlsx", bin, "wb");
|
1714 | ```
|
1715 |
|
1716 | Note: The section ["Processing HTML Tables"](#processing-html-tables) shows how
|
1717 | to generate a workbook from HTML tables in a page in "Headless WebKit".
|
1718 |
|
1719 | </details>
|
1720 |
|
1721 |
|
1722 |
|
1723 | The [included demos](demos/) cover mobile apps and other special deployments.
|
1724 |
|
1725 | ### Writing Examples
|
1726 |
|
1727 | - <http://sheetjs.com/demos/table.html> exporting an HTML table
|
1728 | - <http://sheetjs.com/demos/writexlsx.html> generates a simple file
|
1729 |
|
1730 | ### Streaming Write
|
1731 |
|
1732 | The streaming write functions are available in the `XLSX.stream` object. They
|
1733 | take the same arguments as the normal write functions but return a NodeJS
|
1734 | Readable Stream.
|
1735 |
|
1736 | - `XLSX.stream.to_csv` is the streaming version of `XLSX.utils.sheet_to_csv`.
|
1737 | - `XLSX.stream.to_html` is the streaming version of `XLSX.utils.sheet_to_html`.
|
1738 | - `XLSX.stream.to_json` is the streaming version of `XLSX.utils.sheet_to_json`.
|
1739 |
|
1740 | <details>
|
1741 | <summary><b>nodejs convert to CSV and write file</b> (click to show)</summary>
|
1742 |
|
1743 | ```js
|
1744 | var output_file_name = "out.csv";
|
1745 | var stream = XLSX.stream.to_csv(worksheet);
|
1746 | stream.pipe(fs.createWriteStream(output_file_name));
|
1747 | ```
|
1748 |
|
1749 | </details>
|
1750 |
|
1751 | <details>
|
1752 | <summary><b>nodejs write JSON stream to screen</b> (click to show)</summary>
|
1753 |
|
1754 | ```js
|
1755 | /* to_json returns an object-mode stream */
|
1756 | var stream = XLSX.stream.to_json(worksheet, {raw:true});
|
1757 |
|
1758 | /* the following stream converts JS objects to text via JSON.stringify */
|
1759 | var conv = new Transform({writableObjectMode:true});
|
1760 | conv._transform = function(obj, e, cb){ cb(null, JSON.stringify(obj) + "\n"); };
|
1761 |
|
1762 | stream.pipe(conv); conv.pipe(process.stdout);
|
1763 | ```
|
1764 |
|
1765 | </details>
|
1766 |
|
1767 | <details>
|
1768 | <summary><b>Exporting NUMBERS files</b> (click to show)</summary>
|
1769 |
|
1770 | The NUMBERS writer requires a fairly large base. The supplementary `xlsx.zahl`
|
1771 | scripts provide support. `xlsx.zahl.js` is designed for standalone and NodeJS
|
1772 | use, while `xlsx.zahl.mjs` is suitable for ESM.
|
1773 |
|
1774 | _Browser_
|
1775 |
|
1776 | ```html
|
1777 | <meta charset="utf8">
|
1778 | <script src="xlsx.full.min.js"></script>
|
1779 | <script src="xlsx.zahl.js"></script>
|
1780 | <script>
|
1781 | var wb = XLSX.utils.book_new(); var ws = XLSX.utils.aoa_to_sheet([
|
1782 | ["SheetJS", "<3","விரிதாள்"],
|
1783 | [72,,"Arbeitsblätter"],
|
1784 | [,62,"数据"],
|
1785 | [true,false,],
|
1786 | ]); XLSX.utils.book_append_sheet(wb, ws, "Sheet1");
|
1787 | XLSX.writeFile(wb, "textport.numbers", {numbers: XLSX_ZAHL, compression: true});
|
1788 | </script>
|
1789 | ```
|
1790 |
|
1791 | _Node_
|
1792 |
|
1793 | ```js
|
1794 | var XLSX = require("./xlsx.flow");
|
1795 | var XLSX_ZAHL = require("./dist/xlsx.zahl");
|
1796 | var wb = XLSX.utils.book_new(); var ws = XLSX.utils.aoa_to_sheet([
|
1797 | ["SheetJS", "<3","விரிதாள்"],
|
1798 | [72,,"Arbeitsblätter"],
|
1799 | [,62,"数据"],
|
1800 | [true,false,],
|
1801 | ]); XLSX.utils.book_append_sheet(wb, ws, "Sheet1");
|
1802 | XLSX.writeFile(wb, "textport.numbers", {numbers: XLSX_ZAHL, compression: true});
|
1803 | ```
|
1804 |
|
1805 | _Deno_
|
1806 |
|
1807 | ```ts
|
1808 | import * as XLSX from './xlsx.mjs';
|
1809 | import XLSX_ZAHL from './dist/xlsx.zahl.mjs';
|
1810 |
|
1811 | var wb = XLSX.utils.book_new(); var ws = XLSX.utils.aoa_to_sheet([
|
1812 | ["SheetJS", "<3","விரிதாள்"],
|
1813 | [72,,"Arbeitsblätter"],
|
1814 | [,62,"数据"],
|
1815 | [true,false,],
|
1816 | ]); XLSX.utils.book_append_sheet(wb, ws, "Sheet1");
|
1817 | XLSX.writeFile(wb, "textports.numbers", {numbers: XLSX_ZAHL, compression: true});
|
1818 | ```
|
1819 |
|
1820 | </details>
|
1821 |
|
1822 | <https://github.com/sheetjs/sheetaki> pipes write streams to nodejs response.
|
1823 |
|
1824 | ### Generating JSON and JS Data
|
1825 |
|
1826 | JSON and JS data tend to represent single worksheets. The utility functions in
|
1827 | this section work with single worksheets.
|
1828 |
|
1829 | The ["Common Spreadsheet Format"](#common-spreadsheet-format) section describes
|
1830 | the object structure in more detail. `workbook.SheetNames` is an ordered list
|
1831 | of the worksheet names. `workbook.Sheets` is an object whose keys are sheet
|
1832 | names and whose values are worksheet objects.
|
1833 |
|
1834 | The "first worksheet" is stored at `workbook.Sheets[workbook.SheetNames[0]]`.
|
1835 |
|
1836 | **API**
|
1837 |
|
1838 | _Create an array of JS objects from a worksheet_
|
1839 |
|
1840 | ```js
|
1841 | var jsa = XLSX.utils.sheet_to_json(worksheet, opts);
|
1842 | ```
|
1843 |
|
1844 | _Create an array of arrays of JS values from a worksheet_
|
1845 |
|
1846 | ```js
|
1847 | var aoa = XLSX.utils.sheet_to_json(worksheet, {...opts, header: 1});
|
1848 | ```
|
1849 |
|
1850 | The `sheet_to_json` utility function walks a workbook in row-major order,
|
1851 | generating an array of objects. The second `opts` argument controls a number of
|
1852 | export decisions including the type of values (JS values or formatted text). The
|
1853 | ["JSON"](#json) section describes the argument in more detail.
|
1854 |
|
1855 | By default, `sheet_to_json` scans the first row and uses the values as headers.
|
1856 | With the `header: 1` option, the function exports an array of arrays of values.
|
1857 |
|
1858 | **Examples**
|
1859 |
|
1860 | [`x-spreadsheet`](https://github.com/myliang/x-spreadsheet) is an interactive
|
1861 | data grid for previewing and modifying structured data in the web browser. The
|
1862 | [`xspreadsheet` demo](/demos/xspreadsheet) includes a sample script with the
|
1863 | `stox` function for converting from a workbook to x-spreadsheet data object.
|
1864 | <https://oss.sheetjs.com/sheetjs/x-spreadsheet> is a live demo.
|
1865 |
|
1866 | <details>
|
1867 | <summary><b>Previewing data in a React data grid</b> (click to show)</summary>
|
1868 |
|
1869 | [`react-data-grid`](https://npm.im/react-data-grid) is a data grid tailored for
|
1870 | react. It expects two properties: `rows` of data objects and `columns` which
|
1871 | describe the columns. For the purposes of massaging the data to fit the react
|
1872 | data grid API it is easiest to start from an array of arrays.
|
1873 |
|
1874 | This demo starts by fetching a remote file and using `XLSX.read` to extract:
|
1875 |
|
1876 | ```js
|
1877 | import { useEffect, useState } from "react";
|
1878 | import DataGrid from "react-data-grid";
|
1879 | import { read, utils } from "xlsx";
|
1880 |
|
1881 | const url = "https://oss.sheetjs.com/test_files/RkNumber.xls";
|
1882 |
|
1883 | export default function App() {
|
1884 | const [columns, setColumns] = useState([]);
|
1885 | const [rows, setRows] = useState([]);
|
1886 | useEffect(() => {(async () => {
|
1887 | const wb = read(await (await fetch(url)).arrayBuffer(), { WTF: 1 });
|
1888 |
|
1889 | /* use sheet_to_json with header: 1 to generate an array of arrays */
|
1890 | const data = utils.sheet_to_json(wb.Sheets[wb.SheetNames[0]], { header: 1 });
|
1891 |
|
1892 | /* see react-data-grid docs to understand the shape of the expected data */
|
1893 | setColumns(data[0].map((r) => ({ key: r, name: r })));
|
1894 | setRows(data.slice(1).map((r) => r.reduce((acc, x, i) => {
|
1895 | acc[data[0][i]] = x;
|
1896 | return acc;
|
1897 | }, {})));
|
1898 | })(); });
|
1899 |
|
1900 | return <DataGrid columns={columns} rows={rows} />;
|
1901 | }
|
1902 | ```
|
1903 |
|
1904 | </details>
|
1905 |
|
1906 | <details>
|
1907 | <summary><b>Previewing data in a VueJS data grid</b> (click to show)</summary>
|
1908 |
|
1909 | [`vue3-table-lite`](https://github.com/linmasahiro/vue3-table-lite) is a simple
|
1910 | VueJS 3 data table. It is featured [in the VueJS demo](/demos/vue/modify/).
|
1911 |
|
1912 | </details>
|
1913 |
|
1914 | <details>
|
1915 | <summary><b>Populating a database (SQL or no-SQL)</b> (click to show)</summary>
|
1916 |
|
1917 | The [`database` demo](/demos/database/) includes examples of working with
|
1918 | databases and query results.
|
1919 |
|
1920 | </details>
|
1921 |
|
1922 | <details>
|
1923 | <summary><b>Numerical Computations with TensorFlow.js</b> (click to show)</summary>
|
1924 |
|
1925 | [`@tensorflow/tfjs`](@tensorflow/tfjs) and other libraries expect data in simple
|
1926 | arrays, well-suited for worksheets where each column is a data vector. That is
|
1927 | the transpose of how most people use spreadsheets, where each row is a vector.
|
1928 |
|
1929 | A single `Array#map` can pull individual named rows from `sheet_to_json` export:
|
1930 |
|
1931 | ```js
|
1932 | const XLSX = require("xlsx");
|
1933 | const tf = require('@tensorflow/tfjs');
|
1934 |
|
1935 | const key = "age"; // this is the field we want to pull
|
1936 | const ages = XLSX.utils.sheet_to_json(worksheet).map(r => r[key]);
|
1937 | const tf_data = tf.tensor1d(ages);
|
1938 | ```
|
1939 |
|
1940 | All fields can be processed at once using a transpose of the 2D tensor generated
|
1941 | with the `sheet_to_json` export with `header: 1`. The first row, if it contains
|
1942 | header labels, should be removed with a slice:
|
1943 |
|
1944 | ```js
|
1945 | const XLSX = require("xlsx");
|
1946 | const tf = require('@tensorflow/tfjs');
|
1947 |
|
1948 | /* array of arrays of the data starting on the second row */
|
1949 | const aoa = XLSX.utils.sheet_to_json(worksheet, {header: 1}).slice(1);
|
1950 | /* dataset in the "correct orientation" */
|
1951 | const tf_dataset = tf.tensor2d(aoa).transpose();
|
1952 | /* pull out each dataset with a slice */
|
1953 | const tf_field0 = tf_dataset.slice([0,0], [1,tensor.shape[1]]).flatten();
|
1954 | const tf_field1 = tf_dataset.slice([1,0], [1,tensor.shape[1]]).flatten();
|
1955 | ```
|
1956 |
|
1957 | The [`array` demo](demos/array/) shows a complete example.
|
1958 |
|
1959 | </details>
|
1960 |
|
1961 |
|
1962 | ### Generating HTML Tables
|
1963 |
|
1964 | **API**
|
1965 |
|
1966 | _Generate HTML Table from Worksheet_
|
1967 |
|
1968 | ```js
|
1969 | var html = XLSX.utils.sheet_to_html(worksheet);
|
1970 | ```
|
1971 |
|
1972 | The `sheet_to_html` utility function generates HTML code based on the worksheet
|
1973 | data. Each cell in the worksheet is mapped to a `<TD>` element. Merged cells
|
1974 | in the worksheet are serialized by setting `colspan` and `rowspan` attributes.
|
1975 |
|
1976 | **Examples**
|
1977 |
|
1978 | The `sheet_to_html` utility function generates HTML code that can be added to
|
1979 | any DOM element by setting the `innerHTML`:
|
1980 |
|
1981 | ```js
|
1982 | var container = document.getElementById("tavolo");
|
1983 | container.innerHTML = XLSX.utils.sheet_to_html(worksheet);
|
1984 | ```
|
1985 |
|
1986 | Combining with `fetch`, constructing a site from a workbook is straightforward:
|
1987 |
|
1988 | <details>
|
1989 | <summary><b>Vanilla JS + HTML fetch workbook and generate table previews</b> (click to show)</summary>
|
1990 |
|
1991 | ```html
|
1992 | <body>
|
1993 | <style>TABLE { border-collapse: collapse; } TD { border: 1px solid; }</style>
|
1994 | <div id="tavolo"></div>
|
1995 | <script src="https://unpkg.com/xlsx/dist/xlsx.full.min.js"></script>
|
1996 | <script type="text/javascript">
|
1997 | (async() => {
|
1998 | /* fetch and parse workbook -- see the fetch example for details */
|
1999 | const workbook = XLSX.read(await (await fetch("sheetjs.xlsx")).arrayBuffer());
|
2000 |
|
2001 | let output = [];
|
2002 | /* loop through the worksheet names in order */
|
2003 | workbook.SheetNames.forEach(name => {
|
2004 |
|
2005 | /* generate HTML from the corresponding worksheets */
|
2006 | const worksheet = workbook.Sheets[name];
|
2007 | const html = XLSX.utils.sheet_to_html(worksheet);
|
2008 |
|
2009 | /* add a header with the title name followed by the table */
|
2010 | output.push(`<H3>${name}</H3>${html}`);
|
2011 | });
|
2012 | /* write to the DOM at the end */
|
2013 | tavolo.innerHTML = output.join("\n");
|
2014 | })();
|
2015 | </script>
|
2016 | </body>
|
2017 | ```
|
2018 |
|
2019 | </details>
|
2020 |
|
2021 | <details>
|
2022 | <summary><b>React fetch workbook and generate HTML table previews</b> (click to show)</summary>
|
2023 |
|
2024 | It is generally recommended to use a React-friendly workflow, but it is possible
|
2025 | to generate HTML and use it in React with `dangerouslySetInnerHTML`:
|
2026 |
|
2027 | ```jsx
|
2028 | function Tabeller(props) {
|
2029 | /* the workbook object is the state */
|
2030 | const [workbook, setWorkbook] = React.useState(XLSX.utils.book_new());
|
2031 |
|
2032 | /* fetch and update the workbook with an effect */
|
2033 | React.useEffect(() => { (async() => {
|
2034 | /* fetch and parse workbook -- see the fetch example for details */
|
2035 | const wb = XLSX.read(await (await fetch("sheetjs.xlsx")).arrayBuffer());
|
2036 | setWorkbook(wb);
|
2037 | })(); });
|
2038 |
|
2039 | return workbook.SheetNames.map(name => (<>
|
2040 | <h3>name</h3>
|
2041 | <div dangerouslySetInnerHTML={{
|
2042 | /* this __html mantra is needed to set the inner HTML */
|
2043 | __html: XLSX.utils.sheet_to_html(workbook.Sheets[name])
|
2044 | }} />
|
2045 | </>));
|
2046 | }
|
2047 | ```
|
2048 |
|
2049 | The [`react` demo](demos/react) includes more React examples.
|
2050 |
|
2051 | </details>
|
2052 |
|
2053 | <details>
|
2054 | <summary><b>VueJS fetch workbook and generate HTML table previews</b> (click to show)</summary>
|
2055 |
|
2056 | It is generally recommended to use a VueJS-friendly workflow, but it is possible
|
2057 | to generate HTML and use it in VueJS with the `v-html` directive:
|
2058 |
|
2059 | ```jsx
|
2060 | import { read, utils } from 'xlsx';
|
2061 | import { reactive } from 'vue';
|
2062 |
|
2063 | const S5SComponent = {
|
2064 | mounted() { (async() => {
|
2065 | /* fetch and parse workbook -- see the fetch example for details */
|
2066 | const workbook = read(await (await fetch("sheetjs.xlsx")).arrayBuffer());
|
2067 | /* loop through the worksheet names in order */
|
2068 | workbook.SheetNames.forEach(name => {
|
2069 | /* generate HTML from the corresponding worksheets */
|
2070 | const html = utils.sheet_to_html(workbook.Sheets[name]);
|
2071 | /* add to state */
|
2072 | this.wb.wb.push({ name, html });
|
2073 | });
|
2074 | })(); },
|
2075 | /* this state mantra is required for array updates to work */
|
2076 | setup() { return { wb: reactive({ wb: [] }) }; },
|
2077 | template: `
|
2078 | <div v-for="ws in wb.wb" :key="ws.name">
|
2079 | <h3>{{ ws.name }}</h3>
|
2080 | <div v-html="ws.html"></div>
|
2081 | </div>`
|
2082 | };
|
2083 | ```
|
2084 |
|
2085 | The [`vuejs` demo](demos/vue) includes more React examples.
|
2086 |
|
2087 | </details>
|
2088 |
|
2089 | ### Generating Single-Worksheet Snapshots
|
2090 |
|
2091 | The `sheet_to_*` functions accept a worksheet object.
|
2092 |
|
2093 | **API**
|
2094 |
|
2095 | _Generate a CSV from a single worksheet_
|
2096 |
|
2097 | ```js
|
2098 | var csv = XLSX.utils.sheet_to_csv(worksheet, opts);
|
2099 | ```
|
2100 |
|
2101 | This snapshot is designed to replicate the "CSV UTF8 (`.csv`)" output type.
|
2102 | ["Delimiter-Separated Output"](#delimiter-separated-output) describes the
|
2103 | function and the optional `opts` argument in more detail.
|
2104 |
|
2105 | _Generate "Text" from a single worksheet_
|
2106 |
|
2107 | ```js
|
2108 | var txt = XLSX.utils.sheet_to_txt(worksheet, opts);
|
2109 | ```
|
2110 |
|
2111 | This snapshot is designed to replicate the "UTF16 Text (`.txt`)" output type.
|
2112 | ["Delimiter-Separated Output"](#delimiter-separated-output) describes the
|
2113 | function and the optional `opts` argument in more detail.
|
2114 |
|
2115 | _Generate a list of formulae from a single worksheet_
|
2116 |
|
2117 | ```js
|
2118 | var fmla = XLSX.utils.sheet_to_formulae(worksheet);
|
2119 | ```
|
2120 |
|
2121 | This snapshot generates an array of entries representing the embedded formulae.
|
2122 | Array formulae are rendered in the form `range=formula` while plain cells are
|
2123 | rendered in the form `cell=formula or value`. String literals are prefixed with
|
2124 | an apostrophe `'`, consistent with Excel's formula bar display.
|
2125 |
|
2126 | ["Formulae Output"](#formulae-output) describes the function in more detail.
|
2127 |
|
2128 | ## Interface
|
2129 |
|
2130 | `XLSX` is the exposed variable in the browser and the exported node variable
|
2131 |
|
2132 | `XLSX.version` is the version of the library (added by the build script).
|
2133 |
|
2134 | `XLSX.SSF` is an embedded version of the [format library](https://git.io/ssf).
|
2135 |
|
2136 | ### Parsing functions
|
2137 |
|
2138 | `XLSX.read(data, read_opts)` attempts to parse `data`.
|
2139 |
|
2140 | `XLSX.readFile(filename, read_opts)` attempts to read `filename` and parse.
|
2141 |
|
2142 | Parse options are described in the [Parsing Options](#parsing-options) section.
|
2143 |
|
2144 | ### Writing functions
|
2145 |
|
2146 | `XLSX.write(wb, write_opts)` attempts to write the workbook `wb`
|
2147 |
|
2148 | `XLSX.writeFile(wb, filename, write_opts)` attempts to write `wb` to `filename`.
|
2149 | In browser-based environments, it will attempt to force a client-side download.
|
2150 |
|
2151 | `XLSX.writeFileAsync(wb, filename, o, cb)` attempts to write `wb` to `filename`.
|
2152 | If `o` is omitted, the writer will use the third argument as the callback.
|
2153 |
|
2154 | `XLSX.stream` contains a set of streaming write functions.
|
2155 |
|
2156 | Write options are described in the [Writing Options](#writing-options) section.
|
2157 |
|
2158 | ### Utilities
|
2159 |
|
2160 | Utilities are available in the `XLSX.utils` object and are described in the
|
2161 | [Utility Functions](#utility-functions) section:
|
2162 |
|
2163 | **Constructing:**
|
2164 |
|
2165 | - `book_new` creates an empty workbook
|
2166 | - `book_append_sheet` adds a worksheet to a workbook
|
2167 |
|
2168 | **Importing:**
|
2169 |
|
2170 | - `aoa_to_sheet` converts an array of arrays of JS data to a worksheet.
|
2171 | - `json_to_sheet` converts an array of JS objects to a worksheet.
|
2172 | - `table_to_sheet` converts a DOM TABLE element to a worksheet.
|
2173 | - `sheet_add_aoa` adds an array of arrays of JS data to an existing worksheet.
|
2174 | - `sheet_add_json` adds an array of JS objects to an existing worksheet.
|
2175 |
|
2176 |
|
2177 | **Exporting:**
|
2178 |
|
2179 | - `sheet_to_json` converts a worksheet object to an array of JSON objects.
|
2180 | - `sheet_to_csv` generates delimiter-separated-values output.
|
2181 | - `sheet_to_txt` generates UTF16 formatted text.
|
2182 | - `sheet_to_html` generates HTML output.
|
2183 | - `sheet_to_formulae` generates a list of the formulae (with value fallbacks).
|
2184 |
|
2185 |
|
2186 | **Cell and cell address manipulation:**
|
2187 |
|
2188 | - `format_cell` generates the text value for a cell (using number formats).
|
2189 | - `encode_row / decode_row` converts between 0-indexed rows and 1-indexed rows.
|
2190 | - `encode_col / decode_col` converts between 0-indexed columns and column names.
|
2191 | - `encode_cell / decode_cell` converts cell addresses.
|
2192 | - `encode_range / decode_range` converts cell ranges.
|
2193 |
|
2194 | ## Common Spreadsheet Format
|
2195 |
|
2196 | SheetJS conforms to the Common Spreadsheet Format (CSF):
|
2197 |
|
2198 | ### General Structures
|
2199 |
|
2200 | Cell address objects are stored as `{c:C, r:R}` where `C` and `R` are 0-indexed
|
2201 | column and row numbers, respectively. For example, the cell address `B5` is
|
2202 | represented by the object `{c:1, r:4}`.
|
2203 |
|
2204 | Cell range objects are stored as `{s:S, e:E}` where `S` is the first cell and
|
2205 | `E` is the last cell in the range. The ranges are inclusive. For example, the
|
2206 | range `A3:B7` is represented by the object `{s:{c:0, r:2}, e:{c:1, r:6}}`.
|
2207 | Utility functions perform a row-major order walk traversal of a sheet range:
|
2208 |
|
2209 | ```js
|
2210 | for(var R = range.s.r; R <= range.e.r; ++R) {
|
2211 | for(var C = range.s.c; C <= range.e.c; ++C) {
|
2212 | var cell_address = {c:C, r:R};
|
2213 | /* if an A1-style address is needed, encode the address */
|
2214 | var cell_ref = XLSX.utils.encode_cell(cell_address);
|
2215 | }
|
2216 | }
|
2217 | ```
|
2218 |
|
2219 | ### Cell Object
|
2220 |
|
2221 | Cell objects are plain JS objects with keys and values following the convention:
|
2222 |
|
2223 | | Key | Description |
|
2224 | | --- | ---------------------------------------------------------------------- |
|
2225 | | `v` | raw value (see Data Types section for more info) |
|
2226 | | `w` | formatted text (if applicable) |
|
2227 | | `t` | type: `b` Boolean, `e` Error, `n` Number, `d` Date, `s` Text, `z` Stub |
|
2228 | | `f` | cell formula encoded as an A1-style string (if applicable) |
|
2229 | | `F` | range of enclosing array if formula is array formula (if applicable) |
|
2230 | | `D` | if true, array formula is dynamic (if applicable) |
|
2231 | | `r` | rich text encoding (if applicable) |
|
2232 | | `h` | HTML rendering of the rich text (if applicable) |
|
2233 | | `c` | comments associated with the cell |
|
2234 | | `z` | number format string associated with the cell (if requested) |
|
2235 | | `l` | cell hyperlink object (`.Target` holds link, `.Tooltip` is tooltip) |
|
2236 | | `s` | the style/theme of the cell (if applicable) |
|
2237 |
|
2238 | Built-in export utilities (such as the CSV exporter) will use the `w` text if it
|
2239 | is available. To change a value, be sure to delete `cell.w` (or set it to
|
2240 | `undefined`) before attempting to export. The utilities will regenerate the `w`
|
2241 | text from the number format (`cell.z`) and the raw value if possible.
|
2242 |
|
2243 | The actual array formula is stored in the `f` field of the first cell in the
|
2244 | array range. Other cells in the range will omit the `f` field.
|
2245 |
|
2246 | #### Data Types
|
2247 |
|
2248 | The raw value is stored in the `v` value property, interpreted based on the `t`
|
2249 | type property. This separation allows for representation of numbers as well as
|
2250 | numeric text. There are 6 valid cell types:
|
2251 |
|
2252 | | Type | Description |
|
2253 | | :--: | :-------------------------------------------------------------------- |
|
2254 | | `b` | Boolean: value interpreted as JS `boolean` |
|
2255 | | `e` | Error: value is a numeric code and `w` property stores common name ** |
|
2256 | | `n` | Number: value is a JS `number` ** |
|
2257 | | `d` | Date: value is a JS `Date` object or string to be parsed as Date ** |
|
2258 | | `s` | Text: value interpreted as JS `string` and written as text ** |
|
2259 | | `z` | Stub: blank stub cell that is ignored by data processing utilities ** |
|
2260 |
|
2261 | <details>
|
2262 | <summary><b>Error values and interpretation</b> (click to show)</summary>
|
2263 |
|
2264 | | Value | Error Meaning |
|
2265 | | -----: | :-------------- |
|
2266 | | `0x00` | `#NULL!` |
|
2267 | | `0x07` | `#DIV/0!` |
|
2268 | | `0x0F` | `#VALUE!` |
|
2269 | | `0x17` | `#REF!` |
|
2270 | | `0x1D` | `#NAME?` |
|
2271 | | `0x24` | `#NUM!` |
|
2272 | | `0x2A` | `#N/A` |
|
2273 | | `0x2B` | `#GETTING_DATA` |
|
2274 |
|
2275 | </details>
|
2276 |
|
2277 | Type `n` is the Number type. This includes all forms of data that Excel stores
|
2278 | as numbers, such as dates/times and Boolean fields. Excel exclusively uses data
|
2279 | that can be fit in an IEEE754 floating point number, just like JS Number, so the
|
2280 | `v` field holds the raw number. The `w` field holds formatted text. Dates are
|
2281 | stored as numbers by default and converted with `XLSX.SSF.parse_date_code`.
|
2282 |
|
2283 | Type `d` is the Date type, generated only when the option `cellDates` is passed.
|
2284 | Since JSON does not have a natural Date type, parsers are generally expected to
|
2285 | store ISO 8601 Date strings like you would get from `date.toISOString()`. On
|
2286 | the other hand, writers and exporters should be able to handle date strings and
|
2287 | JS Date objects. Note that Excel disregards timezone modifiers and treats all
|
2288 | dates in the local timezone. The library does not correct for this error.
|
2289 |
|
2290 | Type `s` is the String type. Values are explicitly stored as text. Excel will
|
2291 | interpret these cells as "number stored as text". Generated Excel files
|
2292 | automatically suppress that class of error, but other formats may elicit errors.
|
2293 |
|
2294 | Type `z` represents blank stub cells. They are generated in cases where cells
|
2295 | have no assigned value but hold comments or other metadata. They are ignored by
|
2296 | the core library data processing utility functions. By default these cells are
|
2297 | not generated; the parser `sheetStubs` option must be set to `true`.
|
2298 |
|
2299 |
|
2300 | #### Dates
|
2301 |
|
2302 | <details>
|
2303 | <summary><b>Excel Date Code details</b> (click to show)</summary>
|
2304 |
|
2305 | By default, Excel stores dates as numbers with a format code that specifies date
|
2306 | processing. For example, the date `19-Feb-17` is stored as the number `42785`
|
2307 | with a number format of `d-mmm-yy`. The `SSF` module understands number formats
|
2308 | and performs the appropriate conversion.
|
2309 |
|
2310 | XLSX also supports a special date type `d` where the data is an ISO 8601 date
|
2311 | string. The formatter converts the date back to a number.
|
2312 |
|
2313 | The default behavior for all parsers is to generate number cells. Setting
|
2314 | `cellDates` to true will force the generators to store dates.
|
2315 |
|
2316 | </details>
|
2317 |
|
2318 | <details>
|
2319 | <summary><b>Time Zones and Dates</b> (click to show)</summary>
|
2320 |
|
2321 | Excel has no native concept of universal time. All times are specified in the
|
2322 | local time zone. Excel limitations prevent specifying true absolute dates.
|
2323 |
|
2324 | Following Excel, this library treats all dates as relative to local time zone.
|
2325 |
|
2326 | </details>
|
2327 |
|
2328 | <details>
|
2329 | <summary><b>Epochs: 1900 and 1904</b> (click to show)</summary>
|
2330 |
|
2331 | Excel supports two epochs (January 1 1900 and January 1 1904).
|
2332 | The workbook's epoch can be determined by examining the workbook's
|
2333 | `wb.Workbook.WBProps.date1904` property:
|
2334 |
|
2335 | ```js
|
2336 | !!(((wb.Workbook||{}).WBProps||{}).date1904)
|
2337 | ```
|
2338 |
|
2339 | </details>
|
2340 |
|
2341 | ### Sheet Objects
|
2342 |
|
2343 | Each key that does not start with `!` maps to a cell (using `A-1` notation)
|
2344 |
|
2345 | `sheet[address]` returns the cell object for the specified address.
|
2346 |
|
2347 | **Special sheet keys (accessible as `sheet[key]`, each starting with `!`):**
|
2348 |
|
2349 | - `sheet['!ref']`: A-1 based range representing the sheet range. Functions that
|
2350 | work with sheets should use this parameter to determine the range. Cells that
|
2351 | are assigned outside of the range are not processed. In particular, when
|
2352 | writing a sheet by hand, cells outside of the range are not included
|
2353 |
|
2354 | Functions that handle sheets should test for the presence of `!ref` field.
|
2355 | If the `!ref` is omitted or is not a valid range, functions are free to treat
|
2356 | the sheet as empty or attempt to guess the range. The standard utilities that
|
2357 | ship with this library treat sheets as empty (for example, the CSV output is
|
2358 | empty string).
|
2359 |
|
2360 | When reading a worksheet with the `sheetRows` property set, the ref parameter
|
2361 | will use the restricted range. The original range is set at `ws['!fullref']`
|
2362 |
|
2363 | - `sheet['!margins']`: Object representing the page margins. The default values
|
2364 | follow Excel's "normal" preset. Excel also has a "wide" and a "narrow" preset
|
2365 | but they are stored as raw measurements. The main properties are listed below:
|
2366 |
|
2367 | <details>
|
2368 | <summary><b>Page margin details</b> (click to show)</summary>
|
2369 |
|
2370 | | key | description | "normal" | "wide" | "narrow" |
|
2371 | |----------|------------------------|:---------|:-------|:-------- |
|
2372 | | `left` | left margin (inches) | `0.7` | `1.0` | `0.25` |
|
2373 | | `right` | right margin (inches) | `0.7` | `1.0` | `0.25` |
|
2374 | | `top` | top margin (inches) | `0.75` | `1.0` | `0.75` |
|
2375 | | `bottom` | bottom margin (inches) | `0.75` | `1.0` | `0.75` |
|
2376 | | `header` | header margin (inches) | `0.3` | `0.5` | `0.3` |
|
2377 | | `footer` | footer margin (inches) | `0.3` | `0.5` | `0.3` |
|
2378 |
|
2379 | ```js
|
2380 | /* Set worksheet sheet to "normal" */
|
2381 | ws["!margins"]={left:0.7, right:0.7, top:0.75,bottom:0.75,header:0.3,footer:0.3}
|
2382 | /* Set worksheet sheet to "wide" */
|
2383 | ws["!margins"]={left:1.0, right:1.0, top:1.0, bottom:1.0, header:0.5,footer:0.5}
|
2384 | /* Set worksheet sheet to "narrow" */
|
2385 | ws["!margins"]={left:0.25,right:0.25,top:0.75,bottom:0.75,header:0.3,footer:0.3}
|
2386 | ```
|
2387 | </details>
|
2388 |
|
2389 | #### Worksheet Object
|
2390 |
|
2391 | In addition to the base sheet keys, worksheets also add:
|
2392 |
|
2393 | - `ws['!cols']`: array of column properties objects. Column widths are actually
|
2394 | stored in files in a normalized manner, measured in terms of the "Maximum
|
2395 | Digit Width" (the largest width of the rendered digits 0-9, in pixels). When
|
2396 | parsed, the column objects store the pixel width in the `wpx` field, character
|
2397 | width in the `wch` field, and the maximum digit width in the `MDW` field.
|
2398 |
|
2399 | - `ws['!rows']`: array of row properties objects as explained later in the docs.
|
2400 | Each row object encodes properties including row height and visibility.
|
2401 |
|
2402 | - `ws['!merges']`: array of range objects corresponding to the merged cells in
|
2403 | the worksheet. Plain text formats do not support merge cells. CSV export
|
2404 | will write all cells in the merge range if they exist, so be sure that only
|
2405 | the first cell (upper-left) in the range is set.
|
2406 |
|
2407 | - `ws['!outline']`: configure how outlines should behave. Options default to
|
2408 | the default settings in Excel 2019:
|
2409 |
|
2410 | | key | Excel feature | default |
|
2411 | |:----------|:----------------------------------------------|:--------|
|
2412 | | `above` | Uncheck "Summary rows below detail" | `false` |
|
2413 | | `left` | Uncheck "Summary rows to the right of detail" | `false` |
|
2414 |
|
2415 | - `ws['!protect']`: object of write sheet protection properties. The `password`
|
2416 | key specifies the password for formats that support password-protected sheets
|
2417 | (XLSX/XLSB/XLS). The writer uses the XOR obfuscation method. The following
|
2418 | keys control the sheet protection -- set to `false` to enable a feature when
|
2419 | sheet is locked or set to `true` to disable a feature:
|
2420 |
|
2421 | <details>
|
2422 | <summary><b>Worksheet Protection Details</b> (click to show)</summary>
|
2423 |
|
2424 | | key | feature (true=disabled / false=enabled) | default |
|
2425 | |:----------------------|:----------------------------------------|:-----------|
|
2426 | | `selectLockedCells` | Select locked cells | enabled |
|
2427 | | `selectUnlockedCells` | Select unlocked cells | enabled |
|
2428 | | `formatCells` | Format cells | disabled |
|
2429 | | `formatColumns` | Format columns | disabled |
|
2430 | | `formatRows` | Format rows | disabled |
|
2431 | | `insertColumns` | Insert columns | disabled |
|
2432 | | `insertRows` | Insert rows | disabled |
|
2433 | | `insertHyperlinks` | Insert hyperlinks | disabled |
|
2434 | | `deleteColumns` | Delete columns | disabled |
|
2435 | | `deleteRows` | Delete rows | disabled |
|
2436 | | `sort` | Sort | disabled |
|
2437 | | `autoFilter` | Filter | disabled |
|
2438 | | `pivotTables` | Use PivotTable reports | disabled |
|
2439 | | `objects` | Edit objects | enabled |
|
2440 | | `scenarios` | Edit scenarios | enabled |
|
2441 | </details>
|
2442 |
|
2443 | - `ws['!autofilter']`: AutoFilter object following the schema:
|
2444 |
|
2445 | ```typescript
|
2446 | type AutoFilter = {
|
2447 | ref:string; // A-1 based range representing the AutoFilter table range
|
2448 | }
|
2449 | ```
|
2450 |
|
2451 | #### Chartsheet Object
|
2452 |
|
2453 | Chartsheets are represented as standard sheets. They are distinguished with the
|
2454 | `!type` property set to `"chart"`.
|
2455 |
|
2456 | The underlying data and `!ref` refer to the cached data in the chartsheet. The
|
2457 | first row of the chartsheet is the underlying header.
|
2458 |
|
2459 | #### Macrosheet Object
|
2460 |
|
2461 | Macrosheets are represented as standard sheets. They are distinguished with the
|
2462 | `!type` property set to `"macro"`.
|
2463 |
|
2464 | #### Dialogsheet Object
|
2465 |
|
2466 | Dialogsheets are represented as standard sheets. They are distinguished with the
|
2467 | `!type` property set to `"dialog"`.
|
2468 |
|
2469 | ### Workbook Object
|
2470 |
|
2471 | `workbook.SheetNames` is an ordered list of the sheets in the workbook
|
2472 |
|
2473 | `wb.Sheets[sheetname]` returns an object representing the worksheet.
|
2474 |
|
2475 | `wb.Props` is an object storing the standard properties. `wb.Custprops` stores
|
2476 | custom properties. Since the XLS standard properties deviate from the XLSX
|
2477 | standard, XLS parsing stores core properties in both places.
|
2478 |
|
2479 | `wb.Workbook` stores [workbook-level attributes](#workbook-level-attributes).
|
2480 |
|
2481 | #### Workbook File Properties
|
2482 |
|
2483 | The various file formats use different internal names for file properties. The
|
2484 | workbook `Props` object normalizes the names:
|
2485 |
|
2486 | <details>
|
2487 | <summary><b>File Properties</b> (click to show)</summary>
|
2488 |
|
2489 | | JS Name | Excel Description |
|
2490 | |:--------------|:-------------------------------|
|
2491 | | `Title` | Summary tab "Title" |
|
2492 | | `Subject` | Summary tab "Subject" |
|
2493 | | `Author` | Summary tab "Author" |
|
2494 | | `Manager` | Summary tab "Manager" |
|
2495 | | `Company` | Summary tab "Company" |
|
2496 | | `Category` | Summary tab "Category" |
|
2497 | | `Keywords` | Summary tab "Keywords" |
|
2498 | | `Comments` | Summary tab "Comments" |
|
2499 | | `LastAuthor` | Statistics tab "Last saved by" |
|
2500 | | `CreatedDate` | Statistics tab "Created" |
|
2501 |
|
2502 | </details>
|
2503 |
|
2504 | For example, to set the workbook title property:
|
2505 |
|
2506 | ```js
|
2507 | if(!wb.Props) wb.Props = {};
|
2508 | wb.Props.Title = "Insert Title Here";
|
2509 | ```
|
2510 |
|
2511 | Custom properties are added in the workbook `Custprops` object:
|
2512 |
|
2513 | ```js
|
2514 | if(!wb.Custprops) wb.Custprops = {};
|
2515 | wb.Custprops["Custom Property"] = "Custom Value";
|
2516 | ```
|
2517 |
|
2518 | Writers will process the `Props` key of the options object:
|
2519 |
|
2520 | ```js
|
2521 | /* force the Author to be "SheetJS" */
|
2522 | XLSX.write(wb, {Props:{Author:"SheetJS"}});
|
2523 | ```
|
2524 |
|
2525 | ### Workbook-Level Attributes
|
2526 |
|
2527 | `wb.Workbook` stores workbook-level attributes.
|
2528 |
|
2529 | #### Defined Names
|
2530 |
|
2531 | `wb.Workbook.Names` is an array of defined name objects which have the keys:
|
2532 |
|
2533 | <details>
|
2534 | <summary><b>Defined Name Properties</b> (click to show)</summary>
|
2535 |
|
2536 | | Key | Description |
|
2537 | |:----------|:-----------------------------------------------------------------|
|
2538 | | `Sheet` | Name scope. Sheet Index (0 = first sheet) or `null` (Workbook) |
|
2539 | | `Name` | Case-sensitive name. Standard rules apply ** |
|
2540 | | `Ref` | A1-style Reference (`"Sheet1!$A$1:$D$20"`) |
|
2541 | | `Comment` | Comment (only applicable for XLS/XLSX/XLSB) |
|
2542 |
|
2543 | </details>
|
2544 |
|
2545 | Excel allows two sheet-scoped defined names to share the same name. However, a
|
2546 | sheet-scoped name cannot collide with a workbook-scope name. Workbook writers
|
2547 | may not enforce this constraint.
|
2548 |
|
2549 | #### Workbook Views
|
2550 |
|
2551 | `wb.Workbook.Views` is an array of workbook view objects which have the keys:
|
2552 |
|
2553 | | Key | Description |
|
2554 | |:----------------|:----------------------------------------------------|
|
2555 | | `RTL` | If true, display right-to-left |
|
2556 |
|
2557 | #### Miscellaneous Workbook Properties
|
2558 |
|
2559 | `wb.Workbook.WBProps` holds other workbook properties:
|
2560 |
|
2561 | | Key | Description |
|
2562 | |:----------------|:----------------------------------------------------|
|
2563 | | `CodeName` | [VBA Project Workbook Code Name](#vba-and-macros) |
|
2564 | | `date1904` | epoch: 0/false for 1900 system, 1/true for 1904 |
|
2565 | | `filterPrivacy` | Warn or strip personally identifying info on save |
|
2566 |
|
2567 | ### Document Features
|
2568 |
|
2569 | Even for basic features like date storage, the official Excel formats store the
|
2570 | same content in different ways. The parsers are expected to convert from the
|
2571 | underlying file format representation to the Common Spreadsheet Format. Writers
|
2572 | are expected to convert from CSF back to the underlying file format.
|
2573 |
|
2574 | #### Formulae
|
2575 |
|
2576 | The A1-style formula string is stored in the `f` field. Even though different
|
2577 | file formats store the formulae in different ways, the formats are translated.
|
2578 | Even though some formats store formulae with a leading equal sign, CSF formulae
|
2579 | do not start with `=`.
|
2580 |
|
2581 | <details>
|
2582 | <summary><b>Formulae File Format Support</b> (click to show)</summary>
|
2583 |
|
2584 | | Storage Representation | Formats | Read | Write |
|
2585 | |:-----------------------|:-------------------------|:-----:|:-----:|
|
2586 | | A1-style strings | XLSX | ✔ | ✔ |
|
2587 | | RC-style strings | XLML and plain text | ✔ | ✔ |
|
2588 | | BIFF Parsed formulae | XLSB and all XLS formats | ✔ | |
|
2589 | | OpenFormula formulae | ODS/FODS/UOS | ✔ | ✔ |
|
2590 | | Lotus Parsed formulae | All Lotus WK_ formats | ✔ | |
|
2591 |
|
2592 | Since Excel prohibits named cells from colliding with names of A1 or RC style
|
2593 | cell references, a (not-so-simple) regex conversion is possible. BIFF Parsed
|
2594 | formulae and Lotus Parsed formulae have to be explicitly unwound. OpenFormula
|
2595 | formulae can be converted with regular expressions.
|
2596 |
|
2597 | Shared formulae are decompressed and each cell has the formula corresponding to
|
2598 | its cell. Writers generally do not attempt to generate shared formulae.
|
2599 | </details>
|
2600 |
|
2601 | **Single-Cell Formulae**
|
2602 |
|
2603 | For simple formulae, the `f` key of the desired cell can be set to the actual
|
2604 | formula text. This worksheet represents `A1=1`, `A2=2`, and `A3=A1+A2`:
|
2605 |
|
2606 | ```js
|
2607 | var worksheet = {
|
2608 | "!ref": "A1:A3",
|
2609 | A1: { t:'n', v:1 },
|
2610 | A2: { t:'n', v:2 },
|
2611 | A3: { t:'n', v:3, f:'A1+A2' }
|
2612 | };
|
2613 | ```
|
2614 |
|
2615 | Utilities like `aoa_to_sheet` will accept cell objects in lieu of values:
|
2616 |
|
2617 | ```js
|
2618 | var worksheet = XLSX.utils.aoa_to_sheet([
|
2619 | [ 1 ], // A1
|
2620 | [ 2 ], // A2
|
2621 | [ {t: "n", v: 3, f: "A1+A2"} ] // A3
|
2622 | ]);
|
2623 | ```
|
2624 |
|
2625 | Cells with formula entries but no value will be serialized in a way that Excel
|
2626 | and other spreadsheet tools will recognize. This library will not automatically
|
2627 | compute formula results! For example, the following worksheet will include the
|
2628 | `BESSELJ` function but the result will not be available in JavaScript:
|
2629 |
|
2630 | ```js
|
2631 | var worksheet = XLSX.utils.aoa_to_sheet([
|
2632 | [ 3.14159, 2 ], // Row "1"
|
2633 | [ { t:'n', f:'BESSELJ(A1,B1)' } ] // Row "2" will be calculated on file open
|
2634 | }
|
2635 | ```
|
2636 |
|
2637 | If the actual results are needed in JS, [SheetJS Pro](https://sheetjs.com/pro)
|
2638 | offers a formula calculator component for evaluating expressions, updating
|
2639 | values and dependent cells, and refreshing entire workbooks.
|
2640 |
|
2641 |
|
2642 | **Array Formulae**
|
2643 |
|
2644 | _Assign an array formula_
|
2645 |
|
2646 | ```js
|
2647 | XLSX.utils.sheet_set_array_formula(worksheet, range, formula);
|
2648 | ```
|
2649 |
|
2650 | Array formulae are stored in the top-left cell of the array block. All cells
|
2651 | of an array formula have a `F` field corresponding to the range. A single-cell
|
2652 | formula can be distinguished from a plain formula by the presence of `F` field.
|
2653 |
|
2654 | For example, setting the cell `C1` to the array formula `{=SUM(A1:A3*B1:B3)}`:
|
2655 |
|
2656 | ```js
|
2657 | // API function
|
2658 | XLSX.utils.sheet_set_array_formula(worksheet, "C1", "SUM(A1:A3*B1:B3)");
|
2659 |
|
2660 | // ... OR raw operations
|
2661 | worksheet['C1'] = { t:'n', f: "SUM(A1:A3*B1:B3)", F:"C1:C1" };
|
2662 | ```
|
2663 |
|
2664 | For a multi-cell array formula, every cell has the same array range but only the
|
2665 | first cell specifies the formula. Consider `D1:D3=A1:A3*B1:B3`:
|
2666 |
|
2667 | ```js
|
2668 | // API function
|
2669 | XLSX.utils.sheet_set_array_formula(worksheet, "D1:D3", "A1:A3*B1:B3");
|
2670 |
|
2671 | // ... OR raw operations
|
2672 | worksheet['D1'] = { t:'n', F:"D1:D3", f:"A1:A3*B1:B3" };
|
2673 | worksheet['D2'] = { t:'n', F:"D1:D3" };
|
2674 | worksheet['D3'] = { t:'n', F:"D1:D3" };
|
2675 | ```
|
2676 |
|
2677 | Utilities and writers are expected to check for the presence of a `F` field and
|
2678 | ignore any possible formula element `f` in cells other than the starting cell.
|
2679 | They are not expected to perform validation of the formulae!
|
2680 |
|
2681 |
|
2682 | **Dynamic Array Formulae**
|
2683 |
|
2684 | _Assign a dynamic array formula_
|
2685 |
|
2686 | ```js
|
2687 | XLSX.utils.sheet_set_array_formula(worksheet, range, formula, true);
|
2688 | ```
|
2689 |
|
2690 | Released in 2020, Dynamic Array Formulae are supported in the XLSX/XLSM and XLSB
|
2691 | file formats. They are represented like normal array formulae but have special
|
2692 | cell metadata indicating that the formula should be allowed to adjust the range.
|
2693 |
|
2694 | An array formula can be marked as dynamic by setting the cell's `D` property to
|
2695 | true. The `F` range is expected but can be the set to the current cell:
|
2696 |
|
2697 | ```js
|
2698 | // API function
|
2699 | XLSX.utils.sheet_set_array_formula(worksheet, "C1", "_xlfn.UNIQUE(A1:A3)", 1);
|
2700 |
|
2701 | // ... OR raw operations
|
2702 | worksheet['C1'] = { t: "s", f: "_xlfn.UNIQUE(A1:A3)", F:"C1", D: 1 }; // dynamic
|
2703 | ```
|
2704 |
|
2705 | **Localization with Function Names**
|
2706 |
|
2707 | SheetJS operates at the file level. Excel stores formula expressions using the
|
2708 | English (United States) function names. For non-English users, Excel uses a
|
2709 | localized set of function names.
|
2710 |
|
2711 | For example, when the computer language and region is set to French (France),
|
2712 | Excel interprets `=SOMME(A1:C3)` as if `SOMME` is the `SUM` function. However,
|
2713 | in the actual file, Excel stores `SUM(A1:C3)`.
|
2714 |
|
2715 | **Prefixed "Future Functions"**
|
2716 |
|
2717 | Functions introduced in newer versions of Excel are prefixed with `_xlfn.` when
|
2718 | stored in files. When writing formula expressions using these functions, the
|
2719 | prefix is required for maximal compatibility:
|
2720 |
|
2721 | ```js
|
2722 | // Broadest compatibility
|
2723 | XLSX.utils.sheet_set_array_formula(worksheet, "C1", "_xlfn.UNIQUE(A1:A3)", 1);
|
2724 |
|
2725 | // Can cause errors in spreadsheet software
|
2726 | XLSX.utils.sheet_set_array_formula(worksheet, "C1", "UNIQUE(A1:A3)", 1);
|
2727 | ```
|
2728 |
|
2729 | When reading a file, the `xlfn` option preserves the prefixes.
|
2730 |
|
2731 | <details>
|
2732 | <summary><b> Functions requiring `_xlfn.` prefix</b> (click to show)</summary>
|
2733 |
|
2734 | This list is growing with each Excel release.
|
2735 |
|
2736 | ```
|
2737 | ACOT
|
2738 | ACOTH
|
2739 | AGGREGATE
|
2740 | ARABIC
|
2741 | BASE
|
2742 | BETA.DIST
|
2743 | BETA.INV
|
2744 | BINOM.DIST
|
2745 | BINOM.DIST.RANGE
|
2746 | BINOM.INV
|
2747 | BITAND
|
2748 | BITLSHIFT
|
2749 | BITOR
|
2750 | BITRSHIFT
|
2751 | BITXOR
|
2752 | BYCOL
|
2753 | BYROW
|
2754 | CEILING.MATH
|
2755 | CEILING.PRECISE
|
2756 | CHISQ.DIST
|
2757 | CHISQ.DIST.RT
|
2758 | CHISQ.INV
|
2759 | CHISQ.INV.RT
|
2760 | CHISQ.TEST
|
2761 | COMBINA
|
2762 | CONFIDENCE.NORM
|
2763 | CONFIDENCE.T
|
2764 | COT
|
2765 | COTH
|
2766 | COVARIANCE.P
|
2767 | COVARIANCE.S
|
2768 | CSC
|
2769 | CSCH
|
2770 | DAYS
|
2771 | DECIMAL
|
2772 | ERF.PRECISE
|
2773 | ERFC.PRECISE
|
2774 | EXPON.DIST
|
2775 | F.DIST
|
2776 | F.DIST.RT
|
2777 | F.INV
|
2778 | F.INV.RT
|
2779 | F.TEST
|
2780 | FIELDVALUE
|
2781 | FILTERXML
|
2782 | FLOOR.MATH
|
2783 | FLOOR.PRECISE
|
2784 | FORMULATEXT
|
2785 | GAMMA
|
2786 | GAMMA.DIST
|
2787 | GAMMA.INV
|
2788 | GAMMALN.PRECISE
|
2789 | GAUSS
|
2790 | HYPGEOM.DIST
|
2791 | IFNA
|
2792 | IMCOSH
|
2793 | IMCOT
|
2794 | IMCSC
|
2795 | IMCSCH
|
2796 | IMSEC
|
2797 | IMSECH
|
2798 | IMSINH
|
2799 | IMTAN
|
2800 | ISFORMULA
|
2801 | ISOMITTED
|
2802 | ISOWEEKNUM
|
2803 | LAMBDA
|
2804 | LET
|
2805 | LOGNORM.DIST
|
2806 | LOGNORM.INV
|
2807 | MAKEARRAY
|
2808 | MAP
|
2809 | MODE.MULT
|
2810 | MODE.SNGL
|
2811 | MUNIT
|
2812 | NEGBINOM.DIST
|
2813 | NORM.DIST
|
2814 | NORM.INV
|
2815 | NORM.S.DIST
|
2816 | NORM.S.INV
|
2817 | NUMBERVALUE
|
2818 | PDURATION
|
2819 | PERCENTILE.EXC
|
2820 | PERCENTILE.INC
|
2821 | PERCENTRANK.EXC
|
2822 | PERCENTRANK.INC
|
2823 | PERMUTATIONA
|
2824 | PHI
|
2825 | POISSON.DIST
|
2826 | QUARTILE.EXC
|
2827 | QUARTILE.INC
|
2828 | QUERYSTRING
|
2829 | RANDARRAY
|
2830 | RANK.AVG
|
2831 | RANK.EQ
|
2832 | REDUCE
|
2833 | RRI
|
2834 | SCAN
|
2835 | SEC
|
2836 | SECH
|
2837 | SEQUENCE
|
2838 | SHEET
|
2839 | SHEETS
|
2840 | SKEW.P
|
2841 | SORTBY
|
2842 | STDEV.P
|
2843 | STDEV.S
|
2844 | T.DIST
|
2845 | T.DIST.2T
|
2846 | T.DIST.RT
|
2847 | T.INV
|
2848 | T.INV.2T
|
2849 | T.TEST
|
2850 | UNICHAR
|
2851 | UNICODE
|
2852 | UNIQUE
|
2853 | VAR.P
|
2854 | VAR.S
|
2855 | WEBSERVICE
|
2856 | WEIBULL.DIST
|
2857 | XLOOKUP
|
2858 | XOR
|
2859 | Z.TEST
|
2860 | ```
|
2861 |
|
2862 | </details>
|
2863 |
|
2864 | #### Row and Column Properties
|
2865 |
|
2866 | <details>
|
2867 | <summary><b>Format Support</b> (click to show)</summary>
|
2868 |
|
2869 | **Row Properties**: XLSX/M, XLSB, BIFF8 XLS, XLML, SYLK, DOM, ODS
|
2870 |
|
2871 | **Column Properties**: XLSX/M, XLSB, BIFF8 XLS, XLML, SYLK, DOM
|
2872 |
|
2873 | </details>
|
2874 |
|
2875 |
|
2876 | Row and Column properties are not extracted by default when reading from a file
|
2877 | and are not persisted by default when writing to a file. The option
|
2878 | `cellStyles: true` must be passed to the relevant read or write function.
|
2879 |
|
2880 | _Column Properties_
|
2881 |
|
2882 | The `!cols` array in each worksheet, if present, is a collection of `ColInfo`
|
2883 | objects which have the following properties:
|
2884 |
|
2885 | ```typescript
|
2886 | type ColInfo = {
|
2887 | /* visibility */
|
2888 | hidden?: boolean; // if true, the column is hidden
|
2889 |
|
2890 | /* column width is specified in one of the following ways: */
|
2891 | wpx?: number; // width in screen pixels
|
2892 | width?: number; // width in Excel's "Max Digit Width", width*256 is integral
|
2893 | wch?: number; // width in characters
|
2894 |
|
2895 | /* other fields for preserving features from files */
|
2896 | level?: number; // 0-indexed outline / group level
|
2897 | MDW?: number; // Excel's "Max Digit Width" unit, always integral
|
2898 | };
|
2899 | ```
|
2900 |
|
2901 | _Row Properties_
|
2902 |
|
2903 | The `!rows` array in each worksheet, if present, is a collection of `RowInfo`
|
2904 | objects which have the following properties:
|
2905 |
|
2906 | ```typescript
|
2907 | type RowInfo = {
|
2908 | /* visibility */
|
2909 | hidden?: boolean; // if true, the row is hidden
|
2910 |
|
2911 | /* row height is specified in one of the following ways: */
|
2912 | hpx?: number; // height in screen pixels
|
2913 | hpt?: number; // height in points
|
2914 |
|
2915 | level?: number; // 0-indexed outline / group level
|
2916 | };
|
2917 | ```
|
2918 |
|
2919 | _Outline / Group Levels Convention_
|
2920 |
|
2921 | The Excel UI displays the base outline level as `1` and the max level as `8`.
|
2922 | Following JS conventions, SheetJS uses 0-indexed outline levels wherein the base
|
2923 | outline level is `0` and the max level is `7`.
|
2924 |
|
2925 | <details>
|
2926 | <summary><b>Why are there three width types?</b> (click to show)</summary>
|
2927 |
|
2928 | There are three different width types corresponding to the three different ways
|
2929 | spreadsheets store column widths:
|
2930 |
|
2931 | SYLK and other plain text formats use raw character count. Contemporaneous tools
|
2932 | like Visicalc and Multiplan were character based. Since the characters had the
|
2933 | same width, it sufficed to store a count. This tradition was continued into the
|
2934 | BIFF formats.
|
2935 |
|
2936 | SpreadsheetML (2003) tried to align with HTML by standardizing on screen pixel
|
2937 | count throughout the file. Column widths, row heights, and other measures use
|
2938 | pixels. When the pixel and character counts do not align, Excel rounds values.
|
2939 |
|
2940 | XLSX internally stores column widths in a nebulous "Max Digit Width" form. The
|
2941 | Max Digit Width is the width of the largest digit when rendered (generally the
|
2942 | "0" character is the widest). The internal width must be an integer multiple of
|
2943 | the the width divided by 256. ECMA-376 describes a formula for converting
|
2944 | between pixels and the internal width. This represents a hybrid approach.
|
2945 |
|
2946 | Read functions attempt to populate all three properties. Write functions will
|
2947 | try to cycle specified values to the desired type. In order to avoid potential
|
2948 | conflicts, manipulation should delete the other properties first. For example,
|
2949 | when changing the pixel width, delete the `wch` and `width` properties.
|
2950 | </details>
|
2951 |
|
2952 | <details>
|
2953 | <summary><b>Implementation details</b> (click to show)</summary>
|
2954 |
|
2955 | _Row Heights_
|
2956 |
|
2957 | Excel internally stores row heights in points. The default resolution is 72 DPI
|
2958 | or 96 PPI, so the pixel and point size should agree. For different resolutions
|
2959 | they may not agree, so the library separates the concepts.
|
2960 |
|
2961 | Even though all of the information is made available, writers are expected to
|
2962 | follow the priority order:
|
2963 |
|
2964 | 1) use `hpx` pixel height if available
|
2965 | 2) use `hpt` point height if available
|
2966 |
|
2967 | _Column Widths_
|
2968 |
|
2969 | Given the constraints, it is possible to determine the MDW without actually
|
2970 | inspecting the font! The parsers guess the pixel width by converting from width
|
2971 | to pixels and back, repeating for all possible MDW and selecting the MDW that
|
2972 | minimizes the error. XLML actually stores the pixel width, so the guess works
|
2973 | in the opposite direction.
|
2974 |
|
2975 | Even though all of the information is made available, writers are expected to
|
2976 | follow the priority order:
|
2977 |
|
2978 | 1) use `width` field if available
|
2979 | 2) use `wpx` pixel width if available
|
2980 | 3) use `wch` character count if available
|
2981 |
|
2982 | </details>
|
2983 |
|
2984 | #### Number Formats
|
2985 |
|
2986 | The `cell.w` formatted text for each cell is produced from `cell.v` and `cell.z`
|
2987 | format. If the format is not specified, the Excel `General` format is used.
|
2988 | The format can either be specified as a string or as an index into the format
|
2989 | table. Parsers are expected to populate `workbook.SSF` with the number format
|
2990 | table. Writers are expected to serialize the table.
|
2991 |
|
2992 | Custom tools should ensure that the local table has each used format string
|
2993 | somewhere in the table. Excel convention mandates that the custom formats start
|
2994 | at index 164. The following example creates a custom format from scratch:
|
2995 |
|
2996 | <details>
|
2997 | <summary><b>New worksheet with custom format</b> (click to show)</summary>
|
2998 |
|
2999 | ```js
|
3000 | var wb = {
|
3001 | SheetNames: ["Sheet1"],
|
3002 | Sheets: {
|
3003 | Sheet1: {
|
3004 | "!ref":"A1:C1",
|
3005 | A1: { t:"n", v:10000 }, // <-- General format
|
3006 | B1: { t:"n", v:10000, z: "0%" }, // <-- Builtin format
|
3007 | C1: { t:"n", v:10000, z: "\"T\"\ #0.00" } // <-- Custom format
|
3008 | }
|
3009 | }
|
3010 | }
|
3011 | ```
|
3012 | </details>
|
3013 |
|
3014 | The rules are slightly different from how Excel displays custom number formats.
|
3015 | In particular, literal characters must be wrapped in double quotes or preceded
|
3016 | by a backslash. For more info, see the Excel documentation article
|
3017 | `Create or delete a custom number format` or ECMA-376 18.8.31 (Number Formats)
|
3018 |
|
3019 |
|
3020 | <details>
|
3021 | <summary><b>Default Number Formats</b> (click to show)</summary>
|
3022 |
|
3023 | The default formats are listed in ECMA-376 18.8.30:
|
3024 |
|
3025 | | ID | Format |
|
3026 | |---:|:---------------------------|
|
3027 | | 0 | `General` |
|
3028 | | 1 | `0` |
|
3029 | | 2 | `0.00` |
|
3030 | | 3 | `#,##0` |
|
3031 | | 4 | `#,##0.00` |
|
3032 | | 9 | `0%` |
|
3033 | | 10 | `0.00%` |
|
3034 | | 11 | `0.00E+00` |
|
3035 | | 12 | `# ?/?` |
|
3036 | | 13 | `# ??/??` |
|
3037 | | 14 | `m/d/yy` (see below) |
|
3038 | | 15 | `d-mmm-yy` |
|
3039 | | 16 | `d-mmm` |
|
3040 | | 17 | `mmm-yy` |
|
3041 | | 18 | `h:mm AM/PM` |
|
3042 | | 19 | `h:mm:ss AM/PM` |
|
3043 | | 20 | `h:mm` |
|
3044 | | 21 | `h:mm:ss` |
|
3045 | | 22 | `m/d/yy h:mm` |
|
3046 | | 37 | `#,##0 ;(#,##0)` |
|
3047 | | 38 | `#,##0 ;[Red](#,##0)` |
|
3048 | | 39 | `#,##0.00;(#,##0.00)` |
|
3049 | | 40 | `#,##0.00;[Red](#,##0.00)` |
|
3050 | | 45 | `mm:ss` |
|
3051 | | 46 | `[h]:mm:ss` |
|
3052 | | 47 | `mmss.0` |
|
3053 | | 48 | `##0.0E+0` |
|
3054 | | 49 | `@` |
|
3055 |
|
3056 | </details>
|
3057 |
|
3058 | Format 14 (`m/d/yy`) is localized by Excel: even though the file specifies that
|
3059 | number format, it will be drawn differently based on system settings. It makes
|
3060 | sense when the producer and consumer of files are in the same locale, but that
|
3061 | is not always the case over the Internet. To get around this ambiguity, parse
|
3062 | functions accept the `dateNF` option to override the interpretation of that
|
3063 | specific format string.
|
3064 |
|
3065 | #### Hyperlinks
|
3066 |
|
3067 | <details>
|
3068 | <summary><b>Format Support</b> (click to show)</summary>
|
3069 |
|
3070 | **Cell Hyperlinks**: XLSX/M, XLSB, BIFF8 XLS, XLML, ODS
|
3071 |
|
3072 | **Tooltips**: XLSX/M, XLSB, BIFF8 XLS, XLML
|
3073 |
|
3074 | </details>
|
3075 |
|
3076 | Hyperlinks are stored in the `l` key of cell objects. The `Target` field of the
|
3077 | hyperlink object is the target of the link, including the URI fragment. Tooltips
|
3078 | are stored in the `Tooltip` field and are displayed when you move your mouse
|
3079 | over the text.
|
3080 |
|
3081 | For example, the following snippet creates a link from cell `A3` to
|
3082 | <https://sheetjs.com> with the tip `"Find us @ SheetJS.com!"`:
|
3083 |
|
3084 | ```js
|
3085 | ws['A1'].l = { Target:"https://sheetjs.com", Tooltip:"Find us @ SheetJS.com!" };
|
3086 | ```
|
3087 |
|
3088 | Note that Excel does not automatically style hyperlinks -- they will generally
|
3089 | be displayed as normal text.
|
3090 |
|
3091 | _Remote Links_
|
3092 |
|
3093 | HTTP / HTTPS links can be used directly:
|
3094 |
|
3095 | ```js
|
3096 | ws['A2'].l = { Target:"https://docs.sheetjs.com/#hyperlinks" };
|
3097 | ws['A3'].l = { Target:"http://localhost:7262/yes_localhost_works" };
|
3098 | ```
|
3099 |
|
3100 | Excel also supports `mailto` email links with subject line:
|
3101 |
|
3102 | ```js
|
3103 | ws['A4'].l = { Target:"mailto:ignored@dev.null" };
|
3104 | ws['A5'].l = { Target:"mailto:ignored@dev.null?subject=Test Subject" };
|
3105 | ```
|
3106 |
|
3107 | _Local Links_
|
3108 |
|
3109 | Links to absolute paths should use the `file://` URI scheme:
|
3110 |
|
3111 | ```js
|
3112 | ws['B1'].l = { Target:"file:///SheetJS/t.xlsx" }; /* Link to /SheetJS/t.xlsx */
|
3113 | ws['B2'].l = { Target:"file:///c:/SheetJS.xlsx" }; /* Link to c:\SheetJS.xlsx */
|
3114 | ```
|
3115 |
|
3116 | Links to relative paths can be specified without a scheme:
|
3117 |
|
3118 | ```js
|
3119 | ws['B3'].l = { Target:"SheetJS.xlsb" }; /* Link to SheetJS.xlsb */
|
3120 | ws['B4'].l = { Target:"../SheetJS.xlsm" }; /* Link to ../SheetJS.xlsm */
|
3121 | ```
|
3122 |
|
3123 | Relative Paths have undefined behavior in the SpreadsheetML 2003 format. Excel
|
3124 | 2019 will treat a `..\` parent mark as two levels up.
|
3125 |
|
3126 | _Internal Links_
|
3127 |
|
3128 | Links where the target is a cell or range or defined name in the same workbook
|
3129 | ("Internal Links") are marked with a leading hash character:
|
3130 |
|
3131 | ```js
|
3132 | ws['C1'].l = { Target:"#E2" }; /* Link to cell E2 */
|
3133 | ws['C2'].l = { Target:"#Sheet2!E2" }; /* Link to cell E2 in sheet Sheet2 */
|
3134 | ws['C3'].l = { Target:"#SomeDefinedName" }; /* Link to Defined Name */
|
3135 | ```
|
3136 |
|
3137 | #### Cell Comments
|
3138 |
|
3139 | Cell comments are objects stored in the `c` array of cell objects. The actual
|
3140 | contents of the comment are split into blocks based on the comment author. The
|
3141 | `a` field of each comment object is the author of the comment and the `t` field
|
3142 | is the plain text representation.
|
3143 |
|
3144 | For example, the following snippet appends a cell comment into cell `A1`:
|
3145 |
|
3146 | ```js
|
3147 | if(!ws.A1.c) ws.A1.c = [];
|
3148 | ws.A1.c.push({a:"SheetJS", t:"I'm a little comment, short and stout!"});
|
3149 | ```
|
3150 |
|
3151 | Note: XLSB enforces a 54 character limit on the Author name. Names longer than
|
3152 | 54 characters may cause issues with other formats.
|
3153 |
|
3154 | To mark a comment as normally hidden, set the `hidden` property:
|
3155 |
|
3156 | ```js
|
3157 | if(!ws.A1.c) ws.A1.c = [];
|
3158 | ws.A1.c.push({a:"SheetJS", t:"This comment is visible"});
|
3159 |
|
3160 | if(!ws.A2.c) ws.A2.c = [];
|
3161 | ws.A2.c.hidden = true;
|
3162 | ws.A2.c.push({a:"SheetJS", t:"This comment will be hidden"});
|
3163 | ```
|
3164 |
|
3165 |
|
3166 | _Threaded Comments_
|
3167 |
|
3168 | Introduced in Excel 365, threaded comments are plain text comment snippets with
|
3169 | author metadata and parent references. They are supported in XLSX and XLSB.
|
3170 |
|
3171 | To mark a comment as threaded, each comment part must have a true `T` property:
|
3172 |
|
3173 | ```js
|
3174 | if(!ws.A1.c) ws.A1.c = [];
|
3175 | ws.A1.c.push({a:"SheetJS", t:"This is not threaded"});
|
3176 |
|
3177 | if(!ws.A2.c) ws.A2.c = [];
|
3178 | ws.A2.c.hidden = true;
|
3179 | ws.A2.c.push({a:"SheetJS", t:"This is threaded", T: true});
|
3180 | ws.A2.c.push({a:"JSSheet", t:"This is also threaded", T: true});
|
3181 | ```
|
3182 |
|
3183 | There is no Active Directory or Office 365 metadata associated with authors in a thread.
|
3184 |
|
3185 | #### Sheet Visibility
|
3186 |
|
3187 | Excel enables hiding sheets in the lower tab bar. The sheet data is stored in
|
3188 | the file but the UI does not readily make it available. Standard hidden sheets
|
3189 | are revealed in the "Unhide" menu. Excel also has "very hidden" sheets which
|
3190 | cannot be revealed in the menu. It is only accessible in the VB Editor!
|
3191 |
|
3192 | The visibility setting is stored in the `Hidden` property of sheet props array.
|
3193 |
|
3194 | <details>
|
3195 | <summary><b>More details</b> (click to show)</summary>
|
3196 |
|
3197 | | Value | Definition |
|
3198 | |:-----:|:------------|
|
3199 | | 0 | Visible |
|
3200 | | 1 | Hidden |
|
3201 | | 2 | Very Hidden |
|
3202 |
|
3203 | With <https://rawgit.com/SheetJS/test_files/HEAD/sheet_visibility.xlsx>:
|
3204 |
|
3205 | ```js
|
3206 | > wb.Workbook.Sheets.map(function(x) { return [x.name, x.Hidden] })
|
3207 | [ [ 'Visible', 0 ], [ 'Hidden', 1 ], [ 'VeryHidden', 2 ] ]
|
3208 | ```
|
3209 |
|
3210 | Non-Excel formats do not support the Very Hidden state. The best way to test
|
3211 | if a sheet is visible is to check if the `Hidden` property is logical truth:
|
3212 |
|
3213 | ```js
|
3214 | > wb.Workbook.Sheets.map(function(x) { return [x.name, !x.Hidden] })
|
3215 | [ [ 'Visible', true ], [ 'Hidden', false ], [ 'VeryHidden', false ] ]
|
3216 | ```
|
3217 | </details>
|
3218 |
|
3219 | #### VBA and Macros
|
3220 |
|
3221 | VBA Macros are stored in a special data blob that is exposed in the `vbaraw`
|
3222 | property of the workbook object when the `bookVBA` option is `true`. They are
|
3223 | supported in `XLSM`, `XLSB`, and `BIFF8 XLS` formats. The supported format
|
3224 | writers automatically insert the data blobs if it is present in the workbook and
|
3225 | associate with the worksheet names.
|
3226 |
|
3227 | <details>
|
3228 | <summary><b>Custom Code Names</b> (click to show)</summary>
|
3229 |
|
3230 | The workbook code name is stored in `wb.Workbook.WBProps.CodeName`. By default,
|
3231 | Excel will write `ThisWorkbook` or a translated phrase like `DieseArbeitsmappe`.
|
3232 | Worksheet and Chartsheet code names are in the worksheet properties object at
|
3233 | `wb.Workbook.Sheets[i].CodeName`. Macrosheets and Dialogsheets are ignored.
|
3234 |
|
3235 | The readers and writers preserve the code names, but they have to be manually
|
3236 | set when adding a VBA blob to a different workbook.
|
3237 |
|
3238 | </details>
|
3239 |
|
3240 | <details>
|
3241 | <summary><b>Macrosheets</b> (click to show)</summary>
|
3242 |
|
3243 | Older versions of Excel also supported a non-VBA "macrosheet" sheet type that
|
3244 | stored automation commands. These are exposed in objects with the `!type`
|
3245 | property set to `"macro"`.
|
3246 |
|
3247 | </details>
|
3248 |
|
3249 | <details>
|
3250 | <summary><b>Detecting macros in workbooks</b> (click to show)</summary>
|
3251 |
|
3252 | The `vbaraw` field will only be set if macros are present, so testing is simple:
|
3253 |
|
3254 | ```js
|
3255 | function wb_has_macro(wb/*:workbook*/)/*:boolean*/ {
|
3256 | if(!!wb.vbaraw) return true;
|
3257 | const sheets = wb.SheetNames.map((n) => wb.Sheets[n]);
|
3258 | return sheets.some((ws) => !!ws && ws['!type']=='macro');
|
3259 | }
|
3260 | ```
|
3261 |
|
3262 | </details>
|
3263 |
|
3264 | ## Parsing Options
|
3265 |
|
3266 | The exported `read` and `readFile` functions accept an options argument:
|
3267 |
|
3268 | | Option Name | Default | Description |
|
3269 | | :---------- | ------: | :--------------------------------------------------- |
|
3270 | |`type` | | Input data encoding (see Input Type below) |
|
3271 | |`raw` | false | If true, plain text parsing will not parse values ** |
|
3272 | |`codepage` | | If specified, use code page when appropriate ** |
|
3273 | |`cellFormula`| true | Save formulae to the .f field |
|
3274 | |`cellHTML` | true | Parse rich text and save HTML to the `.h` field |
|
3275 | |`cellNF` | false | Save number format string to the `.z` field |
|
3276 | |`cellStyles` | false | Save style/theme info to the `.s` field |
|
3277 | |`cellText` | true | Generated formatted text to the `.w` field |
|
3278 | |`cellDates` | false | Store dates as type `d` (default is `n`) |
|
3279 | |`dateNF` | | If specified, use the string for date code 14 ** |
|
3280 | |`sheetStubs` | false | Create cell objects of type `z` for stub cells |
|
3281 | |`sheetRows` | 0 | If >0, read the first `sheetRows` rows ** |
|
3282 | |`bookDeps` | false | If true, parse calculation chains |
|
3283 | |`bookFiles` | false | If true, add raw files to book object ** |
|
3284 | |`bookProps` | false | If true, only parse enough to get book metadata ** |
|
3285 | |`bookSheets` | false | If true, only parse enough to get the sheet names |
|
3286 | |`bookVBA` | false | If true, copy VBA blob to `vbaraw` field ** |
|
3287 | |`password` | "" | If defined and file is encrypted, use password ** |
|
3288 | |`WTF` | false | If true, throw errors on unexpected file features ** |
|
3289 | |`sheets` | | If specified, only parse specified sheets ** |
|
3290 | |`PRN` | false | If true, allow parsing of PRN files ** |
|
3291 | |`xlfn` | false | If true, preserve `_xlfn.` prefixes in formulae ** |
|
3292 | |`FS` | | DSV Field Separator override |
|
3293 |
|
3294 | - Even if `cellNF` is false, formatted text will be generated and saved to `.w`
|
3295 | - In some cases, sheets may be parsed even if `bookSheets` is false.
|
3296 | - Excel aggressively tries to interpret values from CSV and other plain text.
|
3297 | This leads to surprising behavior! The `raw` option suppresses value parsing.
|
3298 | - `bookSheets` and `bookProps` combine to give both sets of information
|
3299 | - `Deps` will be an empty object if `bookDeps` is false
|
3300 | - `bookFiles` behavior depends on file type:
|
3301 | * `keys` array (paths in the ZIP) for ZIP-based formats
|
3302 | * `files` hash (mapping paths to objects representing the files) for ZIP
|
3303 | * `cfb` object for formats using CFB containers
|
3304 | - `sheetRows-1` rows will be generated when looking at the JSON object output
|
3305 | (since the header row is counted as a row when parsing the data)
|
3306 | - By default all worksheets are parsed. `sheets` restricts based on input type:
|
3307 | * number: zero-based index of worksheet to parse (`0` is first worksheet)
|
3308 | * string: name of worksheet to parse (case insensitive)
|
3309 | * array of numbers and strings to select multiple worksheets.
|
3310 | - `bookVBA` merely exposes the raw VBA CFB object. It does not parse the data.
|
3311 | XLSM and XLSB store the VBA CFB object in `xl/vbaProject.bin`. BIFF8 XLS mixes
|
3312 | the VBA entries alongside the core Workbook entry, so the library generates a
|
3313 | new XLSB-compatible blob from the XLS CFB container.
|
3314 | - `codepage` is applied to BIFF2 - BIFF5 files without `CodePage` records and to
|
3315 | CSV files without BOM in `type:"binary"`. BIFF8 XLS always defaults to 1200.
|
3316 | - `PRN` affects parsing of text files without a common delimiter character.
|
3317 | - Currently only XOR encryption is supported. Unsupported error will be thrown
|
3318 | for files employing other encryption methods.
|
3319 | - Newer Excel functions are serialized with the `_xlfn.` prefix, hidden from the
|
3320 | user. SheetJS will strip `_xlfn.` normally. The `xlfn` option preserves them.
|
3321 | - WTF is mainly for development. By default, the parser will suppress read
|
3322 | errors on single worksheets, allowing you to read from the worksheets that do
|
3323 | parse properly. Setting `WTF:true` forces those errors to be thrown.
|
3324 |
|
3325 | ### Input Type
|
3326 |
|
3327 | Strings can be interpreted in multiple ways. The `type` parameter for `read`
|
3328 | tells the library how to parse the data argument:
|
3329 |
|
3330 | | `type` | expected input |
|
3331 | |------------|-----------------------------------------------------------------|
|
3332 | | `"base64"` | string: Base64 encoding of the file |
|
3333 | | `"binary"` | string: binary string (byte `n` is `data.charCodeAt(n)`) |
|
3334 | | `"string"` | string: JS string (characters interpreted as UTF8) |
|
3335 | | `"buffer"` | nodejs Buffer |
|
3336 | | `"array"` | array: array of 8-bit unsigned int (byte `n` is `data[n]`) |
|
3337 | | `"file"` | string: path of file that will be read (nodejs only) |
|
3338 |
|
3339 | ### Guessing File Type
|
3340 |
|
3341 | <details>
|
3342 | <summary><b>Implementation Details</b> (click to show)</summary>
|
3343 |
|
3344 | Excel and other spreadsheet tools read the first few bytes and apply other
|
3345 | heuristics to determine a file type. This enables file type punning: renaming
|
3346 | files with the `.xls` extension will tell your computer to use Excel to open the
|
3347 | file but Excel will know how to handle it. This library applies similar logic:
|
3348 |
|
3349 | | Byte 0 | Raw File Type | Spreadsheet Types |
|
3350 | |:-------|:--------------|:----------------------------------------------------|
|
3351 | | `0xD0` | CFB Container | BIFF 5/8 or protected XLSX/XLSB or WQ3/QPW or XLR |
|
3352 | | `0x09` | BIFF Stream | BIFF 2/3/4/5 |
|
3353 | | `0x3C` | XML/HTML | SpreadsheetML / Flat ODS / UOS1 / HTML / plain text |
|
3354 | | `0x50` | ZIP Archive | XLSB or XLSX/M or ODS or UOS2 or NUMBERS or text |
|
3355 | | `0x49` | Plain Text | SYLK or plain text |
|
3356 | | `0x54` | Plain Text | DIF or plain text |
|
3357 | | `0xEF` | UTF8 Encoded | SpreadsheetML / Flat ODS / UOS1 / HTML / plain text |
|
3358 | | `0xFF` | UTF16 Encoded | SpreadsheetML / Flat ODS / UOS1 / HTML / plain text |
|
3359 | | `0x00` | Record Stream | Lotus WK\* or Quattro Pro or plain text |
|
3360 | | `0x7B` | Plain text | RTF or plain text |
|
3361 | | `0x0A` | Plain text | SpreadsheetML / Flat ODS / UOS1 / HTML / plain text |
|
3362 | | `0x0D` | Plain text | SpreadsheetML / Flat ODS / UOS1 / HTML / plain text |
|
3363 | | `0x20` | Plain text | SpreadsheetML / Flat ODS / UOS1 / HTML / plain text |
|
3364 |
|
3365 | DBF files are detected based on the first byte as well as the third and fourth
|
3366 | bytes (corresponding to month and day of the file date)
|
3367 |
|
3368 | Works for Windows files are detected based on the BOF record with type `0xFF`
|
3369 |
|
3370 | Plain text format guessing follows the priority order:
|
3371 |
|
3372 | | Format | Test |
|
3373 | |:-------|:--------------------------------------------------------------------|
|
3374 | | XML | `<?xml` appears in the first 1024 characters |
|
3375 | | HTML | starts with `<` and HTML tags appear in the first 1024 characters * |
|
3376 | | XML | starts with `<` and the first tag is valid |
|
3377 | | RTF | starts with `{\rt` |
|
3378 | | DSV | starts with `/sep=.$/`, separator is the specified character |
|
3379 | | DSV | more unquoted `|` chars than `;` `\t` `,` in the first 1024 |
|
3380 | | DSV | more unquoted `;` chars than `\t` or `,` in the first 1024 |
|
3381 | | TSV | more unquoted `\t` chars than `,` chars in the first 1024 |
|
3382 | | CSV | one of the first 1024 characters is a comma `","` |
|
3383 | | ETH | starts with `socialcalc:version:` |
|
3384 | | PRN | `PRN` option is set to true |
|
3385 | | CSV | (fallback) |
|
3386 |
|
3387 | - HTML tags include: `html`, `table`, `head`, `meta`, `script`, `style`, `div`
|
3388 |
|
3389 | </details>
|
3390 |
|
3391 | <details>
|
3392 | <summary><b>Why are random text files valid?</b> (click to show)</summary>
|
3393 |
|
3394 | Excel is extremely aggressive in reading files. Adding an XLS extension to any
|
3395 | display text file (where the only characters are ANSI display chars) tricks
|
3396 | Excel into thinking that the file is potentially a CSV or TSV file, even if it
|
3397 | is only one column! This library attempts to replicate that behavior.
|
3398 |
|
3399 | The best approach is to validate the desired worksheet and ensure it has the
|
3400 | expected number of rows or columns. Extracting the range is extremely simple:
|
3401 |
|
3402 | ```js
|
3403 | var range = XLSX.utils.decode_range(worksheet['!ref']);
|
3404 | var ncols = range.e.c - range.s.c + 1, nrows = range.e.r - range.s.r + 1;
|
3405 | ```
|
3406 |
|
3407 | </details>
|
3408 |
|
3409 | ## Writing Options
|
3410 |
|
3411 | The exported `write` and `writeFile` functions accept an options argument:
|
3412 |
|
3413 | | Option Name | Default | Description |
|
3414 | | :---------- | -------: | :-------------------------------------------------- |
|
3415 | |`type` | | Output data encoding (see Output Type below) |
|
3416 | |`cellDates` | `false` | Store dates as type `d` (default is `n`) |
|
3417 | |`bookSST` | `false` | Generate Shared String Table ** |
|
3418 | |`bookType` | `"xlsx"` | Type of Workbook (see below for supported formats) |
|
3419 | |`sheet` | `""` | Name of Worksheet for single-sheet formats ** |
|
3420 | |`compression`| `false` | Use ZIP compression for ZIP-based formats ** |
|
3421 | |`Props` | | Override workbook properties when writing ** |
|
3422 | |`themeXLSX` | | Override theme XML when writing XLSX/XLSB/XLSM ** |
|
3423 | |`ignoreEC` | `true` | Suppress "number as text" errors ** |
|
3424 | |`numbers` | | Payload for NUMBERS export ** |
|
3425 |
|
3426 | - `bookSST` is slower and more memory intensive, but has better compatibility
|
3427 | with older versions of iOS Numbers
|
3428 | - The raw data is the only thing guaranteed to be saved. Features not described
|
3429 | in this README may not be serialized.
|
3430 | - `cellDates` only applies to XLSX output and is not guaranteed to work with
|
3431 | third-party readers. Excel itself does not usually write cells with type `d`
|
3432 | so non-Excel tools may ignore the data or error in the presence of dates.
|
3433 | - `Props` is an object mirroring the workbook `Props` field. See the table from
|
3434 | the [Workbook File Properties](#workbook-file-properties) section.
|
3435 | - if specified, the string from `themeXLSX` will be saved as the primary theme
|
3436 | for XLSX/XLSB/XLSM files (to `xl/theme/theme1.xml` in the ZIP)
|
3437 | - Due to a bug in the program, some features like "Text to Columns" will crash
|
3438 | Excel on worksheets where error conditions are ignored. The writer will mark
|
3439 | files to ignore the error by default. Set `ignoreEC` to `false` to suppress.
|
3440 | - Due to the size of the data, the NUMBERS data is not included by default. The
|
3441 | included `xlsx.zahl.js` and `xlsx.zahl.mjs` scripts include the data.
|
3442 |
|
3443 | ### Supported Output Formats
|
3444 |
|
3445 | For broad compatibility with third-party tools, this library supports many
|
3446 | output formats. The specific file type is controlled with `bookType` option:
|
3447 |
|
3448 | | `bookType` | file ext | container | sheets | Description |
|
3449 | | :--------- | -------: | :-------: | :----- |:------------------------------- |
|
3450 | | `xlsx` | `.xlsx` | ZIP | multi | Excel 2007+ XML Format |
|
3451 | | `xlsm` | `.xlsm` | ZIP | multi | Excel 2007+ Macro XML Format |
|
3452 | | `xlsb` | `.xlsb` | ZIP | multi | Excel 2007+ Binary Format |
|
3453 | | `biff8` | `.xls` | CFB | multi | Excel 97-2004 Workbook Format |
|
3454 | | `biff5` | `.xls` | CFB | multi | Excel 5.0/95 Workbook Format |
|
3455 | | `biff4` | `.xls` | none | single | Excel 4.0 Worksheet Format |
|
3456 | | `biff3` | `.xls` | none | single | Excel 3.0 Worksheet Format |
|
3457 | | `biff2` | `.xls` | none | single | Excel 2.0 Worksheet Format |
|
3458 | | `xlml` | `.xls` | none | multi | Excel 2003-2004 (SpreadsheetML) |
|
3459 | | `numbers` |`.numbers`| ZIP | single | Numbers 3.0+ Spreadsheet |
|
3460 | | `ods` | `.ods` | ZIP | multi | OpenDocument Spreadsheet |
|
3461 | | `fods` | `.fods` | none | multi | Flat OpenDocument Spreadsheet |
|
3462 | | `wk3` | `.wk3` | none | multi | Lotus Workbook (WK3) |
|
3463 | | `csv` | `.csv` | none | single | Comma Separated Values |
|
3464 | | `txt` | `.txt` | none | single | UTF-16 Unicode Text (TXT) |
|
3465 | | `sylk` | `.sylk` | none | single | Symbolic Link (SYLK) |
|
3466 | | `html` | `.html` | none | single | HTML Document |
|
3467 | | `dif` | `.dif` | none | single | Data Interchange Format (DIF) |
|
3468 | | `dbf` | `.dbf` | none | single | dBASE II + VFP Extensions (DBF) |
|
3469 | | `wk1` | `.wk1` | none | single | Lotus Worksheet (WK1) |
|
3470 | | `rtf` | `.rtf` | none | single | Rich Text Format (RTF) |
|
3471 | | `prn` | `.prn` | none | single | Lotus Formatted Text |
|
3472 | | `eth` | `.eth` | none | single | Ethercalc Record Format (ETH) |
|
3473 |
|
3474 | - `compression` only applies to formats with ZIP containers.
|
3475 | - Formats that only support a single sheet require a `sheet` option specifying
|
3476 | the worksheet. If the string is empty, the first worksheet is used.
|
3477 | - `writeFile` will automatically guess the output file format based on the file
|
3478 | extension if `bookType` is not specified. It will choose the first format in
|
3479 | the aforementioned table that matches the extension.
|
3480 |
|
3481 | ### Output Type
|
3482 |
|
3483 | The `type` argument for `write` mirrors the `type` argument for `read`:
|
3484 |
|
3485 | | `type` | output |
|
3486 | |------------|-----------------------------------------------------------------|
|
3487 | | `"base64"` | string: Base64 encoding of the file |
|
3488 | | `"binary"` | string: binary string (byte `n` is `data.charCodeAt(n)`) |
|
3489 | | `"string"` | string: JS string (characters interpreted as UTF8) |
|
3490 | | `"buffer"` | nodejs Buffer |
|
3491 | | `"array"` | ArrayBuffer, fallback array of 8-bit unsigned int |
|
3492 | | `"file"` | string: path of file that will be created (nodejs only) |
|
3493 |
|
3494 | - For compatibility with Excel, `csv` output will always include the UTF-8 byte
|
3495 | order mark.
|
3496 |
|
3497 | ## Utility Functions
|
3498 |
|
3499 | The `sheet_to_*` functions accept a worksheet and an optional options object.
|
3500 |
|
3501 | The `*_to_sheet` functions accept a data object and an optional options object.
|
3502 |
|
3503 | The examples are based on the following worksheet:
|
3504 |
|
3505 | ```
|
3506 | XXX| A | B | C | D | E | F | G |
|
3507 | ---+---+---+---+---+---+---+---+
|
3508 | 1 | S | h | e | e | t | J | S |
|
3509 | 2 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|
3510 | 3 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
|
3511 | ```
|
3512 |
|
3513 | ### Array of Arrays Input
|
3514 |
|
3515 | `XLSX.utils.aoa_to_sheet` takes an array of arrays of JS values and returns a
|
3516 | worksheet resembling the input data. Numbers, Booleans and Strings are stored
|
3517 | as the corresponding styles. Dates are stored as date or numbers. Array holes
|
3518 | and explicit `undefined` values are skipped. `null` values may be stubbed. All
|
3519 | other values are stored as strings. The function takes an options argument:
|
3520 |
|
3521 | | Option Name | Default | Description |
|
3522 | | :---------- | :-----: | :--------------------------------------------------- |
|
3523 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3524 | |`cellDates` | false | Store dates as type `d` (default is `n`) |
|
3525 | |`sheetStubs` | false | Create cell objects of type `z` for `null` values |
|
3526 | |`nullError` | false | If true, emit `#NULL!` error cells for `null` values |
|
3527 |
|
3528 | <details>
|
3529 | <summary><b>Examples</b> (click to show)</summary>
|
3530 |
|
3531 | To generate the example sheet:
|
3532 |
|
3533 | ```js
|
3534 | var ws = XLSX.utils.aoa_to_sheet([
|
3535 | "SheetJS".split(""),
|
3536 | [1,2,3,4,5,6,7],
|
3537 | [2,3,4,5,6,7,8]
|
3538 | ]);
|
3539 | ```
|
3540 | </details>
|
3541 |
|
3542 | `XLSX.utils.sheet_add_aoa` takes an array of arrays of JS values and updates an
|
3543 | existing worksheet object. It follows the same process as `aoa_to_sheet` and
|
3544 | accepts an options argument:
|
3545 |
|
3546 | | Option Name | Default | Description |
|
3547 | | :---------- | :-----: | :--------------------------------------------------- |
|
3548 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3549 | |`cellDates` | false | Store dates as type `d` (default is `n`) |
|
3550 | |`sheetStubs` | false | Create cell objects of type `z` for `null` values |
|
3551 | |`nullError` | false | If true, emit `#NULL!` error cells for `null` values |
|
3552 | |`origin` | | Use specified cell as starting point (see below) |
|
3553 |
|
3554 | `origin` is expected to be one of:
|
3555 |
|
3556 | | `origin` | Description |
|
3557 | | :--------------- | :-------------------------------------------------------- |
|
3558 | | (cell object) | Use specified cell (cell object) |
|
3559 | | (string) | Use specified cell (A1-style cell) |
|
3560 | | (number >= 0) | Start from the first column at specified row (0-indexed) |
|
3561 | | -1 | Append to bottom of worksheet starting on first column |
|
3562 | | (default) | Start from cell A1 |
|
3563 |
|
3564 |
|
3565 | <details>
|
3566 | <summary><b>Examples</b> (click to show)</summary>
|
3567 |
|
3568 | Consider the worksheet:
|
3569 |
|
3570 | ```
|
3571 | XXX| A | B | C | D | E | F | G |
|
3572 | ---+---+---+---+---+---+---+---+
|
3573 | 1 | S | h | e | e | t | J | S |
|
3574 | 2 | 1 | 2 | | | 5 | 6 | 7 |
|
3575 | 3 | 2 | 3 | | | 6 | 7 | 8 |
|
3576 | 4 | 3 | 4 | | | 7 | 8 | 9 |
|
3577 | 5 | 4 | 5 | 6 | 7 | 8 | 9 | 0 |
|
3578 | ```
|
3579 |
|
3580 | This worksheet can be built up in the order `A1:G1, A2:B4, E2:G4, A5:G5`:
|
3581 |
|
3582 | ```js
|
3583 | /* Initial row */
|
3584 | var ws = XLSX.utils.aoa_to_sheet([ "SheetJS".split("") ]);
|
3585 |
|
3586 | /* Write data starting at A2 */
|
3587 | XLSX.utils.sheet_add_aoa(ws, [[1,2], [2,3], [3,4]], {origin: "A2"});
|
3588 |
|
3589 | /* Write data starting at E2 */
|
3590 | XLSX.utils.sheet_add_aoa(ws, [[5,6,7], [6,7,8], [7,8,9]], {origin:{r:1, c:4}});
|
3591 |
|
3592 | /* Append row */
|
3593 | XLSX.utils.sheet_add_aoa(ws, [[4,5,6,7,8,9,0]], {origin: -1});
|
3594 | ```
|
3595 |
|
3596 | </details>
|
3597 |
|
3598 | ### Array of Objects Input
|
3599 |
|
3600 | `XLSX.utils.json_to_sheet` takes an array of objects and returns a worksheet
|
3601 | with automatically-generated "headers" based on the keys of the objects. The
|
3602 | default column order is determined by the first appearance of the field using
|
3603 | `Object.keys`. The function accepts an options argument:
|
3604 |
|
3605 | | Option Name | Default | Description |
|
3606 | | :---------- | :-----: | :--------------------------------------------------- |
|
3607 | |`header` | | Use specified field order (default `Object.keys`) ** |
|
3608 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3609 | |`cellDates` | false | Store dates as type `d` (default is `n`) |
|
3610 | |`skipHeader` | false | If true, do not include header row in output |
|
3611 | |`nullError` | false | If true, emit `#NULL!` error cells for `null` values |
|
3612 |
|
3613 | - All fields from each row will be written. If `header` is an array and it does
|
3614 | not contain a particular field, the key will be appended to the array.
|
3615 | - Cell types are deduced from the type of each value. For example, a `Date`
|
3616 | object will generate a Date cell, while a string will generate a Text cell.
|
3617 | - Null values will be skipped by default. If `nullError` is true, an error cell
|
3618 | corresponding to `#NULL!` will be written to the worksheet.
|
3619 |
|
3620 | <details>
|
3621 | <summary><b>Examples</b> (click to show)</summary>
|
3622 |
|
3623 | The original sheet cannot be reproduced using plain objects since JS object keys
|
3624 | must be unique. After replacing the second `e` and `S` with `e_1` and `S_1`:
|
3625 |
|
3626 | ```js
|
3627 | var ws = XLSX.utils.json_to_sheet([
|
3628 | { S:1, h:2, e:3, e_1:4, t:5, J:6, S_1:7 },
|
3629 | { S:2, h:3, e:4, e_1:5, t:6, J:7, S_1:8 }
|
3630 | ], {header:["S","h","e","e_1","t","J","S_1"]});
|
3631 | ```
|
3632 |
|
3633 | Alternatively, the header row can be skipped:
|
3634 |
|
3635 | ```js
|
3636 | var ws = XLSX.utils.json_to_sheet([
|
3637 | { A:"S", B:"h", C:"e", D:"e", E:"t", F:"J", G:"S" },
|
3638 | { A: 1, B: 2, C: 3, D: 4, E: 5, F: 6, G: 7 },
|
3639 | { A: 2, B: 3, C: 4, D: 5, E: 6, F: 7, G: 8 }
|
3640 | ], {header:["A","B","C","D","E","F","G"], skipHeader:true});
|
3641 | ```
|
3642 |
|
3643 | </details>
|
3644 |
|
3645 | `XLSX.utils.sheet_add_json` takes an array of objects and updates an existing
|
3646 | worksheet object. It follows the same process as `json_to_sheet` and accepts
|
3647 | an options argument:
|
3648 |
|
3649 | | Option Name | Default | Description |
|
3650 | | :---------- | :-----: | :--------------------------------------------------- |
|
3651 | |`header` | | Use specified column order (default `Object.keys`) |
|
3652 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3653 | |`cellDates` | false | Store dates as type `d` (default is `n`) |
|
3654 | |`skipHeader` | false | If true, do not include header row in output |
|
3655 | |`nullError` | false | If true, emit `#NULL!` error cells for `null` values |
|
3656 | |`origin` | | Use specified cell as starting point (see below) |
|
3657 |
|
3658 | `origin` is expected to be one of:
|
3659 |
|
3660 | | `origin` | Description |
|
3661 | | :--------------- | :-------------------------------------------------------- |
|
3662 | | (cell object) | Use specified cell (cell object) |
|
3663 | | (string) | Use specified cell (A1-style cell) |
|
3664 | | (number >= 0) | Start from the first column at specified row (0-indexed) |
|
3665 | | -1 | Append to bottom of worksheet starting on first column |
|
3666 | | (default) | Start from cell A1 |
|
3667 |
|
3668 |
|
3669 | <details>
|
3670 | <summary><b>Examples</b> (click to show)</summary>
|
3671 |
|
3672 | Consider the worksheet:
|
3673 |
|
3674 | ```
|
3675 | XXX| A | B | C | D | E | F | G |
|
3676 | ---+---+---+---+---+---+---+---+
|
3677 | 1 | S | h | e | e | t | J | S |
|
3678 | 2 | 1 | 2 | | | 5 | 6 | 7 |
|
3679 | 3 | 2 | 3 | | | 6 | 7 | 8 |
|
3680 | 4 | 3 | 4 | | | 7 | 8 | 9 |
|
3681 | 5 | 4 | 5 | 6 | 7 | 8 | 9 | 0 |
|
3682 | ```
|
3683 |
|
3684 | This worksheet can be built up in the order `A1:G1, A2:B4, E2:G4, A5:G5`:
|
3685 |
|
3686 | ```js
|
3687 | /* Initial row */
|
3688 | var ws = XLSX.utils.json_to_sheet([
|
3689 | { A: "S", B: "h", C: "e", D: "e", E: "t", F: "J", G: "S" }
|
3690 | ], {header: ["A", "B", "C", "D", "E", "F", "G"], skipHeader: true});
|
3691 |
|
3692 | /* Write data starting at A2 */
|
3693 | XLSX.utils.sheet_add_json(ws, [
|
3694 | { A: 1, B: 2 }, { A: 2, B: 3 }, { A: 3, B: 4 }
|
3695 | ], {skipHeader: true, origin: "A2"});
|
3696 |
|
3697 | /* Write data starting at E2 */
|
3698 | XLSX.utils.sheet_add_json(ws, [
|
3699 | { A: 5, B: 6, C: 7 }, { A: 6, B: 7, C: 8 }, { A: 7, B: 8, C: 9 }
|
3700 | ], {skipHeader: true, origin: { r: 1, c: 4 }, header: [ "A", "B", "C" ]});
|
3701 |
|
3702 | /* Append row */
|
3703 | XLSX.utils.sheet_add_json(ws, [
|
3704 | { A: 4, B: 5, C: 6, D: 7, E: 8, F: 9, G: 0 }
|
3705 | ], {header: ["A", "B", "C", "D", "E", "F", "G"], skipHeader: true, origin: -1});
|
3706 | ```
|
3707 |
|
3708 | </details>
|
3709 |
|
3710 | ### HTML Table Input
|
3711 |
|
3712 | `XLSX.utils.table_to_sheet` takes a table DOM element and returns a worksheet
|
3713 | resembling the input table. Numbers are parsed. All other data will be stored
|
3714 | as strings.
|
3715 |
|
3716 | `XLSX.utils.table_to_book` produces a minimal workbook based on the worksheet.
|
3717 |
|
3718 | Both functions accept options arguments:
|
3719 |
|
3720 | | Option Name | Default | Description |
|
3721 | | :---------- | :------: | :-------------------------------------------------- |
|
3722 | |`raw` | | If true, every cell will hold raw strings |
|
3723 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3724 | |`cellDates` | false | Store dates as type `d` (default is `n`) |
|
3725 | |`sheetRows` | 0 | If >0, read the first `sheetRows` rows of the table |
|
3726 | |`display` | false | If true, hidden rows and cells will not be parsed |
|
3727 |
|
3728 |
|
3729 | <details>
|
3730 | <summary><b>Examples</b> (click to show)</summary>
|
3731 |
|
3732 | To generate the example sheet, start with the HTML table:
|
3733 |
|
3734 | ```html
|
3735 | <table id="sheetjs">
|
3736 | <tr><td>S</td><td>h</td><td>e</td><td>e</td><td>t</td><td>J</td><td>S</td></tr>
|
3737 | <tr><td>1</td><td>2</td><td>3</td><td>4</td><td>5</td><td>6</td><td>7</td></tr>
|
3738 | <tr><td>2</td><td>3</td><td>4</td><td>5</td><td>6</td><td>7</td><td>8</td></tr>
|
3739 | </table>
|
3740 | ```
|
3741 |
|
3742 | To process the table:
|
3743 |
|
3744 | ```js
|
3745 | var tbl = document.getElementById('sheetjs');
|
3746 | var wb = XLSX.utils.table_to_book(tbl);
|
3747 | ```
|
3748 | </details>
|
3749 |
|
3750 | Note: `XLSX.read` can handle HTML represented as strings.
|
3751 |
|
3752 |
|
3753 | `XLSX.utils.sheet_add_dom` takes a table DOM element and updates an existing
|
3754 | worksheet object. It follows the same process as `table_to_sheet` and accepts
|
3755 | an options argument:
|
3756 |
|
3757 | | Option Name | Default | Description |
|
3758 | | :---------- | :------: | :-------------------------------------------------- |
|
3759 | |`raw` | | If true, every cell will hold raw strings |
|
3760 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3761 | |`cellDates` | false | Store dates as type `d` (default is `n`) |
|
3762 | |`sheetRows` | 0 | If >0, read the first `sheetRows` rows of the table |
|
3763 | |`display` | false | If true, hidden rows and cells will not be parsed |
|
3764 |
|
3765 | `origin` is expected to be one of:
|
3766 |
|
3767 | | `origin` | Description |
|
3768 | | :--------------- | :-------------------------------------------------------- |
|
3769 | | (cell object) | Use specified cell (cell object) |
|
3770 | | (string) | Use specified cell (A1-style cell) |
|
3771 | | (number >= 0) | Start from the first column at specified row (0-indexed) |
|
3772 | | -1 | Append to bottom of worksheet starting on first column |
|
3773 | | (default) | Start from cell A1 |
|
3774 |
|
3775 |
|
3776 | <details>
|
3777 | <summary><b>Examples</b> (click to show)</summary>
|
3778 |
|
3779 | A small helper function can create gap rows between tables:
|
3780 |
|
3781 | ```js
|
3782 | function create_gap_rows(ws, nrows) {
|
3783 | var ref = XLSX.utils.decode_range(ws["!ref"]); // get original range
|
3784 | ref.e.r += nrows; // add to ending row
|
3785 | ws["!ref"] = XLSX.utils.encode_range(ref); // reassign row
|
3786 | }
|
3787 |
|
3788 | /* first table */
|
3789 | var ws = XLSX.utils.table_to_sheet(document.getElementById('table1'));
|
3790 | create_gap_rows(ws, 1); // one row gap after first table
|
3791 |
|
3792 | /* second table */
|
3793 | XLSX.utils.sheet_add_dom(ws, document.getElementById('table2'), {origin: -1});
|
3794 | create_gap_rows(ws, 3); // three rows gap after second table
|
3795 |
|
3796 | /* third table */
|
3797 | XLSX.utils.sheet_add_dom(ws, document.getElementById('table3'), {origin: -1});
|
3798 | ```
|
3799 |
|
3800 | </details>
|
3801 |
|
3802 | ### Formulae Output
|
3803 |
|
3804 | `XLSX.utils.sheet_to_formulae` generates an array of commands that represent
|
3805 | how a person would enter data into an application. Each entry is of the form
|
3806 | `A1-cell-address=formula-or-value`. String literals are prefixed with a `'` in
|
3807 | accordance with Excel.
|
3808 |
|
3809 | <details>
|
3810 | <summary><b>Examples</b> (click to show)</summary>
|
3811 |
|
3812 | For the example sheet:
|
3813 |
|
3814 | ```js
|
3815 | > var o = XLSX.utils.sheet_to_formulae(ws);
|
3816 | > [o[0], o[5], o[10], o[15], o[20]];
|
3817 | [ 'A1=\'S', 'F1=\'J', 'D2=4', 'B3=3', 'G3=8' ]
|
3818 | ```
|
3819 | </details>
|
3820 |
|
3821 | ### Delimiter-Separated Output
|
3822 |
|
3823 | As an alternative to the `writeFile` CSV type, `XLSX.utils.sheet_to_csv` also
|
3824 | produces CSV output. The function takes an options argument:
|
3825 |
|
3826 | | Option Name | Default | Description |
|
3827 | | :----------- | :------: | :------------------------------------------------- |
|
3828 | |`FS` | `","` | "Field Separator" delimiter between fields |
|
3829 | |`RS` | `"\n"` | "Record Separator" delimiter between rows |
|
3830 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3831 | |`strip` | false | Remove trailing field separators in each record ** |
|
3832 | |`blankrows` | true | Include blank lines in the CSV output |
|
3833 | |`skipHidden` | false | Skips hidden rows/columns in the CSV output |
|
3834 | |`forceQuotes` | false | Force quotes around fields |
|
3835 |
|
3836 | - `strip` will remove trailing commas from each line under default `FS/RS`
|
3837 | - `blankrows` must be set to `false` to skip blank lines.
|
3838 | - Fields containing the record or field separator will automatically be wrapped
|
3839 | in double quotes; `forceQuotes` forces all cells to be wrapped in quotes.
|
3840 | - `XLSX.write` with `csv` type will always prepend the UTF-8 byte-order mark for
|
3841 | Excel compatibility. `sheet_to_csv` returns a JS string and omits the mark.
|
3842 | Using `XLSX.write` with type `string` will also skip the mark.
|
3843 |
|
3844 |
|
3845 | <details>
|
3846 | <summary><b>Examples</b> (click to show)</summary>
|
3847 |
|
3848 | For the example sheet:
|
3849 |
|
3850 | ```js
|
3851 | > console.log(XLSX.utils.sheet_to_csv(ws));
|
3852 | S,h,e,e,t,J,S
|
3853 | 1,2,3,4,5,6,7
|
3854 | 2,3,4,5,6,7,8
|
3855 | > console.log(XLSX.utils.sheet_to_csv(ws, {FS:"\t"}));
|
3856 | S h e e t J S
|
3857 | 1 2 3 4 5 6 7
|
3858 | 2 3 4 5 6 7 8
|
3859 | > console.log(XLSX.utils.sheet_to_csv(ws,{FS:":",RS:"|"}));
|
3860 | S:h:e:e:t:J:S|1:2:3:4:5:6:7|2:3:4:5:6:7:8|
|
3861 | ```
|
3862 | </details>
|
3863 |
|
3864 | #### UTF-16 Unicode Text
|
3865 |
|
3866 | The `txt` output type uses the tab character as the field separator. If the
|
3867 | `codepage` library is available (included in full distribution but not core),
|
3868 | the output will be encoded in `CP1200` and the BOM will be prepended.
|
3869 |
|
3870 | `XLSX.utils.sheet_to_txt` takes the same arguments as `sheet_to_csv`.
|
3871 |
|
3872 | ### HTML Output
|
3873 |
|
3874 | As an alternative to the `writeFile` HTML type, `XLSX.utils.sheet_to_html` also
|
3875 | produces HTML output. The function takes an options argument:
|
3876 |
|
3877 | | Option Name | Default | Description |
|
3878 | | :---------- | :------: | :-------------------------------------------------- |
|
3879 | |`id` | | Specify the `id` attribute for the `TABLE` element |
|
3880 | |`editable` | false | If true, set `contenteditable="true"` for every TD |
|
3881 | |`header` | | Override header (default `html body`) |
|
3882 | |`footer` | | Override footer (default `/body /html`) |
|
3883 |
|
3884 | <details>
|
3885 | <summary><b>Examples</b> (click to show)</summary>
|
3886 |
|
3887 | For the example sheet:
|
3888 |
|
3889 | ```js
|
3890 | > console.log(XLSX.utils.sheet_to_html(ws));
|
3891 | // ...
|
3892 | ```
|
3893 | </details>
|
3894 |
|
3895 | ### JSON
|
3896 |
|
3897 | `XLSX.utils.sheet_to_json` generates different types of JS objects. The function
|
3898 | takes an options argument:
|
3899 |
|
3900 | | Option Name | Default | Description |
|
3901 | | :---------- | :------: | :-------------------------------------------------- |
|
3902 | |`raw` | `true` | Use raw values (true) or formatted strings (false) |
|
3903 | |`range` | from WS | Override Range (see table below) |
|
3904 | |`header` | | Control output format (see table below) |
|
3905 | |`dateNF` | FMT 14 | Use specified date format in string output |
|
3906 | |`defval` | | Use specified value in place of null or undefined |
|
3907 | |`blankrows` | ** | Include blank lines in the output ** |
|
3908 |
|
3909 | - `raw` only affects cells which have a format code (`.z`) field or a formatted
|
3910 | text (`.w`) field.
|
3911 | - If `header` is specified, the first row is considered a data row; if `header`
|
3912 | is not specified, the first row is the header row and not considered data.
|
3913 | - When `header` is not specified, the conversion will automatically disambiguate
|
3914 | header entries by affixing `_` and a count starting at `1`. For example, if
|
3915 | three columns have header `foo` the output fields are `foo`, `foo_1`, `foo_2`
|
3916 | - `null` values are returned when `raw` is true but are skipped when false.
|
3917 | - If `defval` is not specified, null and undefined values are skipped normally.
|
3918 | If specified, all null and undefined points will be filled with `defval`
|
3919 | - When `header` is `1`, the default is to generate blank rows. `blankrows` must
|
3920 | be set to `false` to skip blank rows.
|
3921 | - When `header` is not `1`, the default is to skip blank rows. `blankrows` must
|
3922 | be true to generate blank rows
|
3923 |
|
3924 | `range` is expected to be one of:
|
3925 |
|
3926 | | `range` | Description |
|
3927 | | :--------------- | :-------------------------------------------------------- |
|
3928 | | (number) | Use worksheet range but set starting row to the value |
|
3929 | | (string) | Use specified range (A1-style bounded range string) |
|
3930 | | (default) | Use worksheet range (`ws['!ref']`) |
|
3931 |
|
3932 | `header` is expected to be one of:
|
3933 |
|
3934 | | `header` | Description |
|
3935 | | :--------------- | :-------------------------------------------------------- |
|
3936 | | `1` | Generate an array of arrays ("2D Array") |
|
3937 | | `"A"` | Row object keys are literal column labels |
|
3938 | | array of strings | Use specified strings as keys in row objects |
|
3939 | | (default) | Read and disambiguate first row as keys |
|
3940 |
|
3941 | - If header is not `1`, the row object will contain the non-enumerable property
|
3942 | `__rowNum__` that represents the row of the sheet corresponding to the entry.
|
3943 | - If header is an array, the keys will not be disambiguated. This can lead to
|
3944 | unexpected results if the array values are not unique!
|
3945 |
|
3946 |
|
3947 | <details>
|
3948 | <summary><b>Examples</b> (click to show)</summary>
|
3949 |
|
3950 | For the example sheet:
|
3951 |
|
3952 | ```js
|
3953 | > XLSX.utils.sheet_to_json(ws);
|
3954 | [ { S: 1, h: 2, e: 3, e_1: 4, t: 5, J: 6, S_1: 7 },
|
3955 | { S: 2, h: 3, e: 4, e_1: 5, t: 6, J: 7, S_1: 8 } ]
|
3956 |
|
3957 | > XLSX.utils.sheet_to_json(ws, {header:"A"});
|
3958 | [ { A: 'S', B: 'h', C: 'e', D: 'e', E: 't', F: 'J', G: 'S' },
|
3959 | { A: '1', B: '2', C: '3', D: '4', E: '5', F: '6', G: '7' },
|
3960 | { A: '2', B: '3', C: '4', D: '5', E: '6', F: '7', G: '8' } ]
|
3961 |
|
3962 | > XLSX.utils.sheet_to_json(ws, {header:["A","E","I","O","U","6","9"]});
|
3963 | [ { '6': 'J', '9': 'S', A: 'S', E: 'h', I: 'e', O: 'e', U: 't' },
|
3964 | { '6': '6', '9': '7', A: '1', E: '2', I: '3', O: '4', U: '5' },
|
3965 | { '6': '7', '9': '8', A: '2', E: '3', I: '4', O: '5', U: '6' } ]
|
3966 |
|
3967 | > XLSX.utils.sheet_to_json(ws, {header:1});
|
3968 | [ [ 'S', 'h', 'e', 'e', 't', 'J', 'S' ],
|
3969 | [ '1', '2', '3', '4', '5', '6', '7' ],
|
3970 | [ '2', '3', '4', '5', '6', '7', '8' ] ]
|
3971 | ```
|
3972 |
|
3973 | Example showing the effect of `raw`:
|
3974 |
|
3975 | ```js
|
3976 | > ws['A2'].w = "3"; // set A2 formatted string value
|
3977 |
|
3978 | > XLSX.utils.sheet_to_json(ws, {header:1, raw:false});
|
3979 | [ [ 'S', 'h', 'e', 'e', 't', 'J', 'S' ],
|
3980 | [ '3', '2', '3', '4', '5', '6', '7' ], // <-- A2 uses the formatted string
|
3981 | [ '2', '3', '4', '5', '6', '7', '8' ] ]
|
3982 |
|
3983 | > XLSX.utils.sheet_to_json(ws, {header:1});
|
3984 | [ [ 'S', 'h', 'e', 'e', 't', 'J', 'S' ],
|
3985 | [ 1, 2, 3, 4, 5, 6, 7 ], // <-- A2 uses the raw value
|
3986 | [ 2, 3, 4, 5, 6, 7, 8 ] ]
|
3987 | ```
|
3988 | </details>
|
3989 |
|
3990 | ## File Formats
|
3991 |
|
3992 | Despite the library name `xlsx`, it supports numerous spreadsheet file formats:
|
3993 |
|
3994 | | Format | Read | Write |
|
3995 | |:-------------------------------------------------------------|:-----:|:-----:|
|
3996 | | **Excel Worksheet/Workbook Formats** |:-----:|:-----:|
|
3997 | | Excel 2007+ XML Formats (XLSX/XLSM) | ✔ | ✔ |
|
3998 | | Excel 2007+ Binary Format (XLSB BIFF12) | ✔ | ✔ |
|
3999 | | Excel 2003-2004 XML Format (XML "SpreadsheetML") | ✔ | ✔ |
|
4000 | | Excel 97-2004 (XLS BIFF8) | ✔ | ✔ |
|
4001 | | Excel 5.0/95 (XLS BIFF5) | ✔ | ✔ |
|
4002 | | Excel 4.0 (XLS/XLW BIFF4) | ✔ | ✔ |
|
4003 | | Excel 3.0 (XLS BIFF3) | ✔ | ✔ |
|
4004 | | Excel 2.0/2.1 (XLS BIFF2) | ✔ | ✔ |
|
4005 | | **Excel Supported Text Formats** |:-----:|:-----:|
|
4006 | | Delimiter-Separated Values (CSV/TXT) | ✔ | ✔ |
|
4007 | | Data Interchange Format (DIF) | ✔ | ✔ |
|
4008 | | Symbolic Link (SYLK/SLK) | ✔ | ✔ |
|
4009 | | Lotus Formatted Text (PRN) | ✔ | ✔ |
|
4010 | | UTF-16 Unicode Text (TXT) | ✔ | ✔ |
|
4011 | | **Other Workbook/Worksheet Formats** |:-----:|:-----:|
|
4012 | | Numbers 3.0+ / iWork 2013+ Spreadsheet (NUMBERS) | ✔ | ✔ |
|
4013 | | OpenDocument Spreadsheet (ODS) | ✔ | ✔ |
|
4014 | | Flat XML ODF Spreadsheet (FODS) | ✔ | ✔ |
|
4015 | | Uniform Office Format Spreadsheet (标文通 UOS1/UOS2) | ✔ | |
|
4016 | | dBASE II/III/IV / Visual FoxPro (DBF) | ✔ | ✔ |
|
4017 | | Lotus 1-2-3 (WK1/WK3) | ✔ | ✔ |
|
4018 | | Lotus 1-2-3 (WKS/WK2/WK4/123) | ✔ | |
|
4019 | | Quattro Pro Spreadsheet (WQ1/WQ2/WB1/WB2/WB3/QPW) | ✔ | |
|
4020 | | Works 1.x-3.x DOS / 2.x-5.x Windows Spreadsheet (WKS) | ✔ | |
|
4021 | | Works 6.x-9.x Spreadsheet (XLR) | ✔ | |
|
4022 | | **Other Common Spreadsheet Output Formats** |:-----:|:-----:|
|
4023 | | HTML Tables | ✔ | ✔ |
|
4024 | | Rich Text Format tables (RTF) | | ✔ |
|
4025 | | Ethercalc Record Format (ETH) | ✔ | ✔ |
|
4026 |
|
4027 | Features not supported by a given file format will not be written. Formats with
|
4028 | range limits will be silently truncated:
|
4029 |
|
4030 | | Format | Last Cell | Max Cols | Max Rows |
|
4031 | |:------------------------------------------|:-----------|---------:|---------:|
|
4032 | | Excel 2007+ XML Formats (XLSX/XLSM) | XFD1048576 | 16384 | 1048576 |
|
4033 | | Excel 2007+ Binary Format (XLSB BIFF12) | XFD1048576 | 16384 | 1048576 |
|
4034 | | Excel 97-2004 (XLS BIFF8) | IV65536 | 256 | 65536 |
|
4035 | | Excel 5.0/95 (XLS BIFF5) | IV16384 | 256 | 16384 |
|
4036 | | Excel 4.0 (XLS BIFF4) | IV16384 | 256 | 16384 |
|
4037 | | Excel 3.0 (XLS BIFF3) | IV16384 | 256 | 16384 |
|
4038 | | Excel 2.0/2.1 (XLS BIFF2) | IV16384 | 256 | 16384 |
|
4039 | | Lotus 1-2-3 R2 - R5 (WK1/WK3/WK4) | IV8192 | 256 | 8192 |
|
4040 | | Lotus 1-2-3 R1 (WKS) | IV2048 | 256 | 2048 |
|
4041 |
|
4042 | Excel 2003 SpreadsheetML range limits are governed by the version of Excel and
|
4043 | are not enforced by the writer.
|
4044 |
|
4045 | <details>
|
4046 | <summary><b>File Format Details</b> (click to show)</summary>
|
4047 |
|
4048 | **Core Spreadsheet Formats**
|
4049 |
|
4050 | - **Excel 2007+ XML (XLSX/XLSM)**
|
4051 |
|
4052 | XLSX and XLSM files are ZIP containers containing a series of XML files in
|
4053 | accordance with the Open Packaging Conventions (OPC). The XLSM format, almost
|
4054 | identical to XLSX, is used for files containing macros.
|
4055 |
|
4056 | The format is standardized in ECMA-376 and later in ISO/IEC 29500. Excel does
|
4057 | not follow the specification, and there are additional documents discussing how
|
4058 | Excel deviates from the specification.
|
4059 |
|
4060 | - **Excel 2.0-95 (BIFF2/BIFF3/BIFF4/BIFF5)**
|
4061 |
|
4062 | BIFF 2/3 XLS are single-sheet streams of binary records. Excel 4 introduced
|
4063 | the concept of a workbook (`XLW` files) but also had single-sheet `XLS` format.
|
4064 | The structure is largely similar to the Lotus 1-2-3 file formats. BIFF5/8/12
|
4065 | extended the format in various ways but largely stuck to the same record format.
|
4066 |
|
4067 | There is no official specification for any of these formats. Excel 95 can write
|
4068 | files in these formats, so record lengths and fields were determined by writing
|
4069 | in all of the supported formats and comparing files. Excel 2016 can generate
|
4070 | BIFF5 files, enabling a full suite of file tests starting from XLSX or BIFF2.
|
4071 |
|
4072 | - **Excel 97-2004 Binary (BIFF8)**
|
4073 |
|
4074 | BIFF8 exclusively uses the Compound File Binary container format, splitting some
|
4075 | content into streams within the file. At its core, it still uses an extended
|
4076 | version of the binary record format from older versions of BIFF.
|
4077 |
|
4078 | The `MS-XLS` specification covers the basics of the file format, and other
|
4079 | specifications expand on serialization of features like properties.
|
4080 |
|
4081 | - **Excel 2003-2004 (SpreadsheetML)**
|
4082 |
|
4083 | Predating XLSX, SpreadsheetML files are simple XML files. There is no official
|
4084 | and comprehensive specification, although MS has released documentation on the
|
4085 | format. Since Excel 2016 can generate SpreadsheetML files, mapping features is
|
4086 | pretty straightforward.
|
4087 |
|
4088 | - **Excel 2007+ Binary (XLSB, BIFF12)**
|
4089 |
|
4090 | Introduced in parallel with XLSX, the XLSB format combines the BIFF architecture
|
4091 | with the content separation and ZIP container of XLSX. For the most part nodes
|
4092 | in an XLSX sub-file can be mapped to XLSB records in a corresponding sub-file.
|
4093 |
|
4094 | The `MS-XLSB` specification covers the basics of the file format, and other
|
4095 | specifications expand on serialization of features like properties.
|
4096 |
|
4097 | - **Delimiter-Separated Values (CSV/TXT)**
|
4098 |
|
4099 | Excel CSV deviates from RFC4180 in a number of important ways. The generated
|
4100 | CSV files should generally work in Excel although they may not work in RFC4180
|
4101 | compatible readers. The parser should generally understand Excel CSV. The
|
4102 | writer proactively generates cells for formulae if values are unavailable.
|
4103 |
|
4104 | Excel TXT uses tab as the delimiter and code page 1200.
|
4105 |
|
4106 | Like in Excel, files starting with `0x49 0x44 ("ID")` are treated as Symbolic
|
4107 | Link files. Unlike Excel, if the file does not have a valid SYLK header, it
|
4108 | will be proactively reinterpreted as CSV. There are some files with semicolon
|
4109 | delimiter that align with a valid SYLK file. For the broadest compatibility,
|
4110 | all cells with the value of `ID` are automatically wrapped in double-quotes.
|
4111 |
|
4112 | **Miscellaneous Workbook Formats**
|
4113 |
|
4114 | Support for other formats is generally far behind XLS/XLSB/XLSX support, due in
|
4115 | part to a lack of publicly available documentation. Test files were produced in
|
4116 | the respective apps and compared to their XLS exports to determine structure.
|
4117 | The main focus is data extraction.
|
4118 |
|
4119 | - **Lotus 1-2-3 (WKS/WK1/WK2/WK3/WK4/123)**
|
4120 |
|
4121 | The Lotus formats consist of binary records similar to the BIFF structure. Lotus
|
4122 | did release a specification decades ago covering the original WK1 format. Other
|
4123 | features were deduced by producing files and comparing to Excel support.
|
4124 |
|
4125 | Generated WK1 worksheets are compatible with Lotus 1-2-3 R2 and Excel 5.0.
|
4126 |
|
4127 | Generated WK3 workbooks are compatible with Lotus 1-2-3 R9 and Excel 5.0.
|
4128 |
|
4129 | - **Quattro Pro (WQ1/WQ2/WB1/WB2/WB3/QPW)**
|
4130 |
|
4131 | The Quattro Pro formats use binary records in the same way as BIFF and Lotus.
|
4132 | Some of the newer formats (namely WB3 and QPW) use a CFB enclosure just like
|
4133 | BIFF8 XLS.
|
4134 |
|
4135 | - **Works for DOS / Windows Spreadsheet (WKS/XLR)**
|
4136 |
|
4137 | All versions of Works were limited to a single worksheet.
|
4138 |
|
4139 | Works for DOS 1.x - 3.x and Works for Windows 2.x extends the Lotus WKS format
|
4140 | with additional record types.
|
4141 |
|
4142 | Works for Windows 3.x - 5.x uses the same format and WKS extension. The BOF
|
4143 | record has type `FF`
|
4144 |
|
4145 | Works for Windows 6.x - 9.x use the XLR format. XLR is nearly identical to
|
4146 | BIFF8 XLS: it uses the CFB container with a Workbook stream. Works 9 saves the
|
4147 | exact Workbook stream for the XLR and the 97-2003 XLS export. Works 6 XLS
|
4148 | includes two empty worksheets but the main worksheet has an identical encoding.
|
4149 | XLR also includes a `WksSSWorkBook` stream similar to Lotus FM3/FMT files.
|
4150 |
|
4151 | - **Numbers 3.0+ / iWork 2013+ Spreadsheet (NUMBERS)**
|
4152 |
|
4153 | iWork 2013 (Numbers 3.0 / Pages 5.0 / Keynote 6.0) switched from a proprietary
|
4154 | XML-based format to the current file format based on the iWork Archive (IWA).
|
4155 | This format has been used up through the current release (Numbers 11.2).
|
4156 |
|
4157 | The parser focuses on extracting raw data from tables. Numbers technically
|
4158 | supports multiple tables in a logical worksheet, including custom titles. This
|
4159 | parser will generate one worksheet per Numbers table.
|
4160 |
|
4161 | The writer currently exports a small range from the first worksheet.
|
4162 |
|
4163 | - **OpenDocument Spreadsheet (ODS/FODS)**
|
4164 |
|
4165 | ODS is an XML-in-ZIP format akin to XLSX while FODS is an XML format akin to
|
4166 | SpreadsheetML. Both are detailed in the OASIS standard, but tools like LO/OO
|
4167 | add undocumented extensions. The parsers and writers do not implement the full
|
4168 | standard, instead focusing on parts necessary to extract and store raw data.
|
4169 |
|
4170 | - **Uniform Office Spreadsheet (UOS1/2)**
|
4171 |
|
4172 | UOS is a very similar format, and it comes in 2 varieties corresponding to ODS
|
4173 | and FODS respectively. For the most part, the difference between the formats
|
4174 | is in the names of tags and attributes.
|
4175 |
|
4176 | **Miscellaneous Worksheet Formats**
|
4177 |
|
4178 | Many older formats supported only one worksheet:
|
4179 |
|
4180 | - **dBASE and Visual FoxPro (DBF)**
|
4181 |
|
4182 | DBF is really a typed table format: each column can only hold one data type and
|
4183 | each record omits type information. The parser generates a header row and
|
4184 | inserts records starting at the second row of the worksheet. The writer makes
|
4185 | files compatible with Visual FoxPro extensions.
|
4186 |
|
4187 | Multi-file extensions like external memos and tables are currently unsupported,
|
4188 | limited by the general ability to read arbitrary files in the web browser. The
|
4189 | reader understands DBF Level 7 extensions like DATETIME.
|
4190 |
|
4191 | - **Symbolic Link (SYLK)**
|
4192 |
|
4193 | There is no real documentation. All knowledge was gathered by saving files in
|
4194 | various versions of Excel to deduce the meaning of fields. Notes:
|
4195 |
|
4196 | - Plain formulae are stored in the RC form.
|
4197 | - Column widths are rounded to integral characters.
|
4198 |
|
4199 | - **Lotus Formatted Text (PRN)**
|
4200 |
|
4201 | There is no real documentation, and in fact Excel treats PRN as an output-only
|
4202 | file format. Nevertheless we can guess the column widths and reverse-engineer
|
4203 | the original layout. Excel's 240 character width limitation is not enforced.
|
4204 |
|
4205 | - **Data Interchange Format (DIF)**
|
4206 |
|
4207 | There is no unified definition. Visicalc DIF differs from Lotus DIF, and both
|
4208 | differ from Excel DIF. Where ambiguous, the parser/writer follows the expected
|
4209 | behavior from Excel. In particular, Excel extends DIF in incompatible ways:
|
4210 |
|
4211 | - Since Excel automatically converts numbers-as-strings to numbers, numeric
|
4212 | string constants are converted to formulae: `"0.3" -> "=""0.3""`
|
4213 | - DIF technically expects numeric cells to hold the raw numeric data, but Excel
|
4214 | permits formatted numbers (including dates)
|
4215 | - DIF technically has no support for formulae, but Excel will automatically
|
4216 | convert plain formulae. Array formulae are not preserved.
|
4217 |
|
4218 | - **HTML**
|
4219 |
|
4220 | Excel HTML worksheets include special metadata encoded in styles. For example,
|
4221 | `mso-number-format` is a localized string containing the number format. Despite
|
4222 | the metadata the output is valid HTML, although it does accept bare `&` symbols.
|
4223 |
|
4224 | The writer adds type metadata to the TD elements via the `t` tag. The parser
|
4225 | looks for those tags and overrides the default interpretation. For example, text
|
4226 | like `<td>12345</td>` will be parsed as numbers but `<td t="s">12345</td>` will
|
4227 | be parsed as text.
|
4228 |
|
4229 | - **Rich Text Format (RTF)**
|
4230 |
|
4231 | Excel RTF worksheets are stored in clipboard when copying cells or ranges from a
|
4232 | worksheet. The supported codes are a subset of the Word RTF support.
|
4233 |
|
4234 | - **Ethercalc Record Format (ETH)**
|
4235 |
|
4236 | [Ethercalc](https://ethercalc.net/) is an open source web spreadsheet powered by
|
4237 | a record format reminiscent of SYLK wrapped in a MIME multi-part message.
|
4238 |
|
4239 | </details>
|
4240 |
|
4241 |
|
4242 | ## Testing
|
4243 |
|
4244 | ### Node
|
4245 |
|
4246 | <details>
|
4247 | <summary>(click to show)</summary>
|
4248 |
|
4249 | `make test` will run the node-based tests. By default it runs tests on files in
|
4250 | every supported format. To test a specific file type, set `FMTS` to the format
|
4251 | you want to test. Feature-specific tests are available with `make test_misc`
|
4252 |
|
4253 | ```bash
|
4254 | $ make test_misc # run core tests
|
4255 | $ make test # run full tests
|
4256 | $ make test_xls # only use the XLS test files
|
4257 | $ make test_xlsx # only use the XLSX test files
|
4258 | $ make test_xlsb # only use the XLSB test files
|
4259 | $ make test_xml # only use the XML test files
|
4260 | $ make test_ods # only use the ODS test files
|
4261 | ```
|
4262 |
|
4263 | To enable all errors, set the environment variable `WTF=1`:
|
4264 |
|
4265 | ```bash
|
4266 | $ make test # run full tests
|
4267 | $ WTF=1 make test # enable all error messages
|
4268 | ```
|
4269 |
|
4270 | `flow` and `eslint` checks are available:
|
4271 |
|
4272 | ```bash
|
4273 | $ make lint # eslint checks
|
4274 | $ make flow # make lint + Flow checking
|
4275 | $ make tslint # check TS definitions
|
4276 | ```
|
4277 |
|
4278 | </details>
|
4279 |
|
4280 | ### Browser
|
4281 |
|
4282 | <details>
|
4283 | <summary>(click to show)</summary>
|
4284 |
|
4285 | The core in-browser tests are available at `tests/index.html` within this repo.
|
4286 | Start a local server and navigate to that directory to run the tests.
|
4287 | `make ctestserv` will start a server on port 8000.
|
4288 |
|
4289 | `make ctest` will generate the browser fixtures. To add more files, edit the
|
4290 | `tests/fixtures.lst` file and add the paths.
|
4291 |
|
4292 | To run the full in-browser tests, clone the repo for
|
4293 | [`oss.sheetjs.com`](https://github.com/SheetJS/SheetJS.github.io) and replace
|
4294 | the `xlsx.js` file (then open a browser window and go to `stress.html`):
|
4295 |
|
4296 | ```bash
|
4297 | $ cp xlsx.js ../SheetJS.github.io
|
4298 | $ cd ../SheetJS.github.io
|
4299 | $ simplehttpserver # or "python -mSimpleHTTPServer" or "serve"
|
4300 | $ open -a Chromium.app http://localhost:8000/stress.html
|
4301 | ```
|
4302 | </details>
|
4303 |
|
4304 | ### Tested Environments
|
4305 |
|
4306 | <details>
|
4307 | <summary>(click to show)</summary>
|
4308 |
|
4309 | - NodeJS `0.8`, `0.10`, `0.12`, `4.x`, `5.x`, `6.x`, `7.x`, `8.x`
|
4310 | - IE 6/7/8/9/10/11 (IE 6-9 require shims)
|
4311 | - Chrome 24+ (including Android 4.0+)
|
4312 | - Safari 6+ (iOS and Desktop)
|
4313 | - Edge 13+, FF 18+, and Opera 12+
|
4314 |
|
4315 | Tests utilize the mocha testing framework.
|
4316 |
|
4317 | - <https://saucelabs.com/u/sheetjs> for XLS\* modules using Sauce Labs
|
4318 |
|
4319 | The test suite also includes tests for various time zones. To change
|
4320 | the timezone locally, set the TZ environment variable:
|
4321 |
|
4322 | ```bash
|
4323 | $ env TZ="Asia/Kolkata" WTF=1 make test_misc
|
4324 | ```
|
4325 |
|
4326 | </details>
|
4327 |
|
4328 | ### Test Files
|
4329 |
|
4330 | Test files are housed in [another repo](https://github.com/SheetJS/test_files).
|
4331 |
|
4332 | Running `make init` will refresh the `test_files` submodule and get the files.
|
4333 | Note that this requires `svn`, `git`, `hg` and other commands that may not be
|
4334 | available. If `make init` fails, please download the latest version of the test
|
4335 | files snapshot from [the repo](https://github.com/SheetJS/test_files/releases)
|
4336 |
|
4337 | <details>
|
4338 | <summary><b>Latest Snapshot</b> (click to show)</summary>
|
4339 |
|
4340 | Latest test files snapshot:
|
4341 | <http://github.com/SheetJS/test_files/releases/download/20170409/test_files.zip>
|
4342 |
|
4343 | (download and unzip to the `test_files` subdirectory)
|
4344 |
|
4345 | </details>
|
4346 |
|
4347 | ## Contributing
|
4348 |
|
4349 | Due to the precarious nature of the Open Specifications Promise, it is very
|
4350 | important to ensure code is cleanroom. [Contribution Notes](CONTRIBUTING.md)
|
4351 |
|
4352 | <details>
|
4353 | <summary><b>File organization</b> (click to show)</summary>
|
4354 |
|
4355 | At a high level, the final script is a concatenation of the individual files in
|
4356 | the `bits` folder. Running `make` should reproduce the final output on all
|
4357 | platforms. The README is similarly split into bits in the `docbits` folder.
|
4358 |
|
4359 | Folders:
|
4360 |
|
4361 | | folder | contents |
|
4362 | |:-------------|:--------------------------------------------------------------|
|
4363 | | `bits` | raw source files that make up the final script |
|
4364 | | `docbits` | raw markdown files that make up `README.md` |
|
4365 | | `bin` | server-side bin scripts (`xlsx.njs`) |
|
4366 | | `dist` | dist files for web browsers and nonstandard JS environments |
|
4367 | | `demos` | demo projects for platforms like ExtendScript and Webpack |
|
4368 | | `tests` | browser tests (run `make ctest` to rebuild) |
|
4369 | | `types` | typescript definitions and tests |
|
4370 | | `misc` | miscellaneous supporting scripts |
|
4371 | | `test_files` | test files (pulled from the test files repository) |
|
4372 |
|
4373 | </details>
|
4374 |
|
4375 | After cloning the repo, running `make help` will display a list of commands.
|
4376 |
|
4377 | ### OSX/Linux
|
4378 |
|
4379 | <details>
|
4380 | <summary>(click to show)</summary>
|
4381 |
|
4382 | The `xlsx.js` file is constructed from the files in the `bits` subdirectory. The
|
4383 | build script (run `make`) will concatenate the individual bits to produce the
|
4384 | script. Before submitting a contribution, ensure that running make will produce
|
4385 | the `xlsx.js` file exactly. The simplest way to test is to add the script:
|
4386 |
|
4387 | ```bash
|
4388 | $ git add xlsx.js
|
4389 | $ make clean
|
4390 | $ make
|
4391 | $ git diff xlsx.js
|
4392 | ```
|
4393 |
|
4394 | To produce the dist files, run `make dist`. The dist files are updated in each
|
4395 | version release and *should not be committed between versions*.
|
4396 | </details>
|
4397 |
|
4398 | ### Windows
|
4399 |
|
4400 | <details>
|
4401 | <summary>(click to show)</summary>
|
4402 |
|
4403 | The included `make.cmd` script will build `xlsx.js` from the `bits` directory.
|
4404 | Building is as simple as:
|
4405 |
|
4406 | ```cmd
|
4407 | > make
|
4408 | ```
|
4409 |
|
4410 | To prepare development environment:
|
4411 |
|
4412 | ```cmd
|
4413 | > make init
|
4414 | ```
|
4415 |
|
4416 | The full list of commands available in Windows are displayed in `make help`:
|
4417 |
|
4418 | ```
|
4419 | make init -- install deps and global modules
|
4420 | make lint -- run eslint linter
|
4421 | make test -- run mocha test suite
|
4422 | make misc -- run smaller test suite
|
4423 | make book -- rebuild README and summary
|
4424 | make help -- display this message
|
4425 | ```
|
4426 |
|
4427 | As explained in [Test Files](#test-files), on Windows the release ZIP file must
|
4428 | be downloaded and extracted. If Bash on Windows is available, it is possible
|
4429 | to run the OSX/Linux workflow. The following steps prepares the environment:
|
4430 |
|
4431 | ```bash
|
4432 | # Install support programs for the build and test commands
|
4433 | sudo apt-get install make git subversion mercurial
|
4434 |
|
4435 | # Install nodejs and NPM within the WSL
|
4436 | wget -qO- https://deb.nodesource.com/setup_8.x | sudo bash
|
4437 | sudo apt-get install nodejs
|
4438 |
|
4439 | # Install dev dependencies
|
4440 | sudo npm install -g mocha voc blanket xlsjs
|
4441 | ```
|
4442 |
|
4443 | </details>
|
4444 |
|
4445 | ### Tests
|
4446 |
|
4447 | <details>
|
4448 | <summary>(click to show)</summary>
|
4449 |
|
4450 | The `test_misc` target (`make test_misc` on Linux/OSX / `make misc` on Windows)
|
4451 | runs the targeted feature tests. It should take 5-10 seconds to perform feature
|
4452 | tests without testing against the entire test battery. New features should be
|
4453 | accompanied with tests for the relevant file formats and features.
|
4454 |
|
4455 | For tests involving the read side, an appropriate feature test would involve
|
4456 | reading an existing file and checking the resulting workbook object. If a
|
4457 | parameter is involved, files should be read with different values to verify that
|
4458 | the feature is working as expected.
|
4459 |
|
4460 | For tests involving a new write feature which can already be parsed, appropriate
|
4461 | feature tests would involve writing a workbook with the feature and then opening
|
4462 | and verifying that the feature is preserved.
|
4463 |
|
4464 | For tests involving a new write feature without an existing read ability, please
|
4465 | add a feature test to the kitchen sink `tests/write.js`.
|
4466 | </details>
|
4467 |
|
4468 | ## License
|
4469 |
|
4470 | Please consult the attached LICENSE file for details. All rights not explicitly
|
4471 | granted by the Apache 2.0 License are reserved by the Original Author.
|
4472 |
|
4473 |
|
4474 | ## References
|
4475 |
|
4476 | <details>
|
4477 | <summary><b>OSP-covered Specifications</b> (click to show)</summary>
|
4478 |
|
4479 | - `MS-CFB`: Compound File Binary File Format
|
4480 | - `MS-CTXLS`: Excel Custom Toolbar Binary File Format
|
4481 | - `MS-EXSPXML3`: Excel Calculation Version 2 Web Service XML Schema
|
4482 | - `MS-ODATA`: Open Data Protocol (OData)
|
4483 | - `MS-ODRAW`: Office Drawing Binary File Format
|
4484 | - `MS-ODRAWXML`: Office Drawing Extensions to Office Open XML Structure
|
4485 | - `MS-OE376`: Office Implementation Information for ECMA-376 Standards Support
|
4486 | - `MS-OFFCRYPTO`: Office Document Cryptography Structure
|
4487 | - `MS-OI29500`: Office Implementation Information for ISO/IEC 29500 Standards Support
|
4488 | - `MS-OLEDS`: Object Linking and Embedding (OLE) Data Structures
|
4489 | - `MS-OLEPS`: Object Linking and Embedding (OLE) Property Set Data Structures
|
4490 | - `MS-OODF3`: Office Implementation Information for ODF 1.2 Standards Support
|
4491 | - `MS-OSHARED`: Office Common Data Types and Objects Structures
|
4492 | - `MS-OVBA`: Office VBA File Format Structure
|
4493 | - `MS-XLDM`: Spreadsheet Data Model File Format
|
4494 | - `MS-XLS`: Excel Binary File Format (.xls) Structure Specification
|
4495 | - `MS-XLSB`: Excel (.xlsb) Binary File Format
|
4496 | - `MS-XLSX`: Excel (.xlsx) Extensions to the Office Open XML SpreadsheetML File Format
|
4497 | - `XLS`: Microsoft Office Excel 97-2007 Binary File Format Specification
|
4498 | - `RTF`: Rich Text Format
|
4499 |
|
4500 | </details>
|
4501 |
|
4502 | - ISO/IEC 29500:2012(E) "Information technology — Document description and processing languages — Office Open XML File Formats"
|
4503 | - Open Document Format for Office Applications Version 1.2 (29 September 2011)
|
4504 | - Worksheet File Format (From Lotus) December 1984
|