UNPKG

3.88 kBMarkdownView Raw
1<!-- START doctoc generated TOC please keep comment here to allow auto update -->
2<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
3**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)*
4
5- [hollerith-codec](#hollerith-codec)
6 - [Benchmarks](#benchmarks)
7
8<!-- END doctoc generated TOC please keep comment here to allow auto update -->
9
10
11
12- [hollerith-codec](#hollerith-codec)
13
14> **Table of Contents** *generated with [DocToc](http://doctoc.herokuapp.com/)*
15
16
17# hollerith-codec
18
19Binary encoding for Hollerith that provides a total ordering for primitive
20datatypes and lists of those. Used by
21[Hollerith](https://github.com/loveencounterflow/hollerith2) and
22[KWIC](https://github.com/loveencounterflow/kwic).
23
24
25**Breaking Change** in v2.x: Prior to v2.x, when comparing two lists `a`, `b` where `a` is a prefix of `b`
26used to be sorted with `a` first, then `b`, i.o.w. a shorter word always precedes a longer word that starts
27out the same (as in an alphabetically sorted dictionary).
28
29Starting with v2, however, this has been changed such that when comparing two lists `a`, `b` where `a` is a
30prefix of `b`, `b` will be sorted *before* `a` if the first 'extra' element of `b` (i.e. `b[ a.length ]`) is
31a *negative number*. By way of example, sorting now looks like this (with hexadecimal byte values):
32
33```
34[ -10, ] ... 4b bfdbffffffffffff 4c
35[ -1, ] ... 4b c00fffffffffffff 4c
36[ ] ... 4c
37[ 0, ] ... 4d 0000000000000000 4c
38[ 1, -1, ] ... 4d 3ff0000000000000 4b c00fffffffffffff 4c
39[ 1, ] ... 4d 3ff0000000000000 4c
40[ 1, 0, ] ... 4d 3ff0000000000000 4d 0000000000000000 4c
41[ 1, 1, ] ... 4d 3ff0000000000000 4d 3ff0000000000000 4c
42[ 10, -2, ] ... 4d 4024000000000000 4b bfffffffffffffff 4c
43[ 10, -2, 3, ] ... 4d 4024000000000000 4b bfffffffffffffff 4d 4008000000000000 4c
44[ 10, -1, ] ... 4d 4024000000000000 4b c00fffffffffffff 4c
45[ 10, -1, 3, ] ... 4d 4024000000000000 4b c00fffffffffffff 4d 4008000000000000 4c
46[ 10, ] ... 4d 4024000000000000 4c
47[ 10, 0, ] ... 4d 4024000000000000 4d 0000000000000000 4c
48[ 10, 0, 3, ] ... 4d 4024000000000000 4d 0000000000000000 4d 4008000000000000 4c
49[ 10, 1, ] ... 4d 4024000000000000 4d 3ff0000000000000 4c
50[ 10, 1, 3, ] ... 4d 4024000000000000 4d 3ff0000000000000 4d 4008000000000000 4c
51[ 10, 2, ] ... 4d 4024000000000000 4d 4000000000000000 4c
52[ 10, 2, 3, ] ... 4d 4024000000000000 4d 4000000000000000 4d 4008000000000000 4c
53[ 10, 4, -2, ] ... 4d 4024000000000000 4d 4010000000000000 4b bfffffffffffffff 4c
54[ 10, 4, -1, ] ... 4d 4024000000000000 4d 4010000000000000 4b c00fffffffffffff 4c
55[ 10, 4, ] ... 4d 4024000000000000 4d 4010000000000000 4c
56[ 10, 4, 0, ] ... 4d 4024000000000000 4d 4010000000000000 4d 0000000000000000 4c
57[ 10, 4, 1, ] ... 4d 4024000000000000 4d 4010000000000000 4d 3ff0000000000000 4c
58[ 10, 4, 2, ] ... 4d 4024000000000000 4d 4010000000000000 4d 4000000000000000 4c
59```
60
61
62## Benchmarks
63
64```
65Sample run:
66hollerith_tng 0.235 s 300,000 items 1,274,500⏶Hz 785⏷nspc
67hollerith_bcd 2.543 s 300,000 items 117,963⏶Hz 8,477⏷nspc
68hollerith_classic 7.113 s 300,000 items 42,176⏶Hz 23,710⏷nspc
69
70Averages:
7100:56 HENGIST/BENCHMARKS ▶ hollerith_tng 1,248,369 Hz ≙ 1 ÷ 1.0 100.0 % │████████████▌│
7200:56 HENGIST/BENCHMARKS ▶ hollerith_bcd 118,161 Hz ≙ 1 ÷ 10.6 9.5 % │█▏ │
7300:56 HENGIST/BENCHMARKS ▶ hollerith_classic 42,056 Hz ≙ 1 ÷ 29.7 3.4 % │▍ │
74```
75
76
77