UNPKG

11.1 kBMarkdownView Raw
1# Technical implementation of text / document selections and highlights
2
3## Selections
4
5### DOM selection
6
7A good starting point is to listen for user "selection" events in the DOM tree, such as `selectionstart`:
8
9```javascript
10win.document.addEventListener("selectionstart", (evt) => {
11 // ...
12});
13```
14
15Note: there are instances where the `selectionstart` event is not raised. Not sure why.
16
17Note: the `selectionchange` event may be problematic in cases where the DOM selection API is used programmatically as a post-processing step, in order to normalize multiple selections to a single range, as this can potentially cause duplicate events and infinite loops. Code example:
18
19```javascript
20win.document.addEventListener("selectionchange", (evt) => {
21 const selection = window.getSelection();
22 if (selection) {
23 const range = NORMALIZE_TO_SINGLE_RANGE(...);
24 selection.removeAllRanges(); // => triggers selectionchange again!
25 selection.addRange(range); // => triggers selectionchange again!
26 }
27});
28```
29
30As shown above, `selection.removeAllRanges()` can be used to clear existing user selections in the DOM. This may be useful in cases where the document viewport has shifted pass visible user selections (this is particularly relevant in "structured" paginated mode, but this logic applies to a "looser" scroll view as well). From a UX perspective, such hidden selections can be confusing, so the application may decide to void selections that disappear out of view.
31
32Note that a selection can exist in a "collapsed" state, effectively a "cursor" with no actual content (in which case this may need to be ignored ... it depends on the consumer logic). Code example:
33
34```javascript
35const selection = window.getSelection();
36if (selection) {
37 if (selection.isCollapsed) {
38 return;
39 }
40}
41```
42
43Getting the raw text from a DOM selection is easy, but it might be necessary to cleanup the text (e.g. whitespaces), and to filter-out selections that are deemed "empty". Example:
44
45```javascript
46const selection = window.getSelection();
47if (selection) {
48 if (selection.isCollapsed) {
49 return;
50 }
51 const rawText = selection.toString();
52 const cleanText = rawText.trim().replace(/\n/g, " ").replace(/\s\s+/g, " ");
53 if (cleanText.length === 0) {
54 return;
55 }
56}
57```
58
59DOM selections contain ranges, and should have `anchorNode` (+ `anchorOffset`) and `focusNode` (+ `focusOffset`):
60
61```javascript
62const selection = window.getSelection();
63if (selection) {
64 if (!selection.anchorNode || !selection.focusNode) {
65 return;
66 }
67}
68```
69
70DOM selections can contain a single range, and multiple ranges can be normalized (ensure document ordering):
71
72```javascript
73const selection = window.getSelection();
74if (selection) {
75 if (!selection.anchorNode || !selection.focusNode) {
76 return;
77 }
78
79 const range = selection.rangeCount === 1 ? selection.getRangeAt(0) :
80 createOrderedRange(selection.anchorNode, selection.anchorOffset, selection.focusNode, selection.focusOffset);
81 if (!range || range.collapsed) {
82 return;
83 }
84}
85```
86
87There are multiple ways to ensure the order of selection ranges, here is an example `createOrderedRange()` function:
88
89```javascript
90function createOrderedRange(startNode, startOffset, endNode, endOffset) {
91
92 const range = new Range(); // document.createRange()
93 range.setStart(startNode, startOffset);
94 range.setEnd(endNode, endOffset);
95 if (!range.collapsed) {
96 return range;
97 }
98
99 const rangeReverse = new Range(); // document.createRange()
100 rangeReverse.setStart(endNode, endOffset);
101 rangeReverse.setEnd(startNode, startOffset);
102 if (!rangeReverse.collapsed) {
103 return range;
104 }
105
106 return undefined;
107}
108```
109
110At that point, we have a DOM range object. We now want to serialize it into a JSON data structure that can be used for persistent storage.
111
112### Range serialization
113
114The `convertRange()` function:
115
116https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/selection.ts#L229-L373
117
118...returns a `IRangeInfo` object which is a direct "translation" of the DOM range (unlike CFI which has its own indexing rules and representation conventions):
119
120https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/common/selection.ts#L13-L40
121
122In a nutshell: CSS Selectors are used to reliably encode references to DOM elements in a web-friendly manner (i.e. not CFI). In the case of DOM text nodes, the direct parent is referenced, and the child offset is stored (zero-based integer index).
123
124A CFI reference is also created for good measure, but this is not actually critical to the inner workings of the selection/highlight mechanisms.
125
126Note that the `convertRange()` function takes two additional parameters (external functions) which are used to encode CSS selectors and CFI representations.
127
128The CFI "generator" implementation is currently simplistic (elements only). Code excerpt (blacklist handling removed, for brevity):
129
130```javascript
131export const computeCFI = (node) => {
132
133 if (node.nodeType !== Node.ELEMENT_NODE) {
134 return undefined;
135 }
136
137 let cfi = "";
138
139 let currentElement = node;
140 while (currentElement.parentNode && currentElement.parentNode.nodeType === Node.ELEMENT_NODE) {
141 const currentElementParentChildren = currentElement.parentNode.children;
142 let currentElementIndex = -1;
143 for (let i = 0; i < currentElementParentChildren.length; i++) {
144 if (currentElement === currentElementParentChildren[i]) {
145 currentElementIndex = i;
146 break;
147 }
148 }
149 if (currentElementIndex >= 0) {
150 const cfiIndex = (currentElementIndex + 1) * 2;
151 cfi = cfiIndex +
152 (currentElement.id ? ("[" + currentElement.id + "]") : "") +
153 (cfi.length ? ("/" + cfi) : "");
154 }
155 currentElement = currentElement.parentNode;
156 }
157
158 return "/" + cfi;
159};
160```
161
162...however, there is a prototype (low development priority) CFI generator for character-level CFI range:
163
164https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/selection.ts#L283-L361
165
166The pseudo-canonical unique CSS Selectors are generated using a TypeScript port of an external library called "finder":
167
168https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/common/cssselector2.ts#L8
169
170The original CSS Selectors algorithm was borrowed from the Chromium code, but the "finder" lib turned out to be a better choice (uniqueness is VERY important, along with the blacklisting capabilities):
171
172https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/common/cssselector.ts#L9
173
174Naturally, range serialization must be bidirectional. Hence the `convertRangeInfo()` function which performs the reverse transformation of `convertRange()`:
175
176https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/selection.ts#L375-L416
177
178As you can see, very straight-forward reliable conversion, no CFI edge-case juggling.
179
180## Highlights
181
182### Client rectangles (aggregated atomic bounding boxes)
183
184Once a DOM range is obtained (either directly from a user selection, or from a deserialized range object), the "client rectangles" (bounding boxes) can be normalized to prevent overlap (which would otherwise result in rendering artefacts due to combined opacity factors), and to minimize their number (as this would otherwise impact performance).
185
186The `getClientRectsNoOverlap()` implements the necessary logic:
187
188https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/common/rect-utils.ts#L35-L39
189
190The differences are very obvious between `range.getClientRects()` and `getClientRectsNoOverlap(range)`. The former often generates many duplicates, overlaps, unnecessarily fragmented boxes, etc.
191
192### Rendering, CSS coordinates in paginated and scroll views
193
194The `createHighlightDom()` function implements a particular strategy for encapsulating rendered highlights inside a hierarchy of DOM elements, including individual client rectangles that make the entire shape, as well as surrounding bounding box (single rectangular shape):
195
196https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L353-L487
197
198Note how `pointer-events` is set to `none` on DOM elements used to render highlights, so that neither bounding boxes nor aggregates client rectangles interfere with document-level user interaction (publication HTML). Yet, rendered highlights must react to pointing device / mouse hover and click. This is done using event delegation on the document's body:
199
200https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L242-L252
201
202...see the `processMouseEvent()` function:
203
204https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L113-L233
205
206Also note how CSS `position` must be `relative` on the HTML `body` element. This is critical for the coordinate system to work:
207
208https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L383
209
210Furthermore, `position` must be `fixed` for CSS columns, `absolute` for reflow scroll, and fixed layout:
211
212https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L439
213
214In Electron/Chromium, depending on whether the document is rendered in scroll or column-paginated mode, there is an offset to take into account when computing coordinates for rendering:
215
216https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L393-L406
217
218There is also a scaling factor for fixed-layout documents:
219
220https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L408
221
222Note that the `DEBUG_VISUALS` condition is checked in the `highlight.ts` code to determine when to render special styles for debugging the highlights. In production mode, this code is not used.
223
224Finally, notice how highlights are given unique identifiers so that they can be managed by the navigator instance:
225
226https://github.com/readium/r2-navigator-js/blob/59c593511502eb460b8252f807a6e11dfebb952e/src/electron/renderer/webview/highlight.ts#L335-L336
227
228This way, renderered highlights can be destroyed when the document formatting changes (e.g. font size), which triggers a complete text reflow and therfore requires recreating the character-level highlights using updated coordinates (newly-generated client rectangles).