UNPKG

15 kBPlain TextView Raw
1==================================================================
2Notes regarding fonts, code pages, and East Asian character widths
3==================================================================
4
5
6Registry settings
7=================
8
9 * There are console registry settings in `HKCU\Console`. That key has many
10 default settings (e.g. the default font settings) and also per-app subkeys
11 for app-specific overrides.
12
13 * It is possible to override the code page with an app-specific setting.
14
15 * There are registry settings in
16 `HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console`. In particular,
17 the `TrueTypeFont` subkey has a list of suitable font names associated with
18 various CJK code pages, as well as default font names.
19
20 * There are two values in `HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage`
21 that specify the current code pages -- `OEMCP` and `ACP`. Setting the
22 system locale via the Control Panel's "Region" or "Language" dialogs seems
23 to change these code page values.
24
25
26Console fonts
27=============
28
29 * The `FontFamily` field of `CONSOLE_FONT_INFOEX` has two parts:
30 - The high four bits can be exactly one of the `FF_xxxx` font families:
31 FF_DONTCARE(0x00)
32 FF_ROMAN(0x10)
33 FF_SWISS(0x20)
34 FF_MODERN(0x30)
35 FF_SCRIPT(0x40)
36 FF_DECORATIVE(0x50)
37 - The low four bits are a bitmask:
38 TMPF_FIXED_PITCH(1) -- actually means variable pitch
39 TMPF_VECTOR(2)
40 TMPF_TRUETYPE(4)
41 TMPF_DEVICE(8)
42
43 * Each console has its own independent console font table. The current font
44 is identified with an index into this table. The size of the table is
45 returned by the undocumented `GetNumberOfConsoleFonts` API. It is apparently
46 possible to get the table size without this API, by instead calling
47 `GetConsoleFontSize` on each nonnegative index starting with 0 until the API
48 fails by returning (0, 0).
49
50 * The font table grows dynamically. Each time the console is configured with
51 a previously-unused (FaceName, Size) combination, two entries are added to
52 the font table -- one with normal weight and one with bold weight. Fonts
53 added this way are always TrueType fonts.
54
55 * Initially, the font table appears to contain only raster fonts. For
56 example, on an English Windows 8 installation, here is the initial font
57 table:
58 font 0: 4x6
59 font 1: 6x8
60 font 2: 8x8
61 font 3: 16x8
62 font 4: 5x12
63 font 5: 7x12
64 font 6: 8x12 -- the current font
65 font 7: 16x12
66 font 8: 12x16
67 font 9: 10x18
68 `GetNumberOfConsoleFonts` returns 10, and this table matches the raster font
69 sizes according to the console properties dialog.
70
71 * With a Japanese or Chinese locale, the initial font table appears to contain
72 the sizes applicable to both the East Asian raster font, as well as the
73 sizes for the CP437/CP1252 raster font.
74
75 * The index passed to `SetCurrentConsoleFontEx` apparently has no effect.
76 The undocumented `SetConsoleFont` API, however, accepts *only* a font index,
77 and on Windows 8 English, it switches between all 10 fonts, even font index
78 #0.
79
80 * If the index passed to `SetConsoleFont` identifies a Raster Font
81 incompatible with the current code page, then another Raster Font is
82 activated.
83
84 * Passing "Terminal" to `SetCurrentConsoleFontEx` seems to have no effect.
85 Perhaps relatedly, `SetCurrentConsoleFontEx` does not fail if it is given a
86 bogus `FaceName`. Some font is still chosen and activated. Passing a face
87 name and height seems to work reliably, modulo the CP936 issue described
88 below.
89
90
91Console fonts and code pages
92============================
93
94 * On an English Windows installation, the default code page is 437, and it
95 cannot be set to 932 (Shift-JIS). (The API call fails.) Changing the
96 system locale to "Japanese (Japan)" using the Region/Language dialog
97 changes the default CP to 932 and permits changing the console CP between
98 437 and 932.
99
100 * A console has both an input code page and an output code page
101 (`{Get,Set}ConsoleCP` and `{Get,Set}ConsoleOutputCP`). I'm not going to
102 distinguish between the two for this document; presumably only the output
103 CP matters. The code page can change while the console is open, e.g.
104 by running `mode con: cp select={932,437,1252}` or by calling
105 `SetConsoleOutputCP`.
106
107 * The current code page restricts which TrueType fonts and which Raster Font
108 sizes are available in the console properties dialog. This can change
109 while the console is open.
110
111 * Changing the code page almost(?) always changes the current console font.
112 So far, I don't know how the new font is chosen.
113
114 * With a CP of 932, the only TrueType font available in the console properties
115 dialog is "MS Gothic", displayed as "MS ゴシック". It is still possible to
116 use the English-default TrueType console fonts, Lucida Console and Consolas,
117 via `SetCurrentConsoleFontEx`.
118
119 * When using a Raster Font and CP437 or CP1252, writing a UTF-16 codepoint not
120 representable in the code page instead writes a question mark ('?') to the
121 console. This conversion does not apply with a TrueType font, nor with the
122 Raster Font for CP932 or CP936.
123
124
125ReadConsoleOutput and double-width characters
126==============================================
127
128 * With a Raster Font active, when `ReadConsoleOutputW` reads two cells of a
129 double-width character, it fills only a single `CHAR_INFO` structure. The
130 unused trailing `CHAR_INFO` structures are zero-filled. With a TrueType
131 font active, `ReadConsoleOutputW` instead fills two `CHAR_INFO` structures,
132 the first marked with `COMMON_LVB_LEADING_BYTE` and the second marked with
133 `COMMON_LVB_TRAILING_BYTE`. The flag is a misnomer--there aren't two
134 *bytes*, but two cells, and they have equal `CHAR_INFO.Char.UnicodeChar`
135 values.
136
137 * `ReadConsoleOutputA`, on the other hand, reads two `CHAR_INFO` cells, and
138 if the UTF-16 value can be represented as two bytes in the ANSI/OEM CP, then
139 the two bytes are placed in the two `CHAR_INFO.Char.AsciiChar` values, and
140 the `COMMON_LVB_{LEADING,TRAILING}_BYTE` values are also used. If the
141 codepoint isn't representable, I don't remember what happens -- I think the
142 `AsciiChar` values take on an invalid marker.
143
144 * Reading only one cell of a double-width character reads a space (U+0020)
145 instead. Raster-vs-TrueType and wide-vs-ANSI do not matter.
146 - XXX: what about attributes? Can a double-width character have mismatched
147 color attributes?
148 - XXX: what happens when writing to just one cell of a double-width
149 character?
150
151
152Default Windows fonts for East Asian languages
153==============================================
154CP932 / Japanese: "MS ゴシック" (MS Gothic)
155CP936 / Chinese Simplified: "新宋体" (SimSun)
156
157
158Unreliable character width (half-width vs full-width)
159=====================================================
160
161The half-width vs full-width status of a codepoint depends on at least these variables:
162 * OS version (Win10 legacy and new modes are different versions)
163 * system locale (English vs Japanese vs Chinese Simplified vs Chinese Traditional, etc)
164 * code page (437 vs 932 vs 936, etc)
165 * raster vs TrueType (Terminal vs MS Gothic vs SimSun, etc)
166 * font size
167 * rendered-vs-model (rendered width can be larger or smaller than model width)
168
169Example 1: U+2014 (EM DASH): East_Asian_Width: Ambiguous
170--------------------------------------------------------
171 rendered modeled
172CP932: Win7/8 Raster Fonts half half
173CP932: Win7/8 Gothic 14/15px half full
174CP932: Win7/8 Consolas 14/15px half full
175CP932: Win7/8 Lucida Console 14px half full
176CP932: Win7/8 Lucida Console 15px half half
177CP932: Win10New Raster Fonts half half
178CP932: Win10New Gothic 14/15px half half
179CP932: Win10New Consolas 14/15px half half
180CP932: Win10New Lucida Console 14/15px half half
181
182CP936: Win7/8 Raster Fonts full full
183CP936: Win7/8 SimSun 14px full full
184CP936: Win7/8 SimSun 15px full half
185CP936: Win7/8 Consolas 14/15px half full
186CP936: Win10New Raster Fonts full full
187CP936: Win10New SimSum 14/15px full full
188CP936: Win10New Consolas 14/15px half half
189
190Example 2: U+3044 (HIRAGANA LETTER I): East_Asian_Width: Wide
191-------------------------------------------------------------
192 rendered modeled
193CP932: Win7/8/10N Raster Fonts full full
194CP932: Win7/8/10N Gothic 14/15px full full
195CP932: Win7/8/10N Consolas 14/15px half(*2) full
196CP932: Win7/8/10N Lucida Console 14/15px half(*3) full
197
198CP936: Win7/8/10N Raster Fonts full full
199CP936: Win7/8/10N SimSun 14/15px full full
200CP936: Win7/8/10N Consolas 14/15px full full
201
202Example 3: U+30FC (KATAKANA-HIRAGANA PROLONGED SOUND MARK): East_Asian_Width: Wide
203----------------------------------------------------------------------------------
204 rendered modeled
205CP932: Win7 Raster Fonts full full
206CP932: Win7 Gothic 14/15px full full
207CP932: Win7 Consolas 14/15px half(*2) full
208CP932: Win7 Lucida Console 14px half(*3) full
209CP932: Win7 Lucida Console 15px half(*3) half
210CP932: Win8 Raster Fonts full full
211CP932: Win8 Gothic 14px full half
212CP932: Win8 Gothic 15px full full
213CP932: Win8 Consolas 14/15px half(*2) full
214CP932: Win8 Lucida Console 14px half(*3) full
215CP932: Win8 Lucida Console 15px half(*3) half
216CP932: Win10New Raster Fonts full full
217CP932: Win10New Gothic 14/15px full full
218CP932: Win10New Consolas 14/15px half(*2) half
219CP932: Win10New Lucida Console 14/15px half(*2) half
220
221CP936: Win7/8 Raster Fonts full full
222CP936: Win7/8 SimSun 14px full full
223CP936: Win7/8 SimSun 15px full half
224CP936: Win7/8 Consolas 14px full full
225CP936: Win7/8 Consolas 15px full half
226CP936: Win10New Raster Fonts full full
227CP936: Win10New SimSum 14/15px full full
228CP936: Win10New Consolas 14/15px full full
229
230Example 4: U+4000 (CJK UNIFIED IDEOGRAPH-4000): East_Asian_Width: Wide
231----------------------------------------------------------------------
232 rendered modeled
233CP932: Win7 Raster Fonts half(*1) half
234CP932: Win7 Gothic 14/15px full full
235CP932: Win7 Consolas 14/15px half(*2) full
236CP932: Win7 Lucida Console 14px half(*3) full
237CP932: Win7 Lucida Console 15px half(*3) half
238CP932: Win8 Raster Fonts half(*1) half
239CP932: Win8 Gothic 14px full half
240CP932: Win8 Gothic 15px full full
241CP932: Win8 Consolas 14/15px half(*2) full
242CP932: Win8 Lucida Console 14px half(*3) full
243CP932: Win8 Lucida Console 15px half(*3) half
244CP932: Win10New Raster Fonts half(*1) half
245CP932: Win10New Gothic 14/15px full full
246CP932: Win10New Consolas 14/15px half(*2) half
247CP932: Win10New Lucida Console 14/15px half(*2) half
248
249CP936: Win7/8 Raster Fonts full full
250CP936: Win7/8 SimSun 14px full full
251CP936: Win7/8 SimSun 15px full half
252CP936: Win7/8 Consolas 14px full full
253CP936: Win7/8 Consolas 15px full half
254CP936: Win10New Raster Fonts full full
255CP936: Win10New SimSum 14/15px full full
256CP936: Win10New Consolas 14/15px full full
257
258(*1) Rendered as a half-width filled white box
259(*2) Rendered as a half-width box with a question mark inside
260(*3) Rendered as a half-width empty box
261(!!) One of the only places in Win10New where rendered and modeled width disagree
262
263
264Windows quirk: unreliable font heights with CP936 / Chinese Simplified
265======================================================================
266
267When I set the font to 新宋体 17px, using either the properties dialog or
268`SetCurrentConsoleFontEx`, the height reported by `GetCurrentConsoleFontEx` is
269not 17, but is instead 19. The same problem does not affect Raster Fonts,
270nor have I seen the problem in the English or Japanese locales. I observed
271this with Windows 7 and Windows 10 new mode.
272
273If I set the font using the facename, width, *and* height, then the
274`SetCurrentConsoleFontEx` and `GetCurrentConsoleFontEx` values agree. If I
275set the font using *only* the facename and height, then the two values
276disagree.
277
278
279Windows bug: GetCurrentConsoleFontEx is initially invalid
280=========================================================
281
282 - Assume there is no configured console font name in the registry. In this
283 case, the console defaults to a raster font.
284 - Open a new console and call the `GetCurrentConsoleFontEx` API.
285 - The `FaceName` field of the returned `CONSOLE_FONT_INFOEX` data
286 structure is incorrect. On Windows 7, 8, and 10, I observed that the
287 field was blank. On Windows 8, occasionally, it instead contained:
288 U+AE72 U+75BE U+0001
289 The other fields of the structure all appeared correct:
290 nFont=6 dwFontSize=(8,12) FontFamily=0x30 FontWeight=400
291 - The `FaceName` field becomes initialized easily:
292 - Open the console properties dialog and click OK. (Cancel is not
293 sufficient.)
294 - Call the undocumented `SetConsoleFont` with the current font table
295 index, which is 6 in the example above.
296 - It seems that the console uncritically accepts whatever string is
297 stored in the registry, including a blank string, and passes it on the
298 the `GetCurrentConsoleFontEx` caller. It is possible to get the console
299 to *write* a blank setting into the registry -- simply open the console
300 (default or app-specific) properties and click OK.