1 | ==================================================================
|
2 | Notes regarding fonts, code pages, and East Asian character widths
|
3 | ==================================================================
|
4 |
|
5 |
|
6 | Registry settings
|
7 | =================
|
8 |
|
9 | * There are console registry settings in `HKCU\Console`. That key has many
|
10 | default settings (e.g. the default font settings) and also per-app subkeys
|
11 | for app-specific overrides.
|
12 |
|
13 | * It is possible to override the code page with an app-specific setting.
|
14 |
|
15 | * There are registry settings in
|
16 | `HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console`. In particular,
|
17 | the `TrueTypeFont` subkey has a list of suitable font names associated with
|
18 | various CJK code pages, as well as default font names.
|
19 |
|
20 | * There are two values in `HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage`
|
21 | that specify the current code pages -- `OEMCP` and `ACP`. Setting the
|
22 | system locale via the Control Panel's "Region" or "Language" dialogs seems
|
23 | to change these code page values.
|
24 |
|
25 |
|
26 | Console fonts
|
27 | =============
|
28 |
|
29 | * The `FontFamily` field of `CONSOLE_FONT_INFOEX` has two parts:
|
30 | - The high four bits can be exactly one of the `FF_xxxx` font families:
|
31 | FF_DONTCARE(0x00)
|
32 | FF_ROMAN(0x10)
|
33 | FF_SWISS(0x20)
|
34 | FF_MODERN(0x30)
|
35 | FF_SCRIPT(0x40)
|
36 | FF_DECORATIVE(0x50)
|
37 | - The low four bits are a bitmask:
|
38 | TMPF_FIXED_PITCH(1) -- actually means variable pitch
|
39 | TMPF_VECTOR(2)
|
40 | TMPF_TRUETYPE(4)
|
41 | TMPF_DEVICE(8)
|
42 |
|
43 | * Each console has its own independent console font table. The current font
|
44 | is identified with an index into this table. The size of the table is
|
45 | returned by the undocumented `GetNumberOfConsoleFonts` API. It is apparently
|
46 | possible to get the table size without this API, by instead calling
|
47 | `GetConsoleFontSize` on each nonnegative index starting with 0 until the API
|
48 | fails by returning (0, 0).
|
49 |
|
50 | * The font table grows dynamically. Each time the console is configured with
|
51 | a previously-unused (FaceName, Size) combination, two entries are added to
|
52 | the font table -- one with normal weight and one with bold weight. Fonts
|
53 | added this way are always TrueType fonts.
|
54 |
|
55 | * Initially, the font table appears to contain only raster fonts. For
|
56 | example, on an English Windows 8 installation, here is the initial font
|
57 | table:
|
58 | font 0: 4x6
|
59 | font 1: 6x8
|
60 | font 2: 8x8
|
61 | font 3: 16x8
|
62 | font 4: 5x12
|
63 | font 5: 7x12
|
64 | font 6: 8x12 -- the current font
|
65 | font 7: 16x12
|
66 | font 8: 12x16
|
67 | font 9: 10x18
|
68 | `GetNumberOfConsoleFonts` returns 10, and this table matches the raster font
|
69 | sizes according to the console properties dialog.
|
70 |
|
71 | * With a Japanese or Chinese locale, the initial font table appears to contain
|
72 | the sizes applicable to both the East Asian raster font, as well as the
|
73 | sizes for the CP437/CP1252 raster font.
|
74 |
|
75 | * The index passed to `SetCurrentConsoleFontEx` apparently has no effect.
|
76 | The undocumented `SetConsoleFont` API, however, accepts *only* a font index,
|
77 | and on Windows 8 English, it switches between all 10 fonts, even font index
|
78 | #0.
|
79 |
|
80 | * If the index passed to `SetConsoleFont` identifies a Raster Font
|
81 | incompatible with the current code page, then another Raster Font is
|
82 | activated.
|
83 |
|
84 | * Passing "Terminal" to `SetCurrentConsoleFontEx` seems to have no effect.
|
85 | Perhaps relatedly, `SetCurrentConsoleFontEx` does not fail if it is given a
|
86 | bogus `FaceName`. Some font is still chosen and activated. Passing a face
|
87 | name and height seems to work reliably, modulo the CP936 issue described
|
88 | below.
|
89 |
|
90 |
|
91 | Console fonts and code pages
|
92 | ============================
|
93 |
|
94 | * On an English Windows installation, the default code page is 437, and it
|
95 | cannot be set to 932 (Shift-JIS). (The API call fails.) Changing the
|
96 | system locale to "Japanese (Japan)" using the Region/Language dialog
|
97 | changes the default CP to 932 and permits changing the console CP between
|
98 | 437 and 932.
|
99 |
|
100 | * A console has both an input code page and an output code page
|
101 | (`{Get,Set}ConsoleCP` and `{Get,Set}ConsoleOutputCP`). I'm not going to
|
102 | distinguish between the two for this document; presumably only the output
|
103 | CP matters. The code page can change while the console is open, e.g.
|
104 | by running `mode con: cp select={932,437,1252}` or by calling
|
105 | `SetConsoleOutputCP`.
|
106 |
|
107 | * The current code page restricts which TrueType fonts and which Raster Font
|
108 | sizes are available in the console properties dialog. This can change
|
109 | while the console is open.
|
110 |
|
111 | * Changing the code page almost(?) always changes the current console font.
|
112 | So far, I don't know how the new font is chosen.
|
113 |
|
114 | * With a CP of 932, the only TrueType font available in the console properties
|
115 | dialog is "MS Gothic", displayed as "MS ゴシック". It is still possible to
|
116 | use the English-default TrueType console fonts, Lucida Console and Consolas,
|
117 | via `SetCurrentConsoleFontEx`.
|
118 |
|
119 | * When using a Raster Font and CP437 or CP1252, writing a UTF-16 codepoint not
|
120 | representable in the code page instead writes a question mark ('?') to the
|
121 | console. This conversion does not apply with a TrueType font, nor with the
|
122 | Raster Font for CP932 or CP936.
|
123 |
|
124 |
|
125 | ReadConsoleOutput and double-width characters
|
126 | ==============================================
|
127 |
|
128 | * With a Raster Font active, when `ReadConsoleOutputW` reads two cells of a
|
129 | double-width character, it fills only a single `CHAR_INFO` structure. The
|
130 | unused trailing `CHAR_INFO` structures are zero-filled. With a TrueType
|
131 | font active, `ReadConsoleOutputW` instead fills two `CHAR_INFO` structures,
|
132 | the first marked with `COMMON_LVB_LEADING_BYTE` and the second marked with
|
133 | `COMMON_LVB_TRAILING_BYTE`. The flag is a misnomer--there aren't two
|
134 | *bytes*, but two cells, and they have equal `CHAR_INFO.Char.UnicodeChar`
|
135 | values.
|
136 |
|
137 | * `ReadConsoleOutputA`, on the other hand, reads two `CHAR_INFO` cells, and
|
138 | if the UTF-16 value can be represented as two bytes in the ANSI/OEM CP, then
|
139 | the two bytes are placed in the two `CHAR_INFO.Char.AsciiChar` values, and
|
140 | the `COMMON_LVB_{LEADING,TRAILING}_BYTE` values are also used. If the
|
141 | codepoint isn't representable, I don't remember what happens -- I think the
|
142 | `AsciiChar` values take on an invalid marker.
|
143 |
|
144 | * Reading only one cell of a double-width character reads a space (U+0020)
|
145 | instead. Raster-vs-TrueType and wide-vs-ANSI do not matter.
|
146 | - XXX: what about attributes? Can a double-width character have mismatched
|
147 | color attributes?
|
148 | - XXX: what happens when writing to just one cell of a double-width
|
149 | character?
|
150 |
|
151 |
|
152 | Default Windows fonts for East Asian languages
|
153 | ==============================================
|
154 | CP932 / Japanese: "MS ゴシック" (MS Gothic)
|
155 | CP936 / Chinese Simplified: "新宋体" (SimSun)
|
156 |
|
157 |
|
158 | Unreliable character width (half-width vs full-width)
|
159 | =====================================================
|
160 |
|
161 | The half-width vs full-width status of a codepoint depends on at least these variables:
|
162 | * OS version (Win10 legacy and new modes are different versions)
|
163 | * system locale (English vs Japanese vs Chinese Simplified vs Chinese Traditional, etc)
|
164 | * code page (437 vs 932 vs 936, etc)
|
165 | * raster vs TrueType (Terminal vs MS Gothic vs SimSun, etc)
|
166 | * font size
|
167 | * rendered-vs-model (rendered width can be larger or smaller than model width)
|
168 |
|
169 | Example 1: U+2014 (EM DASH): East_Asian_Width: Ambiguous
|
170 | --------------------------------------------------------
|
171 | rendered modeled
|
172 | CP932: Win7/8 Raster Fonts half half
|
173 | CP932: Win7/8 Gothic 14/15px half full
|
174 | CP932: Win7/8 Consolas 14/15px half full
|
175 | CP932: Win7/8 Lucida Console 14px half full
|
176 | CP932: Win7/8 Lucida Console 15px half half
|
177 | CP932: Win10New Raster Fonts half half
|
178 | CP932: Win10New Gothic 14/15px half half
|
179 | CP932: Win10New Consolas 14/15px half half
|
180 | CP932: Win10New Lucida Console 14/15px half half
|
181 |
|
182 | CP936: Win7/8 Raster Fonts full full
|
183 | CP936: Win7/8 SimSun 14px full full
|
184 | CP936: Win7/8 SimSun 15px full half
|
185 | CP936: Win7/8 Consolas 14/15px half full
|
186 | CP936: Win10New Raster Fonts full full
|
187 | CP936: Win10New SimSum 14/15px full full
|
188 | CP936: Win10New Consolas 14/15px half half
|
189 |
|
190 | Example 2: U+3044 (HIRAGANA LETTER I): East_Asian_Width: Wide
|
191 | -------------------------------------------------------------
|
192 | rendered modeled
|
193 | CP932: Win7/8/10N Raster Fonts full full
|
194 | CP932: Win7/8/10N Gothic 14/15px full full
|
195 | CP932: Win7/8/10N Consolas 14/15px half(*2) full
|
196 | CP932: Win7/8/10N Lucida Console 14/15px half(*3) full
|
197 |
|
198 | CP936: Win7/8/10N Raster Fonts full full
|
199 | CP936: Win7/8/10N SimSun 14/15px full full
|
200 | CP936: Win7/8/10N Consolas 14/15px full full
|
201 |
|
202 | Example 3: U+30FC (KATAKANA-HIRAGANA PROLONGED SOUND MARK): East_Asian_Width: Wide
|
203 | ----------------------------------------------------------------------------------
|
204 | rendered modeled
|
205 | CP932: Win7 Raster Fonts full full
|
206 | CP932: Win7 Gothic 14/15px full full
|
207 | CP932: Win7 Consolas 14/15px half(*2) full
|
208 | CP932: Win7 Lucida Console 14px half(*3) full
|
209 | CP932: Win7 Lucida Console 15px half(*3) half
|
210 | CP932: Win8 Raster Fonts full full
|
211 | CP932: Win8 Gothic 14px full half
|
212 | CP932: Win8 Gothic 15px full full
|
213 | CP932: Win8 Consolas 14/15px half(*2) full
|
214 | CP932: Win8 Lucida Console 14px half(*3) full
|
215 | CP932: Win8 Lucida Console 15px half(*3) half
|
216 | CP932: Win10New Raster Fonts full full
|
217 | CP932: Win10New Gothic 14/15px full full
|
218 | CP932: Win10New Consolas 14/15px half(*2) half
|
219 | CP932: Win10New Lucida Console 14/15px half(*2) half
|
220 |
|
221 | CP936: Win7/8 Raster Fonts full full
|
222 | CP936: Win7/8 SimSun 14px full full
|
223 | CP936: Win7/8 SimSun 15px full half
|
224 | CP936: Win7/8 Consolas 14px full full
|
225 | CP936: Win7/8 Consolas 15px full half
|
226 | CP936: Win10New Raster Fonts full full
|
227 | CP936: Win10New SimSum 14/15px full full
|
228 | CP936: Win10New Consolas 14/15px full full
|
229 |
|
230 | Example 4: U+4000 (CJK UNIFIED IDEOGRAPH-4000): East_Asian_Width: Wide
|
231 | ----------------------------------------------------------------------
|
232 | rendered modeled
|
233 | CP932: Win7 Raster Fonts half(*1) half
|
234 | CP932: Win7 Gothic 14/15px full full
|
235 | CP932: Win7 Consolas 14/15px half(*2) full
|
236 | CP932: Win7 Lucida Console 14px half(*3) full
|
237 | CP932: Win7 Lucida Console 15px half(*3) half
|
238 | CP932: Win8 Raster Fonts half(*1) half
|
239 | CP932: Win8 Gothic 14px full half
|
240 | CP932: Win8 Gothic 15px full full
|
241 | CP932: Win8 Consolas 14/15px half(*2) full
|
242 | CP932: Win8 Lucida Console 14px half(*3) full
|
243 | CP932: Win8 Lucida Console 15px half(*3) half
|
244 | CP932: Win10New Raster Fonts half(*1) half
|
245 | CP932: Win10New Gothic 14/15px full full
|
246 | CP932: Win10New Consolas 14/15px half(*2) half
|
247 | CP932: Win10New Lucida Console 14/15px half(*2) half
|
248 |
|
249 | CP936: Win7/8 Raster Fonts full full
|
250 | CP936: Win7/8 SimSun 14px full full
|
251 | CP936: Win7/8 SimSun 15px full half
|
252 | CP936: Win7/8 Consolas 14px full full
|
253 | CP936: Win7/8 Consolas 15px full half
|
254 | CP936: Win10New Raster Fonts full full
|
255 | CP936: Win10New SimSum 14/15px full full
|
256 | CP936: Win10New Consolas 14/15px full full
|
257 |
|
258 | (*1) Rendered as a half-width filled white box
|
259 | (*2) Rendered as a half-width box with a question mark inside
|
260 | (*3) Rendered as a half-width empty box
|
261 | (!!) One of the only places in Win10New where rendered and modeled width disagree
|
262 |
|
263 |
|
264 | Windows quirk: unreliable font heights with CP936 / Chinese Simplified
|
265 | ======================================================================
|
266 |
|
267 | When I set the font to 新宋体 17px, using either the properties dialog or
|
268 | `SetCurrentConsoleFontEx`, the height reported by `GetCurrentConsoleFontEx` is
|
269 | not 17, but is instead 19. The same problem does not affect Raster Fonts,
|
270 | nor have I seen the problem in the English or Japanese locales. I observed
|
271 | this with Windows 7 and Windows 10 new mode.
|
272 |
|
273 | If I set the font using the facename, width, *and* height, then the
|
274 | `SetCurrentConsoleFontEx` and `GetCurrentConsoleFontEx` values agree. If I
|
275 | set the font using *only* the facename and height, then the two values
|
276 | disagree.
|
277 |
|
278 |
|
279 | Windows bug: GetCurrentConsoleFontEx is initially invalid
|
280 | =========================================================
|
281 |
|
282 | - Assume there is no configured console font name in the registry. In this
|
283 | case, the console defaults to a raster font.
|
284 | - Open a new console and call the `GetCurrentConsoleFontEx` API.
|
285 | - The `FaceName` field of the returned `CONSOLE_FONT_INFOEX` data
|
286 | structure is incorrect. On Windows 7, 8, and 10, I observed that the
|
287 | field was blank. On Windows 8, occasionally, it instead contained:
|
288 | U+AE72 U+75BE U+0001
|
289 | The other fields of the structure all appeared correct:
|
290 | nFont=6 dwFontSize=(8,12) FontFamily=0x30 FontWeight=400
|
291 | - The `FaceName` field becomes initialized easily:
|
292 | - Open the console properties dialog and click OK. (Cancel is not
|
293 | sufficient.)
|
294 | - Call the undocumented `SetConsoleFont` with the current font table
|
295 | index, which is 6 in the example above.
|
296 | - It seems that the console uncritically accepts whatever string is
|
297 | stored in the registry, including a blank string, and passes it on the
|
298 | the `GetCurrentConsoleFontEx` caller. It is possible to get the console
|
299 | to *write* a blank setting into the registry -- simply open the console
|
300 | (default or app-specific) properties and click OK.
|