]> granicus.if.org Git - php/commit
0x5C is not a Yen sign in CP932 (or CP51932)
authorAlex Dowad <alexinbeijing@gmail.com>
Sat, 14 Nov 2020 19:15:11 +0000 (21:15 +0200)
committerAlex Dowad <alexinbeijing@gmail.com>
Wed, 25 Nov 2020 18:51:45 +0000 (20:51 +0200)
commite4ee97911132c6ad4dee372369472316a33b4eee
tree0a220096aaf7b90475d553ceb084b9fbcf279752
parent315d48b4340f79731882b7f87422801a065475b8
0x5C is not a Yen sign in CP932 (or CP51932)

When Microsoft created CP932 (their version of Shift-JIS), they explicitly
used bytes 0-0x7F to represent ASCII characters rather than JIS X 0201
characters.

So when converting Unicode to CP932, it is not correct to convert U+00A5
to CP932 0x5C. Fortunately, CP932 does have a multi-byte FULLWIDTH YEN SIGN
character which we can use instead.

CP51932 uses the same extended character set as CP932; while CP932 is
MicroSoft's extended version of Shift-JIS, CP51932 is their extended version
of EUC-JP. So the same reasoning applies to CP51932.
ext/mbstring/libmbfl/filters/mbfilter_cp51932.c
ext/mbstring/libmbfl/filters/mbfilter_cp932.c
ext/mbstring/tests/cp51932_encoding.phpt
ext/mbstring/tests/cp932_encoding.phpt