According to the Unicode specification (at least as of 5.1), CRLF sequences
are considered to be a single grapheme. We cater to that special case by
letting grapheme_ascii_check() fail. While it would be trivial to fix
grapheme_ascii_check() wrt. grapheme_strlen(), grapheme_substr() and
grapheme_strrpos() would be much harder to handle, so we accept the slight
performance penalty if CRLF is involved.
- IMAP:
. Fixed bug #72852 (imap_mail null dereference). (Anatol)
+- Intl:
+ . Fixed bug #65732 (grapheme_*() is not Unicode compliant on CR LF
+ sequence). (cmb)
+
- JSON:
. Fixed bug #72787 (json_decode reads out of bounds). (Jakub Zelenka)
{
int ret_len = len;
while ( len-- ) {
- if ( *day++ > 0x7f )
+ if ( *day++ > 0x7f || (*day == '\n' && *(day - 1) == '\r') )
return -1;
}
--- /dev/null
+--TEST--
+Bug #65732 (grapheme_*() is not Unicode compliant on CR LF sequence)
+--SKIPIF--
+<?php
+if (!extension_loaded('intl')) die('skip intl extension not available');
+?>
+--FILE--
+<?php
+var_dump(grapheme_strlen("\r\n"));
+var_dump(grapheme_substr(implode("\r\n", ['abc', 'def', 'ghi']), 5));
+var_dump(grapheme_strrpos("a\r\nb", 'b'));
+?>
+==DONE==
+--EXPECT--
+int(1)
+string(7) "ef
+ghi"
+int(2)
+==DONE==