From: Yasuo Ohgaki Date: Wed, 10 Aug 2016 00:47:27 +0000 (+0900) Subject: pull-request/1100 X-Git-Tag: php-7.2.0alpha1~1557 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=087dcd9381c33057901dbe1ef89847d6fa87316d;p=php pull-request/1100 Request #65081 mb_chr() and mb_ord() Added test cases and little optimization. --- 087dcd9381c33057901dbe1ef89847d6fa87316d diff --cc NEWS index e4bef697d8,d7ac6bfab2..5f0f9f45fa --- a/NEWS +++ b/NEWS @@@ -1,26 -1,313 +1,29 @@@ -PHP NEWS +PHP NEWS ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| -?? ??? 20??, PHP 7.0.0 +?? ??? 2016, PHP 7.2.0alpha1 -<<<<<<< HEAD -- CLI server: - . Refactor MIME type handling to use a hash table instead of linear search. - (Adam) - . Update the MIME type list from the one shipped by Apache HTTPD. (Adam) -======= - Core: - . Fixed bug #69139 (Crash in gc_zval_possible_root on unserialize). - (Laruence) - . Fixed bug #69121 (Segfault in get_current_user when script owner is not - in passwd with ZTS build). (dan at syneto dot net) - . Fixed bug #65593 (Segfault when calling ob_start from output buffering - callback). (Mike) - . Fixed bug #68986 (pointer returned by php_stream_fopen_temporary_file - not validated in memory.c). (nayana at ddproperty dot com) - . Fixed bug #68166 (Exception with invalid character causes segv). (Rasmus) - . Fixed bug #69141 (Missing arguments in reflection info for some builtin - functions). (kostyantyn dot lysyy at oracle dot com) - . Fixed bug #68917 (parse_url fails on some partial urls). (Wei Dai) - -- cURL: - . Fixed bug #69088 (PHP_MINIT_FUNCTION does not fully initialize cURL on - Win32). (Grant Pannell) - . Add CURLPROXY_SOCKS4A and CURLPROXY_SOCKS5_HOSTNAME constants if supported - by libcurl. (Linus Unneback) - -- ODBC: - . Fixed bug #68964 (Allowed memory size exhausted with odbc_exec). (Anatol) - -- Opcache: - . Fixed bug #69125 (Array numeric string as key). (Laruence) - . Fixed bug #69038 (switch(SOMECONSTANT) misbehaves). (Laruence) - -- OpenSSL: - . Fixed bug #68912 (Segmentation fault at openssl_spki_new). (Laruence) - . Fixed bug #61285, #68329, #68046, #41631 (encrypted streams don't observe - socket timeouts). (Brad Broerman) - . Fixed bug #68920 (use strict peer_fingerprint input checks) - (Daniel Lowrey) - . Added "alpn_protocols" SSL context option allowing encrypted client/server - streams to negotiate alternative protocols using the ALPN TLS extension when - built against OpenSSL 1.0.2 or newer. Negotiated protocol information is - accessible by passing streams to the new stream_socket_crypto_info(). - (Daniel Lowrey) - -- pgsql: - . Fixed bug #68638 (pg_update() fails to store infinite values). - (william dot welter at 4linux dot com dot br, Laruence) - -- Readline: - . Fixed bug #69054 (Null dereference in readline_(read|write)_history() without - parameters). (Laruence) - -- SOAP: - . Fixed bug #69085 (SoapClient's __call() type confusion through - unserialize()). (andrea dot palazzo at truel dot it, Laruence) - -- SPL: - . Fixed bug #69108 ("Segmentation fault" when (de)serializing - SplObjectStorage). (Laruence) - . Fixed bug #68557 (RecursiveDirectoryIterator::seek(0) broken after - calling getChildren()). (Julien) - -- Stream: - . Added stream_socket_crypto_info() allowing inspection of negotiated TLS values - -- CGI: - . Fixed bug #69015 (php-cgi's getopt does not see $argv). (Laruence) - -- CLI: - . Fixed bug #67741 (auto_prepend_file messes up __LINE__). (Reeze Xia) - -- FPM: - . Fixed bug #68822 (request time is reset too early). (honghu069 at 163 dot com) - -19 Feb 2015, PHP 5.6.6 - -- Core: - . Removed support for multi-line headers, as the are deprecated by RFC 7230. - (Stas) - . Fixed bug #67068 (getClosure returns somethings that's not a closure). - (Danack at basereality dot com) - . Fixed bug #68942 (Use after free vulnerability in unserialize() with - DateTimeZone). (CVE-2015-0273) (Stas) - . Fixed bug #68925 (Mitigation for CVE-2015-0235 – GHOST: glibc gethostbyname - buffer overflow). (Stas) - . Fixed Bug #67988 (htmlspecialchars() does not respect default_charset - specified by ini_set) (Yasuo) - . Added NULL byte protection to exec, system and passthru. (Yasuo) - -- Dba: - . Fixed bug #68711 (useless comparisons). (bugreports at internot dot info) - -- Enchant: - . Fixed bug #68552 (heap buffer overflow in enchant_broker_request_dict()). - (Antony) - -- Fileinfo: - . Fixed bug #68827 (Double free with disabled ZMM). (Joshua Rogers) - . Fixed bug #67647 (Bundled libmagic 5.17 does not detect quicktime files - correctly). (Anatol) - . Fixed bug #68731 (finfo_buffer doesn't extract the correct mime with some - gifs). (Anatol) - -- FPM: - . Fixed bug #66479 (Wrong response to FCGI_GET_VALUES). (Frank Stolle) - . Fixed bug #68571 (core dump when webserver close the socket). - (redfoxli069 at gmail dot com, Laruence) - -- JSON: - . Fixed bug #50224 (json_encode() does not always encode a float as a float) - by adding JSON_PRESERVE_ZERO_FRACTION. (Juan Basso) - -- LIBXML: - . Fixed bug #64938 (libxml_disable_entity_loader setting is shared - between threads). (Martin Jansen) - -- Mysqli: - . Fixed bug #68114 (linker error on some OS X machines with fixed - width decimal support) (Keyur Govande) - . Fixed bug #68657 (Reading 4 byte floats with Mysqli and libmysqlclient - has rounding errors) (Keyur Govande) - -- Opcache: - . Fixed bug with try blocks being removed when extended_info opcode - generation is turned on. (Laruence) - -- PDO_mysql: - . Fixed bug #68750 (PDOMysql with mysqlnd does not allow the usage of - named pipes). (steffenb198 at aol dot com) - -- Phar: - . Fixed bug #68901 (use after free). (bugreports at internot dot info) - -- Pgsql: - . Fixed Bug #65199 (pg_copy_from() modifies input array variable) (Yasuo) - -- Session: - . Fixed bug #68941 (mod_files.sh is a bash-script) (bugzilla at ii.nl, Yasuo) - . Fixed Bug #66623 (no EINTR check on flock) (Yasuo) - . Fixed bug #68063 (Empty session IDs do still start sessions) (Yasuo) - -- Sqlite3: - . Fixed bug #68260 (SQLite3Result::fetchArray declares wrong - required_num_args). (Julien) - -- Standard: - . Fixed bug #65272 (flock() out parameter not set correctly in windows). - (Daniel Lowrey) - . Fixed bug #69033 (Request may get env. variables from previous requests - if PHP works as FastCGI). (Anatol) - -- Streams: - . Fixed bug which caused call after final close on streams filter. (Bob) - -22 Jan 2015, PHP 5.6.5 ->>>>>>> PHP-5.6 - -- Core: - . Fixed bug #68933 (Invalid read of size 8 in zend_std_read_property). - (Laruence, arjen at react dot com) - . Fixed bug #68868 (Segfault in clean_non_persistent_constants() in SugarCRM - 6.5.20). (Laruence) - . Fixed bug #68104 (Segfault while pre-evaluating a disabled function). - (Laruence) - . Fixed bug #68252 (segfault in Zend/zend_hash.c in function - _zend_hash_del_el). (Laruence) - . Added PHP_INT_MIN constant. (Andrea) - . Added Closure::call() method. (Andrea) - . Implemented FR #38409 (parse_ini_file() looses the type of booleans). (Tjerk) - . Fixed bug #67959 (Segfault when calling phpversion('spl')). (Florian) - . Implemented the RFC `Catchable "Call to a member function bar() on a - non-object"`. (Timm) - . Added options parameter for unserialize allowing to specify acceptable - classes (https://wiki.php.net/rfc/secure_unserialize). (Stas) - . Fixed bug #68185 ("Inconsistent insteadof definition."- incorrectly triggered). (Julien) - . Fixed bug #65419 (Inside trait, self::class != __CLASS__). (Julien) - . Fixed bug #65576 (Constructor from trait conflicts with inherited - constructor). (dunglas at gmail dot com) - . Removed ZEND_ACC_FINAL_CLASS, promoting ZEND_ACC_FINAL as final class - modifier. (Guilherme Blanco) - . is_long() & is_integer() is now an alias of is_int(). (Kalle) - . Implemented FR #55467 (phpinfo: PHP Variables with $ and single quotes). (Kalle) - . Fixed bug #55415 (php_info produces invalid anchor names). (Kalle, Johannes) - . Added ?? operator. (Andrea) - . Added <=> operator. (Andrea) - . Added \u{xxxxx} Unicode Codepoint Escape Syntax. (Andrea) - . Fixed oversight where define() did not support arrays yet const syntax did. (Andrea, Dmitry) - . Use "integer" and "float" instead of "long" and "double" in ZPP, type hint and conversion error messages. (Andrea) - . Implemented FR #55428 (E_RECOVERABLE_ERROR when output buffering in output buffering handler). (Kalle) - . Removed scoped calls of non-static methods from an incompatible $this - context. (Nikita) - . Removed support for #-style comments in ini files. (Nikita) - . Removed support for assigning the result of new by reference. (Nikita) - . Invalid octal literals in source code now produce compile errors, fixes PHPSadness #31. (Andrea) - . Removed dl() function on fpm-fcgi. (Nikita) - . Removed support for hexadecimal numeric strings. (Nikita) - . Removed obsolete extensions and SAPIs. See the full list in UPGRADING. (Anatol) - . Added NULL byte protection to exec, system and passthru. (Yasuo) - -- Curl: - . Fixed bug #68937 (Segfault in curl_multi_exec). (Laruence) - -- Date: - . Fixed day_of_week function as it could sometimes return negative values - internally. (Derick) - . Removed $is_dst parameter from mktime() and gmmktime(). (Nikita) - . Removed date.timezone warning (https://wiki.php.net/rfc/date.timezone_warning_removal). (Bob) - -- DBA: - . Fixed bug #62490 (dba_delete returns true on missing item (inifile)). (Mike) - . Fixed bug #68711 (useless comparisons). (bugreports at internot dot info) - -- DOM: - . Made DOMNode::textContent writeable. (Tjerk) - -- GD: - . Made fontFetch's path parser thread-safe. (Sara) - -- Fileinfo: - . Fixed bug #66242 (libmagic: don't assume char is signed). (ArdB) - -- Filter: - . New FILTER_VALIDATE_DOMAIN and better RFC conformance for FILTER_VALIDATE_URL. (Kevin Dunglas) - -- FPM: - . Fixed bug #68945 (Unknown admin values segfault pools). (Laruence) - . Fixed bug #65933 (Cannot specify config lines longer than 1024 bytes). (Chris Wright) - . Implement request #67106 (Split main fpm config). (Elan Ruusamäe, Remi) - -- JSON - . Replace non-free JSON parser with a parser from Jsond extension, fixes #63520 - (JSON extension includes a problematic license statement). (Jakub Zelenka) - . Fixed bug #68938 (json_decode() decodes empty string without error). - (jeremy at bat-country dot us) - -- LiteSpeed: - . Updated LiteSpeed SAPI code from V5.5 to V6.6. (George Wang) - -- Mcrypt: - . Fixed possible read after end of buffer and use after free. (Dmitry) - -- Opcache: - . Fixed bug with try blocks being removed when extended_info opcode - generation is turned on. (Laruence) - . Fixed bug #68644 (strlen incorrect : mbstring + func_overload=2 +UTF-8 - + Opcache). (Laruence) - -- OpenSSL: - . Fixed bug #61285, #68329, #68046, #41631 (encrypted streams don't observe - socket timeouts). (Brad Broerman) - -- pcntl: - . Fixed bug #60509 (pcntl_signal doesn't decrease ref-count of old handler - when setting SIG_DFL). (Julien) - -- PCRE: - . Removed support for the /e (PREG_REPLACE_EVAL) modifier. (Nikita) - -- PDO: - . Fixed bug #59450 (./configure fails with "Cannot find php_pdo_driver.h"). - (maxime dot besson at smile dot fr) - -- PDO_mysql: - . Fixed bug #68424 (Add new PDO mysql connection attr to control multi - statements option). (peter dot wolanin at acquia dot com) - -- Reflection - . Fixed inheritance chain of Reflector interface. (Tjerk) - -- Session: - . Fixed bug #67694 (Regression in session_regenerate_id()). (Tjerk) - . Fixed bug #68941 (mod_files.sh is a bash-script). (bugzilla at ii.nl, Yasuo) - -- SOAP: - . Fixed bug #68361 (Segmentation fault on SoapClient::__getTypes). (Laruence) - -- SPL: - . Implemented #67886 (SplPriorityQueue/SplHeap doesn't expose extractFlags - nor curruption state). (Julien) - . Fixed bug #66405 (RecursiveDirectoryIterator::CURRENT_AS_PATHNAME - breaks the RecursiveIterator). (Paul Garvin) - . Fixed bug #68479 (Added escape parameter to SplFileObject::fputcsv). (Salathe) - -- Sqlite3: - . Fixed bug #68260 (SQLite3Result::fetchArray declares wrong - required_num_args). (Julien) - -- Standard: - . Removed call_user_method() and call_user_method_array() functions. (Kalle) - . Fixed user session handlers (See rfc:session.user.return-value). (Sara) - . Added intdiv() function. (Andrea) - . Improved precision of log() function for base 2 and 10. (Marc Bennewitz) - . Remove string category support in setlocale(). (Nikita) - . Remove set_magic_quotes_runtime() and its alias magic_quotes_runtime(). - (Nikita) - . Fixed bug #65272 (flock() out parameter not set correctly in windows). - (Daniel Lowrey) - -- Streams: - . Fixed bug #68532 (convert.base64-encode omits padding bytes). - (blaesius at krumedia dot de) - . Removed set_socket_blocking() in favor of its alias stream_set_blocking(). - (Nikita) - -- XSL: - . Fixed bug #64776 (The XSLT extension is not thread safe). (Mike) + . Fixed bug #54535 (WSA cleanup executes before MSHUTDOWN). (Kalle) + +- EXIF: + . Added support for vendor specific tags for the following formats: + Samsung, DJI, Panasonic, Sony, Pentax, Minolta & Sigma/Foveon. (Kalle) + . Fixed bug #72682 (exif_read_data() fails to read all data for some + images). (Kalle) + . Fixed bug #71534 (Type confusion in exif_read_data() leading to heap + overflow in debug mode). (hlt99 at blinkenshell dot org, Kalle) + . Fixed bug #68547 (Exif Header component value check error). + (sjh21a at gmail dot com, Kalle) + . Fixed bug #66443 (Corrupt EXIF header: maximum directory nesting level + reached for some cameras). (Kalle) + . Fixed Redhat bug #1362571 (PHP not returning full results for + exif_read_data function). (Kalle) + +- GMP: + . Fixed bug #70896 (gmp_fact() silently ignores non-integer input). (Sara) + ++- Mbstring: ++ . Implemented reuqest #66024 (chr() and mb_ord()) (Masakielastic, Yasuo) + <<< NOTE: Insert NEWS from last stable release here prior to actual release! >>> + diff --cc UPGRADING index 9f59a49757,d4b1d0afa3..12a5ec84ef --- a/UPGRADING +++ b/UPGRADING @@@ -157,17 -441,15 +157,20 @@@ PHP 7.1 UPGRADE NOTE ======================================== 6. New Functions ======================================== -- GMP - . Added gmp_random_seed(). +- Core: + . Added sapi_windows_cp_set(), sapi_windows_cp_get(), sapi_windows_cp_is_utf8(), + sapi_windows_cp_conv() for codepage handling. -- Standard - . Added intdiv() function for integer division. +- pcntl: + . Added pcntl_signal_get_handler() that returns the current signal handler + for a particular signal. -- Stream: - . Added stream_socket_crypto_info() allowing inspection of negotiated TLS - connection properties ++- Mbstring: ++ . Added mb_chr() and mb_ord(). ++ +- Standard: + . Added is_iterable() that determines if a value will be accepted by the new + iterable pseudo-type. ======================================== 7. New Classes and Interfaces diff --cc ext/mbstring/mbstring.c index a44a9dade1,6cf91c094b..afa54ac98c --- a/ext/mbstring/mbstring.c +++ b/ext/mbstring/mbstring.c @@@ -4745,6 -4591,392 +4757,318 @@@ PHP_FUNCTION(mb_check_encoding } /* }}} */ -static const enum mbfl_no_encoding php_mb_unsupported_no_encoding_list[] = { - mbfl_no_encoding_pass, - mbfl_no_encoding_auto, - mbfl_no_encoding_wchar, - mbfl_no_encoding_byte2be, - mbfl_no_encoding_byte2le, - mbfl_no_encoding_byte4be, - mbfl_no_encoding_byte4le, - mbfl_no_encoding_base64, - mbfl_no_encoding_uuencode, - mbfl_no_encoding_html_ent, - mbfl_no_encoding_qprint, - mbfl_no_encoding_utf7, - mbfl_no_encoding_utf7imap, - mbfl_no_encoding_2022kr, - mbfl_no_encoding_jis, - mbfl_no_encoding_2022jp, - mbfl_no_encoding_2022jpms, - mbfl_no_encoding_jis_ms, - mbfl_no_encoding_2022jp_2004, - mbfl_no_encoding_2022jp_kddi, - mbfl_no_encoding_cp50220, - mbfl_no_encoding_cp50220raw, - mbfl_no_encoding_cp50221, - mbfl_no_encoding_cp50222 -}; + -static inline int php_mb_is_unsupported_no_encoding(enum mbfl_no_encoding no_enc) ++/* See mbfl_no_encoding definition for list of unsupported encodings */ ++static inline zend_bool php_mb_is_unsupported_no_encoding(enum mbfl_no_encoding no_enc) + { - int i; - int size = sizeof(php_mb_unsupported_no_encoding_list)/sizeof(php_mb_unsupported_no_encoding_list[0]); - - for (i = 0; i < size; i++) { - - if (no_enc == php_mb_unsupported_no_encoding_list[i]) { - return 1; - } - - } - - return 0; ++ return ((no_enc >= mbfl_no_encoding_invalid && no_enc <= mbfl_no_encoding_qprint) ++ || (no_enc >= mbfl_no_encoding_utf7 && no_enc <= mbfl_no_encoding_utf7imap) ++ || (no_enc >= mbfl_no_encoding_jis && no_enc <= mbfl_no_encoding_2022jpms) ++ || (no_enc >= mbfl_no_encoding_cp50220 && no_enc <= mbfl_no_encoding_cp50222)); + } + -static const enum mbfl_no_encoding php_mb_no_encoding_unicode_list[] = { - mbfl_no_encoding_utf8, - mbfl_no_encoding_utf8_docomo, - mbfl_no_encoding_utf8_kddi_a, - mbfl_no_encoding_utf8_kddi_b, - mbfl_no_encoding_utf8_sb, - mbfl_no_encoding_ucs4, - mbfl_no_encoding_ucs4be, - mbfl_no_encoding_ucs4le, - mbfl_no_encoding_utf32, - mbfl_no_encoding_utf32be, - mbfl_no_encoding_utf32le, - mbfl_no_encoding_ucs2, - mbfl_no_encoding_ucs2be, - mbfl_no_encoding_ucs2le, - mbfl_no_encoding_utf16, - mbfl_no_encoding_utf16be, - mbfl_no_encoding_utf16le -}; + -static inline int php_mb_is_no_encoding_unicode(enum mbfl_no_encoding no_enc) ++/* See mbfl_no_encoding definition for list of unicode encodings */ ++static inline zend_bool php_mb_is_no_encoding_unicode(enum mbfl_no_encoding no_enc) + { - int i; - int size = sizeof(php_mb_no_encoding_unicode_list)/sizeof(php_mb_no_encoding_unicode_list[0]); - - for (i = 0; i < size; i++) { - - if (no_enc == php_mb_no_encoding_unicode_list[i]) { - return 1; - } - - } - - return 0; ++ return (no_enc >= mbfl_no_encoding_ucs4 && no_enc <= mbfl_no_encoding_utf8_sb); + } + -static const enum mbfl_no_encoding php_mb_no_encoding_utf8_list[] = { - mbfl_no_encoding_utf8, - mbfl_no_encoding_utf8_docomo, - mbfl_no_encoding_utf8_kddi_a, - mbfl_no_encoding_utf8_kddi_b, - mbfl_no_encoding_utf8_sb -}; + -static inline int php_mb_is_no_encoding_utf8(enum mbfl_no_encoding no_enc) ++/* See mbfl_no_encoding definition for list of UTF-8 encodings */ ++static inline zend_bool php_mb_is_no_encoding_utf8(enum mbfl_no_encoding no_enc) + { - int i; - int size = sizeof(php_mb_no_encoding_utf8_list)/sizeof(php_mb_no_encoding_utf8_list[0]); - - for (i = 0; i < size; i++) { - - if (no_enc == php_mb_no_encoding_utf8_list[i]) { - return 1; - } - - } - - return 0; ++ return (no_enc >= mbfl_no_encoding_utf8 && no_enc <= mbfl_no_encoding_utf8_sb); + } + ++ + static inline zend_long php_mb_ord(const char* str, size_t str_len, const char* enc) + { + enum mbfl_no_encoding no_enc; + char* ret; + size_t ret_len; + const mbfl_encoding *encoding; + unsigned char char_len; + zend_long cp; + + if (enc == NULL) { + no_enc = MBSTRG(current_internal_encoding)->no_encoding; + } else { + no_enc = mbfl_name2no_encoding(enc); + + if (no_enc == mbfl_no_encoding_invalid) { - php_error_docref(NULL, E_WARNING, "Unknown encoding \"%s\"", enc); ++ php_error_docref(NULL, E_WARNING, "Unknown encoding \"%s\"", enc); + return -1; + } + } + + if (php_mb_is_no_encoding_unicode(no_enc)) { + + ret = php_mb_convert_encoding(str, str_len, "UCS-4BE", enc, &ret_len); + + if (ret == NULL) { + return -1; + } + + cp = (unsigned char) ret[0] << 24 | \ + (unsigned char) ret[1] << 16 | \ + (unsigned char) ret[2] << 8 | \ + (unsigned char) ret[3]; + + efree(ret); + + return cp; + + } else if (php_mb_is_unsupported_no_encoding(no_enc)) { + php_error_docref(NULL, E_WARNING, "Unsupported encoding \"%s\"", enc); + return -1; + } + + ret = php_mb_convert_encoding(str, str_len, enc, enc, &ret_len); + + if (ret == NULL) { + return -1; + } + + encoding = mbfl_no2encoding(no_enc); + char_len = php_mb_mbchar_bytes_ex(ret, encoding); + + if (char_len == 1) { + cp = (unsigned char) ret[0]; + } else if (char_len == 2) { + cp = ((unsigned char) ret[0] << 8) | \ + (unsigned char) ret[1]; + } else if (char_len == 3) { + cp = ((unsigned char) ret[0] << 16) | \ + ((unsigned char) ret[1] << 8) | \ + (unsigned char) ret[2]; + } else { + cp = ((unsigned char) ret[0] << 24) | \ + ((unsigned char) ret[1] << 16) | \ + ((unsigned char) ret[2] << 8) | \ + (unsigned char) ret[3]; + } + + efree(ret); + + return cp; + } + ++ + /* {{{ proto bool mb_ord([string str[, string encoding]]) */ + PHP_FUNCTION(mb_ord) + { + char* str; + size_t str_len; + char* enc = NULL; + size_t enc_len; + zend_long cp; + + #ifndef FAST_ZPP + if (zend_parse_parameters(ZEND_NUM_ARGS(), "s|s", &str, &str_len, &enc, &enc_len) == FAILURE) { + return; + } + #else + ZEND_PARSE_PARAMETERS_START(1, 2) + Z_PARAM_STRING(str, str_len) + Z_PARAM_OPTIONAL + Z_PARAM_STRING(enc, enc_len) + ZEND_PARSE_PARAMETERS_END(); + #endif + + cp = php_mb_ord(str, str_len, enc); + + if (0 > cp) { + RETURN_FALSE; + } + + RETURN_LONG(cp); + } + /* }}} */ + ++ + static inline char* php_mb_chr(zend_long cp, const char* enc, size_t *output_len) + { + enum mbfl_no_encoding no_enc; + char* buf; + size_t buf_len; + char* ret; + size_t ret_len; + + if (enc == NULL) { + no_enc = MBSTRG(current_internal_encoding)->no_encoding; + } else { + no_enc = mbfl_name2no_encoding(enc); + if (no_enc == mbfl_no_encoding_invalid) { + php_error_docref(NULL, E_WARNING, "Unknown encoding \"%s\"", enc); + return NULL; + } + } + - + if (php_mb_is_no_encoding_utf8(no_enc)) { + + if (0 > cp || cp > 0x10ffff || (cp > 0xd7ff && 0xe000 > cp)) { + if (php_mb_is_no_encoding_utf8(MBSTRG(current_internal_encoding)->no_encoding)) { + cp = MBSTRG(current_filter_illegal_substchar); + } else if (php_mb_is_no_encoding_unicode(MBSTRG(current_internal_encoding)->no_encoding)) { + if (0xd800 > MBSTRG(current_filter_illegal_substchar) || MBSTRG(current_filter_illegal_substchar) > 0xdfff) { + cp = MBSTRG(current_filter_illegal_substchar); + } else { + cp = 0x3f; + } + } else { + cp = 0x3f; + } + } + + if (cp < 0x80) { + ret_len = 1; + ret = (char *) safe_emalloc(ret_len, 1, 1); + ret[0] = cp; + ret[1] = 0; + } else if (cp < 0x800) { + ret_len = 2; + ret = (char *) safe_emalloc(ret_len, 1, 1); + ret[0] = 0xc0 | (cp >> 6); + ret[1] = 0x80 | (cp & 0x3f); + ret[2] = 0; + } else if (cp < 0x10000) { + ret_len = 3; + ret = (char *) safe_emalloc(ret_len, 1, 1); + ret[0] = 0xe0 | (cp >> 12); + ret[1] = 0x80 | ((cp >> 6) & 0x3f); + ret[2] = 0x80 | (cp & 0x3f); + ret[3] = 0; + } else { + ret_len = 4; + ret = (char *) safe_emalloc(ret_len, 1, 1); + ret[0] = 0xf0 | (cp >> 18); + ret[1] = 0x80 | ((cp >> 12) & 0x3f); + ret[2] = 0x80 | ((cp >> 6) & 0x3f); + ret[3] = 0x80 | (cp & 0x3f); + ret[4] = 0; + } + + if (output_len) { + *output_len = ret_len; + } + + return ret; + + } else if (php_mb_is_no_encoding_unicode(no_enc)) { + + if (0 > cp || 0x10ffff < cp) { + + if (php_mb_is_no_encoding_unicode(MBSTRG(current_internal_encoding)->no_encoding)) { + cp = MBSTRG(current_filter_illegal_substchar); + } else { + cp = 0x3f; + } + + } + + buf_len = 4; + buf = (char *) safe_emalloc(buf_len, 1, 1); + buf[0] = (cp >> 24) & 0xff; + buf[1] = (cp >> 16) & 0xff; + buf[2] = (cp >> 8) & 0xff; + buf[3] = cp & 0xff; + buf[4] = 0; + + ret = php_mb_convert_encoding(buf, buf_len, enc, "UCS-4BE", &ret_len); + efree(buf); + + if (output_len) { + *output_len = ret_len; + } + + return ret; + + } else if (php_mb_is_unsupported_no_encoding(no_enc)) { + php_error_docref(NULL, E_WARNING, "Unsupported encoding \"%s\"", enc); + return NULL; + } + + if (0 > cp || cp > 0x100000000) { + if (no_enc == MBSTRG(current_internal_encoding)->no_encoding) { + cp = MBSTRG(current_filter_illegal_substchar); + } else { + cp = 0x3f; + } + } + + if (cp < 0x100) { + buf_len = 1; + buf = (char *) safe_emalloc(buf_len, 1, 1); + buf[0] = cp; + buf[1] = 0; + } else if (cp < 0x10000) { + buf_len = 2; + buf = (char *) safe_emalloc(buf_len, 1, 1); + buf[0] = cp >> 8; + buf[1] = cp & 0xff; + buf[2] = 0; + } else if (cp < 0x1000000) { + buf_len = 3; + buf = (char *) safe_emalloc(buf_len, 1, 1); + buf[0] = cp >> 16; + buf[1] = (cp >> 8) & 0xff; + buf[2] = cp & 0xff; + buf[3] = 0; + } else { + buf_len = 4; + buf = (char *) safe_emalloc(buf_len, 1, 1); - buf[0] = cp >> 24; ++ buf[0] = cp >> 24; + buf[1] = (cp >> 16) & 0xff; + buf[2] = (cp >> 8) & 0xff; + buf[3] = cp & 0xff; + buf[4] = 0; + } + + ret = php_mb_convert_encoding(buf, buf_len, enc, enc, &ret_len); + efree(buf); + + if (output_len) { + *output_len = ret_len; + } + + return ret; -} ++} ++ ++ + /* {{{ proto bool mb_ord([int cp[, string encoding]]) */ + PHP_FUNCTION(mb_chr) + { + zend_long cp; + char* enc = NULL; + size_t enc_len; + char* ret; + size_t ret_len; + + #ifndef FAST_ZPP + if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "l|s", &cp, &enc, &enc_len) == FAILURE) { - return; ++ return; + } + #else + ZEND_PARSE_PARAMETERS_START(1, 2) + Z_PARAM_LONG(cp) + Z_PARAM_OPTIONAL + Z_PARAM_STRING(enc, enc_len) + ZEND_PARSE_PARAMETERS_END(); + #endif + + ret = php_mb_chr(cp, enc, &ret_len); + + if (ret == NULL) { + RETURN_FALSE; + } + + RETVAL_STRING(ret); + efree(ret); + } + /* }}} */ + ++ /* {{{ php_mb_populate_current_detect_order_list */ static void php_mb_populate_current_detect_order_list(void) { diff --cc ext/mbstring/tests/mb_chr.phpt index 0000000000,7047d1c2de..095ce90ae5 mode 000000,100644..100644 --- a/ext/mbstring/tests/mb_chr.phpt +++ b/ext/mbstring/tests/mb_chr.phpt @@@ -1,0 -1,35 +1,59 @@@ + --TEST-- + mb_chr() + --SKIPIF-- + + --FILE-- + ---EXPECT-- ++--EXPECTF-- + bool(true) + bool(true) + bool(true) + bool(true) + bool(true) -bool(true) ++bool(true) ++ ++Warning: mb_chr(): Unknown encoding "typo" in %s on line 26 ++ ++Warning: mb_chr(): Unsupported encoding "pass" in %s on line 27 ++ ++Warning: mb_chr(): Unsupported encoding "jis" in %s on line 28 ++ ++Warning: mb_chr(): Unsupported encoding "cp50222" in %s on line 29 ++ ++Warning: mb_chr(): Unsupported encoding "utf-7" in %s on line 30 ++bool(false) ++bool(false) ++bool(false) ++bool(false) ++bool(false) diff --cc ext/mbstring/tests/mb_chr_ord.phpt index 0000000000,0000000000..613f2e7f42 new file mode 100644 --- /dev/null +++ b/ext/mbstring/tests/mb_chr_ord.phpt @@@ -1,0 -1,0 +1,2062 @@@ ++--TEST-- ++mb_chr() and mb_ord() ++--SKIPIF-- ++ ++--FILE-- ++ + --FILE-- + ---EXPECT-- ++--EXPECTF-- ++bool(true) + bool(true) + bool(true) -bool(true) ++ ++Warning: mb_ord(): Unknown encoding "typo" %s 10 ++ ++Warning: mb_ord(): Unsupported encoding "pass" %s 11 ++ ++Warning: mb_ord(): Unsupported encoding "jis" %s 12 ++ ++Warning: mb_ord(): Unsupported encoding "cp50222" %s 13 ++ ++Warning: mb_ord(): Unsupported encoding "utf-7" %s 14 ++bool(false) ++bool(false) ++bool(false) ++bool(false) ++bool(false)