Nikita Popov [Tue, 13 Oct 2020 14:17:40 +0000 (16:17 +0200)]
Normalize mb_ereg() return value
mb_ereg()/mb_eregi() currently have an inconsistent return value
based on whether the $matches parameter is passed or not:
> Returns the byte length of the matched string if a match for
> pattern was found in string, or FALSE if no matches were found
> or an error occurred.
>
> If the optional parameter regs was not passed or the length of
> the matched string is 0, this function returns 1.
Coupling this behavior to the $matches parameter doesn't make sense
-- we know the match length either way, there is no technical
reason to distinguish them. However, returning the match length
is not particularly useful either, especially due to the need to
convert 0-length into 1-length to satisfy "truthy" checks. We
could always return 1, which would kind of match the behavior of
preg_match() -- however, preg_match() actually returns the number
of matches, which is 0 or 1 for preg_match(), while false signals
an error. However, mb_ereg() returns false both for no match and
for an error. This would result in an odd 1|false return value.
The patch canonicalizes mb_ereg() to always return a boolean,
where true indicates a match and false indicates no match or error.
This also matches the behavior of the mb_ereg_match() and
mb_ereg_search() functions.
This fixes the default value integrity violation in PHP 8.
Alex Dowad [Sun, 6 Sep 2020 08:32:58 +0000 (10:32 +0200)]
Add identify filter for UTF-16, UTF-16LE, UTF-16BE
There was one faulty test in the suite which only passed before because UTF-16 had no
identify filter. After this was fixed, it exposed the problem with the test.
Fix #64076: imap_sort() does not return FALSE on failure
If unsupported `$search_criteria` are passed to `imap_sort()`, the
function returns an empty array, but there is also an error on the
libc-client error stack ("Unknown search criterion: UNSUPPORTED
(errflg=2)"). If, on the other hand, unsupported `$criteria` or
unsupported `$flags` are passed, the function returns `false`. We
solve this inconsistency by returning `false` for unsupported
`$search_criteria` as well.
Nikita Popov [Tue, 13 Oct 2020 13:36:09 +0000 (15:36 +0200)]
Don't accept null in pg_unescape_bytea()
This is an error that slipped in via 8d37c37bcdbf6fa99cd275413342457eeb2c664e.
pg_unescape_bytea() did not accept null in PHP 7.4, and it is not
meaningful for it to accept null now -- it will always fail, and now
with a misleading OOM message.
Nikita Popov [Tue, 13 Oct 2020 10:38:39 +0000 (12:38 +0200)]
Use $statement in mysqli
As we went with $statement rather than $stmts in other places,
let's also use it in mysqli. The discrepancy with mysqli_stmt
is a bit unfortunate, but we can't be consistent with *both*.
Ignore memory leaks reported for some libc-client functions
At least on Windows, some static variables are lazily initialized
during `mail_open()` and `mail_lsub()`, which are reported as memory
leaks. We suppress these false positives.
Nikita Popov [Tue, 13 Oct 2020 09:38:30 +0000 (11:38 +0200)]
Fix handling of throwing undef var in verify return
If we have an undefined variable and null is not accepted by the
return type, we want to throw just the undef var error.
In this case this lead to an infinite loop, because we overwrite
the exception opline in SAVE_OPLINE and it does not get reset
when chaining into a previous exception. Add an assertiong to
catch this case earlier.
- Make everything less gratuitously verbose
- Don't litter the code with lots of unneeded NULL checks (for things which
will never be NULL)
- Don't return success/failure code from functions which can never fail
- For encoding structs, don't use pointers to pointers to pointers for the
list of alias strings. Pointers to pointers (2 levels of indirection)
is what actually makes sense. This gets rid of some extraneous
dereference operations.
Alex Dowad [Sun, 6 Sep 2020 10:09:02 +0000 (12:09 +0200)]
Remove useless validity check when converting UTF-16LE -> wchar
The check ensures that the decoded codepoint is between 0x10000-0x10FFFF,
which is the valid range which can be encoded in a UTF-16 surrogate pair.
However, just looking at the code, it's obvious that this will be true.
First of all, 0x10000 is added to the decoded codepoint on the previous
line, so how could it be less than 0x10000?
Further, even if the 20 data bits already decoded were 0xFFFFF (all ones),
when you add 0x10000, it comes to 0x10FFFF, which is the very top of the
valid range. So how could the decoded codepoint be more than 0x10FFFF?
It can't.
These are typical boolean parameters, so we shouldn't advertize them as
integers. For the `$reverse` parameter that even fixes expectations,
because the `reverse` member is a bitfield of 1 bit, so assigning any
even integer would not set it.
intl: report more information about message pattern parse errors
The message patterns can be pretty complex, so reporting a generic
U_PARSE_ERROR without any additional information makes it needlessly
hard to fix erroneous patterns.
This commit makes use of the additional UParseError* parameter to
umsg_open to retrieve more details about the parse error to report that
to the user via intl_get_error_message()
Additional improve error reporting from the IntlMessage constructor.
Previously, all possible failures when calling IntlMessage::__construct()
would be masked away with a generic "Constructor failed" message.
This would include invalid patterns.
This commit makes sure that the underlying error that caused the
constructor failure is reported as part of the IntlException error
message.
Unless `topbod` is of `TYPEMULTIPART`, `mail_free_body()` does not free
the `nested.part`; while we could do this ourselves, instead we just
ignore additional bodies in this case, i.e. we don't attach them in the
first place.