Fix #51903: simplexml_load_file() doesn't use HTTP headers
The `encoding` attribute of the XML declaration is optional; it is good
practice to use external encoding information where available if it is
missing. Thus, we check for `charset` info of `Content-Type` headers,
and see whether the encoding is supported.
We cater to trailing parameters and quoted-strings, but not to escaped
backslashes and quotes in quoted-strings, since no known character
encoding contains these anyway.
Co-authored-by: Michael Wallner <mike@php.net>
Closes GH-6747.
Fix #78719: http wrapper silently ignores long Location headers
When opening HTTP streams, and reading the headers, we currently
discard header lines longer than `HTTP_HEADER_BLOCK_SIZE` (1024 bytes).
While this is not generally forbidden by RFC 7230, section 3.2.5, it
is not generally allowed either, since that may change the "message
framing or response semantics".
We thus fix this by allowing arbitrarily long header lines.
Fix #80751: Comma in recipient name breaks email delivery
So far, `SendText()` simply separates potential email address lists at
any comma, disregarding that commas inside a quoted-string do not
delimit addresses. We fix that by introducing an own variant of
`strtok_r()` which caters to quoted-strings.
We also make `FormatEmailAddress()` aware of quoted strings.
We do not cater to email address comments, and potentially other quirks
of RFC 5322 email addresses, but catering to quoted-strings is supposed
to solve almost all practical use cases.
The lack of such a check leads to false-passes of tests on Windows
which expect no output, but produce a segfault or similar issue. I
discovered this a while ago due to bad tests in an extension I maintain.
Dylan K. Taylor [Mon, 22 Feb 2021 23:56:11 +0000 (23:56 +0000)]
run-tests: fixed exit code not being set on BORKED tests
When no test paths are specified this shows up when 'make test' is used on a PECL extension without specifying tests to run (or in php-src too, I guess...)
Fix #75776: Flushing streams with compression filter is broken
First, the `bzip2.compress` filter has the same issue as `zlib.deflate`
so we port the respective fix[1] to ext/bz2.
Second, there is still an issue, if a stream with an attached
compression filter is flushed before it is closed, without any writes
in between. In that case, the compression is never finalized. We fix
this by enforcing a `_php_stream_flush()` with the `closing` flag set
in `_php_stream_free()`, whenever a write filter is attached. This
call is superfluous for most write filters, but does not hurt, even
when it is unnecessary.
Since we do no longer URL decode cookie names[1], we must not URL
encode the session name. We need to prevent broken Set-Cookie headers,
by rejecting names which contain invalid characters.
Nikita Popov [Mon, 22 Feb 2021 08:33:23 +0000 (09:33 +0100)]
Fixed bug #80781
zend_find_array_dim_slow() may throw, make sure to handle this.
This backports the code we already use for this on PHP-8.0,
and also backports an exception check that makes this easier to
catch.
Nikita Popov [Tue, 16 Feb 2021 14:26:31 +0000 (15:26 +0100)]
Handle incomplete result set metadata more gracefully
Rather than segfaulting because sname is missing lateron, report
a FAIL here. As this indicates a server bug, the errors is reported
as an out of band warning, rather than a client error.
Nikita Popov [Mon, 15 Feb 2021 14:54:49 +0000 (15:54 +0100)]
Suppress OpenSSL error on missing optional config
openssl_pkey_new() fetches various options from the config file --
most of these are optional, and not specifying them is not an error
condition from the perspective of the user. Unfortunately, the
CONF_get_string() API pushes an error when accessing a key that
doesn't exist (_CONF_get_string does not, but that is presumably a
private API). This commit adds a helper php_openssl_conf_get_string()
that automatically clears the error in this case. I've found that
OpenSSL occasionally does the same thing internally:
https://github.com/openssl/openssl/blob/22040fb790c854cefb04bed98ed38ea6357daf83/apps/req.c#L515-L517
Nikita Popov [Mon, 15 Feb 2021 13:52:38 +0000 (14:52 +0100)]
Fix symtable cache being used while cleaning symtable
We need to first clean the symtable and then check whether a cache
slot is available for it. Otherwise, it may happen that a destructor
runs while cleaning the table and uses up all the remaining slots
in the cache.
This is particularly insidious because once we overflow the cache,
the first pointer we modify is symtable_cache_ptr, making it hard
to understand what happened after the fact.
Fix #80706: mail(): Headers after Bcc headers may be ignored
We need to handle the case where a CRLF after a Bcc header is not the
beginning of a folding marker, because in that case the Bcc header was
not the last "thing".
Fix #74779: x() and y() truncating floats to integers
We must not use the locale dependent `atof()`, but instead use the
(hopefully) locale independent `zend_strtod()`, when converting string
representations of floating point numbers which are sent by the server.
When Phars are flushed, a new temporary file is created for each entry
which should be compressed, and the `compressed_filesize` is retrieved.
Afterwards, the Phar manifest is written, and only after that the files
are copied to the actual Phar. So for each such entry there is an open
temp file, what easily exceeds the limit.
Therefore, we use a single temporary file for all entries, and store
the start offset in the otherwise unused `header_offset` member. We
ensure that the `cfp` members are properly set to NULL even if flushing
fails, to avoid use after free scenarios.
This solution is based on a suggestion by @lserni[1].
Nikita Popov [Tue, 2 Feb 2021 09:05:35 +0000 (10:05 +0100)]
Fix persistent leak on load_wsdl_ex failure
Move the load_wsdl_ex call into the zend_try that destroys the
docs hash table. The wsdl will be inserted into docs early on,
and will thus be released on subsequent bailout.
We remove the arbitrary restriction to `INT_MAX`; it is superfluous on
32bit systems where `ZEND_LONG_MAX == INT_MAX` anyway, and not useful
on 64bit systems, where larger files should be readable, if the
`memory_limit` is large enough.
That bug report originally was about `parse_url()` misbehaving, but the
security aspect was actually only regarding `FILTER_VALIDATE_URL`.
Since the changes to `parse_url_ex()` apparently affect userland code
which is relying on the sloppy URL parsing[1], this alternative
restores the old parsing behavior, but ensures that the userinfo is
checked for correctness for `FILTER_VALIDATE_URL`.
Fix #70091: Phar does not mark UTF-8 filenames in ZIP archives
The default encoding of filenames in a ZIP archive is IBM Code Page
437. Phar, however, only supports UTF-8 filenames. Therefore we have
to mark filenames as being stored in UTF-8 by setting the general
purpose bit 11 (the language encoding flag).
The effect of not setting this bit for non ASCII filenames can be seen
in popular tools like 7-Zip and UnZip, but not when extracting the
archives via ext/phar (which is agnostic to the filename encoding), or
via ext/zip (which guesses the encoding). Thus we add a somewhat
brittle low-level test case.
sj-i [Sun, 20 Dec 2020 06:57:54 +0000 (15:57 +0900)]
Fixed bug #42560
Check open_basedir after the fallback to the system's temporary
directory in tempnam().
In order to preserve the current behavior of upload_tmp_dir
(do not check explicitly specified dir, but check fallback),
new flags are added to check open_basedir for explicit dir
and for fallback.
Fix #80595: Resetting POSTFIELDS to empty array breaks request
This is mainly to work around https://github.com/curl/curl/issues/6455,
but not building the mime structure for empty hashtables is a general
performance optimization, so we do not restrict it to affected cURL
versions (7.56.0 to 7.75.0).
The minor change to bug79033.phpt is unexpected, but should not matter
in practice.
Avoid modifying the return value of readline_completion_function()
The internal function `_readline_command_generator()` modifies the
internal array pointer of `readline_completion_function()`'s return
value. We therefore separate the array, what also avoids failing
assertions regarding the array refcount.
Fix #77565: Incorrect locator detection in ZIP-based phars
We must not assume that the first end of central dir signature in a ZIP
archive actually designates the end of central directory record, since
the data in the archive may contain arbitrary byte patterns. Thus, we
better search from the end of the data, what is also slightly more
efficient.
There is, however, no way to detect the end of central directory
signature by searching from the end of the ZIP archive with absolute
certainty, since the signature could be part of the trailing comment.
To mitigate, we check that the comment length fits to the found
position, but that might still not be the correct position in rare
cases.
Dylan K. Taylor [Mon, 4 Jan 2021 23:13:00 +0000 (23:13 +0000)]
gdbinit: use ____print_str to print htable keys
I noticed this problem while dumping the contents of EG(function_table),
where keys for closures start with a null byte. printf interprets this
as a zero-length string and emits nothing. This allows the key to be
rendered properly in readable form.
Fix #77423: parse_url() will deliver a wrong host to user
To avoid that `parse_url()` returns an erroneous host, which would be
valid for `FILTER_VALIDATE_URL`, we make sure that only userinfo which
is valid according to RFC 3986 is treated as such.
For consistency with the existing url parsing code, we use ctype
functions, although that is not necessarily correct.
Adam Seitz [Tue, 1 Dec 2020 23:40:16 +0000 (00:40 +0100)]
Fix #80384: limit read buffer size
In the case of a stream with no filters, php_stream_fill_read_buffer
only reads stream->chunk_size into the read buffer. If the stream has
filters attached, it could unnecessarily buffer a large amount of data.
With this change, php_stream_fill_read_buffer only proceeds until either
the requested size or stream->chunk_size is available in the read buffer.
Co-authored-by: Christoph M. Becker <cmbecker69@gmx.de>
Closes GH-6444.
Nikita Popov [Wed, 16 Dec 2020 11:12:06 +0000 (12:12 +0100)]
MySQLnd: Support cursors in store/get result
This fixes two related issues:
1. When a PS with cursor is used in store_result/get_result,
perform a COM_FETCH with maximum number of rows rather than
silently switching to an unbuffered result set (in the case of
store_result) or erroring (in the case of get_result).
In the future, we might want to make get_result unbuffered for
PS with cursors, as using cursors with buffered result sets
doesn't really make sense. Unlike store_result, get_result
isn't very explicit about what kind of result set is desired.
2. If the client did not request a cursor, but the server reports
that a cursor exists, ignore this and treat the PS as if it
has no cursor (i.e. to not use COM_FETCH). It appears to be a
server side bug that a cursor used inside an SP will be reported
to the client, even though the client cannot use the cursor.
Nikita Popov [Wed, 16 Dec 2020 09:16:50 +0000 (10:16 +0100)]
Fix bug #80523
Don't truncate the file length to unsigned int...
I have no idea whether that fully fixes the problem because the
process gets OOM killed before finishing, but at least the
immediate parse error is gone now.