Nikita Popov [Fri, 15 Mar 2019 11:36:49 +0000 (12:36 +0100)]
Switch to bison location tracking
Locations for AST nodes are now tracked with the help of bison
location tracking. This is more accurate than what we currently do
and easier to extend with more information.
A zend_ast_loc structure is introduced, which is used for the location
stack. Currently it only holds the start lineno, but can be extended
to also hold end lineno and offset/column information in the future.
All AST constructors now accept a zend_ast_loc* as first argument, and
will use it to determine their lineno. Previously this used either the
CG(zend_lineno), or the smallest AST lineno of child nodes.
On the parser side, the location structure for a whole rule can be
obtained using the &@$ character salad.
Nikita Popov [Wed, 20 Mar 2019 11:03:45 +0000 (12:03 +0100)]
Fixed bug #74345
Export zend_release_fcall_info_cache(). It is only necessary to
call it if the fcc may not have been used -- if it is passed to
zend_call_function() and friends, then they will take care of
freeing trampolines.
Peter Kokot [Mon, 18 Mar 2019 23:33:53 +0000 (00:33 +0100)]
Upgrade deprecated directives and use non-posix bison
With Bison 3.0 some directives are deprecated:
- %name-prefix "x" should be %define api.prefix {x}
- %error-verbose should be %define parse.error verbose
Bison 3.3 also started emiting more warnings and since PHP souce parsers
are not POSIX compliant this patch fixes this as pointed out via 495a46aa1dc564656bf919cb49aae48a31ae15f4.
Nikita Popov [Tue, 19 Mar 2019 14:35:15 +0000 (15:35 +0100)]
Respect OFFSET_CAPTURE when padding preg_match_all() results
This issue was mentioned in bug #73948. The PREG_PATTERN_ORDER
padding was performed without respecting the PREF_OFFSET_CAPTURE
flag, which resulted in unmatched subpatterns being either null or
[null, -1] depending on where they occur. Now they will always be
[null, -1], consistent with other usages.
Nikita Popov [Tue, 19 Mar 2019 12:06:21 +0000 (13:06 +0100)]
Don't create a new array for empty/null match every time
If PREG_OFFSET_CAPTURE is used, unmatched subpatterns will be either
[null, -1] or ['', -1] depending on PREG_UNMATCHED_AS_NULL mode.
Instead of creating a new array like this every time, cache it inside
a global (per-request -- could make it immutable though).
Additionally check whether the subpattern is an empty string or
single character string and use an existing interned string in that
case. Empty / single-char subpatterns are common, so let's avoid
allocating strings for them.
Nikita Popov [Tue, 19 Mar 2019 10:55:40 +0000 (11:55 +0100)]
Use zend_string for subpat_names table
When used with preg_match_all or preg_replace_callback(_array),
subpattern names can be used in the matches array many times.
Switch the subpat_names table to use zend_string, so we don't have
to allocate a new string every time. Also don't bother creating the
table if no $matches were passed.
This might be a regression for the case where preg_match() is used
with many trailing named subpatterns that are skipped in the result
array, but that seems rather contrived.
Miriam Lauter [Mon, 18 Mar 2019 16:47:18 +0000 (12:47 -0400)]
Fix #77767: phpdbg break command help message shows incorrect aliases
Previously the aliases for at and del were listed as A and d
in the help message for break. This patch corrects the aliases
to be @ and ~ respectively.
Nikita Popov [Mon, 18 Mar 2019 15:55:25 +0000 (16:55 +0100)]
Don't use random mode in mysqli_query test
MYSQLI_ASYNC is also valid here, at least with mysqlnd. Rather than
using a random mode that is prone to failing once in a blue moon,
use a fixed invalid value.
Nikita Popov [Mon, 18 Mar 2019 11:57:43 +0000 (12:57 +0100)]
Fixed bug #72685
We currently have a large performance problem when implementing lexers
working on UTF-8 strings in PHP. This kind of code tends to perform a
large number of matches at different offsets on a single string. This
is generally fast. However, if /u mode is used, the full string will
be UTF-8 validated on each match. This results in quadratic runtime.
This patch fixes the issue by adding a IS_STR_VALID_UTF8 flag, which
is set when we have determined that the string is valid UTF8 and
further validation is skipped.
A limitation of this approach is that we can't set the flag for interned
strings. I think this is not a problem for this use-case which will
generally work on dynamic data. If we want to use this flag for other
purposes as well (mbstring?) then it might be worthwhile to UTF-8 validate
strings during interning. But right now this doesn't seem useful.
twosee [Sat, 16 Mar 2019 05:21:48 +0000 (13:21 +0800)]
Don't disable object slot reuse while running shutdown functions
We only need to do this once we're running destructors. The current
approach interferes with some event loop code that runs everything
inside a shutdown function.
Peter Kokot [Thu, 14 Mar 2019 22:21:17 +0000 (23:21 +0100)]
Sync AC_CHECK_SIZEOF m4 macro calls
- AC_CHECK_SIZEOF is now called mostly only in PHP_CHECK_STDINT_TYPES()
macro except for some parts checking for the 32 or 64 bit architecture.
- SIZEOF_CHAR removed since it is always 1
- ZEND_BIN_ID is now of a more logical pattern `BIN_48888` on 64bit
architectures and `BIN_44444` on 32bit instead of literal string
`BIN_SIZEOF_CHAR48888` on 64bit and `BIN_SIZEOF_CHAR44444` on 32bit.
The unneeded SIZEOF_CHAR part has been removed.
- XMLRPC_TYPE_CHECKS removed
- The `long long int` is the same as `long long` and redundant checks
removed accordingly.
- Removed PHP_CHECK_64BIT macro. Checking if current platform is 64bit
or not can be also done simply by using a check of the long type on
place. This removes redundant m4 macro PHP_CHECK_64BIT.
Peter Kokot [Sun, 17 Mar 2019 19:10:26 +0000 (20:10 +0100)]
Remove outdated README for ext/json
The php manual already includes introduction to the JSON extension. The
re2c and bison version required to build parser and lexer files have
changed so to move this info on a central place this removes the README.