Rainer Orth [Fri, 28 Jun 2019 18:29:18 +0000 (18:29 +0000)]
[unittests][Support] Fix LLVM-Unit :: Support/./SupportTests/FileSystemTest.permissions on Solaris
LLVM-Unit :: Support/./SupportTests/FileSystemTest.permissions currently
FAILs on Solaris:
FAIL: LLVM-Unit :: Support/./SupportTests/FileSystemTest.permissions (2940 of 51555)
******************** TEST 'LLVM-Unit :: Support/./SupportTests/FileSystemTest.permissions' FAILED ********************
Note: Google Test filter = FileSystemTest.permissions
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from FileSystemTest
[ RUN ] FileSystemTest.permissions
/opt/llvm-buildbot/obj/llvm/llvm.src/unittests/Support/Path.cpp:1705: Failure
Value of: CheckPermissions(fs::sticky_bit)
Actual: false
Expected: true
/opt/llvm-buildbot/obj/llvm/llvm.src/unittests/Support/Path.cpp:1712: Failure
Value of: CheckPermissions(fs::set_uid_on_exe | fs::set_gid_on_exe | fs::sticky_bit)
Actual: false
Expected: true
/opt/llvm-buildbot/obj/llvm/llvm.src/unittests/Support/Path.cpp:1719: Failure
Value of: CheckPermissions(fs::all_read | fs::set_uid_on_exe | fs::set_gid_on_exe | fs::sticky_bit)
Actual: false
Expected: true
/opt/llvm-buildbot/obj/llvm/llvm.src/unittests/Support/Path.cpp:1722: Failure
Value of: CheckPermissions(fs::all_perms)
Actual: false
Expected: true
[ FAILED ] FileSystemTest.permissions (0 ms)
[----------] 1 test from FileSystemTest (0 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (1 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] FileSystemTest.permissions
1 FAILED TEST
Checking with truss reveals that this is the same issue as on AIX and
documented in chmod(2):
If the process is not a privileged process and the file is not a direc-
tory, mode bit 01000 (S_ISVTX, the sticky bit) is cleared.
The following patch fixes this in the same way. Tested on amd64-pc-solaris2.11.
Sam Tebbs [Fri, 28 Jun 2019 15:43:31 +0000 (15:43 +0000)]
[ARM] Add support for the MVE long shift instructions
MVE adds the lsll, lsrl and asrl instructions, which perform a shift on a 64 bit value separated into two 32 bit registers.
The Expand64BitShift function is modified to accept ISD::SHL, ISD::SRL and ISD::SRA and convert it into the appropriate opcode in ARMISD. An SHL is converted into an lsll, an SRL is converted into an lsrl for the immediate form and a negation and lsll for the register form, and SRA is converted into an asrl.
test/CodeGen/ARM/shift_parts.ll is added to test the logic of emitting these instructions.
Roman Lebedev [Fri, 28 Jun 2019 15:32:52 +0000 (15:32 +0000)]
[NFC][InstCombine] Shift amount reassociation: add flag preservation test
As discussed in https://reviews.llvm.org/D63812#inline-569870
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg
So basically if the same flag is set on both original shifts -> set it on new shift.
Don't think we can do anything with non-matching flags on shl.
Fangrui Song [Fri, 28 Jun 2019 10:06:11 +0000 (10:06 +0000)]
[DebugInfo] Simplify GSYM::AddressRange and GSYM::AddressRanges
Delete unnecessary getters of AddressRange.
Simplify AddressRange::size(): Start <= End check should be checked in an upper layer.
Delete isContiguousWith() that doesn't make sense.
Simplify AddressRanges::insert. Delete commented code. Fix it when more than 1 ranges are to be deleted.
Delete trailing newline.
David Green [Fri, 28 Jun 2019 09:47:55 +0000 (09:47 +0000)]
[ARM] Widening loads and narrowing stores
MVE has instructions to widen as it loads, and narrow as it stores. This adds
the required patterns and legalisation to make them work including specifying
that they are legal, patterns to select them and test changes.
David Green [Fri, 28 Jun 2019 08:41:40 +0000 (08:41 +0000)]
[ARM] MVE loads and stores
This fills in the gaps for basic MVE loads and stores, allowing unaligned
access and adding far too many tests. These will become important as
narrowing/expanding and pre/post inc are added. Big endian might still not be
handled very well, because we have not yet added bitcasts (and I'm not sure how
we want it to work yet). I've included the alignment code anyway which maps
with our current patterns. We plan to return to that later.
Code written by Simon Tatham, with additional tests from Me and Mikhail Maltsev.
David Green [Fri, 28 Jun 2019 07:08:42 +0000 (07:08 +0000)]
[ARM] MVE vector shuffles
This patch adds necessary shuffle vector and buildvector support for ARM MVE.
It essentially adds support for VDUP, VREVs and some VMOVs, which are often
required by other code (like upcoming patches).
This mostly uses the same code from Neon that already generated
NEONvdup/NEONvduplane/NEONvrev's. These have been renamed to ARMvdup/etc and
moved to ARMInstrInfo as they are common to both architectures. Most of the
selection code seems to be applicable to both, but NEON does have some more
instructions making some parts specific.
Mikael Holmen [Fri, 28 Jun 2019 06:45:20 +0000 (06:45 +0000)]
Silence gcc warning in testcase [NFC]
Without the fix gcc (7.4.0) complains with
../unittests/ADT/APIntTest.cpp: In member function 'virtual void {anonymous}::APIntTest_MultiplicativeInverseExaustive_Test::TestBody()':
../unittests/ADT/APIntTest.cpp:2510:36: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
for (unsigned Value = 0; Value < (1 << BitWidth); ++Value) {
~~~~~~^~~~~~~~~~~~~~~~~
Alex Brachet [Fri, 28 Jun 2019 03:21:00 +0000 (03:21 +0000)]
[Support] Add fs::getUmask() function and change fs::setPermissions
Summary: This patch changes fs::setPermissions to optionally set permissions while respecting the umask. It also adds the function fs::getUmask() which returns the current umask.
Amara Emerson [Thu, 27 Jun 2019 23:56:34 +0000 (23:56 +0000)]
[GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when optimizations are used.
The new switch lowering code that tries to generate jump tables and range checks
were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to
town on trying to generate optimized lowerings, e.g. multiple jump tables, range
checks etc. This exposed bugs in the way PHI nodes are handled because the CFG
looks even stranger after all of this is done.
Fedor Sergeev [Thu, 27 Jun 2019 23:41:03 +0000 (23:41 +0000)]
[InlineCost] make InlineCost assignable
Summary:
Current InlineCost is not assignable because of const members Cost and Threshold.
I dont see practical benefits from having them const (access to these members is
private and internal interactions are rather simple). On other hand that makes
it hard to use as a member in some other data structure where assignability is necessary.
I'm going to use InlineCost in a downstream inliner that maintains a complex queue
of candidate call-sites and thus keeping and recalculating InlineCost is necessary.
This patch just removes 'const' from both members, making InlineCost assignable.
Roman Lebedev [Thu, 27 Jun 2019 21:52:10 +0000 (21:52 +0000)]
[CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 3)
Summary:
I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222.
There is no such action in "Add Action" menu...
This implements an optimization described in Hacker's Delight 10-17: when `C` is constant,
the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder.
The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.
This is a recommit, the original commit rL364563 was reverted in rL364568
because test-suite detected miscompile - the new comparison constant 'Q'
was being computed incorrectly (we divided by `D0` instead of `D`).
Original patch D50222 by @hermord (Dmytro Shynkevych)
Notes:
- In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla.
This seems to require an extra branch on overflow, so I refrained from implementing this for now.
- An explicit check for when the `REM` can be reduced to just its LHS is included:
the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise.
I hadn't managed to find a better way to not generate worse output in this case.
- The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390.
These lit configuration files are really Python source code. Using the
.py file extension helps editors and tools use the correct language
mode. LLVM and Clang already use this convention for lit configuration,
this change simply applies it to all of compiler-rt.
FeatureFusion bits was first introduced in
https://reviews.llvm.org/rL253724. for add/load integer fusion for P8.
The only use of `hasFusion` was https://reviews.llvm.org/rL255319.
However, this was removed later in https://reviews.llvm.org/rL280440.
So, there is NO any reference to fusion in code now.
Leaving it there is misleading and confusing, so remove it for now.
We can alwasy add back if we ever support fusion in the future.
Use "willreturn" in isGuaranteedToTransferExecutionToSuccessor
The `willreturn` function attribute guarantees that a function call will
come back to the call site if the call is also known not to throw.
Therefore, this attribute can be used in
`isGuaranteedToTransferExecutionToSuccessor`.
The previous output was next to useless if *any* exit was not computable. If we have more than one exit, show the exit count for each so that it's easier to see what's going from with SCEV analysis when debugging.
Yuanfang Chen [Thu, 27 Jun 2019 18:39:34 +0000 (18:39 +0000)]
[llvm-objdump] Update the doc for --disassemble-functions.
Update the doc after llvm-svn: 364121 is landed.
With two more trivial fixes that are not related to
--disassemble-functions but still about llvm-objdump.
Summary:
- Match the syntax output by InstPrinter.
- Fix it always emitting 0 for align. Had to work around fact that
opcode is not available for GetDefaultP2Align while parsing.
- Updated tests that were erroneously happy with a p2align=0
Simon Pilgrim [Thu, 27 Jun 2019 17:30:51 +0000 (17:30 +0000)]
[X86] combineX86ShufflesRecursively - merge shuffles with more than 2 inputs
We already had the infrastructure for this, but were waiting for the fix for a number of regressions which were handled by the recent shuffle(extract_subvector(),extract_subvector()) -> extract_subvector(shuffle()) shuffle combines
Roman Lebedev [Thu, 27 Jun 2019 16:45:42 +0000 (16:45 +0000)]
[CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 2)
Summary:
I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222.
There is no such action in "Add Action" menu...
Original patch D50222 by @hermord (Dmytro Shynkevych)
This implements an optimization described in Hacker's Delight 10-17: when `C` is constant,
the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder.
The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.
Original patch author: @hermord (Dmytro Shynkevych)!
Notes:
- In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla.
This seems to require an extra branch on overflow, so I refrained from implementing this for now.
- An explicit check for when the `REM` can be reduced to just its LHS is included:
the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise.
I hadn't managed to find a better way to not generate worse output in this case.
- The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390.
This patch introduces a new function attribute, willreturn, to indicate
that a call of this function will either exhibit undefined behavior or
comes back and continues execution at a point in the existing call stack
that includes the current invocation.
This attribute guarantees that the function does not have any endless
loops, endless recursion, or terminating functions like abort or exit.
Emit replacements for clobbered parameters location if the parameter
has unmodified value throughout the funciton. This is basic scenario
where we can use the debug entry values.
([12/13] Introduce the debug entry values.)
Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com>
Differential Revision: https://reviews.llvm.org/D58042
James Henderson [Thu, 27 Jun 2019 15:18:15 +0000 (15:18 +0000)]
[docs][llvm-nm][llvm-objdump] Improve "See Also" section
The "See Also" section for llvm-nm didn't actually contain any links,
and the tools referred to didn't make much sense (referring to non-LLVM
tools, when we have equivalents, or tools that aren't really to do with
symbol dumping). llvm-objdump's didn't refer to llvm-readelf.
Tim Northover [Thu, 27 Jun 2019 14:46:51 +0000 (14:46 +0000)]
Bitcode: derive all types used from records instead of Values.
There is existing bitcode that we need to support where the structured nature
of pointer types is used to derive the result type of some operation. For
example a GEP's operation and result will be based on its input Type.
When pointers become opaque, the BitcodeReader will still have access to this
information because it's explicitly told how to construct the more complex
types used, but this information will not be attached to any Value that gets
looked up. This changes BitcodeReader so that in all places which use type
information in this manner, it's derived from a side-table rather than from the
Value in question.
Jinsong Ji [Thu, 27 Jun 2019 14:11:31 +0000 (14:11 +0000)]
[PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and others
This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751
llvm-mc aborted when disassembling tabortdc.
This patch try to clean up TM related DAGs.
* Fixes the problem by remove explicit output of cr0, and put it as implicit def.
* Update int_ppc_tbegin pattern to accommodate the implicit def of cr0.
* Update the TCHECK operand and int_ppc_tcheck accordingly.
* Add some builtin test and disassembly tests.
* Remove unused CRRC0/crrc0
Hans Wennborg [Thu, 27 Jun 2019 13:55:02 +0000 (13:55 +0000)]
Revert r363658 "[SVE][IR] Scalable Vector IR Type with pr42210 fix"
We saw a 70% ThinLTO link time increase in Chromium for Android, see
crbug.com/978817. Sounds like more of PR42210.
> Recommit of D32530 with a few small changes:
> - Stopped recursively walking through aggregates in
> the verifier, so that we don't impose too much
> overhead on large modules under LTO (see PR42210).
> - Changed tests to match; the errors are slightly
> different since they only report the array or
> struct that actually contains a scalable vector,
> rather than all aggregates which contain one in
> a nested member.
> - Corrected an older comment
>
> Reviewers: thakis, rengolin, sdesmalen
>
> Reviewed By: sdesmalen
>
> Differential Revision: https://reviews.llvm.org/D63321