Commit r360221 ("[Support] Add error handling to
sys::Process::getPageSize().", 2019-05-08) seems to have missed these
uses of getPageSize(). Update them to getPageSizeEstimate().
Hans Wennborg [Thu, 9 May 2019 09:22:56 +0000 (09:22 +0000)]
X86WinAllocaExpander: Drop code looking through register copies (PR41786)
This code was never covered by tests, in PR41786 it was pointed out that
the deletion part doesn't work, and in a full Chrome build I was never
able to hit the code path that looks through copies. It seems the
situation it's supposed to handle doesn't actually come up in practice.
Markus Lavin [Thu, 9 May 2019 08:29:04 +0000 (08:29 +0000)]
Make sub-registers index names case sensitive in the MIRParser
Prior to this change sub-register index names are assumed to be lower
case (but they are printed with original casing). This means that if a
target has some upper case characters in its sub-register names then
mir-export directly followed by mir-import is not possible. This also
means that sub-register indices currently are (and will continue to be)
slightly inconsistent with register names which are printed and assumed
to be lower case.
As the current textual representation of mir has a few inconsistencies
in this area it is a bit arbitrary how to address the matter. This
change is towards the direction that we feel is most correct (i.e. case
sensitivity).
Pengfei Wang [Thu, 9 May 2019 08:09:21 +0000 (08:09 +0000)]
Bugfix for nullptr check by klocwork
Klocwork static check:
Pointer from call to function `DebugLoc::operator DILocation *() const `
may be NULL and will be dereference in function `printExtendedName```
Patch by Shengchen Kan (skan)
Differential Revision: https://reviews.llvm.org/D61715
Petr Hosek [Thu, 9 May 2019 06:09:35 +0000 (06:09 +0000)]
[NewPM] Setup Passes for KASan and KMSan
While ASan and MSan passes were already ported to new PM, the kernel
variants weren't setup in the pipeline which makes the KASan and KMSan
tests in Clang fail.
Sanjay Patel [Wed, 8 May 2019 22:19:52 +0000 (22:19 +0000)]
[SelectionDAG] fold 'fneg undef' to undef
This is extracted from the original draft of D61419 with some additional tests.
We don't currently get this in IR (it's conservatively turned into a NaN),
but presumably that'll get updated as we add real IR support for 'fneg'
rather than 'fsub -0.0, x'.
The x86-32 run shows the following, and I haven't looked further to see why,
but that seems to be independent:
Legalizing: t1: f32 = undef
Trying to expand node
Creating fp constant: t4: f32 = ConstantFP<0.000000e+00>
Matt Arsenault [Wed, 8 May 2019 22:09:57 +0000 (22:09 +0000)]
AMDGPU: Select VOP3 form of add
The VOP3 form should always be the preferred selection, to be shrunk
later. This should only be an optimization issue, but this partially
works around a problem from clobbering VCC when SIFixSGPRCopies
rewrites an SCC defining operation directly to VCC.
3 of the testcases are regressions from failing to fold the immediate
in cases it should. These can be avoided by improving the VCC liveness
handling in SIFoldOperands. Simply increasing the threshold to
computeRegisterLiveness works, although this is common enough that VCC
liveness should probably be tracked throughout the pass. The hack of
leaving behind an implicit_def instruction to avoid breaking iterator
wastes instruction count, which inhibits finding the VCC def in long
chains of adds. Doing this however exposes different, worse looking
regressions from poor scheduling behavior. This could probably be
avoided around by forcing the shrink of the addc here, but the
scheduler should probably be fixed.
The r600 add test needs to be split out because it asserts on the
arguments in the new test during the calling convention lowering.
Summary:
Split defines.txt into diagnostics test and functionality test. Also add
comments, remove the semicolon prefix and group RUN lines with their
CHECK directives.
Summary:
Fix various issues in code style of method comments:
1) Move all heading comments to all non-static methods near their
declaration in the FileCheck.h header file.
2) Harmonize the action verb in doxygen comments for methods to always
be in third person
3) Use \returns instead of free text "return" and "returns".
4) Document a couple more parameters while at it.
Craig Topper [Wed, 8 May 2019 20:59:21 +0000 (20:59 +0000)]
[InstCombine] When turning sext into zext due to known bits, return the new ZExt instead of calling replaceinstuseswith
The worklist loop that we're returning back to should be able to do the repacement itself. This is how we normally do replacements.
My main motivation was that I observed that we weren't preserving the name of the result when we do this transform. The replacement code in the worklist loop will call takeName as part of the replacement.
Warren Ristow [Wed, 8 May 2019 18:50:07 +0000 (18:50 +0000)]
[SCEV] Suppress hoisting insertion point of binops when unsafe
InsertBinop tries to move insertion-points out of loops for expressions
that are loop-invariant. This patch adds a new parameter, IsSafeToHost,
to guard that hoisting. This allows callers to suppress that hoisting
for unsafe situations, such as divisions that may have a zero
denominator.
[RegAllocFast] Scan physcial reg definitions before assigning virtual reg definitions
When assigning the definitions of an instruction we were updating
the available registers while walking the definitions. Some of
those definitions may be from physical registers and thus, they are
not available for other definitions to take, but by the time we see
that we may have already assign these registers to another
virtual register.
Fix that by walking through all the definitions and mark as unavailable
the physical register definitions, then do the virtual register assignments.
Alina Sbirlea [Wed, 8 May 2019 17:05:36 +0000 (17:05 +0000)]
[MemorySSA] Teach LoopSimplify to preserve MemorySSA.
Summary:
Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available.
Do not preserve it in the new pass manager.
Update tests.
Simon Pilgrim [Wed, 8 May 2019 15:49:10 +0000 (15:49 +0000)]
[AMDGPU] Reapplied BFE canonicalization from D60462
This was committed in rL358887 but reverted in rL360066 due to a x86 regression, really it should be have been pre-committed instead of being part of the SimplifyDemandedBits bitcast patch.
David Greene [Wed, 8 May 2019 15:44:24 +0000 (15:44 +0000)]
[Reassociation] Place moved instructions after landing pads
Reassociation's NegateValue moved instructions to the beginning of
blocks (after PHIs) without checking for exception handling pads.
It's possible for reassociation to move something into an exception
handling block so we need to make sure we don't move things too early
in the block. This change advances the insertion point past any
exception handling pads.
If the block we want to move into contains a catchswitch, we cannot
move into it. In that case just create a new neg as if we had not
found an existing neg to move.
Petar Jovanovic [Wed, 8 May 2019 14:42:13 +0000 (14:42 +0000)]
[Support] Fix unit test for fs::is_local
Close the temporary file after the test is done using it.
If it is not closed and the file was created on NFS, it will cause the test
to fail. The problem happens in the cleanup process afterwards. It first
tries to delete the file but it is not really deleted. Afterwards, the
program fails to delete the directory containing the file, causing the whole
test to fail.
James Henderson [Wed, 8 May 2019 13:28:58 +0000 (13:28 +0000)]
[llvm-objcopy] Improve error message for unrecognised archive member
Prior to this patch, llvm-objcopy's error messages for archives with
unsupported members only mentioned the archive name, not the member
name, making them unhelpful. This change improves it by approximately
following GNU objcopy's error message syntax of
"<archive name>(<member name>): <problem>".
Tim Northover [Wed, 8 May 2019 10:59:08 +0000 (10:59 +0000)]
ARM: disallow SP as Rn for Thumb2 TST & TEQ instructions
Using SP in this position is unpredictable in ARMv7. CMP and CMN are not
affected, and of course v8 relaxes this requirement, but that's handled
elsewhere.
Simon Pilgrim [Wed, 8 May 2019 10:24:22 +0000 (10:24 +0000)]
[SIMode] Fix typo in Status constructor
As noted in https://www.viva64.com/en/b/0629/ (Snippet No. 36) and the scan-build CI reports (https://llvm.org/reports/scan-build/report-SIModeRegister.cpp-Status-1-1.html#EndPath), rL348754 introduced a typo in the Status constructor due to argument variable names shadowing the member variable names.
Florian Hahn [Wed, 8 May 2019 09:09:54 +0000 (09:09 +0000)]
[SCCP] Fix crash when trying to constant-fold terminators multiple times.
If we fold a branch/switch to an unconditional branch to another dead block we
replace the branch with unreachable, to avoid attempting to fold the
unconditional branch.
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=41775,
Problem is in the final line:
Size += this->EntrySize;
I checked that we do not actually need it in this place,
since we always call removeSectionReferences which
calls removeSymbols which updates the Size.
But it worth to keep it, that allows to relax the dependencies.
QingShan Zhang [Wed, 8 May 2019 07:21:37 +0000 (07:21 +0000)]
[NFC] Add a static function to do the endian check
Add a new function to do the endian check, as I will commit another patch later, which will also need the endian check.
Lang Hames [Wed, 8 May 2019 02:11:07 +0000 (02:11 +0000)]
[Support] Add error handling to sys::Process::getPageSize().
This patch changes the return type of sys::Process::getPageSize to
Expected<unsigned> to account for the fact that the underlying syscalls used to
obtain the page size may fail (see below).
For clients who use the page size as an optimization only this patch adds a new
method, getPageSizeEstimate, which calls through to getPageSize but discards
any error returned and substitues a "reasonable" page size estimate estimate
instead. All existing LLVM clients are updated to call getPageSizeEstimate
rather than getPageSize.
On Unix, sys::Process::getPageSize is implemented in terms of getpagesize or
sysconf, depending on which macros are set. The sysconf call is documented to
return -1 on failure. On Darwin getpagesize is implemented in terms of sysconf
and may also fail (though the manpage documentation does not mention this).
These failures have been observed in practice when highly restrictive sandbox
permissions have been applied. Without this patch, the result is that
getPageSize returns -1, which wreaks havoc on any subsequent code that was
assuming a sane page size value.
Reid Kleckner [Tue, 7 May 2019 23:06:21 +0000 (23:06 +0000)]
[COFF] Use COFF stubs for extern_weak functions
Summary:
A COFF stub indirects the reference to a symbol through memory. A
.refptr.$sym global variable pointer is created to refer to $sym.
Typically mingw uses these for external global variable declarations,
but we can use them for weak function declarations as well.
Updates the dso_local classification to add a special case for
extern_weak symbols on COFF in both clang and LLVM.
Lang Hames [Tue, 7 May 2019 22:56:40 +0000 (22:56 +0000)]
Reapply r360194 "[JITLink] Add support for MachO .alt_entry atoms." with fixes.
This patch modifies MachOAtomGraphBuilder to use setLayoutNext rather than
addEdge, and fixes a bug in the section layout algorithm that could result in
atoms appearing more than once in the section ordering (which resulted in those
atoms being assigned invalid addresses during layout).
Austin Kerbow [Tue, 7 May 2019 22:12:15 +0000 (22:12 +0000)]
[AMDGPU] Check MI bundles for hazards
Summary: GCNHazardRecognizer fails to identify hazards that are in and around bundles. This patch allows the hazard recognizer to consider bundled instructions in both scheduler and hazard recognizer mode. We ignore “bundledness” for the purpose of detecting hazards and examine the instructions individually.
Lang Hames [Tue, 7 May 2019 21:35:14 +0000 (21:35 +0000)]
[JITLink] Add support for MachO .alt_entry atoms.
The MachO .alt_entry directive is applied to a symbol to indicate that it is
locked (in terms of address layout and liveness) to its predecessor atom. I.e.
it is an alternate entry point, at a fixed offset, for the previous atom.
This patch updates MachOAtomGraphBuilder to check for the .alt_entry flag on
symbols and add a corresponding LayoutNext edge to the atom-graph. It also
updates MachOAtomGraphBuilder_x86_64 to generalize handling of the
X86_64_RELOC_SUBTRACTOR relocation: previously either the minuend or
subtrahend of the subtraction had to be the same as the atom being fixed up,
now it is only necessary for the minuend or subtrahend to be locked (via any
chain of alt_entry directives) to the atom being fixed up.
Revert "[OpenMP][Clang] Support for target math functions"
This commit appears to be breaking stage-2 builds on GreenDragon. The
OpenMP wrappers for cmath and math.h are copied into the root of the
resource directory and cause a cyclic dependency in module 'Darwin':
Darwin -> std -> Darwin. This blows up when CMake is testing for modules
support and breaks all stage 2 module builds, including the ThinLTO bot
and all LLDB bots.
CMake Error at cmake/modules/HandleLLVMOptions.cmake:497 (message):
LLVM_ENABLE_MODULES is not supported by this compiler
Compute results in more direct ways, avoid subset intersect
operations. Extract the core code for computing mul nowrap ranges
into separate static functions, so they can be reused.
Make sure that the DAG combiner doesn't merge stores that we explicitly
asked not be greater than preferred vector width for the vectorizer.
Test for both 128 and 256 with a skylake architecture.
Sanjay Patel [Tue, 7 May 2019 18:58:07 +0000 (18:58 +0000)]
[InstCombine] allow sinking fneg operands through an FP min/max
Fundamentally/generally, we should not have to rely on bailouts/crippling of
folds. In this particular case, I think we always recognize the inverted
predicate min/max pattern, so there should not be any loss of optimization.
Codegen looks better because we are eliminating an fneg.
Don Hinton [Tue, 7 May 2019 18:57:01 +0000 (18:57 +0000)]
[CommandLine] Allow Options to specify multiple OptionCategory's.
Summary:
It's not uncommon for separate components to share common
Options, e.g., it's common for related Passes to share Options in
addition to the Pass specific ones.
With this change, components can use OptionCategory's to simply help
output even if some of the options are shared.
Adrian Prantl [Tue, 7 May 2019 17:42:38 +0000 (17:42 +0000)]
Debug Info: Support address space attributes on rvalue references.
DWARF5, 2.12 20ff says that
Any debugging information entry representing a pointer or reference
type [may have a DW_AT_address_class attribute].
The existing code (https://reviews.llvm.org/D29670) seems to take a
quite literal interpretation of that wording. I don't see a reason why
an rvalue reference isn't a reference type in the spirit of that
paragraph. This patch allows rvalue references to also have address
spaces.
Jinsong Ji [Tue, 7 May 2019 17:29:44 +0000 (17:29 +0000)]
[PowerPC][NFC] Update build-vector-tests.ll using utils/update_llc_test_checks.py
build-vector-tests.ll is a huge testcase, it is hard to maintain: eg:
any fundamental changes might need to update hundreds of lines. We should
leverage the script to maintain it.
This patch simply run utils/update_llc_test_checks.py on it. There
should be no missing test points.
Florian Hahn [Tue, 7 May 2019 16:47:27 +0000 (16:47 +0000)]
[DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor
When simplifying TokenFactors, we potentially iterate over all
operands of a large number of TokenFactors. This causes quadratic
compile times in some cases and the large token factors cause additional
scalability problems elsewhere.
This patch adds some limits to the number of nodes explored for the
cases mentioned above.
The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:
A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.
In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.
Keno Fischer [Tue, 7 May 2019 15:28:47 +0000 (15:28 +0000)]
[SCEV] Add explicit representations of umin/smin
Summary:
Currently we express umin as `~umax(~x, ~y)`. However, this becomes
a problem for operands in non-integral pointer spaces, because `~x`
is not something we can compute for `x` non-integral. However, since
comparisons are generally still allowed, we are actually able to
express `umin(x, y)` directly as long as we don't try to express is
as a umax. Support this by adding an explicit umin/smin representation
to SCEV. We do this by factoring the existing getUMax/getSMax functions
into a new function that does all four. The previous two functions were
largely identical.
[PowerPC] Use the two-constant NR algorithm for refining estimates
The single-constant algorithm produces infinities on a lot of denormal values.
The precision of the two-constant algorithm is actually sufficient across the
range of denormals. We will switch to that algorithm for now to avoid the
infinities on denormals. In the future, we will re-evaluate the algorithm to
find the optimal one for PowerPC.
George Rimar [Tue, 7 May 2019 13:14:18 +0000 (13:14 +0000)]
[llvm-objdump] - Print relocation record in a GNU format.
This fixes the https://bugs.llvm.org/show_bug.cgi?id=41355.
Previously with -r we printed relocation section name instead of the target section name.
It was like this: "RELOCATION RECORDS FOR [.rel.text]"
Now it is: "RELOCATION RECORDS FOR [.text]"
Also when relocation target section has more than one relocation section,
we did not combine the output. Now we do.