[ConstantRange] Shl considers full-set shifting to last bit position.
if we do SHL of two 16-bit ranges like [0, 30000) with [1,2) we get
"full-set" instead of what I would have expected [0, 60000) which is
still in the 16-bit unsigned range.
This patch changes the SHL algorithm to allow getting a usable range
even in this case.
Fangrui Song [Sun, 7 Apr 2019 05:32:16 +0000 (05:32 +0000)]
[llvm-objdump] Simplify disassembleObject
* Use std::binary_search to replace some std::lower_bound
* Use llvm::upper_bound to replace some std::upper_bound
* Use format_hex and support::endian::read{16,32}
Petr Hosek [Sat, 6 Apr 2019 23:05:56 +0000 (23:05 +0000)]
[gn] Support for per-target runtime directory layout
This change also introduces the clang_enable_per_target_runtime_dir
to enable the use of per-target runtime directory layout which is the
equivalent of LLVM_ENABLE_PER_TARGET_RUNTIME_DIR CMake option.
[X86] Use a signed mask in foldMaskedShiftToScaledMask to enable a shorter immediate encoding.
This function reorders AND and SHL to enable the SHL to fold into an LEA. The
upper bits of the AND will be shifted out by the SHL so it doesn't matter what
mask value we use for these bits. By using sign bits from the original mask in
these upper bits we might enable a shorter immediate encoding to be used.
[X86] Add test cases to show missed opportunities to use a sign extended 8 or 32 bit immediate AND when reversing SHL+AND to form an LEA.
When we shift the AND mask over we should shift in sign bits instead of zero bits. The scale in the LEA will shift these bits out so it doesn't matter whether we mask the bits off or not. Using sign bits will potentially allow a sign extended immediate to be used.
Also add some other test cases for cases that are currently optimal.
Summary:
D60041 / D60138 refactoring changed how CMOV/SETcc opcodes
are handled. concode is now an immediate, with it's own operand type.
This at least allows to not crash on the opcode.
However, this still won't generate all the snippets
with all the condcode enumerators. D60066 does that.
Fangrui Song [Sat, 6 Apr 2019 09:12:53 +0000 (09:12 +0000)]
[DWARF] Simplify DWARFDebugAranges::findAddress
The current lower_bound approach has to check two iterators pos and pos-1.
Changing it to upper_bound allows us to check one iterator (similar to
DWARFUnitVector::getUnitFor*).
Robert Widmann [Fri, 5 Apr 2019 21:36:50 +0000 (21:36 +0000)]
[LLVM-C] Begin to Expose A More General Binary Interface
Summary:
Provides a new type, `LLVMBinaryRef`, and a binding to `llvm::object::createBinary` for more general interoperation with binary files than `LLVMObjectFileRef`. It also provides the proper non-consuming API for input buffers and populates an out parameter for error handling if necessary - two things the previous API did not do.
In a follow-up, I'll define section and symbol iterators and begin to build upon the existing test infrastructure.
This patch is a first step towards deprecating that API and replacing it with something more robust.
Petr Hosek [Fri, 5 Apr 2019 21:30:40 +0000 (21:30 +0000)]
[gn] Support for building compiler-rt builtins
This is support for building compiler-rt builtins, The library build
should be complete for a subset of supported platforms, but not all
CMake options have been replicated in GN.
We always use the just built compiler to build all the runtimes, which
is equivalent to the CMake runtimes build. This simplifies the build
configuration because we don't need to support arbitrary host compiler
and can always assume the latest Clang. With GN's toolchain support,
this is significantly more efficient than the CMake runtimes build.
[AMDGPU] Add MachineDCE pass after RenameIndependentSubregs
Detect dead lanes can create some dead defs. Then RenameIndependentSubregs
will break a REG_SEQUENCE which may use these dead defs. At this point
a dead instruction can be removed but we do not run a DCE anymore.
MachineDCE was only running before live variable analysis. The patch
adds a mean to preserve LiveIntervals and SlotIndexes in case it works
past this.
[X86] Merge the different Jcc instructions for each condition code into single instructions that store the condition code as an operand.
Summary:
This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes.
Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser.
[X86] Merge the different SETcc instructions for each condition code into single instructions that store the condition code as an operand.
Summary:
This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes.
Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser.
[X86] Merge the different CMOV instructions for each condition code into single instructions that store the condition code as an immediate.
Summary:
Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models.
This avoids needing an isel pattern for each condition code. And it removes
translation switches for converting between CMOV instructions and condition
codes.
Now the printer, encoder and disassembler take care of converting the immediate.
We use InstAliases to handle the assembly matching. But we print using the
asm string in the instruction definition. The instruction itself is marked
IsCodeGenOnly=1 to hide it from the assembly parser.
This does complicate the scheduler models a little since we can't assign the
A and BE instructions to a separate class now.
Petr Hosek [Fri, 5 Apr 2019 19:13:54 +0000 (19:13 +0000)]
[gn] Rebase paths in symlink_or_copy against root_build_dir
We should be always rebasing paths against root_build_dir which is
the directory where scripts are run from, not root_out_dir which is
the current toolchain directory. The latter can result in invalid
paths when the action is being used from a non-default toolchain.
Current LCG doesn't check aliased functions. So if an internal function has a public alias it will not be added to CG SCC, but it is still reachable from outside through the alias.
So this patch adds aliased functions to SCC.
[PDB Docs] Add info about the hash adjustment buffer.
This necessitates adding a document describing the serialized
hash table format. This document is currently empty, although
it will be filled out in followup patches.
Use ctypes to call into SHFileOperationW with the extended NT path to allow us
to remove paths which exceed 261 characters on Windows. This functionality is
exercised by swift's test suite.
This patch changes the error message when the section specified by
--string-dump cannot be found by including the name of the section in
the error message and changing the prefix text to not imply that the
file itself was invalid. As part of this change some uses of
std::error_code have been replaced with the llvm Error class to better
encapsulate the error info (rather than passing File strings around),
and the WithColor class replaces string literal error prefixes.
Simon Pilgrim [Fri, 5 Apr 2019 14:56:21 +0000 (14:56 +0000)]
[SelectionDAG] Add fcmp UNDEF handling to SelectionDAG::FoldSetCC
Second half of PR40800, this patch adds DAG undef handling to fcmp instructions to match the behavior in llvm::ConstantFoldCompareInstruction, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.).
This involves a lot of tweaking to reduced tests as bugpoint loves to reduce fcmp arguments to undef........
Don Hinton [Fri, 5 Apr 2019 13:59:24 +0000 (13:59 +0000)]
[llvm] Add isa_and_nonnull
Summary:
Add new ``isa_and_nonnull<>`` operator that works just like
the ``isa<>`` operator, except that it allows for a null pointer as an
argument (which it then returns false).
There are a variety of vector patterns that may be profitably reduced to a
scalar op when scalar ops are performed using a subset (typically, the
first lane) of the vector register file.
For x86, this is true for float/double ops and element 0 because
insert/extract is just a sub-register rename.
Other targets should likely enable the hook in a similar way.
Petar Jovanovic [Fri, 5 Apr 2019 12:58:15 +0000 (12:58 +0000)]
[TextAPI] Fix off-by-one error in the bit index extraction loop
The loop in findNextSetBit() runs one pass more than it should.
On 64-bit architectures this does not cause a problem, but 32-bit
architectures mask the shift count to 5 bits which limits the number of
shifts inside a range of 0 to 31. Shifting by 32 has the same effect as
shifting by 0, so if the first bit in the set is 1, the function will return
with Index different from EndIndexVal. Because of that, range-based for
loops iterating thorough architectures will continue until hitting a 0 in
the set, resulting in n additional iterations, where n is equal to the
number of consecutive 1 bits at the start the set.
Ultimately TBDv1.WriteFile and TBDv2.WriteFile will output additional
architectures causing a failure in the unit tests.
Pavel Labath [Fri, 5 Apr 2019 08:43:54 +0000 (08:43 +0000)]
Fix r357749 for big-endian architectures
We need to read the strings from the minidump files as little-endian,
regardless of the host byte order.
I definitely remember thinking about this case while writing the patch
(and in fact, I have implemented that for the "write" case), but somehow
I have ended up not implementing the byte swapping when reading the
data. This adds the necessary byte-swapping and should hopefully fix
test failures on big-endian bots.
Pavel Labath [Fri, 5 Apr 2019 08:26:58 +0000 (08:26 +0000)]
Fix MSVC build for r357749
MSVC found the bare "make_unique" invocation ambiguous (between std::
and llvm:: versions). Explicitly qualifying the call with llvm:: should
hopefully fix it.
Pavel Labath [Fri, 5 Apr 2019 08:06:26 +0000 (08:06 +0000)]
Minidump: Add support for reading/writing strings
Summary:
Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:
- adds support for reading these strings out of the minidump file (this
implementation does not depend on proper null-termination)
- adds support for writing them to a minidump file
- using the previous two pieces implements proper (de)serialization of
the CSDVersion field of the SystemInfo stream. Previously, this was
only read/written as hex, and no attempt was made to access the
referenced string -- now this string is read and written correctly.
The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.
Piotr Sobczak [Fri, 5 Apr 2019 07:44:09 +0000 (07:44 +0000)]
[SelectionDAG] Compute known bits of CopyFromReg
Summary:
Teach SelectionDAG how to compute known bits of ISD::CopyFromReg if
the virtual reg used has one def only.
This can be particularly useful when calling isBaseWithConstantOffset()
with the ISD::CopyFromReg argument, as more optimizations may get enabled
in the result.
Also add a missing truncation on X86, found by testing of this patch.
This will introduce sign extends sometimes which might be harder to deal with than the zero we use for promoting SRL. I ran this through some of our internal benchmark lists and didn't see any major regressions.
I think there might be some DAG combine improvement opportunities in the test changes here.
Nick Lewycky [Thu, 4 Apr 2019 23:09:40 +0000 (23:09 +0000)]
An unreachable block may have a route to a reachable block, don't fast-path return that it can't.
A block reachable from the entry block can't have any route to a block that's not reachable from the entry block (if it did, that route would make it reachable from the entry block). That is the intended performance optimization for isPotentiallyReachable. For the case where we ask whether an unreachable from entry block has a route to a reachable from entry block, we can't conclude one way or the other. Fix a bug where we claimed there could be no such route.
The fix in rL357425 ironically reintroduced the very bug it was fixing but only when a DominatorTree is provided. This fixes the remaining bug.
[TextAPI] Prefix all architecture enums to fix the build on i386.
Summary: This changes the Architecture enum to use a prefix (AK_) to prevent the
preprocessor from replacing i386 with 1 when building llvm/clang for i386.
Make SourceManager::createFileID(UnownedTag, ...) take a const llvm::MemoryBuffer*
Requires making the llvm::MemoryBuffer* stored by SourceManager const,
which in turn requires making the accessors for that return const
llvm::MemoryBuffer*s and updating all call sites.
The original motivation for this was to use it and fix the TODO in
CodeGenAction.cpp's ConvertBackendLocation() by using the UnownedTag
version of createFileID, and since llvm::SourceMgr* hands out a const
llvm::MemoryBuffer* this is required. I'm not sure if fixing the TODO
this way actually works, but this seems like a good change on its own
anyways.
Appease STLs where std::atomic<void*> lacks a constexpr default ctor
MSVC 2019 casts the pointer to a pointer-sized integer, which is a
reinterpret_cast, which is invalid in a constexpr context, so I have to
remove the LLVM_REQUIRES_CONSTANT_INITIALIZATION annotation for now.
Data section relocations in wasm shared libraries are applied by the
library itself at static constructor time. This change adds a new
synthetic function that applies relocations to relevant memory locations
on startup.
Ensure that ManagedStatic is constant initialized in MSVC 2017 & 2019
Fixes PR41367.
This effectively relands r357655 with a workaround for MSVC 2017.
I tried various approaches with unions, but I ended up going with this
ifdef approach because it lets us write the proper C++11 code that we
want to write, with a separate workaround that we can delete when we
drop MSVC 2017 support.
This also adds LLVM_REQUIRE_CONSTANT_INITIALIZATION, which wraps
[[clang::require_constant_initialization]]. This actually detected a
minor issue when using clang-cl where clang wasn't able to use the
constexpr constructor in MSVC's STL, so I switched back to using the
default ctor of std::atomic<void*>.
Michal Gorny [Thu, 4 Apr 2019 14:21:38 +0000 (14:21 +0000)]
[llvm] [cmake] Add additional headers only if they exist
Modify the add_header_files_for_glob() function to only add files
that do exist, rather than all matches of the glob. This fixes CMake
error when one of the include directories (which happen to include
/usr/include) contain broken symlinks.
Lewis Revill [Thu, 4 Apr 2019 14:13:37 +0000 (14:13 +0000)]
[RISCV] Support assembling TLS add and associated modifiers
This patch adds support in the MC layer for parsing and assembling the
4-operand add instruction needed for TLS addressing. This also involves
parsing the %tprel_hi, %tprel_lo and %tprel_add operand modifiers.
Jonas Paulsson [Thu, 4 Apr 2019 12:12:35 +0000 (12:12 +0000)]
[SystemZ] Bugfix in isFusableLoadOpStorePattern()
This function is responsible for checking the legality of fusing an instance
of load -> op -> store into a single operation. In the SystemZ backend the
check was incomplete and a test case emerged with a cycle in the instruction
selection DAG as a result.
Instead of using the NodeIds to determine node relationships,
hasPredecessorHelper() now is used just like in the X86 backend. This handled
the failing tests and as well gave a few additional transformations on
benchmarks.
The SystemZ isFusableLoadOpStorePattern() is now a very near copy of the X86
function, and it seems this could be made a utility function in common code
instead.
Simon Pilgrim [Thu, 4 Apr 2019 11:12:30 +0000 (11:12 +0000)]
Revert rL357655 and rL357656 from llvm/trunk:
Fix minor innaccuracy in previous comment on ManagedStaticBase
........
Make ManagedStatic constexpr constructible
Apparently it needs member initializers so that it can be constructed in
a constexpr context. I explained my investigation of this in PR41367.
........
Causes vs2017 debug llvm-tblgen to fail with "Unknown command line argument" errors - similar to the vs2019 error discussed on PR41367 without the patch....
This patch fixes .arch_extension directive parsing to handle a wider
range of architecture extension options. The existing parser was parsing
extensions as an identifier which breaks for extensions containing a
"-", such as the "tlb-rmi" extension.
The extension is now parsed as a string. This is consistent with the
extension parsing in the .arch and .cpu directive parsing.