James Molloy [Thu, 15 Sep 2016 12:30:27 +0000 (12:30 +0000)]
[ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:
This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).
This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.
Tim Northover [Thu, 15 Sep 2016 10:09:59 +0000 (10:09 +0000)]
GlobalISel: remove "unsized" LLT
It was only really there as a sentinel when instructions had to have precisely
one type. Now that registers are typed, each register really has to have a type
that is sized.
[llvm-cov] Hide instantiation views for unexecuted functions
Copying in the full text of the function doesn't help at all when we
already know that it's never executed. Just say that it's unexecuted --
the relevant source text has already been printed.
Wei Mi [Thu, 15 Sep 2016 06:28:34 +0000 (06:28 +0000)]
Add some shortcuts in LazyValueInfo to reduce compile time of Correlated Value Propagation.
The patch is to partially fix PR10584. Correlated Value Propagation queries LVI
to check non-null for pointer params of each callsite. If we know the def of
param is an alloca instruction, we know it is non-null and can return early from
LVI. Similarly, CVP queries LVI to check whether pointer for each mem access is
constant. If the def of the pointer is an alloca instruction, we know it is not
a constant pointer. These shortcuts can reduce the cost of CVP significantly.
Jonas Hahnfeld [Thu, 15 Sep 2016 06:14:13 +0000 (06:14 +0000)]
[CMake] Fixing lit for runtimes directory
Copy variable LLVM_BUILD_MAIN_SRC_DIR from LLVMConfig.cmake to
LLVM_MAIN_SRC_DIR as it is named for in-tree builds. This ensures that
add_lit_target() can reliably find llvm-lit which is not necessarily
in the PATH.
[libFuzzer] implement print_pcs with trace-pc-guard. Change the trace-pc-guard heuristic for 8-bit counters to look more like in AFL (not that it's provable better, but the existin test preferes this heuristic)
Wei Mi [Thu, 15 Sep 2016 04:06:44 +0000 (04:06 +0000)]
Add a C++ unittest to test the fix for PR30213.
The test exercises the branch in scev expansion when the value in ValueOffsetPair
is a ptr and the offset is not divisible by the elem type size of value.
The `CVType` had two redundant fields which were confusing and
error-prone to fill out. By treating member records as a distinct
type from leaf records, we are able to simplify this quite a bit.
Summary:
This fixes a dumpbin warning on objects produced by the MC assembler
when starting from text. All .debug$S sections are supposed to be marked
IMAGE_SCN_MEM_DISCARDABLE. The main, non-COMDAT .debug$S section had
this set, but any comdat ones were not being marked discardable because
there was no .section flag for it.
This change does two things:
- If the section name starts with .debug, implicitly mark the section as
discardable. This means we do the same thing as gas on .s files with
DWARF from GCC, which is important.
- Adds the 'D' flag to the .section directive on COFF to explicitly say
a section is discardable. We only emit this flag if the section name
does not start with .debug. This allows the user to explicitly tweak
their section flags without worrying about magic section names.
The only thing you can't do in this scheme is to create a
non-discardable section with a name starting with ".debug", but
hopefully users don't need that.
Mehdi Amini [Wed, 14 Sep 2016 22:29:59 +0000 (22:29 +0000)]
Fix auto-upgrade of TBAA tags in Bitcode Reader
If TBAA is on an intrinsic and it gets upgraded, it'll delete the call
instruction that we collected in a vector. Even if we were to use
WeakVH, it'll drop the TBAA and we'll hit the assert on the upgrade
path.
r263673 gave a shot to make sure the TBAA upgrade happens before
intrinsics upgrade, but failed to account for all cases.
Instead of collecting instructions in a vector, this patch makes it
just upgrade the TBAA on the fly, because metadata are always
already loaded at this point.
Mehdi Amini [Wed, 14 Sep 2016 21:05:04 +0000 (21:05 +0000)]
[LTO] Fix commons handling
Previously the prevailing information was not honored, and commons
symbols could override a strong definition. This patch fixes it and
propose the following semantic for commons: the client should mark
as prevailing the commons that it expects the LTO implementation to
merge (i.e. take the maximum size and alignment).
It implies that commons are allowed to have multiple prevailing
definitions.
Sanjoy Das [Wed, 14 Sep 2016 20:22:03 +0000 (20:22 +0000)]
[Stackmap] Added callsite counts to emitted function information.
Summary:
It was previously not possible for tools to use solely the stackmap
information emitted to reconstruct the return addresses of callsites in
the map, which is necessary to use the information to walk a stack. This
patch adds per-function callsite counts when emitting the stackmap
section in order to resolve the problem. Note that this slightly alters
the stackmap format, so external tools parsing these maps will need to
be updated.
**Problem Details:**
Records only store their offset from the beginning of the function they
belong to. While these records and the functions are output in program
order, it is not possible to determine where the end of one function's
records are without the callsite count when processing the records to
compute return addresses.
Adrian Prantl [Wed, 14 Sep 2016 17:30:37 +0000 (17:30 +0000)]
Verifier: Mark orphaned DICompileUnits as a debug info failure.
This is a follow-up to r268778 that adds a couple of missing cases,
most notably orphaned compile units.
Summary:
Function __asan_default_options is called by __asan_init before the
shadow memory got initialized. Instrumenting that function may lead
to flaky execution.
As the __asan_default_options is provided by users, we cannot expect
them to add the appropriate function atttributes to avoid
instrumentation.
Matthew Simpson [Wed, 14 Sep 2016 14:47:40 +0000 (14:47 +0000)]
[LV] Process pointer IVs with PHINodes in collectLoopUniforms
This patch moves the processing of pointer induction variables in
collectLoopUniforms from the consecutive pointer phase of the analysis to the
phi node phase. Previously, if a pointer induction variable was used by both a
scalarized non-memory instruction as well as a vectorized memory instruction,
we would incorrectly identify the pointer as uniform. Pointer induction
variables should be treated the same as other phi nodes. That is, they are
uniform if all users of the induction variable and induction variable update
are uniform.
James Molloy [Wed, 14 Sep 2016 14:47:27 +0000 (14:47 +0000)]
[ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:
MCInstrDesc: this fixes an issue setting/getting member Flags, which
is an uint64_t. However, getter function getFlags returned an unsigned,
and in function hasProperty (1 << MCFlag) was used instead of (1ULL << MCFlag).
Fix code-gen crash on Power9 for insert_vector_elt with variable index (PR30189)
This patch corresponds to review:
https://reviews.llvm.org/D24021
In the initial implementation of this instruction, I forgot to account for
variable indices. This patch fixes PR30189 and should probably be merged into
3.9.1 (I'll open a bug according to the new instructions).
Adding missing directive for Power9.
There is currently no codegen for Power9 that depends on the directive
so this is NFC for now but will be important in the future. This was
missed in r268950 so I'm adding it now.
Kuba Brecka [Wed, 14 Sep 2016 14:06:33 +0000 (14:06 +0000)]
[asan] Enable -asan-use-private-alias on Darwin/Mach-O, add test for ODR false positive with LTO (llvm part)
The '-asan-use-private-alias’ option (disabled by default) option is currently only enabled for Linux and ELF, but it also works on Darwin and Mach-O. This option also fixes a known problem with LTO on Darwin (https://github.com/google/sanitizers/issues/647). This patch enables the support for Darwin (but still keeps it off by default) and adds the LTO test case.
Wei Mi [Wed, 14 Sep 2016 04:39:50 +0000 (04:39 +0000)]
Create a getelementptr instead of sub expr for ValueOffsetPair if the
value is a pointer.
This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair,
if the value is of pointer type, we can only create a getelementptr instead
of sub expr.
Ensure Polly linking works without BUILD_SHARED_LIBS
This change ensures all necessary symbols are resolved correctly. Before this
change on some systems, the linker may have eliminated some symbols not directly
used in bugpoint, but used in Polly.
Suggested-by: Michael Kruse <lvm@meinersbur.de>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281438 91177308-0d34-0410-b5e6-96231b3b80d8
[sanitizer-coverage] add yet another flavour of coverage instrumentation: trace-pc-guard. The intent is to eventually replace all of {bool coverage, 8bit-counters, trace-pc} with just this one. LLVM part
[ObjCARC] Traverse chain downwards to replace uses of argument passed to
ObjC library call with call return.
ARC contraction tries to replace uses of an argument passed to an
objective-c library call with the call return value. For example, in the
following IR, it replaces uses of argument %9 and uses of the values
discovered traversing the chain upwards (%7 and %8) with the call return
%10, if they are dominated by the call to @objc_autoreleaseReturnValue.
This transformation enables code-gen to tail-call the call to
@objc_autoreleaseReturnValue, which is necessary to enable auto release
return value optimization.
ARC contraction no longer does the optimization described above since it
only traverses the chain upwards and fails to recognize that the
function return can be replaced by the call return. This commit changes
ARC contraction to traverse the chain downwards too and replace uses of
bitcasts with the call return.
Summary: When expanding mul in type legalization make sure the type for shift amount can actually fit the value. This fixes PR30354 https://llvm.org/bugs/show_bug.cgi?id=30354.
[DAG] Allow build-to-shuffle combine to combine builds from two wide vectors.
This allows us to, in some cases, create a vector_shuffle out of a build_vector, when
the inputs to the build are extract_elements from two different vectors, at least one
of which is wider than the output. (E.g. a <8 x i16> being constructed out of
elements from a <16 x i16> and a <8 x i16>).
Kevin Enderby [Tue, 13 Sep 2016 21:42:28 +0000 (21:42 +0000)]
Next set of additional error checks for invalid Mach-O files for bad load commands
that use the Mach::dyld_info_command type for the load commands that are
currently use in the MachOObjectFile constructor.
This contains the missing checks for LC_DYLD_INFO and
LC_DYLD_INFO_ONLY load commands and the fields for the
Mach::dyld_info_command type.
AArch64: Cleanup tailcall CC check, enable swiftcc.
Cleanup/change the code that checks for possible tailcall conventions to
look the same as the one in the X86 target. This makes the distinction
between calling conventions that can guarnatee tailcalls and the ones
that may tailcall more obvious.
- Add Swift to the mayTailCall list
- PreserveMost seemed to be incorrectly part of the guarnteed tail call
list, move it to the mayTailCall list.