Rui Ueyama [Fri, 15 Nov 2013 01:25:34 +0000 (01:25 +0000)]
Include raw_ostream.h.
Including only Debug.h did not cause a compilation error, but you couldn't
do anything (like writing something with <<) to raw_ostreams returned by
llvm::dbgs() or llvm::errs() without including raw_ostream.h. So including
it from Debug.h should make sense.
Tom Stellard [Fri, 15 Nov 2013 00:12:45 +0000 (00:12 +0000)]
R600: Fix scheduling of instructions that use the LDS output queue
The LDS output queue is accessed via the OQAP register. The OQAP
register cannot be live across clauses, so if value is written to the
output queue, it must be retrieved before the end of the clause.
With the machine scheduler, we cannot statisfy this constraint, because
it lacks proper alias analysis and it will mark some LDS accesses as
having a chain dependency on vertex fetches. Since vertex fetches
require a new clauses, the dependency may end up spiltting OQAP uses and
defs so the end up in different clauses. See the lds-output-queue.ll
test for a more detailed explanation.
To work around this issue, we now combine the LDS read and the OQAP
copy into one instruction and expand it after register allocation.
This patch also adds some checks to the EmitClauseMarker pass, so that
it doesn't end a clause with a value still in the output queue and
removes AR.X and OQAP handling from the scheduler (AR.X uses and defs
were already being expanded post-RA, so the scheduler will never see
them).
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194755 91177308-0d34-0410-b5e6-96231b3b80d8
Chandler Carruth [Thu, 14 Nov 2013 23:51:29 +0000 (23:51 +0000)]
Teach the Makefile build system how to handle SOURCES which include
subdirectories. The only thing needed here is to create the appropriate
object file directories and add those as dependencies for the
compilation rules.
As a consequence, factor the non-source-file-specific dependencies for
compilation rules into a helper variable. This fixes an issue where the
project makefile wasn't actually a dependency of a bunch of compilation
make rules for no apparant reason.
This should have no observable effect for current makefile usage, but
will simplify how we build other libraries and is something CMake
already supports.
Andrew Trick [Thu, 14 Nov 2013 23:45:04 +0000 (23:45 +0000)]
When folding memory operands, preserve existing MachineMemOperands.
This comes into play with patchpoint, which can fold multiple
operands. Since the patchpoint is already treated as a call, the
machine mem operands won't affect anything, and there's nothing to
test. But we still want to do the right thing here to be sure that our
MIs obey the rules.
Rui Ueyama [Thu, 14 Nov 2013 22:09:08 +0000 (22:09 +0000)]
Recognize 0x0000 as a COFF file magic.
Summary:
Some machine-type-neutral object files containing only undefined symbols
actually do exist in the Windows standard library. Need to recognize them
as COFF files.
Rafael Espindola [Thu, 14 Nov 2013 13:58:06 +0000 (13:58 +0000)]
Error if we see an alias to a declaration.
In ELF and COFF an alias is just another offset in a section. There is no way
to represent an alias to something in another file.
In MachO, the spec has the N_INDR type which should allow for exactly that, but
is not currently implemented. Given that it is specified but not implemented,
we error in codegen to avoid miscompiling but don't reject aliases to
declarations in the verifier to leave the option open of implementing it.
In the past we have used alias to declarations as a way of implementing
weakref, which is why it exists in some old tests which this patch updates.
Evgeniy Stepanov [Thu, 14 Nov 2013 12:29:04 +0000 (12:29 +0000)]
[msan] Fast path optimization for wrap-indirect-calls feature of MemorySanitizer.
Indirect call wrapping helps MSanDR (dynamic instrumentation companion tool
for MSan) to catch all cases where execution leaves a compiler-instrumented
module by allowing the tool to rewrite targets of indirect calls.
This change is an optimization that skips wrapping for calls when target is
inside the current module. This relies on the linker providing symbols at the
begin and end of the module code (or code + data, does not really matter).
Gold linker provides such symbols by default. GNU (BFD) linker needs a link
flag: -Wl,--defsym=__executable_start=0.
More info:
https://code.google.com/p/memory-sanitizer/wiki/MSanDR#Native_exec
Andrew Trick [Thu, 14 Nov 2013 06:54:10 +0000 (06:54 +0000)]
Minor extension to llvm.experimental.patchpoint: don't require a call.
If a null call target is provided, don't emit a dummy call. This
allows the runtime to reserve as little nop space as it needs without
the requirement of emitting a call.
Peter Zotov [Thu, 14 Nov 2013 06:34:21 +0000 (06:34 +0000)]
[OCaml] Build stub OCaml libraries for all configured targets
This allows to only link in the needed targets, reducing binary
size and more importantly link time.
Note that this is an incomplete implementation: currently,
LLVM does not have the plumbing which would allow to conditionally
link in AsmPrinter, AsmParser and Disassembler for the targets
which support them. This should be improved in the future.
Added BlockFrequencyInfo::view for displaying the block frequency propagation graph via graphviz.
This is useful for debugging issues in the BlockFrequency implementation since
one can easily visualize where probability mass and other errors occur in the
propagation.
Yunzhong Gao [Thu, 14 Nov 2013 01:10:52 +0000 (01:10 +0000)]
Fixing a heisenbug where the memory dependence analysis behaves differently
with and without -g.
Adding a test case to make sure that the threshold used in the memory
dependence analysis is respected. The test case also checks that debug
intrinsics are not counted towards this threshold.
Yuchen Wu [Thu, 14 Nov 2013 00:38:41 +0000 (00:38 +0000)]
llvm-cov: Slightly improved error checking.
- readInt() should check all 4 bytes can be read, not just 1.
- In the event of false data in the gcno file, it was possible to index
into a non-existent index of SmallVector, causing assertion error.
Yuchen Wu [Thu, 14 Nov 2013 00:32:00 +0000 (00:32 +0000)]
llvm-cov: Removed StringMap holding GCOVLines.
According to the hazy gcov documentation, it appeared to be technically
possible for lines within a block to belong to different source files.
However, upon further investigation, gcov does not actually support
multiple source files for a single block.
This change removes a level of separation between blocks and lines by
replacing the StringMap of GCOVLines with a SmallVector of ints
representing line numbers. This also means that the GCOVLines class is
no longer needed.
This paves the way for supporting the "-a" option, which will output
block information.
Yuchen Wu [Thu, 14 Nov 2013 00:07:15 +0000 (00:07 +0000)]
llvm-cov: Replaced asserts with proper error handling.
Unified the interface for read functions. They all return a boolean
indicating if the read from file succeeded. Functions that previously
returned the read value now store it into a variable that is passed in
by reference instead. Callers will need to check the return value to
detect if an error occurred.
Also added a new test which ensures that no assertions occur when file
contains invalid data. llvm-cov should return with error code 1 upon
failure.
Tom Stellard [Wed, 13 Nov 2013 23:36:37 +0000 (23:36 +0000)]
R600/SI: Prefer SALU instructions for bit shift operations
All shift operations will be selected as SALU instructions and then
if necessary lowered to VALU instructions in the SIFixSGPRCopies pass.
This allows us to do more operations on the SALU which will improve
performance and is also required for implementing private memory
using indirect addressing, since the private memory pointers must stay
in the scalar registers.
This patch includes some fixes from Matt Arsenault.
Yuchen Wu [Wed, 13 Nov 2013 22:50:15 +0000 (22:50 +0000)]
Added basic unit test for llvm-cov.
This test compares the output of llvm-cov against a coverage file
generated by gcov. Currently, llvm-cov does not work on certain
platforms (namely big-endian architectures such as PowerPC, among
others). These platforms are marked as XFAIL for now, but will be fixed
later.
Chad Rosier [Wed, 13 Nov 2013 20:05:37 +0000 (20:05 +0000)]
[AArch64] Add support for legacy AArch32 NEON scalar shift by immediate
instructions. This patch does not include the shift right and accumulate
instructions. A number of non-overloaded intrinsics have been remove in favor
of their overloaded counterparts.
Hans Wennborg [Wed, 13 Nov 2013 19:12:02 +0000 (19:12 +0000)]
CMake: make building with /MT an option instead of always forcing it
for release builds.
This is a follow-up to r194589. Aaron pointed out that building
libraries with /MT and using them in an application that uses a
different run-time library can be a bad idea.
Move the option to build with /MT behind a CMake option so it can be
turned on selectively, such as when building the toolchain installer.
Weiming Zhao [Wed, 13 Nov 2013 18:29:49 +0000 (18:29 +0000)]
Enable generating legacy IT block for AArch32
By default, the behavior of IT block generation will be determinated
dynamically base on the arch (armv8 vs armv7). This patch adds backend
options: -arm-restrict-it and -arm-no-restrict-it. The former one
restricts the generation of IT blocks (the same behavior as thumbv8) for
both arches. The later one allows the generation of legacy IT block (the
same behavior as ARMv7 Thumb2) for both arches.
Clang will support -mrestrict-it and -mno-restrict-it, which is
compatible with GCC.
Rafael Espindola [Wed, 13 Nov 2013 14:01:59 +0000 (14:01 +0000)]
Remove AllowQuotesInName and friends from MCAsmInfo.
Accepting quotes is a property of an assembler, not of an object file. For
example, ELF can support any names for sections and symbols, but the gnu
assembler only accepts quotes in some contexts and llvm-mc in a few more.
LLVM should not produce different symbols based on a guess about which assembler
will be reading the code it is printing.
Rafael Espindola [Wed, 13 Nov 2013 13:44:11 +0000 (13:44 +0000)]
Don't call doFinalization from verifyFunction.
verifyFunction needs to call doInitialization to collect metadata and avoid
crashing when verifying debug info in a function.
But it should not call doFinalization since that is where the verifier will
check declarations, variables and aliases, which is not desirable when one
only wants to verify a function.
A possible cleanup would be to split the class into a ModuleVerifier and
FunctionVerifier.
Issue reported by Ilia Filippov. Patch by Michael Kruse.
Diego Novillo [Wed, 13 Nov 2013 12:22:21 +0000 (12:22 +0000)]
SampleProfileLoader pass. Initial setup.
This adds a new scalar pass that reads a file with samples generated
by 'perf' during runtime. The samples read from the profile are
incorporated and emmited as IR metadata reflecting that profile.
The profile file is assumed to have been generated by an external
profile source. The profile information is converted into IR metadata,
which is later used by the analysis routines to estimate block
frequencies, edge weights and other related data.
External profile information files have no fixed format, each profiler
is free to define its own. This includes both the on-disk representation
of the profile and the kind of profile information stored in the file.
A common kind of profile is based on sampling (e.g., perf), which
essentially counts how many times each line of the program has been
executed during the run.
The SampleProfileLoader pass is organized as a scalar transformation.
On startup, it reads the file given in -sample-profile-file to
determine what kind of profile it contains. This file is assumed to
contain profile information for the whole application. The profile
data in the file is read and incorporated into the internal state of
the corresponding profiler.
To facilitate testing, I've organized the profilers to support two file
formats: text and native. The native format is whatever on-disk
representation the profiler wants to support, I think this will mostly
be bitcode files, but it could be anything the profiler wants to
support. To do this, every profiler must implement the
SampleProfile::loadNative() function.
The text format is mostly meant for debugging. Records are separated by
newlines, but each profiler is free to interpret records as it sees fit.
Profilers must implement the SampleProfile::loadText() function.
Finally, the pass will call SampleProfile::emitAnnotations() for each
function in the current translation unit. This function needs to
translate the loaded profile into IR metadata, which the analyzer will
later be able to use.
This patch implements the first steps towards the above design. I've
implemented a sample-based flat profiler. The format of the profile is
fairly simplistic. Each sampled function contains a list of relative
line locations (from the start of the function) together with a count
representing how many samples were collected at that line during
execution. I generate this profile using perf and a separate converter
tool.
Currently, I have only implemented a text format for these profiles. I
am interested in initial feedback to the whole approach before I send
the other parts of the implementation for review.
This patch implements:
- The SampleProfileLoader pass.
- The base ExternalProfile class with the core interface.
- A SampleProfile sub-class using the above interface. The profiler
generates branch weight metadata on every branch instructions that
matches the profiles.
- A text loader class to assist the implementation of
SampleProfile::loadText().
- Basic unit tests for the pass.
Additionally, the patch uses profile information to compute branch
weights based on instruction samples.
This patch converts instruction samples into branch weights. It
does a fairly simplistic conversion:
Given a multi-way branch instruction, it calculates the weight of
each branch based on the maximum sample count gathered from each
target basic block.
Note that this assignment of branch weights is somewhat lossy and can be
misleading. If a basic block has more than one incoming branch, all the
incoming branches will get the same weight. In reality, it may be that
only one of them is the most heavily taken branch.
I will adjust this assignment in subsequent patches.
Alexey Samsonov [Wed, 13 Nov 2013 11:56:22 +0000 (11:56 +0000)]
FileCheck: fix a bug with multiple --check-prefix options.
Summary:
This fixes a subtle bug in new FileCheck feature added
in r194343. When we search for the first satisfying check-prefix,
we should actually return the first encounter of some check-prefix as a
substring, even if it's not a part of valid check-line. Otherwise
"FileCheck --check-prefix=FOO --check-prefix=BAR" with check file:
FOO not a vaild check-line
FOO: foo
BAR: bar
incorrectly accepted file:
fog
bar
as it skipped the first two encounters of FOO, matching only BAR: line.
Reed Kotler [Wed, 13 Nov 2013 04:37:52 +0000 (04:37 +0000)]
Allow the code which returns the length for inline assembler to know
specifically about the .space directive. This allows us to force large
blocks of code to appear in test cases for things like constant islands
without having to make giant test cases to force things like long
branches to take effect.
Chandler Carruth [Wed, 13 Nov 2013 02:48:20 +0000 (02:48 +0000)]
Fix a null pointer dereference when copying a null polymorphic pointer.
This bug only bit the C++98 build bots because all of the actual uses
really do move. ;] But not *quite* ready to do the whole C++11 switch
yet, so clean it up. Also add a unit test that catches this immediately.
Juergen Ributzka [Wed, 13 Nov 2013 01:57:54 +0000 (01:57 +0000)]
SelectionDAG: Teach the legalizer to split SETCC if VSELECT needs splitting too.
This patch reapplies r193676 with an additional fix for the Hexagon backend. The
SystemZ backend has already been fixed by r194148.
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask type for the given target. Now the type
legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.