Browse code

Merge LLVM upstream r91428.

Squashed commit of the following:

commit 08c733e79dd6b65be6eab3060b47fe4d231098b9
Author: Chris Lattner <sabre@nondot.org>
Date: Tue Dec 15 09:05:13 2009 +0000

add some other xforms that should be done as part of PR5783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91428 91177308-0d34-0410-b5e6-96231b3b80d8

commit 39a7fa146ef728a10fce157d2efcecd806bf276b
Author: Chris Lattner <sabre@nondot.org>
Date: Tue Dec 15 08:34:01 2009 +0000

a few improvements:
1. Use std::equal instead of reinventing it.
2. don't run dtors in destroy_range if element is pod-like.
3. Use isPodLike to decide between memcpy/uninitialized_copy
instead of is_class. isPodLike is more generous in some cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91427 91177308-0d34-0410-b5e6-96231b3b80d8

commit 3a95c15ce022ba6cdeea981f9b7b0a7d4724e11a
Author: Chris Lattner <sabre@nondot.org>
Date: Tue Dec 15 08:29:22 2009 +0000

hoist the begin/end/capacity members and a few trivial methods
up into the non-templated SmallVectorBase class.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91426 91177308-0d34-0410-b5e6-96231b3b80d8

commit 142f4f4c9d8ab4a1d1eb5c2fde61a6383fed25c4
Author: Chris Lattner <sabre@nondot.org>
Date: Tue Dec 15 07:40:44 2009 +0000

improve isPodLike to know that all non-class types are pod.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91425 91177308-0d34-0410-b5e6-96231b3b80d8

commit bc6f37b22aeb8f1ec5c7eb650ecbdea67f34a3de
Author: Chris Lattner <sabre@nondot.org>
Date: Tue Dec 15 07:27:58 2009 +0000

Lang verified that SlotIndex is "pod like" even though it isn't a pod.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91423 91177308-0d34-0410-b5e6-96231b3b80d8

commit 169f3a233e90dcdd01e42829b396c823d016fe30
Author: Chris Lattner <sabre@nondot.org>
Date: Tue Dec 15 07:26:43 2009 +0000

Remove isPod() from DenseMapInfo, splitting it out to its own
isPodLike type trait. This is a generally useful type trait for
more than just DenseMap, and we really care about whether something
acts like a pod, not whether it really is a pod.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91421 91177308-0d34-0410-b5e6-96231b3b80d8

commit e1d483dade6f1675d9c2279fb9ae503858b89844
Author: Mikhail Glushenkov <foldr@codedgers.com>
Date: Tue Dec 15 07:21:14 2009 +0000

Convert llvmc tests to FileCheck.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91420 91177308-0d34-0410-b5e6-96231b3b80d8

commit d016c18182f165c7a967f1c5a6a343971bcd2465
Author: Mikhail Glushenkov <foldr@codedgers.com>
Date: Tue Dec 15 07:20:50 2009 +0000

Support hook invocation from 'append_cmd'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91419 91177308-0d34-0410-b5e6-96231b3b80d8

commit 4136d8daf27d7f04dea28a578b39e5a614fca81e
Author: Evan Cheng <evan.cheng@apple.com>
Date: Tue Dec 15 06:49:02 2009 +0000

Fix an encoding bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91417 91177308-0d34-0410-b5e6-96231b3b80d8

commit 54fec492a4c81ee84265ad953f4212eda9aff5c1
Author: Chris Lattner <sabre@nondot.org>
Date: Tue Dec 15 06:14:33 2009 +0000

add an ALWAYS_INLINE macro, which does the obvious thing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91416 91177308-0d34-0410-b5e6-96231b3b80d8

commit 428c804a753234ecaf6a6177107361a1312508f8
Author: Kenneth Uildriks <kennethuil@gmail.com>
Date: Tue Dec 15 03:27:52 2009 +0000

For fastcc on x86, let ECX be used as a return register after EAX and EDX

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91410 91177308-0d34-0410-b5e6-96231b3b80d8

commit 90468b7e484723a7ecfe7b4bf7a3264d2c6c6d06
Author: John McCall <rjmccall@apple.com>
Date: Tue Dec 15 03:10:26 2009 +0000

Names from dependent base classes are not found by unqualified lookup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91407 91177308-0d34-0410-b5e6-96231b3b80d8

commit 87c0a2dffc5590dc2604754dbe12c9430a54b27b
Author: Evan Cheng <evan.cheng@apple.com>
Date: Tue Dec 15 03:07:11 2009 +0000

Disable 91381 for now. It's miscompiling ARMISelDAG2DAG.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91405 91177308-0d34-0410-b5e6-96231b3b80d8

commit 5ca004428400555d08d43eebe7e91c7035793afb
Author: Mikhail Glushenkov <foldr@codedgers.com>
Date: Tue Dec 15 03:04:52 2009 +0000

Validate the generated C++ code in llvmc tests.

Checks that the code generated by 'tblgen --emit-llvmc' can be actually
compiled. Also fixes two bugs found in this way:

- forward_transformed_value didn't work with non-list arguments
- cl::ZeroOrOne is now called cl::Optional

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91404 91177308-0d34-0410-b5e6-96231b3b80d8

commit 0e4f60395f69857730808200642874b0ecd44896
Author: Mikhail Glushenkov <foldr@codedgers.com>
Date: Tue Dec 15 03:04:14 2009 +0000

Pipe 'grep' output to 'count'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91403 91177308-0d34-0410-b5e6-96231b3b80d8

commit bc4f5408a6a0881c31e2a3165d022d74a6b2b9e5
Author: Mikhail Glushenkov <foldr@codedgers.com>
Date: Tue Dec 15 03:04:02 2009 +0000

Allow $CALL(Hook, '$INFILE') for non-join tools.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91402 91177308-0d34-0410-b5e6-96231b3b80d8

commit ff7c2e17fda5f570afff5eaf75f88460019d3f74
Author: Mikhail Glushenkov <foldr@codedgers.com>
Date: Tue Dec 15 03:03:37 2009 +0000

Small documentation update.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91401 91177308-0d34-0410-b5e6-96231b3b80d8

commit d629d80a80bfd6094563000bc82ed37b42acfffa
Author: Evan Cheng <evan.cheng@apple.com>
Date: Tue Dec 15 03:00:32 2009 +0000

Make 91378 more conservative.
1. Only perform (zext (shl (zext x), y)) -> (shl (zext x), y) when y is a constant. This makes sure it remove at least one zest.
2. If the shift is a left shift, make sure the original shift cannot shift out bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91399 91177308-0d34-0410-b5e6-96231b3b80d8

commit 7caa1423082e873b0685e8d1fb4f7351bdabb103
Author: John McCall <rjmccall@apple.com>
Date: Tue Dec 15 02:35:24 2009 +0000

You can't use typedefs to declare template member specializations, and
clang enforces it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91397 91177308-0d34-0410-b5e6-96231b3b80d8

commit e8aa0b417ca5d0fc33b6079aa11b81cf86667956
Author: Bill Wendling <isanbard@gmail.com>
Date: Tue Dec 15 01:54:51 2009 +0000

Initial work on disabling the scheduler. This is a work in progress, and this
stuff isn't used just yet.

We want to model the GCC `-fno-schedule-insns' and `-fno-schedule-insns2'
flags. The hypothesis is that the people who use these flags know what they are
doing, and have hand-optimized the C code to reduce latencies and other
conflicts.

The idea behind our scheme to turn off scheduling is to create a map "on the
side" during DAG generation. It will order the nodes by how they appeared in the
code. This map is then used during scheduling to get the ordering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91392 91177308-0d34-0410-b5e6-96231b3b80d8

commit 3f0f8885c7079d20930ca0336bb879adde51aaaf
Author: Evan Cheng <evan.cheng@apple.com>
Date: Tue Dec 15 01:44:10 2009 +0000

Tail duplication should zap a copy it inserted for SSA update if the copy is the only use of its source.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91390 91177308-0d34-0410-b5e6-96231b3b80d8

commit 834ae6b04f4c3650b92182662aa8bb5b0fcf419f
Author: Evan Cheng <evan.cheng@apple.com>
Date: Tue Dec 15 00:53:42 2009 +0000

Use sbb x, x to materialize carry bit in a GPR. The result is all one's or all zero's.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91381 91177308-0d34-0410-b5e6-96231b3b80d8

commit 5b6187226b44f590ce7f614b128480b9c2d823ef
Author: Evan Cheng <evan.cheng@apple.com>
Date: Tue Dec 15 00:52:11 2009 +0000

Fold (zext (and x, cst)) -> (and (zext x), cst).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91380 91177308-0d34-0410-b5e6-96231b3b80d8

commit d08dad66572d86df1826c3547cb824b43ae8e8be
Author: Daniel Dunbar <daniel@zuster.org>
Date: Tue Dec 15 00:41:47 2009 +0000

NNT: Make sure stderr for build commands goes to log file, as intended but misdirected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91379 91177308-0d34-0410-b5e6-96231b3b80d8

commit 3ff63ae679cf08e69db6770e7965e4f3d04637b9
Author: Evan Cheng <evan.cheng@apple.com>
Date: Tue Dec 15 00:41:36 2009 +0000

Propagate zest through logical shift.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91378 91177308-0d34-0410-b5e6-96231b3b80d8

commit 9f669d99f66d2ca120c85c4c379f2571d6dd947a
Author: Eric Christopher <echristo@apple.com>
Date: Tue Dec 15 00:40:55 2009 +0000

Formatting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91377 91177308-0d34-0410-b5e6-96231b3b80d8

commit 87426f8cd21507e13f0256a6727a0c27f60705c3
Author: Bill Wendling <isanbard@gmail.com>
Date: Tue Dec 15 00:39:24 2009 +0000

Revert these. They may have been causing 483_xalancbmk to fail:

$ svn merge -c -91161 https://llvm.org/svn/llvm-project/llvm/trunk
--- Reverse-merging r91161 into '.':
U lib/CodeGen/BranchFolding.cpp
U lib/CodeGen/MachineBasicBlock.cpp
$ svn merge -c -91113 https://llvm.org/svn/llvm-project/llvm/trunk
--- Reverse-merging r91113 into '.':
G lib/CodeGen/MachineBasicBlock.cpp
$ svn merge -c -91101 https://llvm.org/svn/llvm-project/llvm/trunk
--- Reverse-merging r91101 into '.':
U include/llvm/CodeGen/MachineBasicBlock.h
G lib/CodeGen/MachineBasicBlock.cpp
$ svn merge -c -91092 https://llvm.org/svn/llvm-project/llvm/trunk
--- Reverse-merging r91092 into '.':
G include/llvm/CodeGen/MachineBasicBlock.h
G lib/CodeGen/MachineBasicBlock.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91376 91177308-0d34-0410-b5e6-96231b3b80d8

commit e6e14f2cfc4e8bd346bf3fa7a5ac87b6ebf422ff
Author: Jim Grosbach <grosbach@apple.com>
Date: Tue Dec 15 00:12:35 2009 +0000

nand atomic requires opposite operand ordering

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91371 91177308-0d34-0410-b5e6-96231b3b80d8

commit c6cfdd3f717bfa1b43351c354e39c066dbd167cd
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 23:40:38 2009 +0000

Fix integer cast code to handle vector types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91362 91177308-0d34-0410-b5e6-96231b3b80d8

commit 8b0d8db13172ed290285f0832e021b4ce3ef9aea
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 23:36:03 2009 +0000

Move Flag and isVoid after the vector types, since bit arithmetic with
those enum values is less common.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91361 91177308-0d34-0410-b5e6-96231b3b80d8

commit 81c5562ec2e59741258fc67824bbb64b91ece71e
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 23:34:36 2009 +0000

Fix these asserts to check the invariant that the code actually
depends on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91360 91177308-0d34-0410-b5e6-96231b3b80d8

commit 612ae24984fcce968041a2f3f379505d8e007a83
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 23:13:31 2009 +0000

Update this comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91356 91177308-0d34-0410-b5e6-96231b3b80d8

commit a34782d2b71b5fd6b3b32fa4943de1fc89d47115
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 23:08:09 2009 +0000

Fix this to properly clear the FastISel debug location. Thanks to
Bill for spotting this!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91355 91177308-0d34-0410-b5e6-96231b3b80d8

commit becf334c8d194b1f6c21db915ba3c22c451ab42a
Author: Bob Wilson <bob.wilson@apple.com>
Date: Mon Dec 14 22:44:22 2009 +0000

Rearrange rules to add missing dependency and allow parallel makes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91352 91177308-0d34-0410-b5e6-96231b3b80d8

commit 6fdbe657ebacf9e1bdb1e2cebfe82a9549d86d3e
Author: Johnny Chen <johnny.chen@apple.com>
Date: Mon Dec 14 21:51:34 2009 +0000

Add encoding bits "let Inst{11-4} = 0b00000000;" to BR_JTr to disambiguate
between BR_JTr and STREXD.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91339 91177308-0d34-0410-b5e6-96231b3b80d8

commit 32b48e95922e730032f188c313cdd2e50c63cbc9
Author: Bill Wendling <isanbard@gmail.com>
Date: Mon Dec 14 21:49:44 2009 +0000

The CIE says that the LSDA point in the FDE section is an "sdata4". That's fine,
but we need it to actually be 4-bytes in the FDE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91337 91177308-0d34-0410-b5e6-96231b3b80d8

commit ba16e07fc539e23bb604defb021187e64c04702a
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 21:33:32 2009 +0000

v6 sync insn copy/paste error

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91333 91177308-0d34-0410-b5e6-96231b3b80d8

commit 6eee903ab286e1a0093c5091bb30fc35e00cd86b
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 21:24:16 2009 +0000

Add ARMv6 memory and sync barrier instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91329 91177308-0d34-0410-b5e6-96231b3b80d8

commit 8ac1d378d72ad45806ab86d316bd50ca5e7f861c
Author: Johnny Chen <johnny.chen@apple.com>
Date: Mon Dec 14 21:01:46 2009 +0000

Fixed encoding bits typo of ldrexd/strexd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91327 91177308-0d34-0410-b5e6-96231b3b80d8

commit 5b595cd6311d7b9670268b69096b55dd1a384d35
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 20:14:59 2009 +0000

Thumb2 atomic operations

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91321 91177308-0d34-0410-b5e6-96231b3b80d8

commit 95218a2eb2163a644a1ff5d419ef0b581a60eb39
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 19:55:22 2009 +0000

Add svn:ignore entries for the Disassembler files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91320 91177308-0d34-0410-b5e6-96231b3b80d8

commit b6e3c7b1e4283ee072c8115244515c145ef6d072
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 19:43:09 2009 +0000

Move several function bodies which are rarely inlined out of line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91319 91177308-0d34-0410-b5e6-96231b3b80d8

commit a850594e8be4f3a3cb7c4d404b8434dfb3844ec8
Author: Chris Lattner <sabre@nondot.org>
Date: Mon Dec 14 19:34:32 2009 +0000

fix an obvious bug found by clang++ and collapse a redundant if.

Here's the diagnostic from clang:

/Volumes/Data/dgregor/Projects/llvm/lib/Target/CppBackend/CPPBackend.cpp:989:23: warning: 'gv' is always NULL in this context
        printConstant(gv);
                      ^
1 diagnostic generated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91318 91177308-0d34-0410-b5e6-96231b3b80d8

commit 5dbe26aa8326068823cb9481972426dca151c3cc
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 19:32:31 2009 +0000

Micro-optimize these functions in the case where they are not inlined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91316 91177308-0d34-0410-b5e6-96231b3b80d8

commit 2d6e24935ebc8902bd9b22f73ba02fa31d60f8bb
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 19:24:11 2009 +0000

correct selection requirements for thumb2 vs. arm versions of the barrier intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91313 91177308-0d34-0410-b5e6-96231b3b80d8

commit 4fc99a1a87b457f994c5e8e0d12206b9b2e02bb4
Author: Eric Christopher <echristo@apple.com>
Date: Mon Dec 14 19:07:25 2009 +0000

Add radar fixed in comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91312 91177308-0d34-0410-b5e6-96231b3b80d8

commit efbc1f057fd24bd540ab94dfcac298d6762aa3bd
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 18:56:47 2009 +0000

add Thumb2 atomic and memory barrier instruction definitions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91310 91177308-0d34-0410-b5e6-96231b3b80d8

commit 31b2740914c8fec8580a8dc1000e3b5295309dfb
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 18:36:32 2009 +0000

whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91307 91177308-0d34-0410-b5e6-96231b3b80d8

commit 63437d96828f86ca3833c58964c4a5d4b142aa07
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 18:31:20 2009 +0000

ARM memory barrier instructions are not predicable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91305 91177308-0d34-0410-b5e6-96231b3b80d8

commit 49da09d50ecde9dcaacb4bc57807b9fe0fd31005
Author: Daniel Dunbar <daniel@zuster.org>
Date: Mon Dec 14 17:58:33 2009 +0000

NNT: Use [e]grep -a when scanning logs, its possibly they will have non-text
characters in them, in which case the grep will just return 'Binary file
matches' and the whole thing falls over.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91302 91177308-0d34-0410-b5e6-96231b3b80d8

commit 2cef3baf8abe8446367182510bb5410247c99a8e
Author: Daniel Dunbar <daniel@zuster.org>
Date: Mon Dec 14 17:58:27 2009 +0000

NNT: Always create the -sentdata.txt file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91301 91177308-0d34-0410-b5e6-96231b3b80d8

commit 5d6c29ba56ae19b4d81f8a8f7abf04aa356403fb
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:35:17 2009 +0000

Clear the Processed set when it is no longer used, and clear the
IVUses list in releaseMemory().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91296 91177308-0d34-0410-b5e6-96231b3b80d8

commit 6f4122b67d3bd31a6d3544f319527949f2d1cf4e
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:31:01 2009 +0000

Fix a thinko; isNotAlreadyContainedIn had a built-in negative, so the
condition was inverted when the code was converted to contains().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91295 91177308-0d34-0410-b5e6-96231b3b80d8

commit b937cf58556a1fea130dae4d42e49489b308edc5
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:19:06 2009 +0000

Remove unnecessary #includes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91293 91177308-0d34-0410-b5e6-96231b3b80d8

commit 34b9035f2aa2e072afa2da175e47a86de9c723ce
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:14:32 2009 +0000

Make the IVUses member private.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91291 91177308-0d34-0410-b5e6-96231b3b80d8

commit 8c5b238c82b464d9993971757204b347a18ed86e
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:12:51 2009 +0000

Instead of having a ScalarEvolution pointer member in BasedUser, just pass
the ScalarEvolution pointer into the functions which need it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91289 91177308-0d34-0410-b5e6-96231b3b80d8

commit e48b5a49457a7976192930b8503e889383e7c0e7
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:10:44 2009 +0000

Don't bother cleaning up if there's nothing to clean up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91288 91177308-0d34-0410-b5e6-96231b3b80d8

commit a7366d992d0c8f3d840085a41c04be370a3cfe95
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:08:09 2009 +0000

Delete an unused variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91287 91177308-0d34-0410-b5e6-96231b3b80d8

commit f1e30e458078b78f37abe8ca738d50df8b3cfae8
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:06:50 2009 +0000

Drop Loop::isNotAlreadyContainedIn in favor of Loop::contains. The
former was just exposing a LoopInfoBase implementation detail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91286 91177308-0d34-0410-b5e6-96231b3b80d8

commit c83030d61279ac68b9532896fea512ae408387de
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 17:02:55 2009 +0000

add ldrexd/strexd instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91284 91177308-0d34-0410-b5e6-96231b3b80d8

commit d1d6f3708a558575396f8c066b9d9575889f8642
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 17:02:34 2009 +0000

LSR itself doesn't need LoopInfo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91283 91177308-0d34-0410-b5e6-96231b3b80d8

commit 01c63bf35c8b7ff7775bc83a02a39fc2efcfe3f8
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 16:57:08 2009 +0000

LSR itself doesn't need DominatorTree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91282 91177308-0d34-0410-b5e6-96231b3b80d8

commit 7ad7ae23378a83d55d836338cf33935a4a6829b9
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 16:52:55 2009 +0000

Remove the code in LSR that manually hoists expansions out of loops;
SCEVExpander does this automatically.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91281 91177308-0d34-0410-b5e6-96231b3b80d8

commit c476702d130b84050a146b9e8a602709bbdc3e2e
Author: Dan Gohman <gohman@apple.com>
Date: Mon Dec 14 16:37:29 2009 +0000

Minor code cleanups.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91280 91177308-0d34-0410-b5e6-96231b3b80d8

commit 814a12c5353afed59395f62dc082aca10b93c3dd
Author: Devang Patel <dpatel@apple.com>
Date: Mon Dec 14 16:18:45 2009 +0000

Use DW_AT_specification to point to DIE describing function declaration.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91278 91177308-0d34-0410-b5e6-96231b3b80d8

commit 99e265ce64bf952da29d01da65438a96984819fe
Author: Shantonu Sen <ssen@apple.com>
Date: Mon Dec 14 14:15:15 2009 +0000

Remove empty file completely

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91277 91177308-0d34-0410-b5e6-96231b3b80d8

commit c8da34e01191c9d3819aa1b52fdbc6fe1d544095
Author: Edwin Török <edwintorok@gmail.com>
Date: Mon Dec 14 12:38:18 2009 +0000

Add "generic" fallback.

gcc warned that the function may not have a return value, indeed
for non-intel and non-amd X86 CPUs it is right (VIA, etc.).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91276 91177308-0d34-0410-b5e6-96231b3b80d8

commit 76d8399d2044c4af7ef6b723f2905e4ad6cbbbf3
Author: Lang Hames <lhames@gmail.com>
Date: Mon Dec 14 07:43:25 2009 +0000

Added CalcSpillWeights to CMakeLists.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91275 91177308-0d34-0410-b5e6-96231b3b80d8

commit 8233240e96cc3df533f37641d17df9ae2d15af12
Author: Bill Wendling <isanbard@gmail.com>
Date: Mon Dec 14 06:51:19 2009 +0000

Whitespace changes, comment clarification. No functional changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91274 91177308-0d34-0410-b5e6-96231b3b80d8

commit 4f49e0f7a619ff4a98eae831896636e8fa9051a4
Author: Lang Hames <lhames@gmail.com>
Date: Mon Dec 14 06:49:42 2009 +0000

Moved spill weight calculation out of SimpleRegisterCoalescing and into its own pass: CalculateSpillWeights.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91273 91177308-0d34-0410-b5e6-96231b3b80d8

commit edf3f1eff2ea650086482a564fe3a649801a17fe
Author: Chris Lattner <sabre@nondot.org>
Date: Mon Dec 14 05:11:02 2009 +0000

revert r91184, because it causes a crash on a .bc file I just
sent to Bob.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91268 91177308-0d34-0410-b5e6-96231b3b80d8

commit c7e4ddcbcad547e5513dfd7eefe8c1ae97e84485
Author: Jim Grosbach <grosbach@apple.com>
Date: Mon Dec 14 04:22:04 2009 +0000

atomic binary operations up to 32-bits wide.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91260 91177308-0d34-0410-b5e6-96231b3b80d8

commit 15874c1fc1d911bfe2ff73e4a66d500d2c07e6f6
Author: Mikhail Glushenkov <foldr@codedgers.com>
Date: Mon Dec 14 04:06:38 2009 +0000

Add a test for the 'init' option property.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91259 91177308-0d34-0410-b5e6-96231b3b80d8

commit 7fe6f87162f412660039e41bc96d1ac96d107176
Author: Jeffrey Yasskin <jyasskin@google.com>
Date: Sun Dec 13 20:30:32 2009 +0000

Reinstate r91208 to fix available_externally linkage for globals, with
nlewycky's fix to add -rdynamic so the JIT can look symbols up in Linux builds
of the JITTests binary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91250 91177308-0d34-0410-b5e6-96231b3b80d8

commit eae8d0465c69874badfcb83312d374d1ba668962
Author: Edwin Török <edwintorok@gmail.com>
Date: Sun Dec 13 08:59:40 2009 +0000

Using _MSC_VER there was wrong, better just use the already existing ifdefs for
x86 CPU detection for the X86 getHostCPUName too, and create a simple
getHostCPUName that returns "generic" for all else.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91240 91177308-0d34-0410-b5e6-96231b3b80d8

commit 87a4e6c0cb18756f3d55ec0f1b5cb86c4c88e068
Author: Chandler Carruth <chandlerc@gmail.com>
Date: Sun Dec 13 07:04:45 2009 +0000

Don't leave pointers uninitialized in the default constructor. GCC complains
about the potential use of these uninitialized members under certain conditions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91239 91177308-0d34-0410-b5e6-96231b3b80d8

commit 7c29ae320a827facfbcc32b91d6d98c6b06e44ea
Author: Anton Korobeynikov <asl@math.spbu.ru>
Date: Sun Dec 13 01:00:59 2009 +0000

Fix weird typo which leads to unallocated memory access for nodes with 4 results.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91233 91177308-0d34-0410-b5e6-96231b3b80d8

commit efb9350360fe13284f9162fec884d16590da206a
Author: Anton Korobeynikov <asl@math.spbu.ru>
Date: Sun Dec 13 01:00:32 2009 +0000

Do not allow uninitialize access during debug printing

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91232 91177308-0d34-0410-b5e6-96231b3b80d8

commit 993eb83df375e2fa7d3fb2c1519690402c27b460
Author: Eli Friedman <eli.friedman@gmail.com>
Date: Sat Dec 12 23:23:43 2009 +0000

More info on this transformation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91230 91177308-0d34-0410-b5e6-96231b3b80d8

commit a3a131a8c63dc9768694c87d74109afefb021cfb
Author: Eli Friedman <eli.friedman@gmail.com>
Date: Sat Dec 12 21:41:48 2009 +0000

Remove some stuff that's already implemented. Also, remove the note about
merging x >u 5 and x <s 20 because it's impossible to implement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91228 91177308-0d34-0410-b5e6-96231b3b80d8

commit ee30369f778aaece9f0f70dc482331c6ed8cb326
Author: Daniel Dunbar <daniel@zuster.org>
Date: Sat Dec 12 21:17:54 2009 +0000

Update install-clang target for clang-cc removal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91226 91177308-0d34-0410-b5e6-96231b3b80d8

commit 8597901e12c0caa1cf841472e12df422c1d2c02b
Author: Evan Cheng <evan.cheng@apple.com>
Date: Sat Dec 12 20:03:14 2009 +0000

Disable r91104 for x86. It causes partial register stall which pessimize code in 32-bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91223 91177308-0d34-0410-b5e6-96231b3b80d8

commit 36d987ea7542f0face7c2a3e98cfa4d8f31ab5e9
Author: Anton Korobeynikov <asl@math.spbu.ru>
Date: Sat Dec 12 18:55:37 2009 +0000

Implement variable-width shifts.
No testcase yet - it seems we're exposing generic codegen bugs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91221 91177308-0d34-0410-b5e6-96231b3b80d8

commit 9357ab4ddb97c2a6606ba0ee9f859b9c93b364b7
Author: Evan Cheng <evan.cheng@apple.com>
Date: Sat Dec 12 18:55:26 2009 +0000

Add comment about potential partial register stall.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91220 91177308-0d34-0410-b5e6-96231b3b80d8

commit ca348204499380bc590165f8467f8dccdc3f414a
Author: Evan Cheng <evan.cheng@apple.com>
Date: Sat Dec 12 18:51:56 2009 +0000

Fix an obvious bug. No test case since LEA16r is not being used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91219 91177308-0d34-0410-b5e6-96231b3b80d8

commit d50b1dd026feed23280c98d75ec3465627424725
Author: Edwin Török <edwintorok@gmail.com>
Date: Sat Dec 12 12:42:31 2009 +0000

Enable CPU detection when using MS VS 2k8 too.
MSVS2k8 doesn't define __i386__, hence all the CPU detection code was disabled.
Enable it by looking for _MSC_VER.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91217 91177308-0d34-0410-b5e6-96231b3b80d8

Török Edvin authored on 2009/12/15 21:28:14
Showing 87 changed files
... ...
@@ -66,8 +66,7 @@ ifeq ($(MAKECMDGOALS),tools-only)
66 66
 endif
67 67
 
68 68
 ifeq ($(MAKECMDGOALS),install-clang)
69
-  DIRS := tools/clang/tools/driver tools/clang/tools/clang-cc \
70
-	tools/clang/lib/Headers tools/clang/docs
69
+  DIRS := tools/clang/tools/driver tools/clang/lib/Headers tools/clang/docs
71 70
   OPTIONAL_DIRS :=
72 71
   NO_INSTALL = 1
73 72
 endif
... ...
@@ -100,7 +100,12 @@ install-ocamldoc: ocamldoc
100 100
 	  $(FIND) . -type f -exec \
101 101
 	    $(DataInstall) {} $(PROJ_docsdir)/ocamldoc/html \;
102 102
 
103
-ocamldoc: regen-ocamldoc $(PROJ_OBJ_DIR)/ocamldoc.tar.gz
103
+ocamldoc: regen-ocamldoc
104
+	$(Echo) Packaging ocamldoc documentation
105
+	$(Verb) $(RM) -rf $(PROJ_OBJ_DIR)/ocamldoc.tar*
106
+	$(Verb) $(TAR) cf $(PROJ_OBJ_DIR)/ocamldoc.tar ocamldoc
107
+	$(Verb) $(GZIP) $(PROJ_OBJ_DIR)/ocamldoc.tar
108
+	$(Verb) $(CP) $(PROJ_OBJ_DIR)/ocamldoc.tar.gz $(PROJ_OBJ_DIR)/ocamldoc/html/
104 109
 
105 110
 regen-ocamldoc:
106 111
 	$(Echo) Building ocamldoc documentation
... ...
@@ -113,13 +118,6 @@ regen-ocamldoc:
113 113
 		$(OCAMLDOC) -d $(PROJ_OBJ_DIR)/ocamldoc/html -sort -colorize-code -html \
114 114
 		`$(FIND) $(LEVEL)/bindings/ocaml -name "*.odoc" -exec echo -load '{}' ';'`
115 115
 
116
-$(PROJ_OBJ_DIR)/ocamldoc.tar.gz:
117
-	$(Echo) Packaging ocamldoc documentation
118
-	$(Verb) $(RM) -rf $@ $(PROJ_OBJ_DIR)/ocamldoc.tar
119
-	$(Verb) $(TAR) cf $(PROJ_OBJ_DIR)/ocamldoc.tar ocamldoc
120
-	$(Verb) $(GZIP) $(PROJ_OBJ_DIR)/ocamldoc.tar
121
-	$(Verb) $(CP) $(PROJ_OBJ_DIR)/ocamldoc.tar.gz $(PROJ_OBJ_DIR)/ocamldoc/html/
122
-
123 116
 uninstall-local::
124 117
 	$(Echo) Uninstalling Documentation
125 118
 	$(Verb) $(RM) -rf $(PROJ_docsdir)
... ...
@@ -217,7 +217,8 @@ public:
217 217
 
218 218
 private:
219 219
   void CopyFrom(const DenseMap& other) {
220
-    if (NumBuckets != 0 && (!KeyInfoT::isPod() || !ValueInfoT::isPod())) {
220
+    if (NumBuckets != 0 &&
221
+        (!isPodLike<KeyInfoT>::value || !isPodLike<ValueInfoT>::value)) {
221 222
       const KeyT EmptyKey = getEmptyKey(), TombstoneKey = getTombstoneKey();
222 223
       for (BucketT *P = Buckets, *E = Buckets+NumBuckets; P != E; ++P) {
223 224
         if (!KeyInfoT::isEqual(P->first, EmptyKey) &&
... ...
@@ -239,7 +240,7 @@ private:
239 239
     Buckets = static_cast<BucketT*>(operator new(sizeof(BucketT) *
240 240
                                                  other.NumBuckets));
241 241
 
242
-    if (KeyInfoT::isPod() && ValueInfoT::isPod())
242
+    if (isPodLike<KeyInfoT>::value && isPodLike<ValueInfoT>::value)
243 243
       memcpy(Buckets, other.Buckets, other.NumBuckets * sizeof(BucketT));
244 244
     else
245 245
       for (size_t i = 0; i < other.NumBuckets; ++i) {
... ...
@@ -15,7 +15,7 @@
15 15
 #define LLVM_ADT_DENSEMAPINFO_H
16 16
 
17 17
 #include "llvm/Support/PointerLikeTypeTraits.h"
18
-#include <utility>
18
+#include "llvm/Support/type_traits.h"
19 19
 
20 20
 namespace llvm {
21 21
 
... ...
@@ -25,7 +25,6 @@ struct DenseMapInfo {
25 25
   //static inline T getTombstoneKey();
26 26
   //static unsigned getHashValue(const T &Val);
27 27
   //static bool isEqual(const T &LHS, const T &RHS);
28
-  //static bool isPod()
29 28
 };
30 29
 
31 30
 // Provide DenseMapInfo for all pointers.
... ...
@@ -46,7 +45,6 @@ struct DenseMapInfo<T*> {
46 46
            (unsigned((uintptr_t)PtrVal) >> 9);
47 47
   }
48 48
   static bool isEqual(const T *LHS, const T *RHS) { return LHS == RHS; }
49
-  static bool isPod() { return true; }
50 49
 };
51 50
 
52 51
 // Provide DenseMapInfo for chars.
... ...
@@ -54,7 +52,6 @@ template<> struct DenseMapInfo<char> {
54 54
   static inline char getEmptyKey() { return ~0; }
55 55
   static inline char getTombstoneKey() { return ~0 - 1; }
56 56
   static unsigned getHashValue(const char& Val) { return Val * 37; }
57
-  static bool isPod() { return true; }
58 57
   static bool isEqual(const char &LHS, const char &RHS) {
59 58
     return LHS == RHS;
60 59
   }
... ...
@@ -65,7 +62,6 @@ template<> struct DenseMapInfo<unsigned> {
65 65
   static inline unsigned getEmptyKey() { return ~0; }
66 66
   static inline unsigned getTombstoneKey() { return ~0U - 1; }
67 67
   static unsigned getHashValue(const unsigned& Val) { return Val * 37; }
68
-  static bool isPod() { return true; }
69 68
   static bool isEqual(const unsigned& LHS, const unsigned& RHS) {
70 69
     return LHS == RHS;
71 70
   }
... ...
@@ -78,7 +74,6 @@ template<> struct DenseMapInfo<unsigned long> {
78 78
   static unsigned getHashValue(const unsigned long& Val) {
79 79
     return (unsigned)(Val * 37UL);
80 80
   }
81
-  static bool isPod() { return true; }
82 81
   static bool isEqual(const unsigned long& LHS, const unsigned long& RHS) {
83 82
     return LHS == RHS;
84 83
   }
... ...
@@ -91,7 +86,6 @@ template<> struct DenseMapInfo<unsigned long long> {
91 91
   static unsigned getHashValue(const unsigned long long& Val) {
92 92
     return (unsigned)(Val * 37ULL);
93 93
   }
94
-  static bool isPod() { return true; }
95 94
   static bool isEqual(const unsigned long long& LHS,
96 95
                       const unsigned long long& RHS) {
97 96
     return LHS == RHS;
... ...
@@ -127,7 +121,6 @@ struct DenseMapInfo<std::pair<T, U> > {
127 127
     return (unsigned)key;
128 128
   }
129 129
   static bool isEqual(const Pair& LHS, const Pair& RHS) { return LHS == RHS; }
130
-  static bool isPod() { return FirstInfo::isPod() && SecondInfo::isPod(); }
131 130
 };
132 131
 
133 132
 } // end namespace llvm
... ...
@@ -211,9 +211,12 @@ template<typename T> struct DenseMapInfo<ImmutableList<T> > {
211 211
   static bool isEqual(ImmutableList<T> X1, ImmutableList<T> X2) {
212 212
     return X1 == X2;
213 213
   }
214
-  static bool isPod() { return true; }
215 214
 };
216 215
 
216
+template <typename T> struct isPodLike;
217
+template <typename T>
218
+struct isPodLike<ImmutableList<T> > { static const bool value = true; };
219
+
217 220
 } // end llvm namespace
218 221
 
219 222
 #endif
... ...
@@ -106,6 +106,12 @@ public:
106 106
   bool operator>=(const PointerIntPair &RHS) const {return Value >= RHS.Value;}
107 107
 };
108 108
 
109
+template <typename T> struct isPodLike;
110
+template<typename PointerTy, unsigned IntBits, typename IntType>
111
+struct isPodLike<PointerIntPair<PointerTy, IntBits, IntType> > {
112
+   static const bool value = true;
113
+};
114
+  
109 115
 // Provide specialization of DenseMapInfo for PointerIntPair.
110 116
 template<typename PointerTy, unsigned IntBits, typename IntType>
111 117
 struct DenseMapInfo<PointerIntPair<PointerTy, IntBits, IntType> > {
... ...
@@ -125,7 +131,6 @@ struct DenseMapInfo<PointerIntPair<PointerTy, IntBits, IntType> > {
125 125
     return unsigned(IV) ^ unsigned(IV >> 9);
126 126
   }
127 127
   static bool isEqual(const Ty &LHS, const Ty &RHS) { return LHS == RHS; }
128
-  static bool isPod() { return true; }
129 128
 };
130 129
 
131 130
 // Teach SmallPtrSet that PointerIntPair is "basically a pointer".
... ...
@@ -46,20 +46,17 @@ namespace std {
46 46
 
47 47
 namespace llvm {
48 48
 
49
-/// SmallVectorImpl - This class consists of common code factored out of the
50
-/// SmallVector class to reduce code duplication based on the SmallVector 'N'
51
-/// template parameter.
52
-template <typename T>
53
-class SmallVectorImpl {
49
+/// SmallVectorBase - This is all the non-templated stuff common to all
50
+/// SmallVectors.
51
+class SmallVectorBase {
54 52
 protected:
55
-  T *Begin, *End, *Capacity;
53
+  void *BeginX, *EndX, *CapacityX;
56 54
 
57 55
   // Allocate raw space for N elements of type T.  If T has a ctor or dtor, we
58 56
   // don't want it to be automatically run, so we need to represent the space as
59 57
   // something else.  An array of char would work great, but might not be
60 58
   // aligned sufficiently.  Instead, we either use GCC extensions, or some
61 59
   // number of union instances for the space, which guarantee maximal alignment.
62
-protected:
63 60
 #ifdef __GNUC__
64 61
   typedef char U;
65 62
   U FirstEl __attribute__((aligned));
... ...
@@ -72,46 +69,65 @@ protected:
72 72
   } FirstEl;
73 73
 #endif
74 74
   // Space after 'FirstEl' is clobbered, do not add any instance vars after it.
75
+  
76
+protected:
77
+  SmallVectorBase(size_t Size)
78
+    : BeginX(&FirstEl), EndX(&FirstEl), CapacityX((char*)&FirstEl+Size) {}
79
+  
80
+  /// isSmall - Return true if this is a smallvector which has not had dynamic
81
+  /// memory allocated for it.
82
+  bool isSmall() const {
83
+    return BeginX == static_cast<const void*>(&FirstEl);
84
+  }
85
+  
86
+  
87
+public:
88
+  bool empty() const { return BeginX == EndX; }
89
+};
90
+  
91
+/// SmallVectorImpl - This class consists of common code factored out of the
92
+/// SmallVector class to reduce code duplication based on the SmallVector 'N'
93
+/// template parameter.
94
+template <typename T>
95
+class SmallVectorImpl : public SmallVectorBase {
96
+  void setEnd(T *P) { EndX = P; }
75 97
 public:
76 98
   // Default ctor - Initialize to empty.
77
-  explicit SmallVectorImpl(unsigned N)
78
-    : Begin(reinterpret_cast<T*>(&FirstEl)),
79
-      End(reinterpret_cast<T*>(&FirstEl)),
80
-      Capacity(reinterpret_cast<T*>(&FirstEl)+N) {
99
+  explicit SmallVectorImpl(unsigned N) : SmallVectorBase(N*sizeof(T)) {
81 100
   }
82 101
 
83 102
   ~SmallVectorImpl() {
84 103
     // Destroy the constructed elements in the vector.
85
-    destroy_range(Begin, End);
104
+    destroy_range(begin(), end());
86 105
 
87 106
     // If this wasn't grown from the inline copy, deallocate the old space.
88 107
     if (!isSmall())
89
-      operator delete(Begin);
108
+      operator delete(begin());
90 109
   }
91 110
 
92 111
   typedef size_t size_type;
93 112
   typedef ptrdiff_t difference_type;
94 113
   typedef T value_type;
95
-  typedef T* iterator;
96
-  typedef const T* const_iterator;
114
+  typedef T *iterator;
115
+  typedef const T *const_iterator;
97 116
 
98
-  typedef std::reverse_iterator<const_iterator>  const_reverse_iterator;
99
-  typedef std::reverse_iterator<iterator>  reverse_iterator;
117
+  typedef std::reverse_iterator<const_iterator> const_reverse_iterator;
118
+  typedef std::reverse_iterator<iterator> reverse_iterator;
100 119
 
101
-  typedef T& reference;
102
-  typedef const T& const_reference;
103
-  typedef T* pointer;
104
-  typedef const T* const_pointer;
105
-
106
-  bool empty() const { return Begin == End; }
107
-  size_type size() const { return End-Begin; }
108
-  size_type max_size() const { return size_type(-1) / sizeof(T); }
120
+  typedef T &reference;
121
+  typedef const T &const_reference;
122
+  typedef T *pointer;
123
+  typedef const T *const_pointer;
109 124
 
110 125
   // forward iterator creation methods.
111
-  iterator begin() { return Begin; }
112
-  const_iterator begin() const { return Begin; }
113
-  iterator end() { return End; }
114
-  const_iterator end() const { return End; }
126
+  iterator begin() { return (iterator)BeginX; }
127
+  const_iterator begin() const { return (const_iterator)BeginX; }
128
+  iterator end() { return (iterator)EndX; }
129
+  const_iterator end() const { return (const_iterator)EndX; }
130
+private:
131
+  iterator capacity_ptr() { return (iterator)CapacityX; }
132
+  const_iterator capacity_ptr() const { return (const_iterator)CapacityX; }
133
+public:
115 134
 
116 135
   // reverse iterator creation methods.
117 136
   reverse_iterator rbegin()            { return reverse_iterator(end()); }
... ...
@@ -119,14 +135,25 @@ public:
119 119
   reverse_iterator rend()              { return reverse_iterator(begin()); }
120 120
   const_reverse_iterator rend() const { return const_reverse_iterator(begin());}
121 121
 
122
-
122
+  size_type size() const { return end()-begin(); }
123
+  size_type max_size() const { return size_type(-1) / sizeof(T); }
124
+  
125
+  /// capacity - Return the total number of elements in the currently allocated
126
+  /// buffer.
127
+  size_t capacity() const { return capacity_ptr() - begin(); }
128
+  
129
+  /// data - Return a pointer to the vector's buffer, even if empty().
130
+  pointer data() { return pointer(begin()); }
131
+  /// data - Return a pointer to the vector's buffer, even if empty().
132
+  const_pointer data() const { return const_pointer(begin()); }
133
+  
123 134
   reference operator[](unsigned idx) {
124
-    assert(Begin + idx < End);
125
-    return Begin[idx];
135
+    assert(begin() + idx < end());
136
+    return begin()[idx];
126 137
   }
127 138
   const_reference operator[](unsigned idx) const {
128
-    assert(Begin + idx < End);
129
-    return Begin[idx];
139
+    assert(begin() + idx < end());
140
+    return begin()[idx];
130 141
   }
131 142
 
132 143
   reference front() {
... ...
@@ -144,10 +171,10 @@ public:
144 144
   }
145 145
 
146 146
   void push_back(const_reference Elt) {
147
-    if (End < Capacity) {
147
+    if (EndX < CapacityX) {
148 148
   Retry:
149
-      new (End) T(Elt);
150
-      ++End;
149
+      new (end()) T(Elt);
150
+      setEnd(end()+1);
151 151
       return;
152 152
     }
153 153
     grow();
... ...
@@ -155,8 +182,8 @@ public:
155 155
   }
156 156
 
157 157
   void pop_back() {
158
-    --End;
159
-    End->~T();
158
+    setEnd(end()-1);
159
+    end()->~T();
160 160
   }
161 161
 
162 162
   T pop_back_val() {
... ...
@@ -166,36 +193,36 @@ public:
166 166
   }
167 167
 
168 168
   void clear() {
169
-    destroy_range(Begin, End);
170
-    End = Begin;
169
+    destroy_range(begin(), end());
170
+    EndX = BeginX;
171 171
   }
172 172
 
173 173
   void resize(unsigned N) {
174 174
     if (N < size()) {
175
-      destroy_range(Begin+N, End);
176
-      End = Begin+N;
175
+      destroy_range(begin()+N, end());
176
+      setEnd(begin()+N);
177 177
     } else if (N > size()) {
178
-      if (unsigned(Capacity-Begin) < N)
178
+      if (capacity() < N)
179 179
         grow(N);
180
-      construct_range(End, Begin+N, T());
181
-      End = Begin+N;
180
+      construct_range(end(), begin()+N, T());
181
+      setEnd(begin()+N);
182 182
     }
183 183
   }
184 184
 
185 185
   void resize(unsigned N, const T &NV) {
186 186
     if (N < size()) {
187
-      destroy_range(Begin+N, End);
188
-      End = Begin+N;
187
+      destroy_range(begin()+N, end());
188
+      setEnd(begin()+N);
189 189
     } else if (N > size()) {
190
-      if (unsigned(Capacity-Begin) < N)
190
+      if (capacity() < N)
191 191
         grow(N);
192
-      construct_range(End, Begin+N, NV);
193
-      End = Begin+N;
192
+      construct_range(end(), begin()+N, NV);
193
+      setEnd(begin()+N);
194 194
     }
195 195
   }
196 196
 
197 197
   void reserve(unsigned N) {
198
-    if (unsigned(Capacity-Begin) < N)
198
+    if (capacity() < N)
199 199
       grow(N);
200 200
   }
201 201
 
... ...
@@ -207,38 +234,38 @@ public:
207 207
   void append(in_iter in_start, in_iter in_end) {
208 208
     size_type NumInputs = std::distance(in_start, in_end);
209 209
     // Grow allocated space if needed.
210
-    if (NumInputs > size_type(Capacity-End))
210
+    if (NumInputs > size_type(capacity_ptr()-end()))
211 211
       grow(size()+NumInputs);
212 212
 
213 213
     // Copy the new elements over.
214
-    std::uninitialized_copy(in_start, in_end, End);
215
-    End += NumInputs;
214
+    std::uninitialized_copy(in_start, in_end, end());
215
+    setEnd(end() + NumInputs);
216 216
   }
217 217
 
218 218
   /// append - Add the specified range to the end of the SmallVector.
219 219
   ///
220 220
   void append(size_type NumInputs, const T &Elt) {
221 221
     // Grow allocated space if needed.
222
-    if (NumInputs > size_type(Capacity-End))
222
+    if (NumInputs > size_type(capacity_ptr()-end()))
223 223
       grow(size()+NumInputs);
224 224
 
225 225
     // Copy the new elements over.
226
-    std::uninitialized_fill_n(End, NumInputs, Elt);
227
-    End += NumInputs;
226
+    std::uninitialized_fill_n(end(), NumInputs, Elt);
227
+    setEnd(end() + NumInputs);
228 228
   }
229 229
 
230 230
   void assign(unsigned NumElts, const T &Elt) {
231 231
     clear();
232
-    if (unsigned(Capacity-Begin) < NumElts)
232
+    if (capacity() < NumElts)
233 233
       grow(NumElts);
234
-    End = Begin+NumElts;
235
-    construct_range(Begin, End, Elt);
234
+    setEnd(begin()+NumElts);
235
+    construct_range(begin(), end(), Elt);
236 236
   }
237 237
 
238 238
   iterator erase(iterator I) {
239 239
     iterator N = I;
240 240
     // Shift all elts down one.
241
-    std::copy(I+1, End, I);
241
+    std::copy(I+1, end(), I);
242 242
     // Drop the last elt.
243 243
     pop_back();
244 244
     return(N);
... ...
@@ -247,36 +274,36 @@ public:
247 247
   iterator erase(iterator S, iterator E) {
248 248
     iterator N = S;
249 249
     // Shift all elts down.
250
-    iterator I = std::copy(E, End, S);
250
+    iterator I = std::copy(E, end(), S);
251 251
     // Drop the last elts.
252
-    destroy_range(I, End);
253
-    End = I;
252
+    destroy_range(I, end());
253
+    setEnd(I);
254 254
     return(N);
255 255
   }
256 256
 
257 257
   iterator insert(iterator I, const T &Elt) {
258
-    if (I == End) {  // Important special case for empty vector.
258
+    if (I == end()) {  // Important special case for empty vector.
259 259
       push_back(Elt);
260 260
       return end()-1;
261 261
     }
262 262
 
263
-    if (End < Capacity) {
263
+    if (EndX < CapacityX) {
264 264
   Retry:
265
-      new (End) T(back());
266
-      ++End;
265
+      new (end()) T(back());
266
+      setEnd(end()+1);
267 267
       // Push everything else over.
268
-      std::copy_backward(I, End-1, End);
268
+      std::copy_backward(I, end()-1, end());
269 269
       *I = Elt;
270 270
       return I;
271 271
     }
272
-    size_t EltNo = I-Begin;
272
+    size_t EltNo = I-begin();
273 273
     grow();
274
-    I = Begin+EltNo;
274
+    I = begin()+EltNo;
275 275
     goto Retry;
276 276
   }
277 277
 
278 278
   iterator insert(iterator I, size_type NumToInsert, const T &Elt) {
279
-    if (I == End) {  // Important special case for empty vector.
279
+    if (I == end()) {  // Important special case for empty vector.
280 280
       append(NumToInsert, Elt);
281 281
       return end()-1;
282 282
     }
... ...
@@ -295,8 +322,8 @@ public:
295 295
     // insertion.  Since we already reserved space, we know that this won't
296 296
     // reallocate the vector.
297 297
     if (size_t(end()-I) >= NumToInsert) {
298
-      T *OldEnd = End;
299
-      append(End-NumToInsert, End);
298
+      T *OldEnd = end();
299
+      append(end()-NumToInsert, end());
300 300
 
301 301
       // Copy the existing elements that get replaced.
302 302
       std::copy_backward(I, OldEnd-NumToInsert, OldEnd);
... ...
@@ -309,10 +336,10 @@ public:
309 309
     // not inserting at the end.
310 310
 
311 311
     // Copy over the elements that we're about to overwrite.
312
-    T *OldEnd = End;
313
-    End += NumToInsert;
312
+    T *OldEnd = end();
313
+    setEnd(end() + NumToInsert);
314 314
     size_t NumOverwritten = OldEnd-I;
315
-    std::uninitialized_copy(I, OldEnd, End-NumOverwritten);
315
+    std::uninitialized_copy(I, OldEnd, end()-NumOverwritten);
316 316
 
317 317
     // Replace the overwritten part.
318 318
     std::fill_n(I, NumOverwritten, Elt);
... ...
@@ -324,7 +351,7 @@ public:
324 324
 
325 325
   template<typename ItTy>
326 326
   iterator insert(iterator I, ItTy From, ItTy To) {
327
-    if (I == End) {  // Important special case for empty vector.
327
+    if (I == end()) {  // Important special case for empty vector.
328 328
       append(From, To);
329 329
       return end()-1;
330 330
     }
... ...
@@ -344,8 +371,8 @@ public:
344 344
     // insertion.  Since we already reserved space, we know that this won't
345 345
     // reallocate the vector.
346 346
     if (size_t(end()-I) >= NumToInsert) {
347
-      T *OldEnd = End;
348
-      append(End-NumToInsert, End);
347
+      T *OldEnd = end();
348
+      append(end()-NumToInsert, end());
349 349
 
350 350
       // Copy the existing elements that get replaced.
351 351
       std::copy_backward(I, OldEnd-NumToInsert, OldEnd);
... ...
@@ -358,10 +385,10 @@ public:
358 358
     // not inserting at the end.
359 359
 
360 360
     // Copy over the elements that we're about to overwrite.
361
-    T *OldEnd = End;
362
-    End += NumToInsert;
361
+    T *OldEnd = end();
362
+    setEnd(end() + NumToInsert);
363 363
     size_t NumOverwritten = OldEnd-I;
364
-    std::uninitialized_copy(I, OldEnd, End-NumOverwritten);
364
+    std::uninitialized_copy(I, OldEnd, end()-NumOverwritten);
365 365
 
366 366
     // Replace the overwritten part.
367 367
     std::copy(From, From+NumOverwritten, I);
... ...
@@ -371,25 +398,11 @@ public:
371 371
     return I;
372 372
   }
373 373
 
374
-  /// data - Return a pointer to the vector's buffer, even if empty().
375
-  pointer data() {
376
-    return pointer(Begin);
377
-  }
378
-
379
-  /// data - Return a pointer to the vector's buffer, even if empty().
380
-  const_pointer data() const {
381
-    return const_pointer(Begin);
382
-  }
383
-
384 374
   const SmallVectorImpl &operator=(const SmallVectorImpl &RHS);
385 375
 
386 376
   bool operator==(const SmallVectorImpl &RHS) const {
387 377
     if (size() != RHS.size()) return false;
388
-    for (T *This = Begin, *That = RHS.Begin, *E = Begin+size();
389
-         This != E; ++This, ++That)
390
-      if (*This != *That)
391
-        return false;
392
-    return true;
378
+    return std::equal(begin(), end(), RHS.begin());
393 379
   }
394 380
   bool operator!=(const SmallVectorImpl &RHS) const { return !(*this == RHS); }
395 381
 
... ...
@@ -398,10 +411,6 @@ public:
398 398
                                         RHS.begin(), RHS.end());
399 399
   }
400 400
 
401
-  /// capacity - Return the total number of elements in the currently allocated
402
-  /// buffer.
403
-  size_t capacity() const { return Capacity - Begin; }
404
-
405 401
   /// set_size - Set the array size to \arg N, which the current array must have
406 402
   /// enough capacity for.
407 403
   ///
... ...
@@ -413,17 +422,10 @@ public:
413 413
   /// which will only be overwritten.
414 414
   void set_size(unsigned N) {
415 415
     assert(N <= capacity());
416
-    End = Begin + N;
416
+    setEnd(begin() + N);
417 417
   }
418 418
 
419 419
 private:
420
-  /// isSmall - Return true if this is a smallvector which has not had dynamic
421
-  /// memory allocated for it.
422
-  bool isSmall() const {
423
-    return static_cast<const void*>(Begin) ==
424
-           static_cast<const void*>(&FirstEl);
425
-  }
426
-
427 420
   /// grow - double the size of the allocated memory, guaranteeing space for at
428 421
   /// least one more element or MinSize if specified.
429 422
   void grow(size_type MinSize = 0);
... ...
@@ -434,6 +436,9 @@ private:
434 434
   }
435 435
 
436 436
   void destroy_range(T *S, T *E) {
437
+    // No need to do a destroy loop for POD's.
438
+    if (isPodLike<T>::value) return;
439
+    
437 440
     while (S != E) {
438 441
       --E;
439 442
       E->~T();
... ...
@@ -444,7 +449,7 @@ private:
444 444
 // Define this out-of-line to dissuade the C++ compiler from inlining it.
445 445
 template <typename T>
446 446
 void SmallVectorImpl<T>::grow(size_t MinSize) {
447
-  size_t CurCapacity = Capacity-Begin;
447
+  size_t CurCapacity = capacity();
448 448
   size_t CurSize = size();
449 449
   size_t NewCapacity = 2*CurCapacity;
450 450
   if (NewCapacity < MinSize)
... ...
@@ -452,22 +457,22 @@ void SmallVectorImpl<T>::grow(size_t MinSize) {
452 452
   T *NewElts = static_cast<T*>(operator new(NewCapacity*sizeof(T)));
453 453
 
454 454
   // Copy the elements over.
455
-  if (is_class<T>::value)
456
-    std::uninitialized_copy(Begin, End, NewElts);
455
+  if (isPodLike<T>::value)
456
+    // Use memcpy for PODs: std::uninitialized_copy optimizes to memmove.
457
+    memcpy(NewElts, begin(), CurSize * sizeof(T));
457 458
   else
458
-    // Use memcpy for PODs (std::uninitialized_copy optimizes to memmove).
459
-    memcpy(NewElts, Begin, CurSize * sizeof(T));
459
+    std::uninitialized_copy(begin(), end(), NewElts);
460 460
 
461 461
   // Destroy the original elements.
462
-  destroy_range(Begin, End);
462
+  destroy_range(begin(), end());
463 463
 
464 464
   // If this wasn't grown from the inline copy, deallocate the old space.
465 465
   if (!isSmall())
466
-    operator delete(Begin);
466
+    operator delete(begin());
467 467
 
468
-  Begin = NewElts;
469
-  End = NewElts+CurSize;
470
-  Capacity = Begin+NewCapacity;
468
+  setEnd(NewElts+CurSize);
469
+  BeginX = NewElts;
470
+  CapacityX = begin()+NewCapacity;
471 471
 }
472 472
 
473 473
 template <typename T>
... ...
@@ -476,35 +481,35 @@ void SmallVectorImpl<T>::swap(SmallVectorImpl<T> &RHS) {
476 476
 
477 477
   // We can only avoid copying elements if neither vector is small.
478 478
   if (!isSmall() && !RHS.isSmall()) {
479
-    std::swap(Begin, RHS.Begin);
480
-    std::swap(End, RHS.End);
481
-    std::swap(Capacity, RHS.Capacity);
479
+    std::swap(BeginX, RHS.BeginX);
480
+    std::swap(EndX, RHS.EndX);
481
+    std::swap(CapacityX, RHS.CapacityX);
482 482
     return;
483 483
   }
484
-  if (RHS.size() > size_type(Capacity-Begin))
484
+  if (RHS.size() > capacity())
485 485
     grow(RHS.size());
486
-  if (size() > size_type(RHS.Capacity-RHS.begin()))
486
+  if (size() > RHS.capacity())
487 487
     RHS.grow(size());
488 488
 
489 489
   // Swap the shared elements.
490 490
   size_t NumShared = size();
491 491
   if (NumShared > RHS.size()) NumShared = RHS.size();
492 492
   for (unsigned i = 0; i != static_cast<unsigned>(NumShared); ++i)
493
-    std::swap(Begin[i], RHS[i]);
493
+    std::swap((*this)[i], RHS[i]);
494 494
 
495 495
   // Copy over the extra elts.
496 496
   if (size() > RHS.size()) {
497 497
     size_t EltDiff = size() - RHS.size();
498
-    std::uninitialized_copy(Begin+NumShared, End, RHS.End);
499
-    RHS.End += EltDiff;
500
-    destroy_range(Begin+NumShared, End);
501
-    End = Begin+NumShared;
498
+    std::uninitialized_copy(begin()+NumShared, end(), RHS.end());
499
+    RHS.setEnd(RHS.end()+EltDiff);
500
+    destroy_range(begin()+NumShared, end());
501
+    setEnd(begin()+NumShared);
502 502
   } else if (RHS.size() > size()) {
503 503
     size_t EltDiff = RHS.size() - size();
504
-    std::uninitialized_copy(RHS.Begin+NumShared, RHS.End, End);
505
-    End += EltDiff;
506
-    destroy_range(RHS.Begin+NumShared, RHS.End);
507
-    RHS.End = RHS.Begin+NumShared;
504
+    std::uninitialized_copy(RHS.begin()+NumShared, RHS.end(), end());
505
+    setEnd(end() + EltDiff);
506
+    destroy_range(RHS.begin()+NumShared, RHS.end());
507
+    RHS.setEnd(RHS.begin()+NumShared);
508 508
   }
509 509
 }
510 510
 
... ...
@@ -516,42 +521,42 @@ SmallVectorImpl<T>::operator=(const SmallVectorImpl<T> &RHS) {
516 516
 
517 517
   // If we already have sufficient space, assign the common elements, then
518 518
   // destroy any excess.
519
-  unsigned RHSSize = unsigned(RHS.size());
520
-  unsigned CurSize = unsigned(size());
519
+  size_t RHSSize = RHS.size();
520
+  size_t CurSize = size();
521 521
   if (CurSize >= RHSSize) {
522 522
     // Assign common elements.
523 523
     iterator NewEnd;
524 524
     if (RHSSize)
525
-      NewEnd = std::copy(RHS.Begin, RHS.Begin+RHSSize, Begin);
525
+      NewEnd = std::copy(RHS.begin(), RHS.begin()+RHSSize, begin());
526 526
     else
527
-      NewEnd = Begin;
527
+      NewEnd = begin();
528 528
 
529 529
     // Destroy excess elements.
530
-    destroy_range(NewEnd, End);
530
+    destroy_range(NewEnd, end());
531 531
 
532 532
     // Trim.
533
-    End = NewEnd;
533
+    setEnd(NewEnd);
534 534
     return *this;
535 535
   }
536 536
 
537 537
   // If we have to grow to have enough elements, destroy the current elements.
538 538
   // This allows us to avoid copying them during the grow.
539
-  if (unsigned(Capacity-Begin) < RHSSize) {
539
+  if (capacity() < RHSSize) {
540 540
     // Destroy current elements.
541
-    destroy_range(Begin, End);
542
-    End = Begin;
541
+    destroy_range(begin(), end());
542
+    setEnd(begin());
543 543
     CurSize = 0;
544 544
     grow(RHSSize);
545 545
   } else if (CurSize) {
546 546
     // Otherwise, use assignment for the already-constructed elements.
547
-    std::copy(RHS.Begin, RHS.Begin+CurSize, Begin);
547
+    std::copy(RHS.begin(), RHS.begin()+CurSize, begin());
548 548
   }
549 549
 
550 550
   // Copy construct the new elements in place.
551
-  std::uninitialized_copy(RHS.Begin+CurSize, RHS.End, Begin+CurSize);
551
+  std::uninitialized_copy(RHS.begin()+CurSize, RHS.end(), begin()+CurSize);
552 552
 
553 553
   // Set end.
554
-  End = Begin+RHSSize;
554
+  setEnd(begin()+RHSSize);
555 555
   return *this;
556 556
 }
557 557
 
... ...
@@ -250,6 +250,12 @@ public:
250 250
   }
251 251
 };
252 252
 
253
+  
254
+template<typename KeyT, typename ValueT, typename Config, typename ValueInfoT>
255
+struct isPodLike<ValueMapCallbackVH<KeyT, ValueT, Config, ValueInfoT> > {
256
+  static const bool value = true;
257
+};
258
+
253 259
 template<typename KeyT, typename ValueT, typename Config, typename ValueInfoT>
254 260
 struct DenseMapInfo<ValueMapCallbackVH<KeyT, ValueT, Config, ValueInfoT> > {
255 261
   typedef ValueMapCallbackVH<KeyT, ValueT, Config, ValueInfoT> VH;
... ...
@@ -267,7 +273,6 @@ struct DenseMapInfo<ValueMapCallbackVH<KeyT, ValueT, Config, ValueInfoT> > {
267 267
   static bool isEqual(const VH &LHS, const VH &RHS) {
268 268
     return LHS == RHS;
269 269
   }
270
-  static bool isPod() { return false; }
271 270
 };
272 271
 
273 272
 
... ...
@@ -643,7 +643,7 @@ struct ilist : public iplist<NodeTy> {
643 643
 
644 644
   // Main implementation here - Insert for a node passed by value...
645 645
   iterator insert(iterator where, const NodeTy &val) {
646
-    return insert(where, createNode(val));
646
+    return insert(where, this->createNode(val));
647 647
   }
648 648
 
649 649
 
... ...
@@ -259,11 +259,9 @@ class AliasSetTracker {
259 259
     ASTCallbackVH(Value *V, AliasSetTracker *AST = 0);
260 260
     ASTCallbackVH &operator=(Value *V);
261 261
   };
262
-  /// ASTCallbackVHDenseMapInfo - Traits to tell DenseMap that ASTCallbackVH
263
-  /// is not a POD (it needs its destructor called).
264
-  struct ASTCallbackVHDenseMapInfo : public DenseMapInfo<Value *> {
265
-    static bool isPod() { return false; }
266
-  };
262
+  /// ASTCallbackVHDenseMapInfo - Traits to tell DenseMap that tell us how to
263
+  /// compare and hash the value handle.
264
+  struct ASTCallbackVHDenseMapInfo : public DenseMapInfo<Value *> {};
267 265
 
268 266
   AliasAnalysis &AA;
269 267
   ilist<AliasSet> AliasSets;
... ...
@@ -175,11 +175,11 @@ class IVUsers : public LoopPass {
175 175
   ScalarEvolution *SE;
176 176
   SmallPtrSet<Instruction*,16> Processed;
177 177
 
178
-public:
179 178
   /// IVUses - A list of all tracked IV uses of induction variable expressions
180 179
   /// we are interested in.
181 180
   ilist<IVUsersOfOneStride> IVUses;
182 181
 
182
+public:
183 183
   /// IVUsesByStride - A mapping from the strides in StrideOrder to the
184 184
   /// uses in IVUses.
185 185
   std::map<const SCEV *, IVUsersOfOneStride*> IVUsesByStride;
... ...
@@ -976,13 +976,6 @@ public:
976 976
   void removeBlock(BasicBlock *BB) {
977 977
     LI.removeBlock(BB);
978 978
   }
979
-
980
-  static bool isNotAlreadyContainedIn(const Loop *SubLoop,
981
-                                      const Loop *ParentLoop) {
982
-    return
983
-      LoopInfoBase<BasicBlock, Loop>::isNotAlreadyContainedIn(SubLoop,
984
-                                                              ParentLoop);
985
-  }
986 979
 };
987 980
 
988 981
 
... ...
@@ -25,53 +25,52 @@
25 25
 
26 26
 namespace llvm {
27 27
 
28
+struct BPNode {
29
+  BPNode* Next;
30
+  uintptr_t& PtrRef;
31
+  
32
+  BPNode(BPNode* n, uintptr_t& pref)
33
+  : Next(n), PtrRef(pref) {
34
+    PtrRef = 0;
35
+  }
36
+};
37
+
38
+struct BPEntry {
39
+  union { BPNode* Head; void* Ptr; };
40
+  BPEntry() : Head(NULL) {}
41
+  void SetPtr(BPNode*& FreeList, void* P);
42
+};
43
+
44
+class BPKey {
45
+  unsigned Raw;
46
+public:
47
+  BPKey(SerializedPtrID PtrId) : Raw(PtrId << 1) { assert (PtrId > 0); }
48
+  BPKey(unsigned code, unsigned) : Raw(code) {}
49
+  
50
+  void MarkFinal() { Raw |= 0x1; }
51
+  bool hasFinalPtr() const { return Raw & 0x1 ? true : false; }
52
+  SerializedPtrID getID() const { return Raw >> 1; }
53
+  
54
+  static inline BPKey getEmptyKey() { return BPKey(0,0); }
55
+  static inline BPKey getTombstoneKey() { return BPKey(1,0); }
56
+  static inline unsigned getHashValue(const BPKey& K) { return K.Raw & ~0x1; }
57
+  
58
+  static bool isEqual(const BPKey& K1, const BPKey& K2) {
59
+    return (K1.Raw ^ K2.Raw) & ~0x1 ? false : true;
60
+  }
61
+};
62
+  
63
+template <>
64
+struct isPodLike<BPKey> { static const bool value = true; };
65
+template <>
66
+struct isPodLike<BPEntry> { static const bool value = true; };
67
+  
28 68
 class Deserializer {
29 69
 
30 70
   //===----------------------------------------------------------===//
31 71
   // Internal type definitions.
32 72
   //===----------------------------------------------------------===//
33 73
 
34
-  struct BPNode {
35
-    BPNode* Next;
36
-    uintptr_t& PtrRef;
37
-
38
-    BPNode(BPNode* n, uintptr_t& pref)
39
-      : Next(n), PtrRef(pref) {
40
-        PtrRef = 0;
41
-      }
42
-  };
43
-
44
-  struct BPEntry {
45
-    union { BPNode* Head; void* Ptr; };
46
-
47
-    BPEntry() : Head(NULL) {}
48
-
49
-    static inline bool isPod() { return true; }
50
-
51
-    void SetPtr(BPNode*& FreeList, void* P);
52
-  };
53
-
54
-  class BPKey {
55
-    unsigned Raw;
56
-
57
-  public:
58
-    BPKey(SerializedPtrID PtrId) : Raw(PtrId << 1) { assert (PtrId > 0); }
59
-    BPKey(unsigned code, unsigned) : Raw(code) {}
60
-
61
-    void MarkFinal() { Raw |= 0x1; }
62
-    bool hasFinalPtr() const { return Raw & 0x1 ? true : false; }
63
-    SerializedPtrID getID() const { return Raw >> 1; }
64
-
65
-    static inline BPKey getEmptyKey() { return BPKey(0,0); }
66
-    static inline BPKey getTombstoneKey() { return BPKey(1,0); }
67
-    static inline unsigned getHashValue(const BPKey& K) { return K.Raw & ~0x1; }
68
-
69
-    static bool isEqual(const BPKey& K1, const BPKey& K2) {
70
-      return (K1.Raw ^ K2.Raw) & ~0x1 ? false : true;
71
-    }
72
-
73
-    static bool isPod() { return true; }
74
-  };
75 74
 
76 75
   typedef llvm::DenseMap<BPKey,BPEntry,BPKey,BPEntry> MapTy;
77 76
 
78 77
new file mode 100644
... ...
@@ -0,0 +1,39 @@
0
+//===---------------- lib/CodeGen/CalcSpillWeights.h ------------*- C++ -*-===//
1
+//
2
+//                     The LLVM Compiler Infrastructure
3
+//
4
+// This file is distributed under the University of Illinois Open Source
5
+// License. See LICENSE.TXT for details.
6
+//
7
+//===----------------------------------------------------------------------===//
8
+
9
+
10
+#ifndef LLVM_CODEGEN_CALCSPILLWEIGHTS_H
11
+#define LLVM_CODEGEN_CALCSPILLWEIGHTS_H
12
+
13
+#include "llvm/CodeGen/MachineFunctionPass.h"
14
+
15
+namespace llvm {
16
+
17
+  class LiveInterval;
18
+
19
+  /// CalculateSpillWeights - Compute spill weights for all virtual register
20
+  /// live intervals.
21
+  class CalculateSpillWeights : public MachineFunctionPass {
22
+  public:
23
+    static char ID;
24
+
25
+    CalculateSpillWeights() : MachineFunctionPass(&ID) {}
26
+
27
+    virtual void getAnalysisUsage(AnalysisUsage &au) const;
28
+
29
+    virtual bool runOnMachineFunction(MachineFunction &fn);    
30
+
31
+  private:
32
+    /// Returns true if the given live interval is zero length.
33
+    bool isZeroLengthInterval(LiveInterval *li) const;
34
+  };
35
+
36
+}
37
+
38
+#endif // LLVM_CODEGEN_CALCSPILLWEIGHTS_H
... ...
@@ -110,8 +110,7 @@ void SelectRoot(SelectionDAG &DAG) {
110 110
     DAG.setSubgraphColor(Node, "red");
111 111
 #endif
112 112
     SDNode *ResNode = Select(SDValue(Node, 0));
113
-    // If node should not be replaced, 
114
-    // continue with the next one.
113
+    // If node should not be replaced, continue with the next one.
115 114
     if (ResNode == Node)
116 115
       continue;
117 116
     // Replace node.
... ...
@@ -327,11 +327,6 @@ public:
327 327
   /// 'Old', change the code and CFG so that it branches to 'New' instead.
328 328
   void ReplaceUsesOfBlockWith(MachineBasicBlock *Old, MachineBasicBlock *New);
329 329
 
330
-  /// BranchesToLandingPad - The basic block is a landing pad or branches only
331
-  /// to a landing pad. No other instructions are present other than the
332
-  /// unconditional branch.
333
-  bool BranchesToLandingPad(const MachineBasicBlock *MBB) const;
334
-
335 330
   /// CorrectExtraCFGEdges - Various pieces of code can cause excess edges in
336 331
   /// the CFG to be inserted.  If we have proven that MBB can only branch to
337 332
   /// DestA and DestB, remove any other MBB successors from the CFG. DestA and
... ...
@@ -110,6 +110,46 @@ class SelectionDAG {
110 110
   /// SelectionDAG.
111 111
   BumpPtrAllocator Allocator;
112 112
 
113
+  /// NodeOrdering - Assigns a "line number" value to each SDNode that
114
+  /// corresponds to the "line number" of the original LLVM instruction. This
115
+  /// used for turning off scheduling, because we'll forgo the normal scheduling
116
+  /// algorithm and output the instructions according to this ordering.
117
+  class NodeOrdering {
118
+    /// LineNo - The line of the instruction the node corresponds to. A value of
119
+    /// `0' means it's not assigned.
120
+    unsigned LineNo;
121
+    std::map<const SDNode*, unsigned> Order;
122
+
123
+    void operator=(const NodeOrdering&); // Do not implement.
124
+    NodeOrdering(const NodeOrdering&);   // Do not implement.
125
+  public:
126
+    NodeOrdering() : LineNo(0) {}
127
+
128
+    void add(const SDNode *Node) {
129
+      assert(LineNo && "Invalid line number!");
130
+      Order[Node] = LineNo;
131
+    }
132
+    void remove(const SDNode *Node) {
133
+      std::map<const SDNode*, unsigned>::iterator Itr = Order.find(Node);
134
+      if (Itr != Order.end())
135
+        Order.erase(Itr);
136
+    }
137
+    void clear() {
138
+      Order.clear();
139
+      LineNo = 1;
140
+    }
141
+    unsigned getLineNo(const SDNode *Node) {
142
+      unsigned LN = Order[Node];
143
+      assert(LN && "Node isn't in ordering map!");
144
+      return LN;
145
+    }
146
+    void newInst() {
147
+      ++LineNo;
148
+    }
149
+
150
+    void dump() const;
151
+  } *Ordering;
152
+
113 153
   /// VerifyNode - Sanity check the given node.  Aborts if it is invalid.
114 154
   void VerifyNode(SDNode *N);
115 155
 
... ...
@@ -120,6 +160,9 @@ class SelectionDAG {
120 120
                               DenseSet<SDNode *> &visited,
121 121
                               int level, bool &printed);
122 122
 
123
+  void operator=(const SelectionDAG&); // Do not implement.
124
+  SelectionDAG(const SelectionDAG&);   // Do not implement.
125
+
123 126
 public:
124 127
   SelectionDAG(TargetLowering &tli, FunctionLoweringInfo &fli);
125 128
   ~SelectionDAG();
... ...
@@ -199,6 +242,13 @@ public:
199 199
     return Root = N;
200 200
   }
201 201
 
202
+  /// NewInst - Tell the ordering object that we're processing a new
203
+  /// instruction.
204
+  void NewInst() {
205
+    if (Ordering)
206
+      Ordering->newInst();
207
+  }
208
+
202 209
   /// Combine - This iterates over the nodes in the SelectionDAG, folding
203 210
   /// certain types of nodes together, or eliminating superfluous nodes.  The
204 211
   /// Level argument controls whether Combine is allowed to produce nodes and
... ...
@@ -891,8 +891,9 @@ template<> struct DenseMapInfo<SDValue> {
891 891
   static bool isEqual(const SDValue &LHS, const SDValue &RHS) {
892 892
     return LHS == RHS;
893 893
   }
894
-  static bool isPod() { return true; }
895 894
 };
895
+template <> struct isPodLike<SDValue> { static const bool value = true; };
896
+
896 897
 
897 898
 /// simplify_type specializations - Allow casting operators to work directly on
898 899
 /// SDValues as if they were SDNode*'s.
... ...
@@ -343,8 +343,10 @@ namespace llvm {
343 343
     static inline bool isEqual(const SlotIndex &LHS, const SlotIndex &RHS) {
344 344
       return (LHS == RHS);
345 345
     }
346
-    static inline bool isPod() { return false; }
347 346
   };
347
+  
348
+  template <> struct isPodLike<SlotIndex> { static const bool value = true; };
349
+
348 350
 
349 351
   inline raw_ostream& operator<<(raw_ostream &os, SlotIndex li) {
350 352
     li.print(os);
... ...
@@ -47,35 +47,36 @@ namespace llvm {
47 47
       f80            =   9,   // This is a 80 bit floating point value
48 48
       f128           =  10,   // This is a 128 bit floating point value
49 49
       ppcf128        =  11,   // This is a PPC 128-bit floating point value
50
-      Flag           =  12,   // This is a condition code or machine flag.
51
-
52
-      isVoid         =  13,   // This has no value
53
-
54
-      v2i8           =  14,   //  2 x i8
55
-      v4i8           =  15,   //  4 x i8
56
-      v8i8           =  16,   //  8 x i8
57
-      v16i8          =  17,   // 16 x i8
58
-      v32i8          =  18,   // 32 x i8
59
-      v2i16          =  19,   //  2 x i16
60
-      v4i16          =  20,   //  4 x i16
61
-      v8i16          =  21,   //  8 x i16
62
-      v16i16         =  22,   // 16 x i16
63
-      v2i32          =  23,   //  2 x i32
64
-      v4i32          =  24,   //  4 x i32
65
-      v8i32          =  25,   //  8 x i32
66
-      v1i64          =  26,   //  1 x i64
67
-      v2i64          =  27,   //  2 x i64
68
-      v4i64          =  28,   //  4 x i64
69
-
70
-      v2f32          =  29,   //  2 x f32
71
-      v4f32          =  30,   //  4 x f32
72
-      v8f32          =  31,   //  8 x f32
73
-      v2f64          =  32,   //  2 x f64
74
-      v4f64          =  33,   //  4 x f64
50
+
51
+      v2i8           =  12,   //  2 x i8
52
+      v4i8           =  13,   //  4 x i8
53
+      v8i8           =  14,   //  8 x i8
54
+      v16i8          =  15,   // 16 x i8
55
+      v32i8          =  16,   // 32 x i8
56
+      v2i16          =  17,   //  2 x i16
57
+      v4i16          =  18,   //  4 x i16
58
+      v8i16          =  19,   //  8 x i16
59
+      v16i16         =  20,   // 16 x i16
60
+      v2i32          =  21,   //  2 x i32
61
+      v4i32          =  22,   //  4 x i32
62
+      v8i32          =  23,   //  8 x i32
63
+      v1i64          =  24,   //  1 x i64
64
+      v2i64          =  25,   //  2 x i64
65
+      v4i64          =  26,   //  4 x i64
66
+
67
+      v2f32          =  27,   //  2 x f32
68
+      v4f32          =  28,   //  4 x f32
69
+      v8f32          =  29,   //  8 x f32
70
+      v2f64          =  30,   //  2 x f64
71
+      v4f64          =  31,   //  4 x f64
75 72
 
76 73
       FIRST_VECTOR_VALUETYPE = v2i8,
77 74
       LAST_VECTOR_VALUETYPE  = v4f64,
78 75
 
76
+      Flag           =  32,   // This glues nodes together during pre-RA sched
77
+
78
+      isVoid         =  33,   // This has no value
79
+
79 80
       LAST_VALUETYPE =  34,   // This always remains at the end of the list.
80 81
 
81 82
       // This is the current maximum for LAST_VALUETYPE.
... ...
@@ -31,30 +31,31 @@ def f64    : ValueType<64 ,  8>;   // 64-bit floating point value
31 31
 def f80    : ValueType<80 ,  9>;   // 80-bit floating point value
32 32
 def f128   : ValueType<128, 10>;   // 128-bit floating point value
33 33
 def ppcf128: ValueType<128, 11>;   // PPC 128-bit floating point value
34
-def FlagVT : ValueType<0  , 12>;   // Condition code or machine flag
35
-def isVoid : ValueType<0  , 13>;   // Produces no value
36 34
 
37
-def v2i8   : ValueType<16 , 14>;   //  2 x i8  vector value
38
-def v4i8   : ValueType<32 , 15>;   //  4 x i8  vector value
39
-def v8i8   : ValueType<64 , 16>;   //  8 x i8  vector value
40
-def v16i8  : ValueType<128, 17>;   // 16 x i8  vector value
41
-def v32i8  : ValueType<256, 18>;   // 32 x i8 vector value
42
-def v2i16  : ValueType<32 , 19>;   //  2 x i16 vector value
43
-def v4i16  : ValueType<64 , 20>;   //  4 x i16 vector value
44
-def v8i16  : ValueType<128, 21>;   //  8 x i16 vector value
45
-def v16i16 : ValueType<256, 22>;   // 16 x i16 vector value
46
-def v2i32  : ValueType<64 , 23>;   //  2 x i32 vector value
47
-def v4i32  : ValueType<128, 24>;   //  4 x i32 vector value
48
-def v8i32  : ValueType<256, 25>;   //  8 x i32 vector value
49
-def v1i64  : ValueType<64 , 26>;   //  1 x i64 vector value
50
-def v2i64  : ValueType<128, 27>;   //  2 x i64 vector value
51
-def v4i64  : ValueType<256, 28>;   //  4 x f64 vector value
35
+def v2i8   : ValueType<16 , 12>;   //  2 x i8  vector value
36
+def v4i8   : ValueType<32 , 13>;   //  4 x i8  vector value
37
+def v8i8   : ValueType<64 , 14>;   //  8 x i8  vector value
38
+def v16i8  : ValueType<128, 15>;   // 16 x i8  vector value
39
+def v32i8  : ValueType<256, 16>;   // 32 x i8 vector value
40
+def v2i16  : ValueType<32 , 17>;   //  2 x i16 vector value
41
+def v4i16  : ValueType<64 , 18>;   //  4 x i16 vector value
42
+def v8i16  : ValueType<128, 19>;   //  8 x i16 vector value
43
+def v16i16 : ValueType<256, 20>;   // 16 x i16 vector value
44
+def v2i32  : ValueType<64 , 21>;   //  2 x i32 vector value
45
+def v4i32  : ValueType<128, 22>;   //  4 x i32 vector value
46
+def v8i32  : ValueType<256, 23>;   //  8 x i32 vector value
47
+def v1i64  : ValueType<64 , 24>;   //  1 x i64 vector value
48
+def v2i64  : ValueType<128, 25>;   //  2 x i64 vector value
49
+def v4i64  : ValueType<256, 26>;   //  4 x f64 vector value
52 50
 
53
-def v2f32  : ValueType<64,  29>;   //  2 x f32 vector value
54
-def v4f32  : ValueType<128, 30>;   //  4 x f32 vector value
55
-def v8f32  : ValueType<256, 31>;   //  8 x f32 vector value
56
-def v2f64  : ValueType<128, 32>;   //  2 x f64 vector value
57
-def v4f64  : ValueType<256, 33>;   //  4 x f64 vector value
51
+def v2f32  : ValueType<64,  27>;   //  2 x f32 vector value
52
+def v4f32  : ValueType<128, 28>;   //  4 x f32 vector value
53
+def v8f32  : ValueType<256, 29>;   //  8 x f32 vector value
54
+def v2f64  : ValueType<128, 30>;   //  2 x f64 vector value
55
+def v4f64  : ValueType<256, 31>;   //  4 x f64 vector value
56
+
57
+def FlagVT : ValueType<0  , 32>;   // Pre-RA sched glue
58
+def isVoid : ValueType<0  , 33>;   // Produces no value
58 59
 
59 60
 def MetadataVT: ValueType<0, 250>; // Metadata
60 61
 
... ...
@@ -42,9 +42,9 @@ def hidden;
42 42
 def init;
43 43
 def multi_val;
44 44
 def one_or_more;
45
+def optional;
45 46
 def really_hidden;
46 47
 def required;
47
-def zero_or_one;
48 48
 def comma_separated;
49 49
 
50 50
 // The 'case' construct.
... ...
@@ -111,12 +111,10 @@ public:
111 111
   virtual void assignPassManager(PMStack &, 
112 112
                                  PassManagerType = PMT_Unknown) {}
113 113
   /// Check if available pass managers are suitable for this pass or not.
114
-  virtual void preparePassManager(PMStack &) {}
114
+  virtual void preparePassManager(PMStack &);
115 115
   
116 116
   ///  Return what kind of Pass Manager can manage this pass.
117
-  virtual PassManagerType getPotentialPassManagerType() const {
118
-    return PMT_Unknown; 
119
-  }
117
+  virtual PassManagerType getPotentialPassManagerType() const;
120 118
 
121 119
   // Access AnalysisResolver
122 120
   inline void setResolver(AnalysisResolver *AR) { 
... ...
@@ -132,9 +130,7 @@ public:
132 132
   /// particular analysis result to this function, it can then use the
133 133
   /// getAnalysis<AnalysisType>() function, below.
134 134
   ///
135
-  virtual void getAnalysisUsage(AnalysisUsage &) const {
136
-    // By default, no analysis results are used, all are invalidated.
137
-  }
135
+  virtual void getAnalysisUsage(AnalysisUsage &) const;
138 136
 
139 137
   /// releaseMemory() - This member can be implemented by a pass if it wants to
140 138
   /// be able to release its memory when it is no longer needed.  The default
... ...
@@ -147,11 +143,11 @@ public:
147 147
   /// Optionally implement this function to release pass memory when it is no
148 148
   /// longer used.
149 149
   ///
150
-  virtual void releaseMemory() {}
150
+  virtual void releaseMemory();
151 151
 
152 152
   /// verifyAnalysis() - This member can be implemented by a analysis pass to
153 153
   /// check state of analysis information. 
154
-  virtual void verifyAnalysis() const {}
154
+  virtual void verifyAnalysis() const;
155 155
 
156 156
   // dumpPassStructure - Implement the -debug-passes=PassStructure option
157 157
   virtual void dumpPassStructure(unsigned Offset = 0);
... ...
@@ -221,9 +217,7 @@ public:
221 221
                                  PassManagerType T = PMT_ModulePassManager);
222 222
 
223 223
   ///  Return what kind of Pass Manager can manage this pass.
224
-  virtual PassManagerType getPotentialPassManagerType() const {
225
-    return PMT_ModulePassManager;
226
-  }
224
+  virtual PassManagerType getPotentialPassManagerType() const;
227 225
 
228 226
   explicit ModulePass(intptr_t pid) : Pass(pid) {}
229 227
   explicit ModulePass(const void *pid) : Pass(pid) {}
... ...
@@ -245,7 +239,7 @@ public:
245 245
   /// and if it does, the overloaded version of initializePass may get access to
246 246
   /// these passes with getAnalysis<>.
247 247
   ///
248
-  virtual void initializePass() {}
248
+  virtual void initializePass();
249 249
 
250 250
   /// ImmutablePasses are never run.
251 251
   ///
... ...
@@ -276,7 +270,7 @@ public:
276 276
   /// doInitialization - Virtual method overridden by subclasses to do
277 277
   /// any necessary per-module initialization.
278 278
   ///
279
-  virtual bool doInitialization(Module &) { return false; }
279
+  virtual bool doInitialization(Module &);
280 280
   
281 281
   /// runOnFunction - Virtual method overriden by subclasses to do the
282 282
   /// per-function processing of the pass.
... ...
@@ -286,7 +280,7 @@ public:
286 286
   /// doFinalization - Virtual method overriden by subclasses to do any post
287 287
   /// processing needed after all passes have run.
288 288
   ///
289
-  virtual bool doFinalization(Module &) { return false; }
289
+  virtual bool doFinalization(Module &);
290 290
 
291 291
   /// runOnModule - On a module, we run this pass by initializing,
292 292
   /// ronOnFunction'ing once for every function in the module, then by
... ...
@@ -303,9 +297,7 @@ public:
303 303
                                  PassManagerType T = PMT_FunctionPassManager);
304 304
 
305 305
   ///  Return what kind of Pass Manager can manage this pass.
306
-  virtual PassManagerType getPotentialPassManagerType() const {
307
-    return PMT_FunctionPassManager;
308
-  }
306
+  virtual PassManagerType getPotentialPassManagerType() const;
309 307
 };
310 308
 
311 309
 
... ...
@@ -328,12 +320,12 @@ public:
328 328
   /// doInitialization - Virtual method overridden by subclasses to do
329 329
   /// any necessary per-module initialization.
330 330
   ///
331
-  virtual bool doInitialization(Module &) { return false; }
331
+  virtual bool doInitialization(Module &);
332 332
 
333 333
   /// doInitialization - Virtual method overridden by BasicBlockPass subclasses
334 334
   /// to do any necessary per-function initialization.
335 335
   ///
336
-  virtual bool doInitialization(Function &) { return false; }
336
+  virtual bool doInitialization(Function &);
337 337
 
338 338
   /// runOnBasicBlock - Virtual method overriden by subclasses to do the
339 339
   /// per-basicblock processing of the pass.
... ...
@@ -343,12 +335,12 @@ public:
343 343
   /// doFinalization - Virtual method overriden by BasicBlockPass subclasses to
344 344
   /// do any post processing needed after all passes have run.
345 345
   ///
346
-  virtual bool doFinalization(Function &) { return false; }
346
+  virtual bool doFinalization(Function &);
347 347
 
348 348
   /// doFinalization - Virtual method overriden by subclasses to do any post
349 349
   /// processing needed after all passes have run.
350 350
   ///
351
-  virtual bool doFinalization(Module &) { return false; }
351
+  virtual bool doFinalization(Module &);
352 352
 
353 353
 
354 354
   // To run this pass on a function, we simply call runOnBasicBlock once for
... ...
@@ -360,9 +352,7 @@ public:
360 360
                                  PassManagerType T = PMT_BasicBlockPassManager);
361 361
 
362 362
   ///  Return what kind of Pass Manager can manage this pass.
363
-  virtual PassManagerType getPotentialPassManagerType() const {
364
-    return PMT_BasicBlockPassManager; 
365
-  }
363
+  virtual PassManagerType getPotentialPassManagerType() const;
366 364
 };
367 365
 
368 366
 /// If the user specifies the -time-passes argument on an LLVM tool command line
... ...
@@ -70,6 +70,16 @@
70 70
 #define DISABLE_INLINE
71 71
 #endif
72 72
 
73
+// ALWAYS_INLINE - On compilers where we have a directive to do so, mark a
74
+// method "always inline" because it is performance sensitive.
75
+#if (__GNUC__ > 3 || (__GNUC__ == 3 && __GNUC_MINOR__ >= 4))
76
+#define ALWAYS_INLINE __attribute__((always_inline))
77
+#else
78
+// TODO: No idea how to do this with MSVC.
79
+#define ALWAYS_INLINE
80
+#endif
81
+
82
+
73 83
 #ifdef __GNUC__
74 84
 #define NORETURN __attribute__((noreturn))
75 85
 #elif defined(_MSC_VER)
... ...
@@ -66,7 +66,7 @@ namespace llvm {
66 66
   };
67 67
 
68 68
   // Specialize DenseMapInfo for DebugLocTuple.
69
-  template<>  struct DenseMapInfo<DebugLocTuple> {
69
+  template<> struct DenseMapInfo<DebugLocTuple> {
70 70
     static inline DebugLocTuple getEmptyKey() {
71 71
       return DebugLocTuple(0, 0, ~0U, ~0U);
72 72
     }
... ...
@@ -85,9 +85,9 @@ namespace llvm {
85 85
              LHS.Line         == RHS.Line &&
86 86
              LHS.Col          == RHS.Col;
87 87
     }
88
-
89
-    static bool isPod() { return true; }
90 88
   };
89
+  template <> struct isPodLike<DebugLocTuple> {static const bool value = true;};
90
+
91 91
 
92 92
   /// DebugLocTracker - This class tracks debug location information.
93 93
   ///
... ...
@@ -254,15 +254,18 @@ struct DenseMapInfo<AssertingVH<T> > {
254 254
   static bool isEqual(const AssertingVH<T> &LHS, const AssertingVH<T> &RHS) {
255 255
     return LHS == RHS;
256 256
   }
257
-  static bool isPod() {
257
+};
258
+  
259
+template <typename T>
260
+struct isPodLike<AssertingVH<T> > {
258 261
 #ifdef NDEBUG
259
-    return true;
262
+  static const bool value = true;
260 263
 #else
261
-    return false;
264
+  static const bool value = false;
262 265
 #endif
263
-  }
264 266
 };
265 267
 
268
+
266 269
 /// TrackingVH - This is a value handle that tracks a Value (or Value subclass),
267 270
 /// even across RAUW operations.
268 271
 ///
... ...
@@ -186,14 +186,12 @@ public:
186 186
     // Inline fast path, particulary for constant strings where a sufficiently
187 187
     // smart compiler will simplify strlen.
188 188
 
189
-    this->operator<<(StringRef(Str));
190
-    return *this;
189
+    return this->operator<<(StringRef(Str));
191 190
   }
192 191
 
193 192
   raw_ostream &operator<<(const std::string &Str) {
194 193
     // Avoid the fast path, it would only increase code size for a marginal win.
195
-    write(Str.data(), Str.length());
196
-    return *this;
194
+    return write(Str.data(), Str.length());
197 195
   }
198 196
 
199 197
   raw_ostream &operator<<(unsigned long N);
... ...
@@ -202,13 +200,11 @@ public:
202 202
   raw_ostream &operator<<(long long N);
203 203
   raw_ostream &operator<<(const void *P);
204 204
   raw_ostream &operator<<(unsigned int N) {
205
-    this->operator<<(static_cast<unsigned long>(N));
206
-    return *this;
205
+    return this->operator<<(static_cast<unsigned long>(N));
207 206
   }
208 207
 
209 208
   raw_ostream &operator<<(int N) {
210
-    this->operator<<(static_cast<long>(N));
211
-    return *this;
209
+    return this->operator<<(static_cast<long>(N));
212 210
   }
213 211
 
214 212
   raw_ostream &operator<<(double N);
... ...
@@ -17,13 +17,15 @@
17 17
 #ifndef LLVM_SUPPORT_TYPE_TRAITS_H
18 18
 #define LLVM_SUPPORT_TYPE_TRAITS_H
19 19
 
20
+#include <utility>
21
+
20 22
 // This is actually the conforming implementation which works with abstract
21 23
 // classes.  However, enough compilers have trouble with it that most will use
22 24
 // the one in boost/type_traits/object_traits.hpp. This implementation actually
23 25
 // works with VC7.0, but other interactions seem to fail when we use it.
24 26
 
25 27
 namespace llvm {
26
-
28
+  
27 29
 namespace dont_use
28 30
 {
29 31
     // These two functions should never be used. They are helpers to
... ...
@@ -48,6 +50,23 @@ struct is_class
48 48
  public:
49 49
     enum { value = sizeof(char) == sizeof(dont_use::is_class_helper<T>(0)) };
50 50
 };
51
+  
52
+  
53
+/// isPodLike - This is a type trait that is used to determine whether a given
54
+/// type can be copied around with memcpy instead of running ctors etc.
55
+template <typename T>
56
+struct isPodLike {
57
+  // If we don't know anything else, we can (at least) assume that all non-class
58
+  // types are PODs.
59
+  static const bool value = !is_class<T>::value;
60
+};
61
+
62
+// std::pair's are pod-like if their elements are.
63
+template<typename T, typename U>
64
+struct isPodLike<std::pair<T, U> > {
65
+  static const bool value = isPodLike<T>::value & isPodLike<U>::value;
66
+};
67
+  
51 68
 
52 69
 /// \brief Metafunction that determines whether the two given types are 
53 70
 /// equivalent.
... ...
@@ -286,11 +286,10 @@ public:
286 286
   ///    just return false, leaving TBB/FBB null.
287 287
   /// 2. If this block ends with only an unconditional branch, it sets TBB to be
288 288
   ///    the destination block.
289
-  /// 3. If this block ends with an conditional branch and it falls through to
290
-  ///    a successor block, it sets TBB to be the branch destination block and
291
-  ///    a list of operands that evaluate the condition. These
292
-  ///    operands can be passed to other TargetInstrInfo methods to create new
293
-  ///    branches.
289
+  /// 3. If this block ends with a conditional branch and it falls through to a
290
+  ///    successor block, it sets TBB to be the branch destination block and a
291
+  ///    list of operands that evaluate the condition. These operands can be
292
+  ///    passed to other TargetInstrInfo methods to create new branches.
294 293
   /// 4. If this block ends with a conditional branch followed by an
295 294
   ///    unconditional branch, it returns the 'true' destination in TBB, the
296 295
   ///    'false' destination in FBB, and a list of operands that evaluate the
... ...
@@ -972,7 +972,7 @@ protected:
972 972
   /// not work with the with specified type and indicate what to do about it.
973 973
   void setLoadExtAction(unsigned ExtType, MVT VT,
974 974
                       LegalizeAction Action) {
975
-    assert((unsigned)VT.SimpleTy < MVT::LAST_VALUETYPE &&
975
+    assert((unsigned)VT.SimpleTy*2 < 63 &&
976 976
            ExtType < array_lengthof(LoadExtActions) &&
977 977
            "Table isn't big enough!");
978 978
     LoadExtActions[ExtType] &= ~(uint64_t(3UL) << VT.SimpleTy*2);
... ...
@@ -984,7 +984,7 @@ protected:
984 984
   void setTruncStoreAction(MVT ValVT, MVT MemVT,
985 985
                            LegalizeAction Action) {
986 986
     assert((unsigned)ValVT.SimpleTy < array_lengthof(TruncStoreActions) &&
987
-           (unsigned)MemVT.SimpleTy < MVT::LAST_VALUETYPE &&
987
+           (unsigned)MemVT.SimpleTy*2 < 63 &&
988 988
            "Table isn't big enough!");
989 989
     TruncStoreActions[ValVT.SimpleTy] &= ~(uint64_t(3UL)  << MemVT.SimpleTy*2);
990 990
     TruncStoreActions[ValVT.SimpleTy] |= (uint64_t)Action << MemVT.SimpleTy*2;
... ...
@@ -121,8 +121,6 @@ namespace {
121 121
 
122 122
       return *LHS == *RHS;
123 123
     }
124
-
125
-    static bool isPod() { return true; }
126 124
   };
127 125
 
128 126
   class Andersens : public ModulePass, public AliasAnalysis,
... ...
@@ -53,7 +53,7 @@ static bool containsAddRecFromDifferentLoop(const SCEV *S, Loop *L) {
53 53
       if (newLoop == L)
54 54
         return false;
55 55
       // if newLoop is an outer loop of L, this is OK.
56
-      if (!LoopInfo::isNotAlreadyContainedIn(L, newLoop))
56
+      if (newLoop->contains(L->getHeader()))
57 57
         return false;
58 58
     }
59 59
     return true;
... ...
@@ -307,6 +307,7 @@ bool IVUsers::runOnLoop(Loop *l, LPPassManager &LPM) {
307 307
   for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ++I)
308 308
     AddUsersIfInteresting(I);
309 309
 
310
+  Processed.clear();
310 311
   return false;
311 312
 }
312 313
 
... ...
@@ -369,7 +370,7 @@ void IVUsers::dump() const {
369 369
 void IVUsers::releaseMemory() {
370 370
   IVUsesByStride.clear();
371 371
   StrideOrder.clear();
372
-  Processed.clear();
372
+  IVUses.clear();
373 373
 }
374 374
 
375 375
 void IVStrideUse::deleted() {
... ...
@@ -44,19 +44,19 @@ ProfileInfoT<Function, BasicBlock>::~ProfileInfoT() {
44 44
 }
45 45
 
46 46
 template<>
47
-char ProfileInfo::ID = 0;
47
+char ProfileInfoT<Function,BasicBlock>::ID = 0;
48 48
 
49 49
 template<>
50
-char MachineProfileInfo::ID = 0;
50
+char ProfileInfoT<MachineFunction, MachineBasicBlock>::ID = 0;
51 51
 
52 52
 template<>
53
-const double ProfileInfo::MissingValue = -1;
53
+const double ProfileInfoT<Function,BasicBlock>::MissingValue = -1;
54 54
 
55
-template<>
56
-const double MachineProfileInfo::MissingValue = -1;
55
+template<> const
56
+double ProfileInfoT<MachineFunction, MachineBasicBlock>::MissingValue = -1;
57 57
 
58
-template<>
59
-double ProfileInfo::getExecutionCount(const BasicBlock *BB) {
58
+template<> double
59
+ProfileInfoT<Function,BasicBlock>::getExecutionCount(const BasicBlock *BB) {
60 60
   std::map<const Function*, BlockCounts>::iterator J =
61 61
     BlockInformation.find(BB->getParent());
62 62
   if (J != BlockInformation.end()) {
... ...
@@ -118,7 +118,8 @@ double ProfileInfo::getExecutionCount(const BasicBlock *BB) {
118 118
 }
119 119
 
120 120
 template<>
121
-double MachineProfileInfo::getExecutionCount(const MachineBasicBlock *MBB) {
121
+double ProfileInfoT<MachineFunction, MachineBasicBlock>::
122
+        getExecutionCount(const MachineBasicBlock *MBB) {
122 123
   std::map<const MachineFunction*, BlockCounts>::iterator J =
123 124
     BlockInformation.find(MBB->getParent());
124 125
   if (J != BlockInformation.end()) {
... ...
@@ -131,7 +132,7 @@ double MachineProfileInfo::getExecutionCount(const MachineBasicBlock *MBB) {
131 131
 }
132 132
 
133 133
 template<>
134
-double ProfileInfo::getExecutionCount(const Function *F) {
134
+double ProfileInfoT<Function,BasicBlock>::getExecutionCount(const Function *F) {
135 135
   std::map<const Function*, double>::iterator J =
136 136
     FunctionInformation.find(F);
137 137
   if (J != FunctionInformation.end())
... ...
@@ -147,7 +148,8 @@ double ProfileInfo::getExecutionCount(const Function *F) {
147 147
 }
148 148
 
149 149
 template<>
150
-double MachineProfileInfo::getExecutionCount(const MachineFunction *MF) {
150
+double ProfileInfoT<MachineFunction, MachineBasicBlock>::
151
+        getExecutionCount(const MachineFunction *MF) {
151 152
   std::map<const MachineFunction*, double>::iterator J =
152 153
     FunctionInformation.find(MF);
153 154
   if (J != FunctionInformation.end())
... ...
@@ -159,21 +161,23 @@ double MachineProfileInfo::getExecutionCount(const MachineFunction *MF) {
159 159
 }
160 160
 
161 161
 template<>
162
-void ProfileInfo::setExecutionCount(const BasicBlock *BB, double w) {
162
+void ProfileInfoT<Function,BasicBlock>::
163
+        setExecutionCount(const BasicBlock *BB, double w) {
163 164
   DEBUG(errs() << "Creating Block " << BB->getName() 
164 165
                << " (weight: " << format("%.20g",w) << ")\n");
165 166
   BlockInformation[BB->getParent()][BB] = w;
166 167
 }
167 168
 
168 169
 template<>
169
-void MachineProfileInfo::setExecutionCount(const MachineBasicBlock *MBB, double w) {
170
+void ProfileInfoT<MachineFunction, MachineBasicBlock>::
171
+        setExecutionCount(const MachineBasicBlock *MBB, double w) {
170 172
   DEBUG(errs() << "Creating Block " << MBB->getBasicBlock()->getName()
171 173
                << " (weight: " << format("%.20g",w) << ")\n");
172 174
   BlockInformation[MBB->getParent()][MBB] = w;
173 175
 }
174 176
 
175 177
 template<>
176
-void ProfileInfo::addEdgeWeight(Edge e, double w) {
178
+void ProfileInfoT<Function,BasicBlock>::addEdgeWeight(Edge e, double w) {
177 179
   double oldw = getEdgeWeight(e);
178 180
   assert (oldw != MissingValue && "Adding weight to Edge with no previous weight");
179 181
   DEBUG(errs() << "Adding to Edge " << e
... ...
@@ -182,7 +186,8 @@ void ProfileInfo::addEdgeWeight(Edge e, double w) {
182 182
 }
183 183
 
184 184
 template<>
185
-void ProfileInfo::addExecutionCount(const BasicBlock *BB, double w) {
185
+void ProfileInfoT<Function,BasicBlock>::
186
+        addExecutionCount(const BasicBlock *BB, double w) {
186 187
   double oldw = getExecutionCount(BB);
187 188
   assert (oldw != MissingValue && "Adding weight to Block with no previous weight");
188 189
   DEBUG(errs() << "Adding to Block " << BB->getName()
... ...
@@ -191,7 +196,7 @@ void ProfileInfo::addExecutionCount(const BasicBlock *BB, double w) {
191 191
 }
192 192
 
193 193
 template<>
194
-void ProfileInfo::removeBlock(const BasicBlock *BB) {
194
+void ProfileInfoT<Function,BasicBlock>::removeBlock(const BasicBlock *BB) {
195 195
   std::map<const Function*, BlockCounts>::iterator J =
196 196
     BlockInformation.find(BB->getParent());
197 197
   if (J == BlockInformation.end()) return;
... ...
@@ -201,7 +206,7 @@ void ProfileInfo::removeBlock(const BasicBlock *BB) {
201 201
 }
202 202
 
203 203
 template<>
204
-void ProfileInfo::removeEdge(Edge e) {
204
+void ProfileInfoT<Function,BasicBlock>::removeEdge(Edge e) {
205 205
   std::map<const Function*, EdgeWeights>::iterator J =
206 206
     EdgeInformation.find(getFunction(e));
207 207
   if (J == EdgeInformation.end()) return;
... ...
@@ -211,7 +216,8 @@ void ProfileInfo::removeEdge(Edge e) {
211 211
 }
212 212
 
213 213
 template<>
214
-void ProfileInfo::replaceEdge(const Edge &oldedge, const Edge &newedge) {
214
+void ProfileInfoT<Function,BasicBlock>::
215
+        replaceEdge(const Edge &oldedge, const Edge &newedge) {
215 216
   double w;
216 217
   if ((w = getEdgeWeight(newedge)) == MissingValue) {
217 218
     w = getEdgeWeight(oldedge);
... ...
@@ -225,8 +231,9 @@ void ProfileInfo::replaceEdge(const Edge &oldedge, const Edge &newedge) {
225 225
 }
226 226
 
227 227
 template<>
228
-const BasicBlock *ProfileInfo::GetPath(const BasicBlock *Src, const BasicBlock *Dest,
229
-                                       Path &P, unsigned Mode) {
228
+const BasicBlock *ProfileInfoT<Function,BasicBlock>::
229
+        GetPath(const BasicBlock *Src, const BasicBlock *Dest,
230
+                Path &P, unsigned Mode) {
230 231
   const BasicBlock *BB = 0;
231 232
   bool hasFoundPath = false;
232 233
 
... ...
@@ -268,7 +275,8 @@ const BasicBlock *ProfileInfo::GetPath(const BasicBlock *Src, const BasicBlock *
268 268
 }
269 269
 
270 270
 template<>
271
-void ProfileInfo::divertFlow(const Edge &oldedge, const Edge &newedge) {
271
+void ProfileInfoT<Function,BasicBlock>::
272
+        divertFlow(const Edge &oldedge, const Edge &newedge) {
272 273
   DEBUG(errs() << "Diverting " << oldedge << " via " << newedge );
273 274
 
274 275
   // First check if the old edge was taken, if not, just delete it...
... ...
@@ -302,8 +310,8 @@ void ProfileInfo::divertFlow(const Edge &oldedge, const Edge &newedge) {
302 302
 /// This checks all edges of the function the blocks reside in and replaces the
303 303
 /// occurences of RmBB with DestBB.
304 304
 template<>
305
-void ProfileInfo::replaceAllUses(const BasicBlock *RmBB, 
306
-                                 const BasicBlock *DestBB) {
305
+void ProfileInfoT<Function,BasicBlock>::
306
+        replaceAllUses(const BasicBlock *RmBB, const BasicBlock *DestBB) {
307 307
   DEBUG(errs() << "Replacing " << RmBB->getName()
308 308
                << " with " << DestBB->getName() << "\n");
309 309
   const Function *F = DestBB->getParent();
... ...
@@ -352,10 +360,10 @@ void ProfileInfo::replaceAllUses(const BasicBlock *RmBB,
352 352
 /// Since its possible that there is more than one edge in the CFG from FristBB
353 353
 /// to SecondBB its necessary to redirect the flow proporionally.
354 354
 template<>
355
-void ProfileInfo::splitEdge(const BasicBlock *FirstBB,
356
-                            const BasicBlock *SecondBB,
357
-                            const BasicBlock *NewBB,
358
-                            bool MergeIdenticalEdges) {
355
+void ProfileInfoT<Function,BasicBlock>::splitEdge(const BasicBlock *FirstBB,
356
+                                                  const BasicBlock *SecondBB,
357
+                                                  const BasicBlock *NewBB,
358
+                                                  bool MergeIdenticalEdges) {
359 359
   const Function *F = FirstBB->getParent();
360 360
   std::map<const Function*, EdgeWeights>::iterator J =
361 361
     EdgeInformation.find(F);
... ...
@@ -398,7 +406,8 @@ void ProfileInfo::splitEdge(const BasicBlock *FirstBB,
398 398
 }
399 399
 
400 400
 template<>
401
-void ProfileInfo::splitBlock(const BasicBlock *Old, const BasicBlock* New) {
401
+void ProfileInfoT<Function,BasicBlock>::splitBlock(const BasicBlock *Old,
402
+                                                   const BasicBlock* New) {
402 403
   const Function *F = Old->getParent();
403 404
   std::map<const Function*, EdgeWeights>::iterator J =
404 405
     EdgeInformation.find(F);
... ...
@@ -426,8 +435,10 @@ void ProfileInfo::splitBlock(const BasicBlock *Old, const BasicBlock* New) {
426 426
 }
427 427
 
428 428
 template<>
429
-void ProfileInfo::splitBlock(const BasicBlock *BB, const BasicBlock* NewBB,
430
-                            BasicBlock *const *Preds, unsigned NumPreds) {
429
+void ProfileInfoT<Function,BasicBlock>::splitBlock(const BasicBlock *BB,
430
+                                                   const BasicBlock* NewBB,
431
+                                                   BasicBlock *const *Preds,
432
+                                                   unsigned NumPreds) {
431 433
   const Function *F = BB->getParent();
432 434
   std::map<const Function*, EdgeWeights>::iterator J =
433 435
     EdgeInformation.find(F);
... ...
@@ -461,7 +472,8 @@ void ProfileInfo::splitBlock(const BasicBlock *BB, const BasicBlock* NewBB,
461 461
 }
462 462
 
463 463
 template<>
464
-void ProfileInfo::transfer(const Function *Old, const Function *New) {
464
+void ProfileInfoT<Function,BasicBlock>::transfer(const Function *Old,
465
+                                                 const Function *New) {
465 466
   DEBUG(errs() << "Replacing Function " << Old->getName() << " with "
466 467
                << New->getName() << "\n");
467 468
   std::map<const Function*, EdgeWeights>::iterator J =
... ...
@@ -474,8 +486,8 @@ void ProfileInfo::transfer(const Function *Old, const Function *New) {
474 474
   FunctionInformation.erase(Old);
475 475
 }
476 476
 
477
-static double readEdgeOrRemember(ProfileInfo::Edge edge, double w, ProfileInfo::Edge &tocalc,
478
-                                 unsigned &uncalc) {
477
+static double readEdgeOrRemember(ProfileInfo::Edge edge, double w,
478
+                                 ProfileInfo::Edge &tocalc, unsigned &uncalc) {
479 479
   if (w == ProfileInfo::MissingValue) {
480 480
     tocalc = edge;
481 481
     uncalc++;
... ...
@@ -486,7 +498,9 @@ static double readEdgeOrRemember(ProfileInfo::Edge edge, double w, ProfileInfo::
486 486
 }
487 487
 
488 488
 template<>
489
-bool ProfileInfo::CalculateMissingEdge(const BasicBlock *BB, Edge &removed, bool assumeEmptySelf) {
489
+bool ProfileInfoT<Function,BasicBlock>::
490
+        CalculateMissingEdge(const BasicBlock *BB, Edge &removed,
491
+                             bool assumeEmptySelf) {
490 492
   Edge edgetocalc;
491 493
   unsigned uncalculated = 0;
492 494
 
... ...
@@ -562,7 +576,7 @@ static void readEdge(ProfileInfo *PI, ProfileInfo::Edge e, double &calcw, std::s
562 562
 }
563 563
 
564 564
 template<>
565
-bool ProfileInfo::EstimateMissingEdges(const BasicBlock *BB) {
565
+bool ProfileInfoT<Function,BasicBlock>::EstimateMissingEdges(const BasicBlock *BB) {
566 566
   bool hasNoSuccessors = false;
567 567
 
568 568
   double inWeight = 0;
... ...
@@ -619,7 +633,7 @@ bool ProfileInfo::EstimateMissingEdges(const BasicBlock *BB) {
619 619
 }
620 620
 
621 621
 template<>
622
-void ProfileInfo::repair(const Function *F) {
622
+void ProfileInfoT<Function,BasicBlock>::repair(const Function *F) {
623 623
 //  if (getExecutionCount(&(F->getEntryBlock())) == 0) {
624 624
 //    for (Function::const_iterator FI = F->begin(), FE = F->end();
625 625
 //         FI != FE; ++FI) {
... ...
@@ -413,7 +413,7 @@ uintptr_t Deserializer::ReadInternalRefPtr() {
413 413
   return GetFinalPtr(E);
414 414
 }
415 415
 
416
-void Deserializer::BPEntry::SetPtr(BPNode*& FreeList, void* P) {
416
+void BPEntry::SetPtr(BPNode*& FreeList, void* P) {
417 417
   BPNode* Last = NULL;
418 418
   
419 419
   for (BPNode* N = Head; N != NULL; N=N->Next) {
... ...
@@ -906,7 +906,7 @@ void DwarfDebug::constructTypeDIE(DIE &Buffer, DICompositeType CTy) {
906 906
         continue;
907 907
       DIE *ElemDie = NULL;
908 908
       if (Element.getTag() == dwarf::DW_TAG_subprogram)
909
-        ElemDie = createMemberSubprogramDIE(DISubprogram(Element.getNode()));
909
+        ElemDie = createSubprogramDIE(DISubprogram(Element.getNode()));
910 910
       else
911 911
         ElemDie = createMemberDIE(DIDerivedType(Element.getNode()));
912 912
       Buffer.addChild(ElemDie);
... ...
@@ -1098,11 +1098,13 @@ DIE *DwarfDebug::createMemberDIE(const DIDerivedType &DT) {
1098 1098
   return MemberDie;
1099 1099
 }
1100 1100
 
1101
-/// createRawSubprogramDIE - Create new partially incomplete DIE. This is
1102
-/// a helper routine used by createMemberSubprogramDIE and 
1103
-/// createSubprogramDIE.
1104
-DIE *DwarfDebug::createRawSubprogramDIE(const DISubprogram &SP) {
1105
-  DIE *SPDie = new DIE(dwarf::DW_TAG_subprogram);
1101
+/// createSubprogramDIE - Create new DIE using SP.
1102
+DIE *DwarfDebug::createSubprogramDIE(const DISubprogram &SP, bool MakeDecl) {
1103
+  DIE *SPDie = ModuleCU->getDIE(SP.getNode());
1104
+  if (SPDie)
1105
+    return SPDie;
1106
+
1107
+  SPDie = new DIE(dwarf::DW_TAG_subprogram);
1106 1108
   addString(SPDie, dwarf::DW_AT_name, dwarf::DW_FORM_string, SP.getName());
1107 1109
 
1108 1110
   StringRef LinkageName = SP.getLinkageName();
... ...
@@ -1144,52 +1146,7 @@ DIE *DwarfDebug::createRawSubprogramDIE(const DISubprogram &SP) {
1144 1144
     ContainingTypeMap.insert(std::make_pair(SPDie, WeakVH(SP.getContainingType().getNode())));
1145 1145
   }
1146 1146
 
1147
-  return SPDie;
1148
-}
1149
-
1150
-/// createMemberSubprogramDIE - Create new member DIE using SP. This routine
1151
-/// always returns a die with DW_AT_declaration attribute.
1152
-DIE *DwarfDebug::createMemberSubprogramDIE(const DISubprogram &SP) {
1153
-  DIE *SPDie = ModuleCU->getDIE(SP.getNode());
1154
-  if (!SPDie)
1155
-    SPDie = createSubprogramDIE(SP);
1156
-
1157
-  // If SPDie has DW_AT_declaration then reuse it.
1158
-  if (!SP.isDefinition())
1159
-    return SPDie;
1160
-
1161
-  // Otherwise create new DIE for the declaration. First push definition
1162
-  // DIE at the top level.
1163
-  if (TopLevelDIEs.insert(SPDie))
1164
-    TopLevelDIEsVector.push_back(SPDie);
1165
-
1166
-  SPDie = createRawSubprogramDIE(SP);
1167
-
1168
-  // Add arguments. 
1169
-  DICompositeType SPTy = SP.getType();
1170
-  DIArray Args = SPTy.getTypeArray();
1171
-  unsigned SPTag = SPTy.getTag();
1172
-  if (SPTag == dwarf::DW_TAG_subroutine_type)
1173
-    for (unsigned i = 1, N =  Args.getNumElements(); i < N; ++i) {
1174
-      DIE *Arg = new DIE(dwarf::DW_TAG_formal_parameter);
1175
-      addType(Arg, DIType(Args.getElement(i).getNode()));
1176
-      addUInt(Arg, dwarf::DW_AT_artificial, dwarf::DW_FORM_flag, 1); // ??
1177
-      SPDie->addChild(Arg);
1178
-    }
1179
-
1180
-  addUInt(SPDie, dwarf::DW_AT_declaration, dwarf::DW_FORM_flag, 1);
1181
-  return SPDie;
1182
-}
1183
-
1184
-/// createSubprogramDIE - Create new DIE using SP.
1185
-DIE *DwarfDebug::createSubprogramDIE(const DISubprogram &SP) {
1186
-  DIE *SPDie = ModuleCU->getDIE(SP.getNode());
1187
-  if (SPDie)
1188
-    return SPDie;
1189
-
1190
-  SPDie = createRawSubprogramDIE(SP);
1191
-
1192
-  if (!SP.isDefinition()) {
1147
+  if (MakeDecl || !SP.isDefinition()) {
1193 1148
     addUInt(SPDie, dwarf::DW_AT_declaration, dwarf::DW_FORM_flag, 1);
1194 1149
 
1195 1150
     // Add arguments. Do not add arguments for subprogram definition. They will
... ...
@@ -1310,6 +1267,28 @@ DIE *DwarfDebug::updateSubprogramScopeDIE(MDNode *SPNode) {
1310 1310
 
1311 1311
  DIE *SPDie = ModuleCU->getDIE(SPNode);
1312 1312
  assert (SPDie && "Unable to find subprogram DIE!");
1313
+ DISubprogram SP(SPNode);
1314
+ if (SP.isDefinition() && !SP.getContext().isCompileUnit()) {
1315
+   addUInt(SPDie, dwarf::DW_AT_declaration, dwarf::DW_FORM_flag, 1);
1316
+  // Add arguments. 
1317
+   DICompositeType SPTy = SP.getType();
1318
+   DIArray Args = SPTy.getTypeArray();
1319
+   unsigned SPTag = SPTy.getTag();
1320
+   if (SPTag == dwarf::DW_TAG_subroutine_type)
1321
+     for (unsigned i = 1, N =  Args.getNumElements(); i < N; ++i) {
1322
+       DIE *Arg = new DIE(dwarf::DW_TAG_formal_parameter);
1323
+       addType(Arg, DIType(Args.getElement(i).getNode()));
1324
+       addUInt(Arg, dwarf::DW_AT_artificial, dwarf::DW_FORM_flag, 1); // ??
1325
+       SPDie->addChild(Arg);
1326
+     }
1327
+   DIE *SPDeclDie = SPDie;
1328
+   SPDie = new DIE(dwarf::DW_TAG_subprogram);
1329
+   addDIEEntry(SPDie, dwarf::DW_AT_specification, dwarf::DW_FORM_ref4, 
1330
+               SPDeclDie);
1331
+   
1332
+   ModuleCU->addDie(SPDie);
1333
+ }
1334
+   
1313 1335
  addLabel(SPDie, dwarf::DW_AT_low_pc, dwarf::DW_FORM_addr,
1314 1336
           DWLabel("func_begin", SubprogramCount));
1315 1337
  addLabel(SPDie, dwarf::DW_AT_high_pc, dwarf::DW_FORM_addr,
... ...
@@ -350,17 +350,7 @@ class DwarfDebug : public Dwarf {
350 350
   DIE *createMemberDIE(const DIDerivedType &DT);
351 351
 
352 352
   /// createSubprogramDIE - Create new DIE using SP.
353
-  DIE *createSubprogramDIE(const DISubprogram &SP);
354
-
355
-  /// createMemberSubprogramDIE - Create new member DIE using SP. This
356
-  /// routine always returns a die with DW_AT_declaration attribute.
357
-
358
-  DIE *createMemberSubprogramDIE(const DISubprogram &SP);
359
-
360
-  /// createRawSubprogramDIE - Create new partially incomplete DIE. This is
361
-  /// a helper routine used by createMemberSubprogramDIE and 
362
-  /// createSubprogramDIE.
363
-  DIE *createRawSubprogramDIE(const DISubprogram &SP);
353
+  DIE *createSubprogramDIE(const DISubprogram &SP, bool MakeDecl = false);
364 354
 
365 355
   /// findCompileUnit - Get the compile unit for the given descriptor. 
366 356
   ///
... ...
@@ -292,14 +292,13 @@ void DwarfException::EmitFDE(const FunctionEHFrameInfo &EHFrameInfo) {
292 292
       Asm->EmitULEB128Bytes(is4Byte ? 4 : 8);
293 293
       Asm->EOL("Augmentation size");
294 294
 
295
+      // We force 32-bits here because we've encoded our LSDA in the CIE with
296
+      // `dwarf::DW_EH_PE_sdata4'. And the CIE and FDE should agree.
295 297
       if (EHFrameInfo.hasLandingPads)
296
-        EmitReference("exception", EHFrameInfo.Number, true, false);
297
-      else {
298
-        if (is4Byte)
299
-          Asm->EmitInt32((int)0);
300
-        else
301
-          Asm->EmitInt64((int)0);
302
-      }
298
+        EmitReference("exception", EHFrameInfo.Number, true, true);
299
+      else
300
+        Asm->EmitInt32((int)0);
301
+
303 302
       Asm->EOL("Language Specific Data Area");
304 303
     } else {
305 304
       Asm->EmitULEB128Bytes(0);
... ...
@@ -119,7 +119,6 @@ class DwarfException : public Dwarf {
119 119
     static inline unsigned getTombstoneKey() { return -2U; }
120 120
     static unsigned getHashValue(const unsigned &Key) { return Key; }
121 121
     static bool isEqual(unsigned LHS, unsigned RHS) { return LHS == RHS; }
122
-    static bool isPod() { return true; }
123 122
   };
124 123
 
125 124
   /// PadRange - Structure holding a try-range and the associated landing pad.
... ...
@@ -1205,11 +1205,11 @@ ReoptimizeBlock:
1205 1205
     }
1206 1206
   }
1207 1207
 
1208
-  // If the prior block doesn't fall through into this block and if this block
1209
-  // doesn't fall through into some other block and it's not branching only to a
1210
-  // landing pad, then see if we can find a place to move this block where a
1211
-  // fall-through will happen.
1212
-  if (!PrevBB.canFallThrough() && !MBB->BranchesToLandingPad(MBB)) {
1208
+  // If the prior block doesn't fall through into this block, and if this
1209
+  // block doesn't fall through into some other block, see if we can find a
1210
+  // place to move this block where a fall-through will happen.
1211
+  if (!PrevBB.canFallThrough()) {
1212
+
1213 1213
     // Now we know that there was no fall-through into this block, check to
1214 1214
     // see if it has a fall-through into its successor.
1215 1215
     bool CurFallsThru = MBB->canFallThrough();
... ...
@@ -1221,32 +1221,28 @@ ReoptimizeBlock:
1221 1221
            E = MBB->pred_end(); PI != E; ++PI) {
1222 1222
         // Analyze the branch at the end of the pred.
1223 1223
         MachineBasicBlock *PredBB = *PI;
1224
-        MachineFunction::iterator PredNextBB = PredBB; ++PredNextBB;
1224
+        MachineFunction::iterator PredFallthrough = PredBB; ++PredFallthrough;
1225 1225
         MachineBasicBlock *PredTBB, *PredFBB;
1226 1226
         SmallVector<MachineOperand, 4> PredCond;
1227
-        if (PredBB != MBB && !PredBB->canFallThrough()
1228
-            && !TII->AnalyzeBranch(*PredBB, PredTBB, PredFBB, PredCond, true)
1227
+        if (PredBB != MBB && !PredBB->canFallThrough() &&
1228
+            !TII->AnalyzeBranch(*PredBB, PredTBB, PredFBB, PredCond, true)
1229 1229
             && (!CurFallsThru || !CurTBB || !CurFBB)
1230 1230
             && (!CurFallsThru || MBB->getNumber() >= PredBB->getNumber())) {
1231
-          // If the current block doesn't fall through, just move it.  If the
1232
-          // current block can fall through and does not end with a conditional
1233
-          // branch, we need to append an unconditional jump to the (current)
1234
-          // next block.  To avoid a possible compile-time infinite loop, move
1235
-          // blocks only backward in this case.
1236
-          // 
1237
-          // Also, if there are already 2 branches here, we cannot add a third.
1238
-          // I.e. we have the case:
1239
-          // 
1240
-          //     Bcc next
1241
-          //     B elsewhere
1242
-          //   next:
1231
+          // If the current block doesn't fall through, just move it.
1232
+          // If the current block can fall through and does not end with a
1233
+          // conditional branch, we need to append an unconditional jump to
1234
+          // the (current) next block.  To avoid a possible compile-time
1235
+          // infinite loop, move blocks only backward in this case.
1236
+          // Also, if there are already 2 branches here, we cannot add a third;
1237
+          // this means we have the case
1238
+          // Bcc next
1239
+          // B elsewhere
1240
+          // next:
1243 1241
           if (CurFallsThru) {
1244
-            MachineBasicBlock *NextBB =
1245
-              llvm::next(MachineFunction::iterator(MBB));
1242
+            MachineBasicBlock *NextBB = llvm::next(MachineFunction::iterator(MBB));
1246 1243
             CurCond.clear();
1247 1244
             TII->InsertBranch(*MBB, NextBB, 0, CurCond);
1248 1245
           }
1249
-
1250 1246
           MBB->moveAfter(PredBB);
1251 1247
           MadeChange = true;
1252 1248
           goto ReoptimizeBlock;
... ...
@@ -1,6 +1,7 @@
1 1
 add_llvm_library(LLVMCodeGen
2 2
   AggressiveAntiDepBreaker.cpp
3 3
   BranchFolding.cpp
4
+  CalcSpillWeights.cpp
4 5
   CodePlacementOpt.cpp
5 6
   CriticalAntiDepBreaker.cpp
6 7
   DeadMachineInstructionElim.cpp
7 8
new file mode 100644
... ...
@@ -0,0 +1,154 @@
0
+//===------------------------ CalcSpillWeights.cpp ------------------------===//
1
+//
2
+//                     The LLVM Compiler Infrastructure
3
+//
4
+// This file is distributed under the University of Illinois Open Source
5
+// License. See LICENSE.TXT for details.
6
+//
7
+//===----------------------------------------------------------------------===//
8
+
9
+#define DEBUG_TYPE "calcspillweights"
10
+
11
+#include "llvm/Function.h"
12
+#include "llvm/ADT/SmallSet.h"
13
+#include "llvm/CodeGen/CalcSpillWeights.h"
14
+#include "llvm/CodeGen/LiveIntervalAnalysis.h"
15
+#include "llvm/CodeGen/MachineFunction.h"
16
+#include "llvm/CodeGen/MachineLoopInfo.h"
17
+#include "llvm/CodeGen/MachineRegisterInfo.h"
18
+#include "llvm/CodeGen/SlotIndexes.h"
19
+#include "llvm/Support/Debug.h"
20
+#include "llvm/Support/raw_ostream.h"
21
+#include "llvm/Target/TargetInstrInfo.h"
22
+#include "llvm/Target/TargetRegisterInfo.h"
23
+
24
+using namespace llvm;
25
+
26
+char CalculateSpillWeights::ID = 0;
27
+static RegisterPass<CalculateSpillWeights> X("calcspillweights",
28
+                                             "Calculate spill weights");
29
+
30
+void CalculateSpillWeights::getAnalysisUsage(AnalysisUsage &au) const {
31
+  au.addRequired<LiveIntervals>();
32
+  au.addRequired<MachineLoopInfo>();
33
+  au.setPreservesAll();
34
+  MachineFunctionPass::getAnalysisUsage(au);
35
+}
36
+
37
+bool CalculateSpillWeights::runOnMachineFunction(MachineFunction &fn) {
38
+
39
+  DEBUG(errs() << "********** Compute Spill Weights **********\n"
40
+               << "********** Function: "
41
+               << fn.getFunction()->getName() << '\n');
42
+
43
+  LiveIntervals *lis = &getAnalysis<LiveIntervals>();
44
+  MachineLoopInfo *loopInfo = &getAnalysis<MachineLoopInfo>();
45
+  const TargetInstrInfo *tii = fn.getTarget().getInstrInfo();
46
+  MachineRegisterInfo *mri = &fn.getRegInfo();
47
+
48
+  SmallSet<unsigned, 4> processed;
49
+  for (MachineFunction::iterator mbbi = fn.begin(), mbbe = fn.end();
50
+       mbbi != mbbe; ++mbbi) {
51
+    MachineBasicBlock* mbb = mbbi;
52
+    SlotIndex mbbEnd = lis->getMBBEndIdx(mbb);
53
+    MachineLoop* loop = loopInfo->getLoopFor(mbb);
54
+    unsigned loopDepth = loop ? loop->getLoopDepth() : 0;
55
+    bool isExiting = loop ? loop->isLoopExiting(mbb) : false;
56
+
57
+    for (MachineBasicBlock::const_iterator mii = mbb->begin(), mie = mbb->end();
58
+         mii != mie; ++mii) {
59
+      const MachineInstr *mi = mii;
60
+      if (tii->isIdentityCopy(*mi))
61
+        continue;
62
+
63
+      if (mi->getOpcode() == TargetInstrInfo::IMPLICIT_DEF)
64
+        continue;
65
+
66
+      for (unsigned i = 0, e = mi->getNumOperands(); i != e; ++i) {
67
+        const MachineOperand &mopi = mi->getOperand(i);
68
+        if (!mopi.isReg() || mopi.getReg() == 0)
69
+          continue;
70
+        unsigned reg = mopi.getReg();
71
+        if (!TargetRegisterInfo::isVirtualRegister(mopi.getReg()))
72
+          continue;
73
+        // Multiple uses of reg by the same instruction. It should not
74
+        // contribute to spill weight again.
75
+        if (!processed.insert(reg))
76
+          continue;
77
+
78
+        bool hasDef = mopi.isDef();
79
+        bool hasUse = !hasDef;
80
+        for (unsigned j = i+1; j != e; ++j) {
81
+          const MachineOperand &mopj = mi->getOperand(j);
82
+          if (!mopj.isReg() || mopj.getReg() != reg)
83
+            continue;
84
+          hasDef |= mopj.isDef();
85
+          hasUse |= mopj.isUse();
86
+          if (hasDef && hasUse)
87
+            break;
88
+        }
89
+
90
+        LiveInterval &regInt = lis->getInterval(reg);
91
+        float weight = lis->getSpillWeight(hasDef, hasUse, loopDepth);
92
+        if (hasDef && isExiting) {
93
+          // Looks like this is a loop count variable update.
94
+          SlotIndex defIdx = lis->getInstructionIndex(mi).getDefIndex();
95
+          const LiveRange *dlr =
96
+            lis->getInterval(reg).getLiveRangeContaining(defIdx);
97
+          if (dlr->end > mbbEnd)
98
+            weight *= 3.0F;
99
+        }
100
+        regInt.weight += weight;
101
+      }
102
+      processed.clear();
103
+    }
104
+  }
105
+
106
+  for (LiveIntervals::iterator I = lis->begin(), E = lis->end(); I != E; ++I) {
107
+    LiveInterval &li = *I->second;
108
+    if (TargetRegisterInfo::isVirtualRegister(li.reg)) {
109
+      // If the live interval length is essentially zero, i.e. in every live
110
+      // range the use follows def immediately, it doesn't make sense to spill
111
+      // it and hope it will be easier to allocate for this li.
112
+      if (isZeroLengthInterval(&li)) {
113
+        li.weight = HUGE_VALF;
114
+        continue;
115
+      }
116
+
117
+      bool isLoad = false;
118
+      SmallVector<LiveInterval*, 4> spillIs;
119
+      if (lis->isReMaterializable(li, spillIs, isLoad)) {
120
+        // If all of the definitions of the interval are re-materializable,
121
+        // it is a preferred candidate for spilling. If non of the defs are
122
+        // loads, then it's potentially very cheap to re-materialize.
123
+        // FIXME: this gets much more complicated once we support non-trivial
124
+        // re-materialization.
125
+        if (isLoad)
126
+          li.weight *= 0.9F;
127
+        else
128
+          li.weight *= 0.5F;
129
+      }
130
+
131
+      // Slightly prefer live interval that has been assigned a preferred reg.
132
+      std::pair<unsigned, unsigned> Hint = mri->getRegAllocationHint(li.reg);
133
+      if (Hint.first || Hint.second)
134
+        li.weight *= 1.01F;
135
+
136
+      // Divide the weight of the interval by its size.  This encourages
137
+      // spilling of intervals that are large and have few uses, and
138
+      // discourages spilling of small intervals with many uses.
139
+      li.weight /= lis->getApproximateInstructionCount(li) * SlotIndex::NUM;
140
+    }
141
+  }
142
+  
143
+  return false;
144
+}
145
+
146
+/// Returns true if the given live interval is zero length.
147
+bool CalculateSpillWeights::isZeroLengthInterval(LiveInterval *li) const {
148
+  for (LiveInterval::Ranges::const_iterator
149
+       i = li->ranges.begin(), e = li->ranges.end(); i != e; ++i)
150
+    if (i->end.getPrevIndex() > i->start)
151
+      return false;
152
+  return true;
153
+}
... ...
@@ -13,16 +13,15 @@
13 13
 
14 14
 #include "llvm/CodeGen/MachineBasicBlock.h"
15 15
 #include "llvm/BasicBlock.h"
16
-#include "llvm/ADT/SmallSet.h"
17
-#include "llvm/Assembly/Writer.h"
18 16
 #include "llvm/CodeGen/MachineFunction.h"
17
+#include "llvm/Target/TargetRegisterInfo.h"
19 18
 #include "llvm/Target/TargetData.h"
20 19
 #include "llvm/Target/TargetInstrDesc.h"
21 20
 #include "llvm/Target/TargetInstrInfo.h"
22 21
 #include "llvm/Target/TargetMachine.h"
23
-#include "llvm/Target/TargetRegisterInfo.h"
24 22
 #include "llvm/Support/LeakDetector.h"
25 23
 #include "llvm/Support/raw_ostream.h"
24
+#include "llvm/Assembly/Writer.h"
26 25
 #include <algorithm>
27 26
 using namespace llvm;
28 27
 
... ...
@@ -449,28 +448,10 @@ void MachineBasicBlock::ReplaceUsesOfBlockWith(MachineBasicBlock *Old,
449 449
   addSuccessor(New);
450 450
 }
451 451
 
452
-/// BranchesToLandingPad - The basic block is a landing pad or branches only to
453
-/// a landing pad. No other instructions are present other than the
454
-/// unconditional branch.
455
-bool
456
-MachineBasicBlock::BranchesToLandingPad(const MachineBasicBlock *MBB) const {
457
-  SmallSet<const MachineBasicBlock*, 32> Visited;
458
-  const MachineBasicBlock *CurMBB = MBB;
459
-
460
-  while (!CurMBB->isLandingPad()) {
461
-    if (CurMBB->succ_size() != 1) break;
462
-    if (!Visited.insert(CurMBB)) break;
463
-    CurMBB = *CurMBB->succ_begin();
464
-  }
465
-
466
-  return CurMBB->isLandingPad();
467
-}
468
-
469 452
 /// CorrectExtraCFGEdges - Various pieces of code can cause excess edges in the
470 453
 /// CFG to be inserted.  If we have proven that MBB can only branch to DestA and
471 454
 /// DestB, remove any other MBB successors from the CFG.  DestA and DestB can
472 455
 /// be null.
473
-/// 
474 456
 /// Besides DestA and DestB, retain other edges leading to LandingPads
475 457
 /// (currently there can be only one; we don't check or require that here).
476 458
 /// Note it is possible that DestA and/or DestB are LandingPads.
... ...
@@ -483,16 +464,16 @@ bool MachineBasicBlock::CorrectExtraCFGEdges(MachineBasicBlock *DestA,
483 483
   MachineFunction::iterator FallThru =
484 484
     llvm::next(MachineFunction::iterator(this));
485 485
   
486
-  // If this block ends with a conditional branch that falls through to its
487
-  // successor, set DestB as the successor.
488 486
   if (isCond) {
487
+    // If this block ends with a conditional branch that falls through to its
488
+    // successor, set DestB as the successor.
489 489
     if (DestB == 0 && FallThru != getParent()->end()) {
490 490
       DestB = FallThru;
491 491
       AddedFallThrough = true;
492 492
     }
493 493
   } else {
494 494
     // If this is an unconditional branch with no explicit dest, it must just be
495
-    // a fallthrough into DestB.
495
+    // a fallthrough into DestA.
496 496
     if (DestA == 0 && FallThru != getParent()->end()) {
497 497
       DestA = FallThru;
498 498
       AddedFallThrough = true;
... ...
@@ -500,17 +481,16 @@ bool MachineBasicBlock::CorrectExtraCFGEdges(MachineBasicBlock *DestA,
500 500
   }
501 501
   
502 502
   MachineBasicBlock::succ_iterator SI = succ_begin();
503
-  const MachineBasicBlock *OrigDestA = DestA, *OrigDestB = DestB;
503
+  MachineBasicBlock *OrigDestA = DestA, *OrigDestB = DestB;
504 504
   while (SI != succ_end()) {
505
-    const MachineBasicBlock *MBB = *SI;
506
-    if (MBB == DestA) {
505
+    if (*SI == DestA) {
507 506
       DestA = 0;
508 507
       ++SI;
509
-    } else if (MBB == DestB) {
508
+    } else if (*SI == DestB) {
510 509
       DestB = 0;
511 510
       ++SI;
512
-    } else if (MBB != OrigDestA && MBB != OrigDestB &&
513
-               BranchesToLandingPad(MBB)) {
511
+    } else if ((*SI)->isLandingPad() && 
512
+               *SI!=OrigDestA && *SI!=OrigDestB) {
514 513
       ++SI;
515 514
     } else {
516 515
       // Otherwise, this is a superfluous edge, remove it.
... ...
@@ -518,14 +498,12 @@ bool MachineBasicBlock::CorrectExtraCFGEdges(MachineBasicBlock *DestA,
518 518
       MadeChange = true;
519 519
     }
520 520
   }
521
-
522 521
   if (!AddedFallThrough) {
523 522
     assert(DestA == 0 && DestB == 0 &&
524 523
            "MachineCFG is missing edges!");
525 524
   } else if (isCond) {
526 525
     assert(DestA == 0 && "MachineCFG is missing edges!");
527 526
   }
528
-
529 527
   return MadeChange;
530 528
 }
531 529
 
... ...
@@ -16,6 +16,7 @@
16 16
 
17 17
 #define DEBUG_TYPE "pre-alloc-split"
18 18
 #include "VirtRegMap.h"
19
+#include "llvm/CodeGen/CalcSpillWeights.h"
19 20
 #include "llvm/CodeGen/LiveIntervalAnalysis.h"
20 21
 #include "llvm/CodeGen/LiveStackAnalysis.h"
21 22
 #include "llvm/CodeGen/MachineDominators.h"
... ...
@@ -104,6 +105,7 @@ namespace {
104 104
       AU.addRequired<LiveStacks>();
105 105
       AU.addPreserved<LiveStacks>();
106 106
       AU.addPreserved<RegisterCoalescer>();
107
+      AU.addPreserved<CalculateSpillWeights>();
107 108
       if (StrongPHIElim)
108 109
         AU.addPreservedID(StrongPHIEliminationID);
109 110
       else
... ...
@@ -16,6 +16,7 @@
16 16
 #include "VirtRegRewriter.h"
17 17
 #include "Spiller.h"
18 18
 #include "llvm/Function.h"
19
+#include "llvm/CodeGen/CalcSpillWeights.h"
19 20
 #include "llvm/CodeGen/LiveIntervalAnalysis.h"
20 21
 #include "llvm/CodeGen/LiveStackAnalysis.h"
21 22
 #include "llvm/CodeGen/MachineFunctionPass.h"
... ...
@@ -187,6 +188,7 @@ namespace {
187 187
       // Make sure PassManager knows which analyses to make available
188 188
       // to coalescing and which analyses coalescing invalidates.
189 189
       AU.addRequiredTransitive<RegisterCoalescer>();
190
+      AU.addRequired<CalculateSpillWeights>();
190 191
       if (PreSplitIntervals)
191 192
         AU.addRequiredID(PreAllocSplittingID);
192 193
       AU.addRequired<LiveStacks>();
... ...
@@ -36,6 +36,7 @@
36 36
 #include "PBQP/Heuristics/Briggs.h"
37 37
 #include "VirtRegMap.h"
38 38
 #include "VirtRegRewriter.h"
39
+#include "llvm/CodeGen/CalcSpillWeights.h"
39 40
 #include "llvm/CodeGen/LiveIntervalAnalysis.h"
40 41
 #include "llvm/CodeGen/LiveStackAnalysis.h"
41 42
 #include "llvm/CodeGen/MachineFunctionPass.h"
... ...
@@ -90,6 +91,7 @@ namespace {
90 90
       au.addRequired<LiveIntervals>();
91 91
       //au.addRequiredID(SplitCriticalEdgesID);
92 92
       au.addRequired<RegisterCoalescer>();
93
+      au.addRequired<CalculateSpillWeights>();
93 94
       au.addRequired<LiveStacks>();
94 95
       au.addPreserved<LiveStacks>();
95 96
       au.addRequired<MachineLoopInfo>();
... ...
@@ -3202,6 +3202,19 @@ SDValue DAGCombiner::visitZERO_EXTEND(SDNode *N) {
3202 3202
                        X, DAG.getConstant(Mask, VT));
3203 3203
   }
3204 3204
 
3205
+  // Fold (zext (and x, cst)) -> (and (zext x), cst)
3206
+  if (N0.getOpcode() == ISD::AND &&
3207
+      N0.getOperand(1).getOpcode() == ISD::Constant &&
3208
+      N0.getOperand(0).getOpcode() != ISD::TRUNCATE &&
3209
+      N0.getOperand(0).hasOneUse()) {
3210
+    APInt Mask = cast<ConstantSDNode>(N0.getOperand(1))->getAPIntValue();
3211
+    Mask.zext(VT.getSizeInBits());
3212
+    return DAG.getNode(ISD::AND, N->getDebugLoc(), VT,
3213
+                       DAG.getNode(ISD::ZERO_EXTEND, N->getDebugLoc(), VT,
3214
+                                   N0.getOperand(0)),
3215
+                       DAG.getConstant(Mask, VT));
3216
+  }
3217
+
3205 3218
   // fold (zext (load x)) -> (zext (truncate (zextload x)))
3206 3219
   if (ISD::isNON_EXTLoad(N0.getNode()) &&
3207 3220
       ((!LegalOperations && !cast<LoadSDNode>(N0)->isVolatile()) ||
... ...
@@ -3278,6 +3291,26 @@ SDValue DAGCombiner::visitZERO_EXTEND(SDNode *N) {
3278 3278
     if (SCC.getNode()) return SCC;
3279 3279
   }
3280 3280
 
3281
+  // (zext (shl (zext x), cst)) -> (shl (zext x), cst)
3282
+  if ((N0.getOpcode() == ISD::SHL || N0.getOpcode() == ISD::SRL) &&
3283
+      isa<ConstantSDNode>(N0.getOperand(1)) &&
3284
+      N0.getOperand(0).getOpcode() == ISD::ZERO_EXTEND &&
3285
+      N0.hasOneUse()) {
3286
+    if (N0.getOpcode() == ISD::SHL) {
3287
+      // If the original shl may be shifting out bits, do not perform this
3288
+      // transformation.
3289
+      unsigned ShAmt = cast<ConstantSDNode>(N0.getOperand(1))->getZExtValue();
3290
+      unsigned KnownZeroBits = N0.getOperand(0).getValueType().getSizeInBits() -
3291
+        N0.getOperand(0).getOperand(0).getValueType().getSizeInBits();
3292
+      if (ShAmt > KnownZeroBits)
3293
+        return SDValue();
3294
+    }
3295
+    DebugLoc dl = N->getDebugLoc();
3296
+    return DAG.getNode(N0.getOpcode(), dl, VT,
3297
+                       DAG.getNode(ISD::ZERO_EXTEND, dl, VT, N0.getOperand(0)),
3298
+                       DAG.getNode(ISD::ZERO_EXTEND, dl, VT, N0.getOperand(1)));
3299
+  }
3300
+
3281 3301
   return SDValue();
3282 3302
 }
3283 3303
 
... ...
@@ -5196,7 +5229,7 @@ SDValue DAGCombiner::visitSTORE(SDNode *N) {
5196 5196
     // SimplifyDemandedBits, which only works if the value has a single use.
5197 5197
     if (SimplifyDemandedBits(Value,
5198 5198
                              APInt::getLowBitsSet(
5199
-                               Value.getValueSizeInBits(),
5199
+                               Value.getValueType().getScalarType().getSizeInBits(),
5200 5200
                                ST->getMemoryVT().getSizeInBits())))
5201 5201
       return SDValue(N, 0);
5202 5202
   }
... ...
@@ -20,10 +20,16 @@
20 20
 #include "llvm/Target/TargetInstrInfo.h"
21 21
 #include "llvm/Target/TargetRegisterInfo.h"
22 22
 #include "llvm/Target/TargetSubtarget.h"
23
+#include "llvm/Support/CommandLine.h"
23 24
 #include "llvm/Support/Debug.h"
24 25
 #include "llvm/Support/raw_ostream.h"
25 26
 using namespace llvm;
26 27
 
28
+cl::opt<bool>
29
+DisableInstScheduling("disable-inst-scheduling",
30
+                      cl::init(false),
31
+                      cl::desc("Disable instruction scheduling"));
32
+
27 33
 ScheduleDAGSDNodes::ScheduleDAGSDNodes(MachineFunction &mf)
28 34
   : ScheduleDAG(mf) {
29 35
 }
... ...
@@ -48,6 +48,8 @@
48 48
 #include <cmath>
49 49
 using namespace llvm;
50 50
 
51
+extern cl::opt<bool> DisableInstScheduling;
52
+
51 53
 /// makeVTList - Return an instance of the SDVTList struct initialized with the
52 54
 /// specified members.
53 55
 static SDVTList makeVTList(const EVT *VTs, unsigned NumVTs) {
... ...
@@ -552,6 +554,9 @@ void SelectionDAG::RemoveDeadNodes(SmallVectorImpl<SDNode *> &DeadNodes,
552 552
     }
553 553
 
554 554
     DeallocateNode(N);
555
+
556
+    // Remove the ordering of this node.
557
+    if (Ordering) Ordering->remove(N);
555 558
   }
556 559
 }
557 560
 
... ...
@@ -577,6 +582,9 @@ void SelectionDAG::DeleteNodeNotInCSEMaps(SDNode *N) {
577 577
   N->DropOperands();
578 578
 
579 579
   DeallocateNode(N);
580
+
581
+  // Remove the ordering of this node.
582
+  if (Ordering) Ordering->remove(N);
580 583
 }
581 584
 
582 585
 void SelectionDAG::DeallocateNode(SDNode *N) {
... ...
@@ -588,6 +596,9 @@ void SelectionDAG::DeallocateNode(SDNode *N) {
588 588
   N->NodeType = ISD::DELETED_NODE;
589 589
 
590 590
   NodeAllocator.Deallocate(AllNodes.remove(N));
591
+
592
+  // Remove the ordering of this node.
593
+  if (Ordering) Ordering->remove(N);
591 594
 }
592 595
 
593 596
 /// RemoveNodeFromCSEMaps - Take the specified node out of the CSE map that
... ...
@@ -691,7 +702,9 @@ SDNode *SelectionDAG::FindModifiedNodeSlot(SDNode *N, SDValue Op,
691 691
   FoldingSetNodeID ID;
692 692
   AddNodeIDNode(ID, N->getOpcode(), N->getVTList(), Ops, 1);
693 693
   AddNodeIDCustom(ID, N);
694
-  return CSEMap.FindNodeOrInsertPos(ID, InsertPos);
694
+  SDNode *Node = CSEMap.FindNodeOrInsertPos(ID, InsertPos);
695
+  if (Ordering) Ordering->remove(Node);
696
+  return Node;
695 697
 }
696 698
 
697 699
 /// FindModifiedNodeSlot - Find a slot for the specified node if its operands
... ...
@@ -708,7 +721,9 @@ SDNode *SelectionDAG::FindModifiedNodeSlot(SDNode *N,
708 708
   FoldingSetNodeID ID;
709 709
   AddNodeIDNode(ID, N->getOpcode(), N->getVTList(), Ops, 2);
710 710
   AddNodeIDCustom(ID, N);
711
-  return CSEMap.FindNodeOrInsertPos(ID, InsertPos);
711
+  SDNode *Node = CSEMap.FindNodeOrInsertPos(ID, InsertPos);
712
+  if (Ordering) Ordering->remove(Node);
713
+  return Node;
712 714
 }
713 715
 
714 716
 
... ...
@@ -725,7 +740,9 @@ SDNode *SelectionDAG::FindModifiedNodeSlot(SDNode *N,
725 725
   FoldingSetNodeID ID;
726 726
   AddNodeIDNode(ID, N->getOpcode(), N->getVTList(), Ops, NumOps);
727 727
   AddNodeIDCustom(ID, N);
728
-  return CSEMap.FindNodeOrInsertPos(ID, InsertPos);
728
+  SDNode *Node = CSEMap.FindNodeOrInsertPos(ID, InsertPos);
729
+  if (Ordering) Ordering->remove(Node);
730
+  return Node;
729 731
 }
730 732
 
731 733
 /// VerifyNode - Sanity check the given node.  Aborts if it is invalid.
... ...
@@ -778,8 +795,13 @@ unsigned SelectionDAG::getEVTAlignment(EVT VT) const {
778 778
 SelectionDAG::SelectionDAG(TargetLowering &tli, FunctionLoweringInfo &fli)
779 779
   : TLI(tli), FLI(fli), DW(0),
780 780
     EntryNode(ISD::EntryToken, DebugLoc::getUnknownLoc(),
781
-    getVTList(MVT::Other)), Root(getEntryNode()) {
781
+              getVTList(MVT::Other)),
782
+    Root(getEntryNode()), Ordering(0) {
782 783
   AllNodes.push_back(&EntryNode);
784
+  if (DisableInstScheduling) {
785
+    Ordering = new NodeOrdering();
786
+    Ordering->add(&EntryNode);
787
+  }
783 788
 }
784 789
 
785 790
 void SelectionDAG::init(MachineFunction &mf, MachineModuleInfo *mmi,
... ...
@@ -792,6 +814,7 @@ void SelectionDAG::init(MachineFunction &mf, MachineModuleInfo *mmi,
792 792
 
793 793
 SelectionDAG::~SelectionDAG() {
794 794
   allnodes_clear();
795
+  delete Ordering;
795 796
 }
796 797
 
797 798
 void SelectionDAG::allnodes_clear() {
... ...
@@ -817,6 +840,10 @@ void SelectionDAG::clear() {
817 817
   EntryNode.UseList = 0;
818 818
   AllNodes.push_back(&EntryNode);
819 819
   Root = getEntryNode();
820
+  if (DisableInstScheduling) {
821
+    Ordering = new NodeOrdering();
822
+    Ordering->add(&EntryNode);
823
+  }
820 824
 }
821 825
 
822 826
 SDValue SelectionDAG::getSExtOrTrunc(SDValue Op, DebugLoc DL, EVT VT) {
... ...
@@ -877,14 +904,17 @@ SDValue SelectionDAG::getConstant(const ConstantInt &Val, EVT VT, bool isT) {
877 877
   ID.AddPointer(&Val);
878 878
   void *IP = 0;
879 879
   SDNode *N = NULL;
880
-  if ((N = CSEMap.FindNodeOrInsertPos(ID, IP)))
880
+  if ((N = CSEMap.FindNodeOrInsertPos(ID, IP))) {
881
+    if (Ordering) Ordering->add(N);
881 882
     if (!VT.isVector())
882 883
       return SDValue(N, 0);
884
+  }
883 885
   if (!N) {
884 886
     N = NodeAllocator.Allocate<ConstantSDNode>();
885 887
     new (N) ConstantSDNode(isT, &Val, EltVT);
886 888
     CSEMap.InsertNode(N, IP);
887 889
     AllNodes.push_back(N);
890
+    if (Ordering) Ordering->add(N);
888 891
   }
889 892
 
890 893
   SDValue Result(N, 0);
... ...
@@ -921,14 +951,17 @@ SDValue SelectionDAG::getConstantFP(const ConstantFP& V, EVT VT, bool isTarget){
921 921
   ID.AddPointer(&V);
922 922
   void *IP = 0;
923 923
   SDNode *N = NULL;
924
-  if ((N = CSEMap.FindNodeOrInsertPos(ID, IP)))
924
+  if ((N = CSEMap.FindNodeOrInsertPos(ID, IP))) {
925
+    if (Ordering) Ordering->add(N);
925 926
     if (!VT.isVector())
926 927
       return SDValue(N, 0);
928
+  }
927 929
   if (!N) {
928 930
     N = NodeAllocator.Allocate<ConstantFPSDNode>();
929 931
     new (N) ConstantFPSDNode(isTarget, &V, EltVT);
930 932
     CSEMap.InsertNode(N, IP);
931 933
     AllNodes.push_back(N);
934
+    if (Ordering) Ordering->add(N);
932 935
   }
933 936
 
934 937
   SDValue Result(N, 0);
... ...
@@ -983,12 +1016,15 @@ SDValue SelectionDAG::getGlobalAddress(const GlobalValue *GV,
983 983
   ID.AddInteger(Offset);
984 984
   ID.AddInteger(TargetFlags);
985 985
   void *IP = 0;
986
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
986
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
987
+    if (Ordering) Ordering->add(E);
987 988
     return SDValue(E, 0);
989
+  }
988 990
   SDNode *N = NodeAllocator.Allocate<GlobalAddressSDNode>();
989 991
   new (N) GlobalAddressSDNode(Opc, GV, VT, Offset, TargetFlags);
990 992
   CSEMap.InsertNode(N, IP);
991 993
   AllNodes.push_back(N);
994
+  if (Ordering) Ordering->add(N);
992 995
   return SDValue(N, 0);
993 996
 }
994 997
 
... ...
@@ -998,12 +1034,15 @@ SDValue SelectionDAG::getFrameIndex(int FI, EVT VT, bool isTarget) {
998 998
   AddNodeIDNode(ID, Opc, getVTList(VT), 0, 0);
999 999
   ID.AddInteger(FI);
1000 1000
   void *IP = 0;
1001
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1001
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1002
+    if (Ordering) Ordering->add(E);
1002 1003
     return SDValue(E, 0);
1004
+  }
1003 1005
   SDNode *N = NodeAllocator.Allocate<FrameIndexSDNode>();
1004 1006
   new (N) FrameIndexSDNode(FI, VT, isTarget);
1005 1007
   CSEMap.InsertNode(N, IP);
1006 1008
   AllNodes.push_back(N);
1009
+  if (Ordering) Ordering->add(N);
1007 1010
   return SDValue(N, 0);
1008 1011
 }
1009 1012
 
... ...
@@ -1017,12 +1056,15 @@ SDValue SelectionDAG::getJumpTable(int JTI, EVT VT, bool isTarget,
1017 1017
   ID.AddInteger(JTI);
1018 1018
   ID.AddInteger(TargetFlags);
1019 1019
   void *IP = 0;
1020
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1020
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1021
+    if (Ordering) Ordering->add(E);
1021 1022
     return SDValue(E, 0);
1023
+  }
1022 1024
   SDNode *N = NodeAllocator.Allocate<JumpTableSDNode>();
1023 1025
   new (N) JumpTableSDNode(JTI, VT, isTarget, TargetFlags);
1024 1026
   CSEMap.InsertNode(N, IP);
1025 1027
   AllNodes.push_back(N);
1028
+  if (Ordering) Ordering->add(N);
1026 1029
   return SDValue(N, 0);
1027 1030
 }
1028 1031
 
... ...
@@ -1042,12 +1084,15 @@ SDValue SelectionDAG::getConstantPool(Constant *C, EVT VT,
1042 1042
   ID.AddPointer(C);
1043 1043
   ID.AddInteger(TargetFlags);
1044 1044
   void *IP = 0;
1045
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1045
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1046
+    if (Ordering) Ordering->add(E);
1046 1047
     return SDValue(E, 0);
1048
+  }
1047 1049
   SDNode *N = NodeAllocator.Allocate<ConstantPoolSDNode>();
1048 1050
   new (N) ConstantPoolSDNode(isTarget, C, VT, Offset, Alignment, TargetFlags);
1049 1051
   CSEMap.InsertNode(N, IP);
1050 1052
   AllNodes.push_back(N);
1053
+  if (Ordering) Ordering->add(N);
1051 1054
   return SDValue(N, 0);
1052 1055
 }
1053 1056
 
... ...
@@ -1068,12 +1113,15 @@ SDValue SelectionDAG::getConstantPool(MachineConstantPoolValue *C, EVT VT,
1068 1068
   C->AddSelectionDAGCSEId(ID);
1069 1069
   ID.AddInteger(TargetFlags);
1070 1070
   void *IP = 0;
1071
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1071
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1072
+    if (Ordering) Ordering->add(E);
1072 1073
     return SDValue(E, 0);
1074
+  }
1073 1075
   SDNode *N = NodeAllocator.Allocate<ConstantPoolSDNode>();
1074 1076
   new (N) ConstantPoolSDNode(isTarget, C, VT, Offset, Alignment, TargetFlags);
1075 1077
   CSEMap.InsertNode(N, IP);
1076 1078
   AllNodes.push_back(N);
1079
+  if (Ordering) Ordering->add(N);
1077 1080
   return SDValue(N, 0);
1078 1081
 }
1079 1082
 
... ...
@@ -1082,12 +1130,15 @@ SDValue SelectionDAG::getBasicBlock(MachineBasicBlock *MBB) {
1082 1082
   AddNodeIDNode(ID, ISD::BasicBlock, getVTList(MVT::Other), 0, 0);
1083 1083
   ID.AddPointer(MBB);
1084 1084
   void *IP = 0;
1085
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1085
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1086
+    if (Ordering) Ordering->add(E);
1086 1087
     return SDValue(E, 0);
1088
+  }
1087 1089
   SDNode *N = NodeAllocator.Allocate<BasicBlockSDNode>();
1088 1090
   new (N) BasicBlockSDNode(MBB);
1089 1091
   CSEMap.InsertNode(N, IP);
1090 1092
   AllNodes.push_back(N);
1093
+  if (Ordering) Ordering->add(N);
1091 1094
   return SDValue(N, 0);
1092 1095
 }
1093 1096
 
... ...
@@ -1103,6 +1154,7 @@ SDValue SelectionDAG::getValueType(EVT VT) {
1103 1103
   N = NodeAllocator.Allocate<VTSDNode>();
1104 1104
   new (N) VTSDNode(VT);
1105 1105
   AllNodes.push_back(N);
1106
+  if (Ordering) Ordering->add(N);
1106 1107
   return SDValue(N, 0);
1107 1108
 }
1108 1109
 
... ...
@@ -1112,6 +1164,7 @@ SDValue SelectionDAG::getExternalSymbol(const char *Sym, EVT VT) {
1112 1112
   N = NodeAllocator.Allocate<ExternalSymbolSDNode>();
1113 1113
   new (N) ExternalSymbolSDNode(false, Sym, 0, VT);
1114 1114
   AllNodes.push_back(N);
1115
+  if (Ordering) Ordering->add(N);
1115 1116
   return SDValue(N, 0);
1116 1117
 }
1117 1118
 
... ...
@@ -1124,6 +1177,7 @@ SDValue SelectionDAG::getTargetExternalSymbol(const char *Sym, EVT VT,
1124 1124
   N = NodeAllocator.Allocate<ExternalSymbolSDNode>();
1125 1125
   new (N) ExternalSymbolSDNode(true, Sym, TargetFlags, VT);
1126 1126
   AllNodes.push_back(N);
1127
+  if (Ordering) Ordering->add(N);
1127 1128
   return SDValue(N, 0);
1128 1129
 }
1129 1130
 
... ...
@@ -1136,6 +1190,7 @@ SDValue SelectionDAG::getCondCode(ISD::CondCode Cond) {
1136 1136
     new (N) CondCodeSDNode(Cond);
1137 1137
     CondCodeNodes[Cond] = N;
1138 1138
     AllNodes.push_back(N);
1139
+    if (Ordering) Ordering->add(N);
1139 1140
   }
1140 1141
   return SDValue(CondCodeNodes[Cond], 0);
1141 1142
 }
... ...
@@ -1228,8 +1283,10 @@ SDValue SelectionDAG::getVectorShuffle(EVT VT, DebugLoc dl, SDValue N1,
1228 1228
     ID.AddInteger(MaskVec[i]);
1229 1229
 
1230 1230
   void* IP = 0;
1231
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1231
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1232
+    if (Ordering) Ordering->add(E);
1232 1233
     return SDValue(E, 0);
1234
+  }
1233 1235
 
1234 1236
   // Allocate the mask array for the node out of the BumpPtrAllocator, since
1235 1237
   // SDNode doesn't have access to it.  This memory will be "leaked" when
... ...
@@ -1241,6 +1298,7 @@ SDValue SelectionDAG::getVectorShuffle(EVT VT, DebugLoc dl, SDValue N1,
1241 1241
   new (N) ShuffleVectorSDNode(VT, dl, N1, N2, MaskAlloc);
1242 1242
   CSEMap.InsertNode(N, IP);
1243 1243
   AllNodes.push_back(N);
1244
+  if (Ordering) Ordering->add(N);
1244 1245
   return SDValue(N, 0);
1245 1246
 }
1246 1247
 
... ...
@@ -1258,12 +1316,15 @@ SDValue SelectionDAG::getConvertRndSat(EVT VT, DebugLoc dl,
1258 1258
   SDValue Ops[] = { Val, DTy, STy, Rnd, Sat };
1259 1259
   AddNodeIDNode(ID, ISD::CONVERT_RNDSAT, getVTList(VT), &Ops[0], 5);
1260 1260
   void* IP = 0;
1261
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1261
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1262
+    if (Ordering) Ordering->add(E);
1262 1263
     return SDValue(E, 0);
1264
+  }
1263 1265
   CvtRndSatSDNode *N = NodeAllocator.Allocate<CvtRndSatSDNode>();
1264 1266
   new (N) CvtRndSatSDNode(VT, dl, Ops, 5, Code);
1265 1267
   CSEMap.InsertNode(N, IP);
1266 1268
   AllNodes.push_back(N);
1269
+  if (Ordering) Ordering->add(N);
1267 1270
   return SDValue(N, 0);
1268 1271
 }
1269 1272
 
... ...
@@ -1272,12 +1333,15 @@ SDValue SelectionDAG::getRegister(unsigned RegNo, EVT VT) {
1272 1272
   AddNodeIDNode(ID, ISD::Register, getVTList(VT), 0, 0);
1273 1273
   ID.AddInteger(RegNo);
1274 1274
   void *IP = 0;
1275
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1275
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1276
+    if (Ordering) Ordering->add(E);
1276 1277
     return SDValue(E, 0);
1278
+  }
1277 1279
   SDNode *N = NodeAllocator.Allocate<RegisterSDNode>();
1278 1280
   new (N) RegisterSDNode(RegNo, VT);
1279 1281
   CSEMap.InsertNode(N, IP);
1280 1282
   AllNodes.push_back(N);
1283
+  if (Ordering) Ordering->add(N);
1281 1284
   return SDValue(N, 0);
1282 1285
 }
1283 1286
 
... ...
@@ -1289,12 +1353,15 @@ SDValue SelectionDAG::getLabel(unsigned Opcode, DebugLoc dl,
1289 1289
   AddNodeIDNode(ID, Opcode, getVTList(MVT::Other), &Ops[0], 1);
1290 1290
   ID.AddInteger(LabelID);
1291 1291
   void *IP = 0;
1292
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1292
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1293
+    if (Ordering) Ordering->add(E);
1293 1294
     return SDValue(E, 0);
1295
+  }
1294 1296
   SDNode *N = NodeAllocator.Allocate<LabelSDNode>();
1295 1297
   new (N) LabelSDNode(Opcode, dl, Root, LabelID);
1296 1298
   CSEMap.InsertNode(N, IP);
1297 1299
   AllNodes.push_back(N);
1300
+  if (Ordering) Ordering->add(N);
1298 1301
   return SDValue(N, 0);
1299 1302
 }
1300 1303
 
... ...
@@ -1308,12 +1375,15 @@ SDValue SelectionDAG::getBlockAddress(BlockAddress *BA, EVT VT,
1308 1308
   ID.AddPointer(BA);
1309 1309
   ID.AddInteger(TargetFlags);
1310 1310
   void *IP = 0;
1311
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1311
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1312
+    if (Ordering) Ordering->add(E);
1312 1313
     return SDValue(E, 0);
1314
+  }
1313 1315
   SDNode *N = NodeAllocator.Allocate<BlockAddressSDNode>();
1314 1316
   new (N) BlockAddressSDNode(Opc, VT, BA, TargetFlags);
1315 1317
   CSEMap.InsertNode(N, IP);
1316 1318
   AllNodes.push_back(N);
1319
+  if (Ordering) Ordering->add(N);
1317 1320
   return SDValue(N, 0);
1318 1321
 }
1319 1322
 
... ...
@@ -1326,13 +1396,16 @@ SDValue SelectionDAG::getSrcValue(const Value *V) {
1326 1326
   ID.AddPointer(V);
1327 1327
 
1328 1328
   void *IP = 0;
1329
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
1329
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
1330
+    if (Ordering) Ordering->add(E);
1330 1331
     return SDValue(E, 0);
1332
+  }
1331 1333
 
1332 1334
   SDNode *N = NodeAllocator.Allocate<SrcValueSDNode>();
1333 1335
   new (N) SrcValueSDNode(V);
1334 1336
   CSEMap.InsertNode(N, IP);
1335 1337
   AllNodes.push_back(N);
1338
+  if (Ordering) Ordering->add(N);
1336 1339
   return SDValue(N, 0);
1337 1340
 }
1338 1341
 
... ...
@@ -2243,13 +2316,16 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT) {
2243 2243
   FoldingSetNodeID ID;
2244 2244
   AddNodeIDNode(ID, Opcode, getVTList(VT), 0, 0);
2245 2245
   void *IP = 0;
2246
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
2246
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
2247
+    if (Ordering) Ordering->add(E);
2247 2248
     return SDValue(E, 0);
2249
+  }
2248 2250
   SDNode *N = NodeAllocator.Allocate<SDNode>();
2249 2251
   new (N) SDNode(Opcode, DL, getVTList(VT));
2250 2252
   CSEMap.InsertNode(N, IP);
2251 2253
 
2252 2254
   AllNodes.push_back(N);
2255
+  if (Ordering) Ordering->add(N);
2253 2256
 #ifndef NDEBUG
2254 2257
   VerifyNode(N);
2255 2258
 #endif
... ...
@@ -2354,6 +2430,10 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL,
2354 2354
     assert(VT.isFloatingPoint() &&
2355 2355
            Operand.getValueType().isFloatingPoint() && "Invalid FP cast!");
2356 2356
     if (Operand.getValueType() == VT) return Operand;  // noop conversion.
2357
+    assert((!VT.isVector() ||
2358
+            VT.getVectorNumElements() ==
2359
+            Operand.getValueType().getVectorNumElements()) &&
2360
+           "Vector element count mismatch!");
2357 2361
     if (Operand.getOpcode() == ISD::UNDEF)
2358 2362
       return getUNDEF(VT);
2359 2363
     break;
... ...
@@ -2361,8 +2441,12 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL,
2361 2361
     assert(VT.isInteger() && Operand.getValueType().isInteger() &&
2362 2362
            "Invalid SIGN_EXTEND!");
2363 2363
     if (Operand.getValueType() == VT) return Operand;   // noop extension
2364
-    assert(Operand.getValueType().bitsLT(VT)
2365
-           && "Invalid sext node, dst < src!");
2364
+    assert(Operand.getValueType().getScalarType().bitsLT(VT.getScalarType()) &&
2365
+           "Invalid sext node, dst < src!");
2366
+    assert((!VT.isVector() ||
2367
+            VT.getVectorNumElements() ==
2368
+            Operand.getValueType().getVectorNumElements()) &&
2369
+           "Vector element count mismatch!");
2366 2370
     if (OpOpcode == ISD::SIGN_EXTEND || OpOpcode == ISD::ZERO_EXTEND)
2367 2371
       return getNode(OpOpcode, DL, VT, Operand.getNode()->getOperand(0));
2368 2372
     break;
... ...
@@ -2370,8 +2454,12 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL,
2370 2370
     assert(VT.isInteger() && Operand.getValueType().isInteger() &&
2371 2371
            "Invalid ZERO_EXTEND!");
2372 2372
     if (Operand.getValueType() == VT) return Operand;   // noop extension
2373
-    assert(Operand.getValueType().bitsLT(VT)
2374
-           && "Invalid zext node, dst < src!");
2373
+    assert(Operand.getValueType().getScalarType().bitsLT(VT.getScalarType()) &&
2374
+           "Invalid zext node, dst < src!");
2375
+    assert((!VT.isVector() ||
2376
+            VT.getVectorNumElements() ==
2377
+            Operand.getValueType().getVectorNumElements()) &&
2378
+           "Vector element count mismatch!");
2375 2379
     if (OpOpcode == ISD::ZERO_EXTEND)   // (zext (zext x)) -> (zext x)
2376 2380
       return getNode(ISD::ZERO_EXTEND, DL, VT,
2377 2381
                      Operand.getNode()->getOperand(0));
... ...
@@ -2380,8 +2468,12 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL,
2380 2380
     assert(VT.isInteger() && Operand.getValueType().isInteger() &&
2381 2381
            "Invalid ANY_EXTEND!");
2382 2382
     if (Operand.getValueType() == VT) return Operand;   // noop extension
2383
-    assert(Operand.getValueType().bitsLT(VT)
2384
-           && "Invalid anyext node, dst < src!");
2383
+    assert(Operand.getValueType().getScalarType().bitsLT(VT.getScalarType()) &&
2384
+           "Invalid anyext node, dst < src!");
2385
+    assert((!VT.isVector() ||
2386
+            VT.getVectorNumElements() ==
2387
+            Operand.getValueType().getVectorNumElements()) &&
2388
+           "Vector element count mismatch!");
2385 2389
     if (OpOpcode == ISD::ZERO_EXTEND || OpOpcode == ISD::SIGN_EXTEND)
2386 2390
       // (ext (zext x)) -> (zext x)  and  (ext (sext x)) -> (sext x)
2387 2391
       return getNode(OpOpcode, DL, VT, Operand.getNode()->getOperand(0));
... ...
@@ -2390,14 +2482,19 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL,
2390 2390
     assert(VT.isInteger() && Operand.getValueType().isInteger() &&
2391 2391
            "Invalid TRUNCATE!");
2392 2392
     if (Operand.getValueType() == VT) return Operand;   // noop truncate
2393
-    assert(Operand.getValueType().bitsGT(VT)
2394
-           && "Invalid truncate node, src < dst!");
2393
+    assert(Operand.getValueType().getScalarType().bitsGT(VT.getScalarType()) &&
2394
+           "Invalid truncate node, src < dst!");
2395
+    assert((!VT.isVector() ||
2396
+            VT.getVectorNumElements() ==
2397
+            Operand.getValueType().getVectorNumElements()) &&
2398
+           "Vector element count mismatch!");
2395 2399
     if (OpOpcode == ISD::TRUNCATE)
2396 2400
       return getNode(ISD::TRUNCATE, DL, VT, Operand.getNode()->getOperand(0));
2397 2401
     else if (OpOpcode == ISD::ZERO_EXTEND || OpOpcode == ISD::SIGN_EXTEND ||
2398 2402
              OpOpcode == ISD::ANY_EXTEND) {
2399 2403
       // If the source is smaller than the dest, we still need an extend.
2400
-      if (Operand.getNode()->getOperand(0).getValueType().bitsLT(VT))
2404
+      if (Operand.getNode()->getOperand(0).getValueType().getScalarType()
2405
+            .bitsLT(VT.getScalarType()))
2401 2406
         return getNode(OpOpcode, DL, VT, Operand.getNode()->getOperand(0));
2402 2407
       else if (Operand.getNode()->getOperand(0).getValueType().bitsGT(VT))
2403 2408
         return getNode(ISD::TRUNCATE, DL, VT, Operand.getNode()->getOperand(0));
... ...
@@ -2452,8 +2549,10 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL,
2452 2452
     SDValue Ops[1] = { Operand };
2453 2453
     AddNodeIDNode(ID, Opcode, VTs, Ops, 1);
2454 2454
     void *IP = 0;
2455
-    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
2455
+    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
2456
+      if (Ordering) Ordering->add(E);
2456 2457
       return SDValue(E, 0);
2458
+    }
2457 2459
     N = NodeAllocator.Allocate<UnarySDNode>();
2458 2460
     new (N) UnarySDNode(Opcode, DL, VTs, Operand);
2459 2461
     CSEMap.InsertNode(N, IP);
... ...
@@ -2463,6 +2562,7 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL,
2463 2463
   }
2464 2464
 
2465 2465
   AllNodes.push_back(N);
2466
+  if (Ordering) Ordering->add(N);
2466 2467
 #ifndef NDEBUG
2467 2468
   VerifyNode(N);
2468 2469
 #endif
... ...
@@ -2870,8 +2970,10 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT,
2870 2870
     FoldingSetNodeID ID;
2871 2871
     AddNodeIDNode(ID, Opcode, VTs, Ops, 2);
2872 2872
     void *IP = 0;
2873
-    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
2873
+    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
2874
+      if (Ordering) Ordering->add(E);
2874 2875
       return SDValue(E, 0);
2876
+    }
2875 2877
     N = NodeAllocator.Allocate<BinarySDNode>();
2876 2878
     new (N) BinarySDNode(Opcode, DL, VTs, N1, N2);
2877 2879
     CSEMap.InsertNode(N, IP);
... ...
@@ -2881,6 +2983,7 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT,
2881 2881
   }
2882 2882
 
2883 2883
   AllNodes.push_back(N);
2884
+  if (Ordering) Ordering->add(N);
2884 2885
 #ifndef NDEBUG
2885 2886
   VerifyNode(N);
2886 2887
 #endif
... ...
@@ -2947,8 +3050,10 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT,
2947 2947
     FoldingSetNodeID ID;
2948 2948
     AddNodeIDNode(ID, Opcode, VTs, Ops, 3);
2949 2949
     void *IP = 0;
2950
-    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
2950
+    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
2951
+      if (Ordering) Ordering->add(E);
2951 2952
       return SDValue(E, 0);
2953
+    }
2952 2954
     N = NodeAllocator.Allocate<TernarySDNode>();
2953 2955
     new (N) TernarySDNode(Opcode, DL, VTs, N1, N2, N3);
2954 2956
     CSEMap.InsertNode(N, IP);
... ...
@@ -2956,7 +3061,9 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT,
2956 2956
     N = NodeAllocator.Allocate<TernarySDNode>();
2957 2957
     new (N) TernarySDNode(Opcode, DL, VTs, N1, N2, N3);
2958 2958
   }
2959
+
2959 2960
   AllNodes.push_back(N);
2961
+  if (Ordering) Ordering->add(N);
2960 2962
 #ifndef NDEBUG
2961 2963
   VerifyNode(N);
2962 2964
 #endif
... ...
@@ -3552,12 +3659,14 @@ SDValue SelectionDAG::getAtomic(unsigned Opcode, DebugLoc dl, EVT MemVT,
3552 3552
   void* IP = 0;
3553 3553
   if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3554 3554
     cast<AtomicSDNode>(E)->refineAlignment(MMO);
3555
+    if (Ordering) Ordering->add(E);
3555 3556
     return SDValue(E, 0);
3556 3557
   }
3557 3558
   SDNode* N = NodeAllocator.Allocate<AtomicSDNode>();
3558 3559
   new (N) AtomicSDNode(Opcode, dl, VTs, MemVT, Chain, Ptr, Cmp, Swp, MMO);
3559 3560
   CSEMap.InsertNode(N, IP);
3560 3561
   AllNodes.push_back(N);
3562
+  if (Ordering) Ordering->add(N);
3561 3563
   return SDValue(N, 0);
3562 3564
 }
3563 3565
 
... ...
@@ -3615,12 +3724,14 @@ SDValue SelectionDAG::getAtomic(unsigned Opcode, DebugLoc dl, EVT MemVT,
3615 3615
   void* IP = 0;
3616 3616
   if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3617 3617
     cast<AtomicSDNode>(E)->refineAlignment(MMO);
3618
+    if (Ordering) Ordering->add(E);
3618 3619
     return SDValue(E, 0);
3619 3620
   }
3620 3621
   SDNode* N = NodeAllocator.Allocate<AtomicSDNode>();
3621 3622
   new (N) AtomicSDNode(Opcode, dl, VTs, MemVT, Chain, Ptr, Val, MMO);
3622 3623
   CSEMap.InsertNode(N, IP);
3623 3624
   AllNodes.push_back(N);
3625
+  if (Ordering) Ordering->add(N);
3624 3626
   return SDValue(N, 0);
3625 3627
 }
3626 3628
 
... ...
@@ -3693,6 +3804,7 @@ SelectionDAG::getMemIntrinsicNode(unsigned Opcode, DebugLoc dl, SDVTList VTList,
3693 3693
     void *IP = 0;
3694 3694
     if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3695 3695
       cast<MemIntrinsicSDNode>(E)->refineAlignment(MMO);
3696
+      if (Ordering) Ordering->add(E);
3696 3697
       return SDValue(E, 0);
3697 3698
     }
3698 3699
 
... ...
@@ -3704,6 +3816,7 @@ SelectionDAG::getMemIntrinsicNode(unsigned Opcode, DebugLoc dl, SDVTList VTList,
3704 3704
     new (N) MemIntrinsicSDNode(Opcode, dl, VTList, Ops, NumOps, MemVT, MMO);
3705 3705
   }
3706 3706
   AllNodes.push_back(N);
3707
+  if (Ordering) Ordering->add(N);
3707 3708
   return SDValue(N, 0);
3708 3709
 }
3709 3710
 
... ...
@@ -3743,16 +3856,15 @@ SelectionDAG::getLoad(ISD::MemIndexedMode AM, DebugLoc dl,
3743 3743
     assert(VT == MemVT && "Non-extending load from different memory type!");
3744 3744
   } else {
3745 3745
     // Extending load.
3746
-    if (VT.isVector())
3747
-      assert(MemVT.getVectorNumElements() == VT.getVectorNumElements() &&
3748
-             "Invalid vector extload!");
3749
-    else
3750
-      assert(MemVT.bitsLT(VT) &&
3751
-             "Should only be an extending load, not truncating!");
3752
-    assert((ExtType == ISD::EXTLOAD || VT.isInteger()) &&
3753
-           "Cannot sign/zero extend a FP/Vector load!");
3746
+    assert(MemVT.getScalarType().bitsLT(VT.getScalarType()) &&
3747
+           "Should only be an extending load, not truncating!");
3754 3748
     assert(VT.isInteger() == MemVT.isInteger() &&
3755 3749
            "Cannot convert from FP to Int or Int -> FP!");
3750
+    assert(VT.isVector() == MemVT.isVector() &&
3751
+           "Cannot use trunc store to convert to or from a vector!");
3752
+    assert((!VT.isVector() ||
3753
+            VT.getVectorNumElements() == MemVT.getVectorNumElements()) &&
3754
+           "Cannot use trunc store to change the number of vector elements!");
3756 3755
   }
3757 3756
 
3758 3757
   bool Indexed = AM != ISD::UNINDEXED;
... ...
@@ -3769,12 +3881,14 @@ SelectionDAG::getLoad(ISD::MemIndexedMode AM, DebugLoc dl,
3769 3769
   void *IP = 0;
3770 3770
   if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3771 3771
     cast<LoadSDNode>(E)->refineAlignment(MMO);
3772
+    if (Ordering) Ordering->add(E);
3772 3773
     return SDValue(E, 0);
3773 3774
   }
3774 3775
   SDNode *N = NodeAllocator.Allocate<LoadSDNode>();
3775 3776
   new (N) LoadSDNode(Ops, dl, VTs, AM, ExtType, MemVT, MMO);
3776 3777
   CSEMap.InsertNode(N, IP);
3777 3778
   AllNodes.push_back(N);
3779
+  if (Ordering) Ordering->add(N);
3778 3780
   return SDValue(N, 0);
3779 3781
 }
3780 3782
 
... ...
@@ -3845,12 +3959,14 @@ SDValue SelectionDAG::getStore(SDValue Chain, DebugLoc dl, SDValue Val,
3845 3845
   void *IP = 0;
3846 3846
   if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3847 3847
     cast<StoreSDNode>(E)->refineAlignment(MMO);
3848
+    if (Ordering) Ordering->add(E);
3848 3849
     return SDValue(E, 0);
3849 3850
   }
3850 3851
   SDNode *N = NodeAllocator.Allocate<StoreSDNode>();
3851 3852
   new (N) StoreSDNode(Ops, dl, VTs, ISD::UNINDEXED, false, VT, MMO);
3852 3853
   CSEMap.InsertNode(N, IP);
3853 3854
   AllNodes.push_back(N);
3855
+  if (Ordering) Ordering->add(N);
3854 3856
   return SDValue(N, 0);
3855 3857
 }
3856 3858
 
... ...
@@ -3885,10 +4001,15 @@ SDValue SelectionDAG::getTruncStore(SDValue Chain, DebugLoc dl, SDValue Val,
3885 3885
   if (VT == SVT)
3886 3886
     return getStore(Chain, dl, Val, Ptr, MMO);
3887 3887
 
3888
-  assert(VT.bitsGT(SVT) && "Not a truncation?");
3888
+  assert(SVT.getScalarType().bitsLT(VT.getScalarType()) &&
3889
+         "Should only be a truncating store, not extending!");
3889 3890
   assert(VT.isInteger() == SVT.isInteger() &&
3890 3891
          "Can't do FP-INT conversion!");
3891
-
3892
+  assert(VT.isVector() == SVT.isVector() &&
3893
+         "Cannot use trunc store to convert to or from a vector!");
3894
+  assert((!VT.isVector() ||
3895
+          VT.getVectorNumElements() == SVT.getVectorNumElements()) &&
3896
+         "Cannot use trunc store to change the number of vector elements!");
3892 3897
 
3893 3898
   SDVTList VTs = getVTList(MVT::Other);
3894 3899
   SDValue Undef = getUNDEF(Ptr.getValueType());
... ...
@@ -3900,12 +4021,14 @@ SDValue SelectionDAG::getTruncStore(SDValue Chain, DebugLoc dl, SDValue Val,
3900 3900
   void *IP = 0;
3901 3901
   if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3902 3902
     cast<StoreSDNode>(E)->refineAlignment(MMO);
3903
+    if (Ordering) Ordering->add(E);
3903 3904
     return SDValue(E, 0);
3904 3905
   }
3905 3906
   SDNode *N = NodeAllocator.Allocate<StoreSDNode>();
3906 3907
   new (N) StoreSDNode(Ops, dl, VTs, ISD::UNINDEXED, true, SVT, MMO);
3907 3908
   CSEMap.InsertNode(N, IP);
3908 3909
   AllNodes.push_back(N);
3910
+  if (Ordering) Ordering->add(N);
3909 3911
   return SDValue(N, 0);
3910 3912
 }
3911 3913
 
... ...
@@ -3922,14 +4045,17 @@ SelectionDAG::getIndexedStore(SDValue OrigStore, DebugLoc dl, SDValue Base,
3922 3922
   ID.AddInteger(ST->getMemoryVT().getRawBits());
3923 3923
   ID.AddInteger(ST->getRawSubclassData());
3924 3924
   void *IP = 0;
3925
-  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
3925
+  if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3926
+    if (Ordering) Ordering->add(E);
3926 3927
     return SDValue(E, 0);
3928
+  }
3927 3929
   SDNode *N = NodeAllocator.Allocate<StoreSDNode>();
3928 3930
   new (N) StoreSDNode(Ops, dl, VTs, AM,
3929 3931
                       ST->isTruncatingStore(), ST->getMemoryVT(),
3930 3932
                       ST->getMemOperand());
3931 3933
   CSEMap.InsertNode(N, IP);
3932 3934
   AllNodes.push_back(N);
3935
+  if (Ordering) Ordering->add(N);
3933 3936
   return SDValue(N, 0);
3934 3937
 }
3935 3938
 
... ...
@@ -3995,8 +4121,10 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT,
3995 3995
     AddNodeIDNode(ID, Opcode, VTs, Ops, NumOps);
3996 3996
     void *IP = 0;
3997 3997
 
3998
-    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
3998
+    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
3999
+      if (Ordering) Ordering->add(E);
3999 4000
       return SDValue(E, 0);
4001
+    }
4000 4002
 
4001 4003
     N = NodeAllocator.Allocate<SDNode>();
4002 4004
     new (N) SDNode(Opcode, DL, VTs, Ops, NumOps);
... ...
@@ -4007,6 +4135,7 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, EVT VT,
4007 4007
   }
4008 4008
 
4009 4009
   AllNodes.push_back(N);
4010
+  if (Ordering) Ordering->add(N);
4010 4011
 #ifndef NDEBUG
4011 4012
   VerifyNode(N);
4012 4013
 #endif
... ...
@@ -4062,8 +4191,10 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, SDVTList VTList,
4062 4062
     FoldingSetNodeID ID;
4063 4063
     AddNodeIDNode(ID, Opcode, VTList, Ops, NumOps);
4064 4064
     void *IP = 0;
4065
-    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
4065
+    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
4066
+      if (Ordering) Ordering->add(E);
4066 4067
       return SDValue(E, 0);
4068
+    }
4067 4069
     if (NumOps == 1) {
4068 4070
       N = NodeAllocator.Allocate<UnarySDNode>();
4069 4071
       new (N) UnarySDNode(Opcode, DL, VTList, Ops[0]);
... ...
@@ -4094,6 +4225,7 @@ SDValue SelectionDAG::getNode(unsigned Opcode, DebugLoc DL, SDVTList VTList,
4094 4094
     }
4095 4095
   }
4096 4096
   AllNodes.push_back(N);
4097
+  if (Ordering) Ordering->add(N);
4097 4098
 #ifndef NDEBUG
4098 4099
   VerifyNode(N);
4099 4100
 #endif
... ...
@@ -4177,7 +4309,7 @@ SDVTList SelectionDAG::getVTList(EVT VT1, EVT VT2, EVT VT3, EVT VT4) {
4177 4177
                           I->VTs[2] == VT3 && I->VTs[3] == VT4)
4178 4178
       return *I;
4179 4179
 
4180
-  EVT *Array = Allocator.Allocate<EVT>(3);
4180
+  EVT *Array = Allocator.Allocate<EVT>(4);
4181 4181
   Array[0] = VT1;
4182 4182
   Array[1] = VT2;
4183 4183
   Array[2] = VT3;
... ...
@@ -4556,8 +4688,10 @@ SDNode *SelectionDAG::MorphNodeTo(SDNode *N, unsigned Opc,
4556 4556
   if (VTs.VTs[VTs.NumVTs-1] != MVT::Flag) {
4557 4557
     FoldingSetNodeID ID;
4558 4558
     AddNodeIDNode(ID, Opc, VTs, Ops, NumOps);
4559
-    if (SDNode *ON = CSEMap.FindNodeOrInsertPos(ID, IP))
4559
+    if (SDNode *ON = CSEMap.FindNodeOrInsertPos(ID, IP)) {
4560
+      if (Ordering) Ordering->add(ON);
4560 4561
       return ON;
4562
+    }
4561 4563
   }
4562 4564
 
4563 4565
   if (!RemoveNodeFromCSEMaps(N))
... ...
@@ -4621,6 +4755,7 @@ SDNode *SelectionDAG::MorphNodeTo(SDNode *N, unsigned Opc,
4621 4621
 
4622 4622
   if (IP)
4623 4623
     CSEMap.InsertNode(N, IP);   // Memoize the new node.
4624
+  if (Ordering) Ordering->add(N);
4624 4625
   return N;
4625 4626
 }
4626 4627
 
... ...
@@ -4759,8 +4894,10 @@ SelectionDAG::getMachineNode(unsigned Opcode, DebugLoc DL, SDVTList VTs,
4759 4759
     FoldingSetNodeID ID;
4760 4760
     AddNodeIDNode(ID, ~Opcode, VTs, Ops, NumOps);
4761 4761
     IP = 0;
4762
-    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
4762
+    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
4763
+      if (Ordering) Ordering->add(E);
4763 4764
       return cast<MachineSDNode>(E);
4765
+    }
4764 4766
   }
4765 4767
 
4766 4768
   // Allocate a new MachineSDNode.
... ...
@@ -4782,6 +4919,7 @@ SelectionDAG::getMachineNode(unsigned Opcode, DebugLoc DL, SDVTList VTs,
4782 4782
     CSEMap.InsertNode(N, IP);
4783 4783
 
4784 4784
   AllNodes.push_back(N);
4785
+  if (Ordering) Ordering->add(N);
4785 4786
 #ifndef NDEBUG
4786 4787
   VerifyNode(N);
4787 4788
 #endif
... ...
@@ -4818,8 +4956,10 @@ SDNode *SelectionDAG::getNodeIfExists(unsigned Opcode, SDVTList VTList,
4818 4818
     FoldingSetNodeID ID;
4819 4819
     AddNodeIDNode(ID, Opcode, VTList, Ops, NumOps);
4820 4820
     void *IP = 0;
4821
-    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP))
4821
+    if (SDNode *E = CSEMap.FindNodeOrInsertPos(ID, IP)) {
4822
+      if (Ordering) Ordering->add(E);
4822 4823
       return E;
4824
+    }
4823 4825
   }
4824 4826
   return NULL;
4825 4827
 }
... ...
@@ -5986,6 +6126,9 @@ void SelectionDAG::dump() const {
5986 5986
   errs() << "\n\n";
5987 5987
 }
5988 5988
 
5989
+void SelectionDAG::NodeOrdering::dump() const {
5990
+}
5991
+
5989 5992
 void SDNode::printr(raw_ostream &OS, const SelectionDAG *G) const {
5990 5993
   print_types(OS, G);
5991 5994
   print_details(OS, G);
... ...
@@ -6126,4 +6269,3 @@ bool ShuffleVectorSDNode::isSplatMask(const int *Mask, EVT VT) {
6126 6126
       return false;
6127 6127
   return true;
6128 6128
 }
6129
-
... ...
@@ -583,6 +583,9 @@ void SelectionDAGBuilder::visit(Instruction &I) {
583 583
 }
584 584
 
585 585
 void SelectionDAGBuilder::visit(unsigned Opcode, User &I) {
586
+  // Tell the DAG that we're processing a new instruction.
587
+  DAG.NewInst();
588
+
586 589
   // Note: this doesn't use InstVisitor, because it has to work with
587 590
   // ConstantExpr's in addition to instructions.
588 591
   switch (Opcode) {
... ...
@@ -390,7 +390,7 @@ static void ResetDebugLoc(SelectionDAGBuilder *SDB,
390 390
                           FastISel *FastIS) {
391 391
   SDB->setCurDebugLoc(DebugLoc::getUnknownLoc());
392 392
   if (FastIS)
393
-    SDB->setCurDebugLoc(DebugLoc::getUnknownLoc());
393
+    FastIS->setCurDebugLoc(DebugLoc::getUnknownLoc());
394 394
 }
395 395
 
396 396
 void SelectionDAGISel::SelectBasicBlock(BasicBlock *LLVMBB,
... ...
@@ -2622,114 +2622,6 @@ void SimpleRegisterCoalescing::releaseMemory() {
2622 2622
   ReMatDefs.clear();
2623 2623
 }
2624 2624
 
2625
-/// Returns true if the given live interval is zero length.
2626
-static bool isZeroLengthInterval(LiveInterval *li, LiveIntervals *li_) {
2627
-  for (LiveInterval::Ranges::const_iterator
2628
-         i = li->ranges.begin(), e = li->ranges.end(); i != e; ++i)
2629
-    if (i->end.getPrevIndex() > i->start)
2630
-      return false;
2631
-  return true;
2632
-}
2633
-
2634
-
2635
-void SimpleRegisterCoalescing::CalculateSpillWeights() {
2636
-  SmallSet<unsigned, 4> Processed;
2637
-  for (MachineFunction::iterator mbbi = mf_->begin(), mbbe = mf_->end();
2638
-       mbbi != mbbe; ++mbbi) {
2639
-    MachineBasicBlock* MBB = mbbi;
2640
-    SlotIndex MBBEnd = li_->getMBBEndIdx(MBB);
2641
-    MachineLoop* loop = loopInfo->getLoopFor(MBB);
2642
-    unsigned loopDepth = loop ? loop->getLoopDepth() : 0;
2643
-    bool isExiting = loop ? loop->isLoopExiting(MBB) : false;
2644
-
2645
-    for (MachineBasicBlock::const_iterator mii = MBB->begin(), mie = MBB->end();
2646
-         mii != mie; ++mii) {
2647
-      const MachineInstr *MI = mii;
2648
-      if (tii_->isIdentityCopy(*MI))
2649
-        continue;
2650
-
2651
-      if (MI->getOpcode() == TargetInstrInfo::IMPLICIT_DEF)
2652
-        continue;
2653
-
2654
-      for (unsigned i = 0, e = MI->getNumOperands(); i != e; ++i) {
2655
-        const MachineOperand &mopi = MI->getOperand(i);
2656
-        if (!mopi.isReg() || mopi.getReg() == 0)
2657
-          continue;
2658
-        unsigned Reg = mopi.getReg();
2659
-        if (!TargetRegisterInfo::isVirtualRegister(mopi.getReg()))
2660
-          continue;
2661
-        // Multiple uses of reg by the same instruction. It should not
2662
-        // contribute to spill weight again.
2663
-        if (!Processed.insert(Reg))
2664
-          continue;
2665
-
2666
-        bool HasDef = mopi.isDef();
2667
-        bool HasUse = !HasDef;
2668
-        for (unsigned j = i+1; j != e; ++j) {
2669
-          const MachineOperand &mopj = MI->getOperand(j);
2670
-          if (!mopj.isReg() || mopj.getReg() != Reg)
2671
-            continue;
2672
-          HasDef |= mopj.isDef();
2673
-          HasUse |= mopj.isUse();
2674
-          if (HasDef && HasUse)
2675
-            break;
2676
-        }
2677
-
2678
-        LiveInterval &RegInt = li_->getInterval(Reg);
2679
-        float Weight = li_->getSpillWeight(HasDef, HasUse, loopDepth);
2680
-        if (HasDef && isExiting) {
2681
-          // Looks like this is a loop count variable update.
2682
-          SlotIndex DefIdx = li_->getInstructionIndex(MI).getDefIndex();
2683
-          const LiveRange *DLR =
2684
-            li_->getInterval(Reg).getLiveRangeContaining(DefIdx);
2685
-          if (DLR->end > MBBEnd)
2686
-            Weight *= 3.0F;
2687
-        }
2688
-        RegInt.weight += Weight;
2689
-      }
2690
-      Processed.clear();
2691
-    }
2692
-  }
2693
-
2694
-  for (LiveIntervals::iterator I = li_->begin(), E = li_->end(); I != E; ++I) {
2695
-    LiveInterval &LI = *I->second;
2696
-    if (TargetRegisterInfo::isVirtualRegister(LI.reg)) {
2697
-      // If the live interval length is essentially zero, i.e. in every live
2698
-      // range the use follows def immediately, it doesn't make sense to spill
2699
-      // it and hope it will be easier to allocate for this li.
2700
-      if (isZeroLengthInterval(&LI, li_)) {
2701
-        LI.weight = HUGE_VALF;
2702
-        continue;
2703
-      }
2704
-
2705
-      bool isLoad = false;
2706
-      SmallVector<LiveInterval*, 4> SpillIs;
2707
-      if (li_->isReMaterializable(LI, SpillIs, isLoad)) {
2708
-        // If all of the definitions of the interval are re-materializable,
2709
-        // it is a preferred candidate for spilling. If non of the defs are
2710
-        // loads, then it's potentially very cheap to re-materialize.
2711
-        // FIXME: this gets much more complicated once we support non-trivial
2712
-        // re-materialization.
2713
-        if (isLoad)
2714
-          LI.weight *= 0.9F;
2715
-        else
2716
-          LI.weight *= 0.5F;
2717
-      }
2718
-
2719
-      // Slightly prefer live interval that has been assigned a preferred reg.
2720
-      std::pair<unsigned, unsigned> Hint = mri_->getRegAllocationHint(LI.reg);
2721
-      if (Hint.first || Hint.second)
2722
-        LI.weight *= 1.01F;
2723
-
2724
-      // Divide the weight of the interval by its size.  This encourages
2725
-      // spilling of intervals that are large and have few uses, and
2726
-      // discourages spilling of small intervals with many uses.
2727
-      LI.weight /= li_->getApproximateInstructionCount(LI) * InstrSlots::NUM;
2728
-    }
2729
-  }
2730
-}
2731
-
2732
-
2733 2625
 bool SimpleRegisterCoalescing::runOnMachineFunction(MachineFunction &fn) {
2734 2626
   mf_ = &fn;
2735 2627
   mri_ = &fn.getRegInfo();
... ...
@@ -2860,8 +2752,6 @@ bool SimpleRegisterCoalescing::runOnMachineFunction(MachineFunction &fn) {
2860 2860
     }
2861 2861
   }
2862 2862
 
2863
-  CalculateSpillWeights();
2864
-
2865 2863
   DEBUG(dump());
2866 2864
   return true;
2867 2865
 }
... ...
@@ -244,10 +244,6 @@ namespace llvm {
244 244
     MachineOperand *lastRegisterUse(SlotIndex Start, SlotIndex End,
245 245
                                     unsigned Reg, SlotIndex &LastUseIdx) const;
246 246
 
247
-    /// CalculateSpillWeights - Compute spill weights for all virtual register
248
-    /// live intervals.
249
-    void CalculateSpillWeights();
250
-
251 247
     void printRegName(unsigned reg) const;
252 248
   };
253 249
 
... ...
@@ -90,7 +90,8 @@ namespace {
90 90
                               SmallSetVector<MachineBasicBlock*, 8> &Succs);
91 91
     bool TailDuplicateBlocks(MachineFunction &MF);
92 92
     bool TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF,
93
-                       SmallVector<MachineBasicBlock*, 8> &TDBBs);
93
+                       SmallVector<MachineBasicBlock*, 8> &TDBBs,
94
+                       SmallVector<MachineInstr*, 16> &Copies);
94 95
     void RemoveDeadBlock(MachineBasicBlock *MBB);
95 96
   };
96 97
 
... ...
@@ -194,7 +195,8 @@ bool TailDuplicatePass::TailDuplicateBlocks(MachineFunction &MF) {
194 194
                                                 MBB->succ_end());
195 195
 
196 196
     SmallVector<MachineBasicBlock*, 8> TDBBs;
197
-    if (TailDuplicate(MBB, MF, TDBBs)) {
197
+    SmallVector<MachineInstr*, 16> Copies;
198
+    if (TailDuplicate(MBB, MF, TDBBs, Copies)) {
198 199
       ++NumTails;
199 200
 
200 201
       // TailBB's immediate successors are now successors of those predecessors
... ...
@@ -251,6 +253,21 @@ bool TailDuplicatePass::TailDuplicateBlocks(MachineFunction &MF) {
251 251
         SSAUpdateVals.clear();
252 252
       }
253 253
 
254
+      // Eliminate some of the copies inserted tail duplication to maintain
255
+      // SSA form.
256
+      for (unsigned i = 0, e = Copies.size(); i != e; ++i) {
257
+        MachineInstr *Copy = Copies[i];
258
+        unsigned Src, Dst, SrcSR, DstSR;
259
+        if (TII->isMoveInstr(*Copy, Src, Dst, SrcSR, DstSR)) {
260
+          MachineRegisterInfo::use_iterator UI = MRI->use_begin(Src);
261
+          if (++UI == MRI->use_end()) {
262
+            // Copy is the only use. Do trivial copy propagation here.
263
+            MRI->replaceRegWith(Dst, Src);
264
+            Copy->eraseFromParent();
265
+          }
266
+        }
267
+      }
268
+
254 269
       if (PreRegAlloc && TailDupVerify)
255 270
         VerifyPHIs(MF, false);
256 271
       MadeChange = true;
... ...
@@ -418,7 +435,8 @@ TailDuplicatePass::UpdateSuccessorsPHIs(MachineBasicBlock *FromBB, bool isDead,
418 418
 /// of its predecessors.
419 419
 bool
420 420
 TailDuplicatePass::TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF,
421
-                                 SmallVector<MachineBasicBlock*, 8> &TDBBs) {
421
+                                 SmallVector<MachineBasicBlock*, 8> &TDBBs,
422
+                                 SmallVector<MachineInstr*, 16> &Copies) {
422 423
   // Don't try to tail-duplicate single-block loops.
423 424
   if (TailBB->isSuccessor(TailBB))
424 425
     return false;
... ...
@@ -502,7 +520,7 @@ TailDuplicatePass::TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF,
502 502
 
503 503
     // Clone the contents of TailBB into PredBB.
504 504
     DenseMap<unsigned, unsigned> LocalVRMap;
505
-    SmallVector<std::pair<unsigned,unsigned>, 4> Copies;
505
+    SmallVector<std::pair<unsigned,unsigned>, 4> CopyInfos;
506 506
     MachineBasicBlock::iterator I = TailBB->begin();
507 507
     while (I != TailBB->end()) {
508 508
       MachineInstr *MI = &*I;
... ...
@@ -510,7 +528,7 @@ TailDuplicatePass::TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF,
510 510
       if (MI->getOpcode() == TargetInstrInfo::PHI) {
511 511
         // Replace the uses of the def of the PHI with the register coming
512 512
         // from PredBB.
513
-        ProcessPHI(MI, TailBB, PredBB, LocalVRMap, Copies);
513
+        ProcessPHI(MI, TailBB, PredBB, LocalVRMap, CopyInfos);
514 514
       } else {
515 515
         // Replace def of virtual registers with new registers, and update
516 516
         // uses with PHI source register or the new registers.
... ...
@@ -518,9 +536,12 @@ TailDuplicatePass::TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF,
518 518
       }
519 519
     }
520 520
     MachineBasicBlock::iterator Loc = PredBB->getFirstTerminator();
521
-    for (unsigned i = 0, e = Copies.size(); i != e; ++i) {
522
-      const TargetRegisterClass *RC = MRI->getRegClass(Copies[i].first);
523
-      TII->copyRegToReg(*PredBB, Loc, Copies[i].first, Copies[i].second, RC, RC);
521
+    for (unsigned i = 0, e = CopyInfos.size(); i != e; ++i) {
522
+      const TargetRegisterClass *RC = MRI->getRegClass(CopyInfos[i].first);
523
+      TII->copyRegToReg(*PredBB, Loc, CopyInfos[i].first,
524
+                        CopyInfos[i].second, RC,RC);
525
+      MachineInstr *CopyMI = prior(Loc);
526
+      Copies.push_back(CopyMI);
524 527
     }
525 528
     NumInstrDups += TailBB->size() - 1; // subtract one for removed branch
526 529
 
... ...
@@ -553,14 +574,14 @@ TailDuplicatePass::TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF,
553 553
           << "From MBB: " << *TailBB);
554 554
     if (PreRegAlloc) {
555 555
       DenseMap<unsigned, unsigned> LocalVRMap;
556
-      SmallVector<std::pair<unsigned,unsigned>, 4> Copies;
556
+      SmallVector<std::pair<unsigned,unsigned>, 4> CopyInfos;
557 557
       MachineBasicBlock::iterator I = TailBB->begin();
558 558
       // Process PHI instructions first.
559 559
       while (I != TailBB->end() && I->getOpcode() == TargetInstrInfo::PHI) {
560 560
         // Replace the uses of the def of the PHI with the register coming
561 561
         // from PredBB.
562 562
         MachineInstr *MI = &*I++;
563
-        ProcessPHI(MI, TailBB, PrevBB, LocalVRMap, Copies);
563
+        ProcessPHI(MI, TailBB, PrevBB, LocalVRMap, CopyInfos);
564 564
         if (MI->getParent())
565 565
           MI->eraseFromParent();
566 566
       }
... ...
@@ -574,9 +595,12 @@ TailDuplicatePass::TailDuplicate(MachineBasicBlock *TailBB, MachineFunction &MF,
574 574
         MI->eraseFromParent();
575 575
       }
576 576
       MachineBasicBlock::iterator Loc = PrevBB->getFirstTerminator();
577
-      for (unsigned i = 0, e = Copies.size(); i != e; ++i) {
578
-        const TargetRegisterClass *RC = MRI->getRegClass(Copies[i].first);
579
-        TII->copyRegToReg(*PrevBB, Loc, Copies[i].first, Copies[i].second, RC, RC);
577
+      for (unsigned i = 0, e = CopyInfos.size(); i != e; ++i) {
578
+        const TargetRegisterClass *RC = MRI->getRegClass(CopyInfos[i].first);
579
+        TII->copyRegToReg(*PrevBB, Loc, CopyInfos[i].first,
580
+                          CopyInfos[i].second, RC, RC);
581
+        MachineInstr *CopyMI = prior(Loc);
582
+        Copies.push_back(CopyMI);
580 583
       }
581 584
     } else {
582 585
       // No PHIs to worry about, just splice the instructions over.
... ...
@@ -208,7 +208,7 @@ ExecutionEngine *JIT::createJIT(ModuleProvider *MP,
208 208
                                 JITMemoryManager *JMM,
209 209
                                 CodeGenOpt::Level OptLevel,
210 210
                                 bool GVsWithCode,
211
-				CodeModel::Model CMM) {
211
+                                CodeModel::Model CMM) {
212 212
   // Make sure we can resolve symbols in the program as well. The zero arg
213 213
   // to the function tells DynamicLibrary to load the program, not a library.
214 214
   if (sys::DynamicLibrary::LoadLibraryPermanently(0, ErrorStr))
... ...
@@ -681,7 +681,7 @@ void *JIT::getOrEmitGlobalVariable(const GlobalVariable *GV) {
681 681
   if (Ptr) return Ptr;
682 682
 
683 683
   // If the global is external, just remember the address.
684
-  if (GV->isDeclaration()) {
684
+  if (GV->isDeclaration() || GV->hasAvailableExternallyLinkage()) {
685 685
 #if HAVE___DSO_HANDLE
686 686
     if (GV->getName() == "__dso_handle")
687 687
       return (void*)&__dso_handle;
... ...
@@ -175,7 +175,6 @@ struct KeyInfo {
175 175
   static inline unsigned getTombstoneKey() { return -2U; }
176 176
   static unsigned getHashValue(const unsigned &Key) { return Key; }
177 177
   static bool isEqual(unsigned LHS, unsigned RHS) { return LHS == RHS; }
178
-  static bool isPod() { return true; }
179 178
 };
180 179
 
181 180
 /// ActionEntry - Structure describing an entry in the actions table.
... ...
@@ -209,8 +209,7 @@ raw_ostream &raw_ostream::operator<<(const void *P) {
209 209
 }
210 210
 
211 211
 raw_ostream &raw_ostream::operator<<(double N) {
212
-  this->operator<<(ftostr(N));
213
-  return *this;
212
+  return this->operator<<(ftostr(N));
214 213
 }
215 214
 
216 215
 
... ...
@@ -103,11 +103,8 @@ static void DetectX86FamilyModel(unsigned EAX, unsigned &Family, unsigned &Model
103 103
     Model += ((EAX >> 16) & 0xf) << 4; // Bits 16 - 19
104 104
   }
105 105
 }
106
-#endif
107
-
108 106
 
109 107
 std::string sys::getHostCPUName() {
110
-#if defined(__x86_64__) || defined(__i386__) || defined (_MSC_VER)
111 108
   unsigned EAX = 0, EBX = 0, ECX = 0, EDX = 0;
112 109
   if (GetX86CpuIDAndInfo(0x1, &EAX, &EBX, &ECX, &EDX))
113 110
     return "generic";
... ...
@@ -295,7 +292,10 @@ std::string sys::getHostCPUName() {
295 295
       return "generic";
296 296
     }
297 297
   }
298
-#endif
299
-
300 298
   return "generic";
301 299
 }
300
+#else
301
+std::string sys::getHostCPUName() {
302
+  return "generic";
303
+}
304
+#endif
... ...
@@ -1474,17 +1474,24 @@ ARMTargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op, SelectionDAG &DAG) {
1474 1474
   }
1475 1475
 }
1476 1476
 
1477
-static SDValue LowerMEMBARRIER(SDValue Op, SelectionDAG &DAG) {
1477
+static SDValue LowerMEMBARRIER(SDValue Op, SelectionDAG &DAG,
1478
+                          const ARMSubtarget *Subtarget) {
1478 1479
   DebugLoc dl = Op.getDebugLoc();
1479 1480
   SDValue Op5 = Op.getOperand(5);
1480 1481
   SDValue Res;
1481 1482
   unsigned isDeviceBarrier = cast<ConstantSDNode>(Op5)->getZExtValue();
1482 1483
   if (isDeviceBarrier) {
1483
-    Res = DAG.getNode(ARMISD::SYNCBARRIER, dl, MVT::Other,
1484
-                              Op.getOperand(0));
1484
+    if (Subtarget->hasV7Ops())
1485
+      Res = DAG.getNode(ARMISD::SYNCBARRIER, dl, MVT::Other, Op.getOperand(0));
1486
+    else
1487
+      Res = DAG.getNode(ARMISD::SYNCBARRIER, dl, MVT::Other, Op.getOperand(0),
1488
+                        DAG.getConstant(0, MVT::i32));
1485 1489
   } else {
1486
-    Res = DAG.getNode(ARMISD::MEMBARRIER, dl, MVT::Other,
1487
-                              Op.getOperand(0));
1490
+    if (Subtarget->hasV7Ops())
1491
+      Res = DAG.getNode(ARMISD::MEMBARRIER, dl, MVT::Other, Op.getOperand(0));
1492
+    else
1493
+      Res = DAG.getNode(ARMISD::MEMBARRIER, dl, MVT::Other, Op.getOperand(0),
1494
+                        DAG.getConstant(0, MVT::i32));
1488 1495
   }
1489 1496
   return Res;
1490 1497
 }
... ...
@@ -2991,7 +2998,7 @@ SDValue ARMTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) {
2991 2991
   case ISD::BR_JT:         return LowerBR_JT(Op, DAG);
2992 2992
   case ISD::DYNAMIC_STACKALLOC: return LowerDYNAMIC_STACKALLOC(Op, DAG);
2993 2993
   case ISD::VASTART:       return LowerVASTART(Op, DAG, VarArgsFrameIndex);
2994
-  case ISD::MEMBARRIER:    return LowerMEMBARRIER(Op, DAG);
2994
+  case ISD::MEMBARRIER:    return LowerMEMBARRIER(Op, DAG, Subtarget);
2995 2995
   case ISD::SINT_TO_FP:
2996 2996
   case ISD::UINT_TO_FP:    return LowerINT_TO_FP(Op, DAG);
2997 2997
   case ISD::FP_TO_SINT:
... ...
@@ -3055,13 +3062,23 @@ ARMTargetLowering::EmitAtomicCmpSwap(MachineInstr *MI,
3055 3055
     .createVirtualRegister(ARM::GPRRegisterClass);
3056 3056
   const TargetInstrInfo *TII = getTargetMachine().getInstrInfo();
3057 3057
   DebugLoc dl = MI->getDebugLoc();
3058
+  bool isThumb2 = Subtarget->isThumb2();
3058 3059
 
3059 3060
   unsigned ldrOpc, strOpc;
3060 3061
   switch (Size) {
3061 3062
   default: llvm_unreachable("unsupported size for AtomicCmpSwap!");
3062
-  case 1: ldrOpc = ARM::LDREXB; strOpc = ARM::STREXB; break;
3063
-  case 2: ldrOpc = ARM::LDREXH; strOpc = ARM::STREXH; break;
3064
-  case 4: ldrOpc = ARM::LDREX;  strOpc = ARM::STREX;  break;
3063
+  case 1:
3064
+    ldrOpc = isThumb2 ? ARM::t2LDREXB : ARM::LDREXB;
3065
+    strOpc = isThumb2 ? ARM::t2LDREXB : ARM::STREXB;
3066
+    break;
3067
+  case 2:
3068
+    ldrOpc = isThumb2 ? ARM::t2LDREXH : ARM::LDREXH;
3069
+    strOpc = isThumb2 ? ARM::t2STREXH : ARM::STREXH;
3070
+    break;
3071
+  case 4:
3072
+    ldrOpc = isThumb2 ? ARM::t2LDREX : ARM::LDREX;
3073
+    strOpc = isThumb2 ? ARM::t2STREX : ARM::STREX;
3074
+    break;
3065 3075
   }
3066 3076
 
3067 3077
   MachineFunction *MF = BB->getParent();
... ...
@@ -3088,10 +3105,10 @@ ARMTargetLowering::EmitAtomicCmpSwap(MachineInstr *MI,
3088 3088
   //   bne exitMBB
3089 3089
   BB = loop1MBB;
3090 3090
   AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), dest).addReg(ptr));
3091
-  AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::CMPrr))
3091
+  AddDefaultPred(BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2CMPrr : ARM::CMPrr))
3092 3092
                  .addReg(dest).addReg(oldval));
3093
-  BuildMI(BB, dl, TII->get(ARM::Bcc)).addMBB(exitMBB).addImm(ARMCC::NE)
3094
-    .addReg(ARM::CPSR);
3093
+  BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2Bcc : ARM::Bcc))
3094
+    .addMBB(exitMBB).addImm(ARMCC::NE).addReg(ARM::CPSR);
3095 3095
   BB->addSuccessor(loop2MBB);
3096 3096
   BB->addSuccessor(exitMBB);
3097 3097
 
... ...
@@ -3102,10 +3119,10 @@ ARMTargetLowering::EmitAtomicCmpSwap(MachineInstr *MI,
3102 3102
   BB = loop2MBB;
3103 3103
   AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), scratch).addReg(newval)
3104 3104
                  .addReg(ptr));
3105
-  AddDefaultPred(BuildMI(BB, dl, TII->get(ARM::CMPri))
3105
+  AddDefaultPred(BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2CMPri : ARM::CMPri))
3106 3106
                  .addReg(scratch).addImm(0));
3107
-  BuildMI(BB, dl, TII->get(ARM::Bcc)).addMBB(loop1MBB).addImm(ARMCC::NE)
3108
-    .addReg(ARM::CPSR);
3107
+  BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2Bcc : ARM::Bcc))
3108
+    .addMBB(loop1MBB).addImm(ARMCC::NE).addReg(ARM::CPSR);
3109 3109
   BB->addSuccessor(loop1MBB);
3110 3110
   BB->addSuccessor(exitMBB);
3111 3111
 
... ...
@@ -3118,11 +3135,85 @@ ARMTargetLowering::EmitAtomicCmpSwap(MachineInstr *MI,
3118 3118
 MachineBasicBlock *
3119 3119
 ARMTargetLowering::EmitAtomicBinary(MachineInstr *MI, MachineBasicBlock *BB,
3120 3120
                                     unsigned Size, unsigned BinOpcode) const {
3121
-  std::string msg;
3122
-  raw_string_ostream Msg(msg);
3123
-  Msg << "Cannot yet emit: ";
3124
-  MI->print(Msg);
3125
-  llvm_report_error(Msg.str());
3121
+  // This also handles ATOMIC_SWAP, indicated by BinOpcode==0.
3122
+  const TargetInstrInfo *TII = getTargetMachine().getInstrInfo();
3123
+
3124
+  const BasicBlock *LLVM_BB = BB->getBasicBlock();
3125
+  MachineFunction *F = BB->getParent();
3126
+  MachineFunction::iterator It = BB;
3127
+  ++It;
3128
+
3129
+  unsigned dest = MI->getOperand(0).getReg();
3130
+  unsigned ptr = MI->getOperand(1).getReg();
3131
+  unsigned incr = MI->getOperand(2).getReg();
3132
+  DebugLoc dl = MI->getDebugLoc();
3133
+  bool isThumb2 = Subtarget->isThumb2();
3134
+  unsigned ldrOpc, strOpc;
3135
+  switch (Size) {
3136
+  default: llvm_unreachable("unsupported size for AtomicCmpSwap!");
3137
+  case 1:
3138
+    ldrOpc = isThumb2 ? ARM::t2LDREXB : ARM::LDREXB;
3139
+    strOpc = isThumb2 ? ARM::t2LDREXB : ARM::STREXB;
3140
+    break;
3141
+  case 2:
3142
+    ldrOpc = isThumb2 ? ARM::t2LDREXH : ARM::LDREXH;
3143
+    strOpc = isThumb2 ? ARM::t2STREXH : ARM::STREXH;
3144
+    break;
3145
+  case 4:
3146
+    ldrOpc = isThumb2 ? ARM::t2LDREX : ARM::LDREX;
3147
+    strOpc = isThumb2 ? ARM::t2STREX : ARM::STREX;
3148
+    break;
3149
+  }
3150
+
3151
+  MachineBasicBlock *loopMBB = F->CreateMachineBasicBlock(LLVM_BB);
3152
+  MachineBasicBlock *exitMBB = F->CreateMachineBasicBlock(LLVM_BB);
3153
+  F->insert(It, loopMBB);
3154
+  F->insert(It, exitMBB);
3155
+  exitMBB->transferSuccessors(BB);
3156
+
3157
+  MachineRegisterInfo &RegInfo = F->getRegInfo();
3158
+  unsigned scratch = RegInfo.createVirtualRegister(ARM::GPRRegisterClass);
3159
+  unsigned scratch2 = (!BinOpcode) ? incr :
3160
+    RegInfo.createVirtualRegister(ARM::GPRRegisterClass);
3161
+
3162
+  //  thisMBB:
3163
+  //   ...
3164
+  //   fallthrough --> loopMBB
3165
+  BB->addSuccessor(loopMBB);
3166
+
3167
+  //  loopMBB:
3168
+  //   ldrex dest, ptr
3169
+  //   <binop> scratch2, dest, incr
3170
+  //   strex scratch, scratch2, ptr
3171
+  //   cmp scratch, #0
3172
+  //   bne- loopMBB
3173
+  //   fallthrough --> exitMBB
3174
+  BB = loopMBB;
3175
+  AddDefaultPred(BuildMI(BB, dl, TII->get(ldrOpc), dest).addReg(ptr));
3176
+  if (BinOpcode) {
3177
+    // operand order needs to go the other way for NAND
3178
+    if (BinOpcode == ARM::BICrr || BinOpcode == ARM::t2BICrr)
3179
+      AddDefaultPred(BuildMI(BB, dl, TII->get(BinOpcode), scratch2).
3180
+                     addReg(incr).addReg(dest)).addReg(0);
3181
+    else
3182
+      AddDefaultPred(BuildMI(BB, dl, TII->get(BinOpcode), scratch2).
3183
+                     addReg(dest).addReg(incr)).addReg(0);
3184
+  }
3185
+
3186
+  AddDefaultPred(BuildMI(BB, dl, TII->get(strOpc), scratch).addReg(scratch2)
3187
+                 .addReg(ptr));
3188
+  AddDefaultPred(BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2CMPri : ARM::CMPri))
3189
+                 .addReg(scratch).addImm(0));
3190
+  BuildMI(BB, dl, TII->get(isThumb2 ? ARM::t2Bcc : ARM::Bcc))
3191
+    .addMBB(loopMBB).addImm(ARMCC::NE).addReg(ARM::CPSR);
3192
+
3193
+  BB->addSuccessor(loopMBB);
3194
+  BB->addSuccessor(exitMBB);
3195
+
3196
+  //  exitMBB:
3197
+  //   ...
3198
+  BB = exitMBB;
3199
+  return BB;
3126 3200
 }
3127 3201
 
3128 3202
 MachineBasicBlock *
... ...
@@ -3131,38 +3222,57 @@ ARMTargetLowering::EmitInstrWithCustomInserter(MachineInstr *MI,
3131 3131
                    DenseMap<MachineBasicBlock*, MachineBasicBlock*> *EM) const {
3132 3132
   const TargetInstrInfo *TII = getTargetMachine().getInstrInfo();
3133 3133
   DebugLoc dl = MI->getDebugLoc();
3134
+  bool isThumb2 = Subtarget->isThumb2();
3134 3135
   switch (MI->getOpcode()) {
3135 3136
   default:
3136 3137
     MI->dump();
3137 3138
     llvm_unreachable("Unexpected instr type to insert");
3138 3139
 
3139
-  case ARM::ATOMIC_LOAD_ADD_I8:  return EmitAtomicBinary(MI, BB, 1, ARM::ADDrr);
3140
-  case ARM::ATOMIC_LOAD_ADD_I16: return EmitAtomicBinary(MI, BB, 2, ARM::ADDrr);
3141
-  case ARM::ATOMIC_LOAD_ADD_I32: return EmitAtomicBinary(MI, BB, 4, ARM::ADDrr);
3142
-
3143
-  case ARM::ATOMIC_LOAD_AND_I8:  return EmitAtomicBinary(MI, BB, 1, ARM::ANDrr);
3144
-  case ARM::ATOMIC_LOAD_AND_I16: return EmitAtomicBinary(MI, BB, 2, ARM::ANDrr);
3145
-  case ARM::ATOMIC_LOAD_AND_I32: return EmitAtomicBinary(MI, BB, 4, ARM::ANDrr);
3146
-
3147
-  case ARM::ATOMIC_LOAD_OR_I8:   return EmitAtomicBinary(MI, BB, 1, ARM::ORRrr);
3148
-  case ARM::ATOMIC_LOAD_OR_I16:  return EmitAtomicBinary(MI, BB, 2, ARM::ORRrr);
3149
-  case ARM::ATOMIC_LOAD_OR_I32:  return EmitAtomicBinary(MI, BB, 4, ARM::ORRrr);
3150
-
3151
-  case ARM::ATOMIC_LOAD_XOR_I8:  return EmitAtomicBinary(MI, BB, 1, ARM::EORrr);
3152
-  case ARM::ATOMIC_LOAD_XOR_I16: return EmitAtomicBinary(MI, BB, 2, ARM::EORrr);
3153
-  case ARM::ATOMIC_LOAD_XOR_I32: return EmitAtomicBinary(MI, BB, 4, ARM::EORrr);
3154
-
3155
-  case ARM::ATOMIC_LOAD_NAND_I8: return EmitAtomicBinary(MI, BB, 1, ARM::BICrr);
3156
-  case ARM::ATOMIC_LOAD_NAND_I16:return EmitAtomicBinary(MI, BB, 2, ARM::BICrr);
3157
-  case ARM::ATOMIC_LOAD_NAND_I32:return EmitAtomicBinary(MI, BB, 4, ARM::BICrr);
3158
-
3159
-  case ARM::ATOMIC_LOAD_SUB_I8:  return EmitAtomicBinary(MI, BB, 1, ARM::SUBrr);
3160
-  case ARM::ATOMIC_LOAD_SUB_I16: return EmitAtomicBinary(MI, BB, 2, ARM::SUBrr);
3161
-  case ARM::ATOMIC_LOAD_SUB_I32: return EmitAtomicBinary(MI, BB, 4, ARM::SUBrr);
3162
-
3163
-  case ARM::ATOMIC_SWAP_I8:      return EmitAtomicBinary(MI, BB, 1, 0);
3164
-  case ARM::ATOMIC_SWAP_I16:     return EmitAtomicBinary(MI, BB, 2, 0);
3165
-  case ARM::ATOMIC_SWAP_I32:     return EmitAtomicBinary(MI, BB, 4, 0);
3140
+  case ARM::ATOMIC_LOAD_ADD_I8:
3141
+     return EmitAtomicBinary(MI, BB, 1, isThumb2 ? ARM::t2ADDrr : ARM::ADDrr);
3142
+  case ARM::ATOMIC_LOAD_ADD_I16:
3143
+     return EmitAtomicBinary(MI, BB, 2, isThumb2 ? ARM::t2ADDrr : ARM::ADDrr);
3144
+  case ARM::ATOMIC_LOAD_ADD_I32:
3145
+     return EmitAtomicBinary(MI, BB, 4, isThumb2 ? ARM::t2ADDrr : ARM::ADDrr);
3146
+
3147
+  case ARM::ATOMIC_LOAD_AND_I8:
3148
+     return EmitAtomicBinary(MI, BB, 1, isThumb2 ? ARM::t2ANDrr : ARM::ANDrr);
3149
+  case ARM::ATOMIC_LOAD_AND_I16:
3150
+     return EmitAtomicBinary(MI, BB, 2, isThumb2 ? ARM::t2ANDrr : ARM::ANDrr);
3151
+  case ARM::ATOMIC_LOAD_AND_I32:
3152
+     return EmitAtomicBinary(MI, BB, 4, isThumb2 ? ARM::t2ANDrr : ARM::ANDrr);
3153
+
3154
+  case ARM::ATOMIC_LOAD_OR_I8:
3155
+     return EmitAtomicBinary(MI, BB, 1, isThumb2 ? ARM::t2ORRrr : ARM::ORRrr);
3156
+  case ARM::ATOMIC_LOAD_OR_I16:
3157
+     return EmitAtomicBinary(MI, BB, 2, isThumb2 ? ARM::t2ORRrr : ARM::ORRrr);
3158
+  case ARM::ATOMIC_LOAD_OR_I32:
3159
+     return EmitAtomicBinary(MI, BB, 4, isThumb2 ? ARM::t2ORRrr : ARM::ORRrr);
3160
+
3161
+  case ARM::ATOMIC_LOAD_XOR_I8:
3162
+     return EmitAtomicBinary(MI, BB, 1, isThumb2 ? ARM::t2EORrr : ARM::EORrr);
3163
+  case ARM::ATOMIC_LOAD_XOR_I16:
3164
+     return EmitAtomicBinary(MI, BB, 2, isThumb2 ? ARM::t2EORrr : ARM::EORrr);
3165
+  case ARM::ATOMIC_LOAD_XOR_I32:
3166
+     return EmitAtomicBinary(MI, BB, 4, isThumb2 ? ARM::t2EORrr : ARM::EORrr);
3167
+
3168
+  case ARM::ATOMIC_LOAD_NAND_I8:
3169
+     return EmitAtomicBinary(MI, BB, 1, isThumb2 ? ARM::t2BICrr : ARM::BICrr);
3170
+  case ARM::ATOMIC_LOAD_NAND_I16:
3171
+     return EmitAtomicBinary(MI, BB, 2, isThumb2 ? ARM::t2BICrr : ARM::BICrr);
3172
+  case ARM::ATOMIC_LOAD_NAND_I32:
3173
+     return EmitAtomicBinary(MI, BB, 4, isThumb2 ? ARM::t2BICrr : ARM::BICrr);
3174
+
3175
+  case ARM::ATOMIC_LOAD_SUB_I8:
3176
+     return EmitAtomicBinary(MI, BB, 1, isThumb2 ? ARM::t2SUBrr : ARM::SUBrr);
3177
+  case ARM::ATOMIC_LOAD_SUB_I16:
3178
+     return EmitAtomicBinary(MI, BB, 2, isThumb2 ? ARM::t2SUBrr : ARM::SUBrr);
3179
+  case ARM::ATOMIC_LOAD_SUB_I32:
3180
+     return EmitAtomicBinary(MI, BB, 4, isThumb2 ? ARM::t2SUBrr : ARM::SUBrr);
3181
+
3182
+  case ARM::ATOMIC_SWAP_I8:  return EmitAtomicBinary(MI, BB, 1, 0);
3183
+  case ARM::ATOMIC_SWAP_I16: return EmitAtomicBinary(MI, BB, 2, 0);
3184
+  case ARM::ATOMIC_SWAP_I32: return EmitAtomicBinary(MI, BB, 4, 0);
3166 3185
 
3167 3186
   case ARM::ATOMIC_CMP_SWAP_I8:  return EmitAtomicCmpSwap(MI, BB, 1);
3168 3187
   case ARM::ATOMIC_CMP_SWAP_I16: return EmitAtomicCmpSwap(MI, BB, 2);
... ...
@@ -201,6 +201,19 @@ class I<dag oops, dag iops, AddrMode am, SizeFlagVal sz,
201 201
   let Pattern = pattern;
202 202
   list<Predicate> Predicates = [IsARM];
203 203
 }
204
+// A few are not predicable
205
+class InoP<dag oops, dag iops, AddrMode am, SizeFlagVal sz,
206
+        IndexMode im, Format f, InstrItinClass itin, 
207
+        string opc, string asm, string cstr,
208
+        list<dag> pattern>
209
+  : InstARM<am, sz, im, f, GenericDomain, cstr, itin> {
210
+  let OutOperandList = oops;
211
+  let InOperandList = iops;
212
+  let AsmString   = !strconcat(opc, asm);
213
+  let Pattern = pattern;
214
+  let isPredicable = 0;
215
+  list<Predicate> Predicates = [IsARM];
216
+}
204 217
 
205 218
 // Same as I except it can optionally modify CPSR. Note it's modeled as
206 219
 // an input operand since by default it's a zero register. It will
... ...
@@ -241,6 +254,10 @@ class AXI<dag oops, dag iops, Format f, InstrItinClass itin,
241 241
           string asm, list<dag> pattern>
242 242
   : XI<oops, iops, AddrModeNone, Size4Bytes, IndexModeNone, f, itin,
243 243
        asm, "", pattern>;
244
+class AInoP<dag oops, dag iops, Format f, InstrItinClass itin,
245
+         string opc, string asm, list<dag> pattern>
246
+  : InoP<oops, iops, AddrModeNone, Size4Bytes, IndexModeNone, f, itin,
247
+      opc, asm, "", pattern>;
244 248
 
245 249
 // Ctrl flow instructions
246 250
 class ABI<bits<4> opcod, dag oops, dag iops, InstrItinClass itin,
... ...
@@ -46,8 +46,10 @@ def SDT_ARMPICAdd  : SDTypeProfile<1, 2, [SDTCisSameAs<0, 1>,
46 46
 def SDT_ARMThreadPointer : SDTypeProfile<1, 0, [SDTCisPtrTy<0>]>;
47 47
 def SDT_ARMEH_SJLJ_Setjmp : SDTypeProfile<1, 1, [SDTCisInt<0>, SDTCisPtrTy<1>]>;
48 48
 
49
-def SDT_ARMMEMBARRIER  : SDTypeProfile<0, 0, []>;
50
-def SDT_ARMSYNCBARRIER : SDTypeProfile<0, 0, []>;
49
+def SDT_ARMMEMBARRIERV7  : SDTypeProfile<0, 0, []>;
50
+def SDT_ARMSYNCBARRIERV7 : SDTypeProfile<0, 0, []>;
51
+def SDT_ARMMEMBARRIERV6  : SDTypeProfile<0, 1, [SDTCisInt<0>]>;
52
+def SDT_ARMSYNCBARRIERV6 : SDTypeProfile<0, 1, [SDTCisInt<0>]>;
51 53
 
52 54
 // Node definitions.
53 55
 def ARMWrapper       : SDNode<"ARMISD::Wrapper",     SDTIntUnaryOp>;
... ...
@@ -96,9 +98,13 @@ def ARMrrx           : SDNode<"ARMISD::RRX"     , SDTIntUnaryOp, [SDNPInFlag ]>;
96 96
 def ARMthread_pointer: SDNode<"ARMISD::THREAD_POINTER", SDT_ARMThreadPointer>;
97 97
 def ARMeh_sjlj_setjmp: SDNode<"ARMISD::EH_SJLJ_SETJMP", SDT_ARMEH_SJLJ_Setjmp>;
98 98
 
99
-def ARMMemBarrier    : SDNode<"ARMISD::MEMBARRIER", SDT_ARMMEMBARRIER,
99
+def ARMMemBarrierV7  : SDNode<"ARMISD::MEMBARRIER", SDT_ARMMEMBARRIERV7,
100 100
                               [SDNPHasChain]>;
101
-def ARMSyncBarrier   : SDNode<"ARMISD::SYNCBARRIER", SDT_ARMMEMBARRIER,
101
+def ARMSyncBarrierV7 : SDNode<"ARMISD::SYNCBARRIER", SDT_ARMMEMBARRIERV7,
102
+                              [SDNPHasChain]>;
103
+def ARMMemBarrierV6  : SDNode<"ARMISD::MEMBARRIER", SDT_ARMMEMBARRIERV6,
104
+                              [SDNPHasChain]>;
105
+def ARMSyncBarrierV6 : SDNode<"ARMISD::SYNCBARRIER", SDT_ARMMEMBARRIERV6,
102 106
                               [SDNPHasChain]>;
103 107
 
104 108
 //===----------------------------------------------------------------------===//
... ...
@@ -780,6 +786,7 @@ let isBranch = 1, isTerminator = 1 in {
780 780
   def BR_JTr : JTI<(outs), (ins GPR:$target, jtblock_operand:$jt, i32imm:$id),
781 781
                     IIC_Br, "mov\tpc, $target \n$jt",
782 782
                     [(ARMbrjt GPR:$target, tjumptable:$jt, imm:$id)]> {
783
+    let Inst{11-4}  = 0b00000000;
783 784
     let Inst{15-12} = 0b1111;
784 785
     let Inst{20}    = 0; // S Bit
785 786
     let Inst{24-21} = 0b1101;
... ...
@@ -1574,26 +1581,44 @@ def MOVCCi : AI1<0b1101, (outs GPR:$dst),
1574 1574
 //
1575 1575
 
1576 1576
 // memory barriers protect the atomic sequences
1577
-let isPredicable = 0, hasSideEffects = 1 in {
1578
-def Int_MemBarrierV7 : AI<(outs), (ins),
1577
+let hasSideEffects = 1 in {
1578
+def Int_MemBarrierV7 : AInoP<(outs), (ins),
1579 1579
                         Pseudo, NoItinerary,
1580 1580
                         "dmb", "",
1581
-                        [(ARMMemBarrier)]>,
1582
-                        Requires<[HasV7]> {
1581
+                        [(ARMMemBarrierV7)]>,
1582
+                        Requires<[IsARM, HasV7]> {
1583 1583
   let Inst{31-4} = 0xf57ff05;
1584 1584
   // FIXME: add support for options other than a full system DMB
1585 1585
   let Inst{3-0} = 0b1111;
1586 1586
 }
1587 1587
 
1588
-def Int_SyncBarrierV7 : AI<(outs), (ins),
1588
+def Int_SyncBarrierV7 : AInoP<(outs), (ins),
1589 1589
                         Pseudo, NoItinerary,
1590 1590
                         "dsb", "",
1591
-                        [(ARMSyncBarrier)]>,
1592
-                        Requires<[HasV7]> {
1591
+                        [(ARMSyncBarrierV7)]>,
1592
+                        Requires<[IsARM, HasV7]> {
1593 1593
   let Inst{31-4} = 0xf57ff04;
1594 1594
   // FIXME: add support for options other than a full system DSB
1595 1595
   let Inst{3-0} = 0b1111;
1596 1596
 }
1597
+
1598
+def Int_MemBarrierV6 : AInoP<(outs), (ins GPR:$zero),
1599
+                       Pseudo, NoItinerary,
1600
+                       "mcr", "\tp15, 0, $zero, c7, c10, 5",
1601
+                       [(ARMMemBarrierV6 GPR:$zero)]>,
1602
+                       Requires<[IsARM, HasV6]> {
1603
+  // FIXME: add support for options other than a full system DMB
1604
+  // FIXME: add encoding
1605
+}
1606
+
1607
+def Int_SyncBarrierV6 : AInoP<(outs), (ins GPR:$zero),
1608
+                        Pseudo, NoItinerary,
1609
+                        "mcr", "\tp15, 0, $zero, c7, c10, 4",
1610
+                        [(ARMSyncBarrierV6 GPR:$zero)]>,
1611
+                        Requires<[IsARM, HasV6]> {
1612
+  // FIXME: add support for options other than a full system DSB
1613
+  // FIXME: add encoding
1614
+}
1597 1615
 }
1598 1616
 
1599 1617
 let usesCustomInserter = 1 in {
... ...
@@ -1684,7 +1709,6 @@ let usesCustomInserter = 1 in {
1684 1684
       "${:comment} ATOMIC_SWAP_I32 PSEUDO!",
1685 1685
       [(set GPR:$dst, (atomic_swap_32 GPR:$ptr, GPR:$new))]>;
1686 1686
 
1687
-
1688 1687
     def ATOMIC_CMP_SWAP_I8 : PseudoInst<
1689 1688
       (outs GPR:$dst), (ins GPR:$ptr, GPR:$old, GPR:$new), NoItinerary,
1690 1689
       "${:comment} ATOMIC_CMP_SWAP_I8 PSEUDO!",
... ...
@@ -1710,11 +1734,15 @@ def LDREXH : AIldrex<0b11, (outs GPR:$dest), (ins GPR:$ptr), NoItinerary,
1710 1710
 def LDREX  : AIldrex<0b00, (outs GPR:$dest), (ins GPR:$ptr), NoItinerary,
1711 1711
                     "ldrex", "\t$dest, [$ptr]",
1712 1712
                     []>;
1713
+def LDREXD : AIldrex<0b01, (outs GPR:$dest, GPR:$dest2), (ins GPR:$ptr),
1714
+                    NoItinerary,
1715
+                    "ldrexd", "\t$dest, $dest2, [$ptr]",
1716
+                    []>;
1713 1717
 }
1714 1718
 
1715 1719
 let mayStore = 1 in {
1716 1720
 def STREXB : AIstrex<0b10, (outs GPR:$success), (ins GPR:$src, GPR:$ptr),
1717
-                     NoItinerary,
1721
+                    NoItinerary,
1718 1722
                     "strexb", "\t$success, $src, [$ptr]",
1719 1723
                     []>;
1720 1724
 def STREXH : AIstrex<0b11, (outs GPR:$success), (ins GPR:$src, GPR:$ptr),
... ...
@@ -1722,9 +1750,14 @@ def STREXH : AIstrex<0b11, (outs GPR:$success), (ins GPR:$src, GPR:$ptr),
1722 1722
                     "strexh", "\t$success, $src, [$ptr]",
1723 1723
                     []>;
1724 1724
 def STREX  : AIstrex<0b00, (outs GPR:$success), (ins GPR:$src, GPR:$ptr),
1725
-                     NoItinerary,
1725
+                    NoItinerary,
1726 1726
                     "strex", "\t$success, $src, [$ptr]",
1727 1727
                     []>;
1728
+def STREXD : AIstrex<0b01, (outs GPR:$success),
1729
+                    (ins GPR:$src, GPR:$src2, GPR:$ptr),
1730
+                    NoItinerary,
1731
+                    "strexd", "\t$success, $src, $src2, [$ptr]",
1732
+                    []>;
1728 1733
 }
1729 1734
 
1730 1735
 //===----------------------------------------------------------------------===//
... ...
@@ -1065,6 +1065,68 @@ def t2MOVCCror : T2I<(outs GPR:$dst), (ins GPR:$false, GPR:$true, i32imm:$rhs),
1065 1065
                    RegConstraint<"$false = $dst">;
1066 1066
 
1067 1067
 //===----------------------------------------------------------------------===//
1068
+// Atomic operations intrinsics
1069
+//
1070
+
1071
+// memory barriers protect the atomic sequences
1072
+let hasSideEffects = 1 in {
1073
+def t2Int_MemBarrierV7 : AInoP<(outs), (ins),
1074
+                        Pseudo, NoItinerary,
1075
+                        "dmb", "",
1076
+                        [(ARMMemBarrierV7)]>,
1077
+                        Requires<[IsThumb2]> {
1078
+  // FIXME: add support for options other than a full system DMB
1079
+}
1080
+
1081
+def t2Int_SyncBarrierV7 : AInoP<(outs), (ins),
1082
+                        Pseudo, NoItinerary,
1083
+                        "dsb", "",
1084
+                        [(ARMSyncBarrierV7)]>,
1085
+                        Requires<[IsThumb2]> {
1086
+  // FIXME: add support for options other than a full system DSB
1087
+}
1088
+}
1089
+
1090
+let mayLoad = 1 in {
1091
+def t2LDREXB : Thumb2I<(outs GPR:$dest), (ins GPR:$ptr), AddrModeNone,
1092
+                      Size4Bytes, NoItinerary,
1093
+                      "ldrexb", "\t$dest, [$ptr]", "",
1094
+                      []>;
1095
+def t2LDREXH : Thumb2I<(outs GPR:$dest), (ins GPR:$ptr), AddrModeNone,
1096
+                      Size4Bytes, NoItinerary,
1097
+                      "ldrexh", "\t$dest, [$ptr]", "",
1098
+                      []>;
1099
+def t2LDREX  : Thumb2I<(outs GPR:$dest), (ins GPR:$ptr), AddrModeNone,
1100
+                      Size4Bytes, NoItinerary,
1101
+                      "ldrex", "\t$dest, [$ptr]", "",
1102
+                      []>;
1103
+def t2LDREXD : Thumb2I<(outs GPR:$dest, GPR:$dest2), (ins GPR:$ptr),
1104
+                      AddrModeNone, Size4Bytes, NoItinerary,
1105
+                      "ldrexd", "\t$dest, $dest2, [$ptr]", "",
1106
+                      []>;
1107
+}
1108
+
1109
+let mayStore = 1 in {
1110
+def t2STREXB : Thumb2I<(outs GPR:$success), (ins GPR:$src, GPR:$ptr),
1111
+                      AddrModeNone, Size4Bytes, NoItinerary,
1112
+                      "strexb", "\t$success, $src, [$ptr]", "",
1113
+                      []>;
1114
+def t2STREXH : Thumb2I<(outs GPR:$success), (ins GPR:$src, GPR:$ptr),
1115
+                      AddrModeNone, Size4Bytes, NoItinerary,
1116
+                      "strexh", "\t$success, $src, [$ptr]", "",
1117
+                      []>;
1118
+def t2STREX  : Thumb2I<(outs GPR:$success), (ins GPR:$src, GPR:$ptr),
1119
+                      AddrModeNone, Size4Bytes, NoItinerary,
1120
+                      "strex", "\t$success, $src, [$ptr]", "",
1121
+                      []>;
1122
+def t2STREXD : Thumb2I<(outs GPR:$success),
1123
+                      (ins GPR:$src, GPR:$src2, GPR:$ptr),
1124
+                      AddrModeNone, Size4Bytes, NoItinerary,
1125
+                      "strexd", "\t$success, $src, $src2, [$ptr]", "",
1126
+                      []>;
1127
+}
1128
+
1129
+//===----------------------------------------------------------------------===//
1068 1130
 // TLS Instructions
1069 1131
 //
1070 1132
 
... ...
@@ -801,8 +801,21 @@ void bar(unsigned n) {
801 801
     true();
802 802
 }
803 803
 
804
-I think this basically amounts to a dag combine to simplify comparisons against
805
-multiply hi's into a comparison against the mullo.
804
+This is equivalent to the following, where 2863311531 is the multiplicative
805
+inverse of 3, and 1431655766 is ((2^32)-1)/3+1:
806
+void bar(unsigned n) {
807
+  if (n * 2863311531U < 1431655766U)
808
+    true();
809
+}
810
+
811
+The same transformation can work with an even modulo with the addition of a
812
+rotate: rotate the result of the multiply to the right by the number of bits
813
+which need to be zero for the condition to be true, and shrink the compare RHS
814
+by the same amount.  Unless the target supports rotates, though, that
815
+transformation probably isn't worthwhile.
816
+
817
+The transformation can also easily be made to work with non-zero equality
818
+comparisons: just transform, for example, "n % 3 == 1" to "(n-1) % 3 == 0".
806 819
 
807 820
 //===---------------------------------------------------------------------===//
808 821
 
... ...
@@ -823,20 +836,6 @@ int main() {
823 823
 
824 824
 //===---------------------------------------------------------------------===//
825 825
 
826
-Instcombine will merge comparisons like (x >= 10) && (x < 20) by producing (x -
827
-10) u< 10, but only when the comparisons have matching sign.
828
-
829
-This could be converted with a similiar technique. (PR1941)
830
-
831
-define i1 @test(i8 %x) {
832
-  %A = icmp uge i8 %x, 5
833
-  %B = icmp slt i8 %x, 20
834
-  %C = and i1 %A, %B
835
-  ret i1 %C
836
-}
837
-
838
-//===---------------------------------------------------------------------===//
839
-
840 826
 These functions perform the same computation, but produce different assembly.
841 827
 
842 828
 define i8 @select(i8 %x) readnone nounwind {
... ...
@@ -884,18 +883,6 @@ The expression should optimize to something like
884 884
 
885 885
 //===---------------------------------------------------------------------===//
886 886
 
887
-From GCC Bug 15241:
888
-unsigned int
889
-foo (unsigned int a, unsigned int b)
890
-{
891
- if (a <= 7 && b <= 7)
892
-   baz ();
893
-}
894
-Should combine to "(a|b) <= 7".  Currently not optimized with "clang
895
--emit-llvm-bc | opt -std-compile-opts".
896
-
897
-//===---------------------------------------------------------------------===//
898
-
899 887
 From GCC Bug 3756:
900 888
 int
901 889
 pn (int n)
... ...
@@ -907,19 +894,6 @@ Should combine to (n >> 31) | 1.  Currently not optimized with "clang
907 907
 
908 908
 //===---------------------------------------------------------------------===//
909 909
 
910
-From GCC Bug 28685:
911
-int test(int a, int b)
912
-{
913
- int lt = a < b;
914
- int eq = a == b;
915
-
916
- return (lt || eq);
917
-}
918
-Should combine to "a <= b".  Currently not optimized with "clang
919
--emit-llvm-bc | opt -std-compile-opts | llc".
920
-
921
-//===---------------------------------------------------------------------===//
922
-
923 910
 void a(int variable)
924 911
 {
925 912
  if (variable == 4 || variable == 6)
... ...
@@ -993,12 +967,6 @@ Should combine to 0.  Currently not optimized with "clang
993 993
 
994 994
 //===---------------------------------------------------------------------===//
995 995
 
996
-int a(unsigned char* b) {return *b > 99;}
997
-There's an unnecessary zext in the generated code with "clang
998
--emit-llvm-bc | opt -std-compile-opts".
999
-
1000
-//===---------------------------------------------------------------------===//
1001
-
1002 996
 int a(unsigned b) {return ((b << 31) | (b << 30)) >> 31;}
1003 997
 Should be combined to  "((b >> 1) | b) & 1".  Currently not optimized
1004 998
 with "clang -emit-llvm-bc | opt -std-compile-opts".
... ...
@@ -1011,12 +979,6 @@ Should combine to "x | (y & 3)".  Currently not optimized with "clang
1011 1011
 
1012 1012
 //===---------------------------------------------------------------------===//
1013 1013
 
1014
-unsigned a(unsigned a) {return ((a | 1) & 3) | (a & -4);}
1015
-Should combine to "a | 1".  Currently not optimized with "clang
1016
--emit-llvm-bc | opt -std-compile-opts".
1017
-
1018
-//===---------------------------------------------------------------------===//
1019
-
1020 1014
 int a(int a, int b, int c) {return (~a & c) | ((c|a) & b);}
1021 1015
 Should fold to "(~a & c) | (a & b)".  Currently not optimized with
1022 1016
 "clang -emit-llvm-bc | opt -std-compile-opts".
... ...
@@ -64,11 +64,18 @@ def RetCC_X86_32_C : CallingConv<[
64 64
 // X86-32 FastCC return-value convention.
65 65
 def RetCC_X86_32_Fast : CallingConv<[
66 66
   // The X86-32 fastcc returns 1, 2, or 3 FP values in XMM0-2 if the target has
67
-  // SSE2, otherwise it is the the C calling conventions.
67
+  // SSE2.
68 68
   // This can happen when a float, 2 x float, or 3 x float vector is split by
69 69
   // target lowering, and is returned in 1-3 sse regs.
70 70
   CCIfType<[f32], CCIfSubtarget<"hasSSE2()", CCAssignToReg<[XMM0,XMM1,XMM2]>>>,
71 71
   CCIfType<[f64], CCIfSubtarget<"hasSSE2()", CCAssignToReg<[XMM0,XMM1,XMM2]>>>,
72
+
73
+  // For integers, ECX can be used as an extra return register
74
+  CCIfType<[i8],  CCAssignToReg<[AL, DL, CL]>>,
75
+  CCIfType<[i16], CCAssignToReg<[AX, DX, CX]>>,
76
+  CCIfType<[i32], CCAssignToReg<[EAX, EDX, ECX]>>,
77
+
78
+  // Otherwise, it is the same as the common X86 calling convention.
72 79
   CCDelegateTo<RetCC_X86Common>
73 80
 ]>;
74 81
 
... ...
@@ -596,6 +596,17 @@ X86TargetLowering::X86TargetLowering(X86TargetMachine &TM)
596 596
     setOperationAction(ISD::UINT_TO_FP, (MVT::SimpleValueType)VT, Expand);
597 597
     setOperationAction(ISD::SINT_TO_FP, (MVT::SimpleValueType)VT, Expand);
598 598
     setOperationAction(ISD::SIGN_EXTEND_INREG, (MVT::SimpleValueType)VT,Expand);
599
+    setOperationAction(ISD::TRUNCATE,  (MVT::SimpleValueType)VT, Expand);
600
+    setOperationAction(ISD::SIGN_EXTEND,  (MVT::SimpleValueType)VT, Expand);
601
+    setOperationAction(ISD::ZERO_EXTEND,  (MVT::SimpleValueType)VT, Expand);
602
+    setOperationAction(ISD::ANY_EXTEND,  (MVT::SimpleValueType)VT, Expand);
603
+    for (unsigned InnerVT = (unsigned)MVT::FIRST_VECTOR_VALUETYPE;
604
+         InnerVT <= (unsigned)MVT::LAST_VECTOR_VALUETYPE; ++InnerVT)
605
+      setTruncStoreAction((MVT::SimpleValueType)VT,
606
+                          (MVT::SimpleValueType)InnerVT, Expand);
607
+    setLoadExtAction(ISD::SEXTLOAD, (MVT::SimpleValueType)VT, Expand);
608
+    setLoadExtAction(ISD::ZEXTLOAD, (MVT::SimpleValueType)VT, Expand);
609
+    setLoadExtAction(ISD::EXTLOAD, (MVT::SimpleValueType)VT, Expand);
599 610
   }
600 611
 
601 612
   // FIXME: In order to prevent SSE instructions being expanded to MMX ones
... ...
@@ -672,8 +683,6 @@ X86TargetLowering::X86TargetLowering(X86TargetMachine &TM)
672 672
 
673 673
     setOperationAction(ISD::INSERT_VECTOR_ELT,  MVT::v4i16, Custom);
674 674
 
675
-    setTruncStoreAction(MVT::v8i16,             MVT::v8i8, Expand);
676
-    setOperationAction(ISD::TRUNCATE,           MVT::v8i8, Expand);
677 675
     setOperationAction(ISD::SELECT,             MVT::v8i8, Promote);
678 676
     setOperationAction(ISD::SELECT,             MVT::v4i16, Promote);
679 677
     setOperationAction(ISD::SELECT,             MVT::v2i32, Promote);
... ...
@@ -5741,6 +5750,17 @@ SDValue X86TargetLowering::LowerSETCC(SDValue Op, SelectionDAG &DAG) {
5741 5741
     return SDValue();
5742 5742
 
5743 5743
   SDValue Cond = EmitCmp(Op0, Op1, X86CC, DAG);
5744
+
5745
+  // Use sbb x, x to materialize carry bit into a GPR.
5746
+  // FIXME: Temporarily disabled since it breaks self-hosting. It's apparently
5747
+  // miscompiling ARMISelDAGToDAG.cpp.
5748
+  if (0 && !isFP && X86CC == X86::COND_B) {
5749
+    return DAG.getNode(ISD::AND, dl, MVT::i8,
5750
+                       DAG.getNode(X86ISD::SETCC_CARRY, dl, MVT::i8,
5751
+                                   DAG.getConstant(X86CC, MVT::i8), Cond),
5752
+                       DAG.getConstant(1, MVT::i8));
5753
+  }
5754
+
5744 5755
   return DAG.getNode(X86ISD::SETCC, dl, MVT::i8,
5745 5756
                      DAG.getConstant(X86CC, MVT::i8), Cond);
5746 5757
 }
... ...
@@ -5893,9 +5913,18 @@ SDValue X86TargetLowering::LowerSELECT(SDValue Op, SelectionDAG &DAG) {
5893 5893
       Cond = NewCond;
5894 5894
   }
5895 5895
 
5896
+  // Look pass (and (setcc_carry (cmp ...)), 1).
5897
+  if (Cond.getOpcode() == ISD::AND &&
5898
+      Cond.getOperand(0).getOpcode() == X86ISD::SETCC_CARRY) {
5899
+    ConstantSDNode *C = dyn_cast<ConstantSDNode>(Cond.getOperand(1));
5900
+    if (C && C->getAPIntValue() == 1) 
5901
+      Cond = Cond.getOperand(0);
5902
+  }
5903
+
5896 5904
   // If condition flag is set by a X86ISD::CMP, then use it as the condition
5897 5905
   // setting operand in place of the X86ISD::SETCC.
5898
-  if (Cond.getOpcode() == X86ISD::SETCC) {
5906
+  if (Cond.getOpcode() == X86ISD::SETCC ||
5907
+      Cond.getOpcode() == X86ISD::SETCC_CARRY) {
5899 5908
     CC = Cond.getOperand(0);
5900 5909
 
5901 5910
     SDValue Cmp = Cond.getOperand(1);
... ...
@@ -5978,9 +6007,18 @@ SDValue X86TargetLowering::LowerBRCOND(SDValue Op, SelectionDAG &DAG) {
5978 5978
     Cond = LowerXALUO(Cond, DAG);
5979 5979
 #endif
5980 5980
 
5981
+  // Look pass (and (setcc_carry (cmp ...)), 1).
5982
+  if (Cond.getOpcode() == ISD::AND &&
5983
+      Cond.getOperand(0).getOpcode() == X86ISD::SETCC_CARRY) {
5984
+    ConstantSDNode *C = dyn_cast<ConstantSDNode>(Cond.getOperand(1));
5985
+    if (C && C->getAPIntValue() == 1) 
5986
+      Cond = Cond.getOperand(0);
5987
+  }
5988
+
5981 5989
   // If condition flag is set by a X86ISD::CMP, then use it as the condition
5982 5990
   // setting operand in place of the X86ISD::SETCC.
5983
-  if (Cond.getOpcode() == X86ISD::SETCC) {
5991
+  if (Cond.getOpcode() == X86ISD::SETCC ||
5992
+      Cond.getOpcode() == X86ISD::SETCC_CARRY) {
5984 5993
     CC = Cond.getOperand(0);
5985 5994
 
5986 5995
     SDValue Cmp = Cond.getOperand(1);
... ...
@@ -7367,6 +7405,7 @@ const char *X86TargetLowering::getTargetNodeName(unsigned Opcode) const {
7367 7367
   case X86ISD::COMI:               return "X86ISD::COMI";
7368 7368
   case X86ISD::UCOMI:              return "X86ISD::UCOMI";
7369 7369
   case X86ISD::SETCC:              return "X86ISD::SETCC";
7370
+  case X86ISD::SETCC_CARRY:        return "X86ISD::SETCC_CARRY";
7370 7371
   case X86ISD::CMOV:               return "X86ISD::CMOV";
7371 7372
   case X86ISD::BRCOND:             return "X86ISD::BRCOND";
7372 7373
   case X86ISD::RET_FLAG:           return "X86ISD::RET_FLAG";
... ...
@@ -8941,11 +8980,42 @@ static SDValue PerformMulCombine(SDNode *N, SelectionDAG &DAG,
8941 8941
   return SDValue();
8942 8942
 }
8943 8943
 
8944
+static SDValue PerformSHLCombine(SDNode *N, SelectionDAG &DAG) {
8945
+  SDValue N0 = N->getOperand(0);
8946
+  SDValue N1 = N->getOperand(1);
8947
+  ConstantSDNode *N1C = dyn_cast<ConstantSDNode>(N1);
8948
+  EVT VT = N0.getValueType();
8949
+
8950
+  // fold (shl (and (setcc_c), c1), c2) -> (and setcc_c, (c1 << c2))
8951
+  // since the result of setcc_c is all zero's or all ones.
8952
+  if (N1C && N0.getOpcode() == ISD::AND &&
8953
+      N0.getOperand(1).getOpcode() == ISD::Constant) {
8954
+    SDValue N00 = N0.getOperand(0);
8955
+    if (N00.getOpcode() == X86ISD::SETCC_CARRY ||
8956
+        ((N00.getOpcode() == ISD::ANY_EXTEND ||
8957
+          N00.getOpcode() == ISD::ZERO_EXTEND) &&
8958
+         N00.getOperand(0).getOpcode() == X86ISD::SETCC_CARRY)) {
8959
+      APInt Mask = cast<ConstantSDNode>(N0.getOperand(1))->getAPIntValue();
8960
+      APInt ShAmt = N1C->getAPIntValue();
8961
+      Mask = Mask.shl(ShAmt);
8962
+      if (Mask != 0)
8963
+        return DAG.getNode(ISD::AND, N->getDebugLoc(), VT,
8964
+                           N00, DAG.getConstant(Mask, VT));
8965
+    }
8966
+  }
8967
+
8968
+  return SDValue();
8969
+}
8944 8970
 
8945 8971
 /// PerformShiftCombine - Transforms vector shift nodes to use vector shifts
8946 8972
 ///                       when possible.
8947 8973
 static SDValue PerformShiftCombine(SDNode* N, SelectionDAG &DAG,
8948 8974
                                    const X86Subtarget *Subtarget) {
8975
+  EVT VT = N->getValueType(0);
8976
+  if (!VT.isVector() && VT.isInteger() &&
8977
+      N->getOpcode() == ISD::SHL)
8978
+    return PerformSHLCombine(N, DAG);
8979
+
8949 8980
   // On X86 with SSE2 support, we can transform this to a vector shift if
8950 8981
   // all elements are shifted by the same amount.  We can't do this in legalize
8951 8982
   // because the a constant vector is typically transformed to a constant pool
... ...
@@ -8953,7 +9023,6 @@ static SDValue PerformShiftCombine(SDNode* N, SelectionDAG &DAG,
8953 8953
   if (!Subtarget->hasSSE2())
8954 8954
     return SDValue();
8955 8955
 
8956
-  EVT VT = N->getValueType(0);
8957 8956
   if (VT != MVT::v2i64 && VT != MVT::v4i32 && VT != MVT::v8i16)
8958 8957
     return SDValue();
8959 8958
 
... ...
@@ -118,6 +118,10 @@ namespace llvm {
118 118
       /// operand produced by a CMP instruction.
119 119
       SETCC,
120 120
 
121
+      // Same as SETCC except it's materialized with a sbb and the value is all
122
+      // one's or all zero's.
123
+      SETCC_CARRY,
124
+
121 125
       /// X86 conditional moves. Operand 0 and operand 1 are the two values
122 126
       /// to select from. Operand 2 is the condition code, and operand 3 is the
123 127
       /// flag operand produced by a CMP or TEST instruction. It also writes a
... ...
@@ -1333,6 +1333,15 @@ def CMOVNO64rm : RI<0x41, MRMSrcMem,       // if !overflow, GR64 = [mem64]
1333 1333
                                      X86_COND_NO, EFLAGS))]>, TB;
1334 1334
 } // isTwoAddress
1335 1335
 
1336
+// Use sbb to materialize carry flag into a GPR.
1337
+let Defs = [EFLAGS], Uses = [EFLAGS], isCodeGenOnly = 1 in
1338
+def SETB_C64r : RI<0x19, MRMInitReg, (outs GR64:$dst), (ins),
1339
+                  "sbb{q}\t$dst, $dst",
1340
+                 [(set GR64:$dst, (zext (X86setcc_c X86_COND_B, EFLAGS)))]>;
1341
+
1342
+def : Pat<(i64 (anyext (X86setcc_c X86_COND_B, EFLAGS))),
1343
+          (SETB_C64r)>;
1344
+
1336 1345
 //===----------------------------------------------------------------------===//
1337 1346
 //  Conversion Instructions...
1338 1347
 //
... ...
@@ -1058,7 +1058,7 @@ static bool hasLiveCondCodeDef(MachineInstr *MI) {
1058 1058
   return false;
1059 1059
 }
1060 1060
 
1061
-/// convertToThreeAddressWithLEA - Helper for convertToThreeAddress when 16-bit
1061
+/// convertToThreeAddressWithLEA - Helper for convertToThreeAddress when
1062 1062
 /// 16-bit LEA is disabled, use 32-bit LEA to form 3-address code by promoting
1063 1063
 /// to a 32-bit superregister and then truncating back down to a 16-bit
1064 1064
 /// subregister.
... ...
@@ -1081,6 +1081,11 @@ X86InstrInfo::convertToThreeAddressWithLEA(unsigned MIOpc,
1081 1081
             
1082 1082
   // Build and insert into an implicit UNDEF value. This is OK because
1083 1083
   // well be shifting and then extracting the lower 16-bits. 
1084
+  // This has the potential to cause partial register stall. e.g.
1085
+  //   movw    (%rbp,%rcx,2), %dx
1086
+  //   leal    -65(%rdx), %esi
1087
+  // But testing has shown this *does* help performance in 64-bit mode (at
1088
+  // least on modern x86 machines).
1084 1089
   BuildMI(*MFI, MBBI, MI->getDebugLoc(), get(X86::IMPLICIT_DEF), leaInReg);
1085 1090
   MachineInstr *InsMI =
1086 1091
     BuildMI(*MFI, MBBI, MI->getDebugLoc(), get(X86::INSERT_SUBREG),leaInReg)
... ...
@@ -1184,7 +1189,9 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1184 1184
   MachineInstr *NewMI = NULL;
1185 1185
   // FIXME: 16-bit LEA's are really slow on Athlons, but not bad on P4's.  When
1186 1186
   // we have better subtarget support, enable the 16-bit LEA generation here.
1187
+  // 16-bit LEA is also slow on Core2.
1187 1188
   bool DisableLEA16 = true;
1189
+  bool is64Bit = TM.getSubtarget<X86Subtarget>().is64Bit();
1188 1190
 
1189 1191
   unsigned MIOpc = MI->getOpcode();
1190 1192
   switch (MIOpc) {
... ...
@@ -1223,8 +1230,7 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1223 1223
     unsigned ShAmt = MI->getOperand(2).getImm();
1224 1224
     if (ShAmt == 0 || ShAmt >= 4) return 0;
1225 1225
 
1226
-    unsigned Opc = TM.getSubtarget<X86Subtarget>().is64Bit() ?
1227
-      X86::LEA64_32r : X86::LEA32r;
1226
+    unsigned Opc = is64Bit ? X86::LEA64_32r : X86::LEA32r;
1228 1227
     NewMI = BuildMI(MF, MI->getDebugLoc(), get(Opc))
1229 1228
       .addReg(Dest, RegState::Define | getDeadRegState(isDead))
1230 1229
       .addReg(0).addImm(1 << ShAmt)
... ...
@@ -1239,7 +1245,7 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1239 1239
     if (ShAmt == 0 || ShAmt >= 4) return 0;
1240 1240
 
1241 1241
     if (DisableLEA16)
1242
-      return convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV);
1242
+      return is64Bit ? convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV) : 0;
1243 1243
     NewMI = BuildMI(MF, MI->getDebugLoc(), get(X86::LEA16r))
1244 1244
       .addReg(Dest, RegState::Define | getDeadRegState(isDead))
1245 1245
       .addReg(0).addImm(1 << ShAmt)
... ...
@@ -1254,7 +1260,6 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1254 1254
     if (hasLiveCondCodeDef(MI))
1255 1255
       return 0;
1256 1256
 
1257
-    bool is64Bit = TM.getSubtarget<X86Subtarget>().is64Bit();
1258 1257
     switch (MIOpc) {
1259 1258
     default: return 0;
1260 1259
     case X86::INC64r:
... ...
@@ -1272,7 +1277,7 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1272 1272
     case X86::INC16r:
1273 1273
     case X86::INC64_16r:
1274 1274
       if (DisableLEA16)
1275
-        return convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV);
1275
+        return is64Bit ? convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV) : 0;
1276 1276
       assert(MI->getNumOperands() >= 2 && "Unknown inc instruction!");
1277 1277
       NewMI = addRegOffset(BuildMI(MF, MI->getDebugLoc(), get(X86::LEA16r))
1278 1278
                            .addReg(Dest, RegState::Define |
... ...
@@ -1294,7 +1299,7 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1294 1294
     case X86::DEC16r:
1295 1295
     case X86::DEC64_16r:
1296 1296
       if (DisableLEA16)
1297
-        return convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV);
1297
+        return is64Bit ? convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV) : 0;
1298 1298
       assert(MI->getNumOperands() >= 2 && "Unknown dec instruction!");
1299 1299
       NewMI = addRegOffset(BuildMI(MF, MI->getDebugLoc(), get(X86::LEA16r))
1300 1300
                            .addReg(Dest, RegState::Define |
... ...
@@ -1318,7 +1323,7 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1318 1318
     }
1319 1319
     case X86::ADD16rr: {
1320 1320
       if (DisableLEA16)
1321
-        return convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV);
1321
+        return is64Bit ? convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV) : 0;
1322 1322
       assert(MI->getNumOperands() >= 3 && "Unknown add instruction!");
1323 1323
       unsigned Src2 = MI->getOperand(2).getReg();
1324 1324
       bool isKill2 = MI->getOperand(2).isKill();
... ...
@@ -1351,7 +1356,7 @@ X86InstrInfo::convertToThreeAddress(MachineFunction::iterator &MFI,
1351 1351
     case X86::ADD16ri:
1352 1352
     case X86::ADD16ri8:
1353 1353
       if (DisableLEA16)
1354
-        return convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV);
1354
+        return is64Bit ? convertToThreeAddressWithLEA(MIOpc, MFI, MBBI, LV) : 0;
1355 1355
       assert(MI->getNumOperands() >= 3 && "Unknown add instruction!");
1356 1356
       NewMI = addLeaRegOffset(BuildMI(MF, MI->getDebugLoc(), get(X86::LEA16r))
1357 1357
                               .addReg(Dest, RegState::Define |
... ...
@@ -1619,14 +1624,17 @@ bool X86InstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
1619 1619
   MachineBasicBlock::iterator I = MBB.end();
1620 1620
   while (I != MBB.begin()) {
1621 1621
     --I;
1622
-    // Working from the bottom, when we see a non-terminator
1623
-    // instruction, we're done.
1622
+
1623
+    // Working from the bottom, when we see a non-terminator instruction, we're
1624
+    // done.
1624 1625
     if (!isBrAnalysisUnpredicatedTerminator(I, *this))
1625 1626
       break;
1626
-    // A terminator that isn't a branch can't easily be handled
1627
-    // by this analysis.
1627
+
1628
+    // A terminator that isn't a branch can't easily be handled by this
1629
+    // analysis.
1628 1630
     if (!I->getDesc().isBranch())
1629 1631
       return true;
1632
+
1630 1633
     // Handle unconditional branches.
1631 1634
     if (I->getOpcode() == X86::JMP) {
1632 1635
       if (!AllowModify) {
... ...
@@ -1637,8 +1645,10 @@ bool X86InstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
1637 1637
       // If the block has any instructions after a JMP, delete them.
1638 1638
       while (llvm::next(I) != MBB.end())
1639 1639
         llvm::next(I)->eraseFromParent();
1640
+
1640 1641
       Cond.clear();
1641 1642
       FBB = 0;
1643
+
1642 1644
       // Delete the JMP if it's equivalent to a fall-through.
1643 1645
       if (MBB.isLayoutSuccessor(I->getOperand(0).getMBB())) {
1644 1646
         TBB = 0;
... ...
@@ -1646,14 +1656,17 @@ bool X86InstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
1646 1646
         I = MBB.end();
1647 1647
         continue;
1648 1648
       }
1649
+
1649 1650
       // TBB is used to indicate the unconditinal destination.
1650 1651
       TBB = I->getOperand(0).getMBB();
1651 1652
       continue;
1652 1653
     }
1654
+
1653 1655
     // Handle conditional branches.
1654 1656
     X86::CondCode BranchCode = GetCondFromBranchOpc(I->getOpcode());
1655 1657
     if (BranchCode == X86::COND_INVALID)
1656 1658
       return true;  // Can't handle indirect branch.
1659
+
1657 1660
     // Working from the bottom, handle the first conditional branch.
1658 1661
     if (Cond.empty()) {
1659 1662
       FBB = TBB;
... ...
@@ -1661,24 +1674,26 @@ bool X86InstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
1661 1661
       Cond.push_back(MachineOperand::CreateImm(BranchCode));
1662 1662
       continue;
1663 1663
     }
1664
-    // Handle subsequent conditional branches. Only handle the case
1665
-    // where all conditional branches branch to the same destination
1666
-    // and their condition opcodes fit one of the special
1667
-    // multi-branch idioms.
1664
+
1665
+    // Handle subsequent conditional branches. Only handle the case where all
1666
+    // conditional branches branch to the same destination and their condition
1667
+    // opcodes fit one of the special multi-branch idioms.
1668 1668
     assert(Cond.size() == 1);
1669 1669
     assert(TBB);
1670
-    // Only handle the case where all conditional branches branch to
1671
-    // the same destination.
1670
+
1671
+    // Only handle the case where all conditional branches branch to the same
1672
+    // destination.
1672 1673
     if (TBB != I->getOperand(0).getMBB())
1673 1674
       return true;
1674
-    X86::CondCode OldBranchCode = (X86::CondCode)Cond[0].getImm();
1675
+
1675 1676
     // If the conditions are the same, we can leave them alone.
1677
+    X86::CondCode OldBranchCode = (X86::CondCode)Cond[0].getImm();
1676 1678
     if (OldBranchCode == BranchCode)
1677 1679
       continue;
1678
-    // If they differ, see if they fit one of the known patterns.
1679
-    // Theoretically we could handle more patterns here, but
1680
-    // we shouldn't expect to see them if instruction selection
1681
-    // has done a reasonable job.
1680
+
1681
+    // If they differ, see if they fit one of the known patterns. Theoretically,
1682
+    // we could handle more patterns here, but we shouldn't expect to see them
1683
+    // if instruction selection has done a reasonable job.
1682 1684
     if ((OldBranchCode == X86::COND_NP &&
1683 1685
          BranchCode == X86::COND_E) ||
1684 1686
         (OldBranchCode == X86::COND_E &&
... ...
@@ -1691,6 +1706,7 @@ bool X86InstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
1691 1691
       BranchCode = X86::COND_NE_OR_P;
1692 1692
     else
1693 1693
       return true;
1694
+
1694 1695
     // Update the MachineOperand.
1695 1696
     Cond[0].setImm(BranchCode);
1696 1697
   }
... ...
@@ -87,6 +87,7 @@ def X86cmov    : SDNode<"X86ISD::CMOV",     SDTX86Cmov>;
87 87
 def X86brcond  : SDNode<"X86ISD::BRCOND",   SDTX86BrCond,
88 88
                         [SDNPHasChain]>;
89 89
 def X86setcc   : SDNode<"X86ISD::SETCC",    SDTX86SetCC>;
90
+def X86setcc_c : SDNode<"X86ISD::SETCC_CARRY", SDTX86SetCC>;
90 91
 
91 92
 def X86cas : SDNode<"X86ISD::LCMPXCHG_DAG", SDTX86cas,
92 93
                         [SDNPHasChain, SDNPInFlag, SDNPOutFlag, SDNPMayStore,
... ...
@@ -816,7 +817,7 @@ def BSR32rm  : I<0xBD, MRMSrcMem, (outs GR32:$dst), (ins i32mem:$src),
816 816
 
817 817
 let neverHasSideEffects = 1 in
818 818
 def LEA16r   : I<0x8D, MRMSrcMem,
819
-                 (outs GR16:$dst), (ins i32mem:$src),
819
+                 (outs GR16:$dst), (ins lea32mem:$src),
820 820
                  "lea{w}\t{$src|$dst}, {$dst|$src}", []>, OpSize;
821 821
 let isReMaterializable = 1 in
822 822
 def LEA32r   : I<0x8D, MRMSrcMem,
... ...
@@ -3059,6 +3060,21 @@ let Defs = [AH], Uses = [EFLAGS], neverHasSideEffects = 1 in
3059 3059
 def LAHF     : I<0x9F, RawFrm, (outs),  (ins), "lahf", []>;  // AH = flags
3060 3060
 
3061 3061
 let Uses = [EFLAGS] in {
3062
+// Use sbb to materialize carry bit.
3063
+
3064
+let Defs = [EFLAGS], isCodeGenOnly = 1 in {
3065
+def SETB_C8r : I<0x18, MRMInitReg, (outs GR8:$dst), (ins),
3066
+                 "sbb{b}\t$dst, $dst",
3067
+                 [(set GR8:$dst, (X86setcc_c X86_COND_B, EFLAGS))]>;
3068
+def SETB_C16r : I<0x19, MRMInitReg, (outs GR16:$dst), (ins),
3069
+                  "sbb{w}\t$dst, $dst",
3070
+                 [(set GR16:$dst, (zext (X86setcc_c X86_COND_B, EFLAGS)))]>,
3071
+                OpSize;
3072
+def SETB_C32r : I<0x19, MRMInitReg, (outs GR32:$dst), (ins),
3073
+                  "sbb{l}\t$dst, $dst",
3074
+                 [(set GR32:$dst, (zext (X86setcc_c X86_COND_B, EFLAGS)))]>;
3075
+} // isCodeGenOnly
3076
+
3062 3077
 def SETEr    : I<0x94, MRM0r, 
3063 3078
                  (outs GR8   :$dst), (ins),
3064 3079
                  "sete\t$dst",
... ...
@@ -4169,6 +4185,12 @@ def : Pat<(store (shld (loadi16 addr:$dst), (i8 imm:$amt1),
4169 4169
                        GR16:$src2, (i8 imm:$amt2)), addr:$dst),
4170 4170
           (SHLD16mri8 addr:$dst, GR16:$src2, (i8 imm:$amt1))>;
4171 4171
 
4172
+// (anyext (setcc_carry)) -> (zext (setcc_carry))
4173
+def : Pat<(i16 (anyext (X86setcc_c X86_COND_B, EFLAGS))),
4174
+          (SETB_C16r)>;
4175
+def : Pat<(i32 (anyext (X86setcc_c X86_COND_B, EFLAGS))),
4176
+          (SETB_C32r)>;
4177
+
4172 4178
 //===----------------------------------------------------------------------===//
4173 4179
 // EFLAGS-defining Patterns
4174 4180
 //===----------------------------------------------------------------------===//
... ...
@@ -190,8 +190,11 @@ template <> struct DenseMapInfo<Expression> {
190 190
   static bool isEqual(const Expression &LHS, const Expression &RHS) {
191 191
     return LHS == RHS;
192 192
   }
193
-  static bool isPod() { return true; }
194 193
 };
194
+  
195
+template <>
196
+struct isPodLike<Expression> { static const bool value = true; };
197
+
195 198
 }
196 199
 
197 200
 //===----------------------------------------------------------------------===//
... ...
@@ -11200,8 +11200,9 @@ namespace llvm {
11200 11200
       return LHS.PN == RHS.PN && LHS.Shift == RHS.Shift &&
11201 11201
              LHS.Width == RHS.Width;
11202 11202
     }
11203
-    static bool isPod() { return true; }
11204 11203
   };
11204
+  template <>
11205
+  struct isPodLike<LoweredPHIRecord> { static const bool value = true; };
11205 11206
 }
11206 11207
 
11207 11208
 
... ...
@@ -24,18 +24,14 @@
24 24
 #include "llvm/Constants.h"
25 25
 #include "llvm/Instructions.h"
26 26
 #include "llvm/IntrinsicInst.h"
27
-#include "llvm/Type.h"
28 27
 #include "llvm/DerivedTypes.h"
29
-#include "llvm/Analysis/Dominators.h"
30 28
 #include "llvm/Analysis/IVUsers.h"
31
-#include "llvm/Analysis/LoopInfo.h"
32 29
 #include "llvm/Analysis/LoopPass.h"
33 30
 #include "llvm/Analysis/ScalarEvolutionExpander.h"
34 31
 #include "llvm/Transforms/Utils/AddrModeMatcher.h"
35 32
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
36 33
 #include "llvm/Transforms/Utils/Local.h"
37 34
 #include "llvm/ADT/Statistic.h"
38
-#include "llvm/Support/CFG.h"
39 35
 #include "llvm/Support/Debug.h"
40 36
 #include "llvm/Support/CommandLine.h"
41 37
 #include "llvm/Support/ValueHandle.h"
... ...
@@ -85,8 +81,6 @@ namespace {
85 85
 
86 86
   class LoopStrengthReduce : public LoopPass {
87 87
     IVUsers *IU;
88
-    LoopInfo *LI;
89
-    DominatorTree *DT;
90 88
     ScalarEvolution *SE;
91 89
     bool Changed;
92 90
 
... ...
@@ -94,10 +88,6 @@ namespace {
94 94
     /// particular stride.
95 95
     std::map<const SCEV *, IVsOfOneStride> IVsByStride;
96 96
 
97
-    /// StrideNoReuse - Keep track of all the strides whose ivs cannot be
98
-    /// reused (nor should they be rewritten to reuse other strides).
99
-    SmallSet<const SCEV *, 4> StrideNoReuse;
100
-
101 97
     /// DeadInsts - Keep track of instructions we may have made dead, so that
102 98
     /// we can remove them after we are done working.
103 99
     SmallVector<WeakVH, 16> DeadInsts;
... ...
@@ -109,8 +99,7 @@ namespace {
109 109
   public:
110 110
     static char ID; // Pass ID, replacement for typeid
111 111
     explicit LoopStrengthReduce(const TargetLowering *tli = NULL) :
112
-      LoopPass(&ID), TLI(tli) {
113
-    }
112
+      LoopPass(&ID), TLI(tli) {}
114 113
 
115 114
     bool runOnLoop(Loop *L, LPPassManager &LPM);
116 115
 
... ...
@@ -118,13 +107,11 @@ namespace {
118 118
       // We split critical edges, so we change the CFG.  However, we do update
119 119
       // many analyses if they are around.
120 120
       AU.addPreservedID(LoopSimplifyID);
121
-      AU.addPreserved<LoopInfo>();
122
-      AU.addPreserved<DominanceFrontier>();
123
-      AU.addPreserved<DominatorTree>();
121
+      AU.addPreserved("loops");
122
+      AU.addPreserved("domfrontier");
123
+      AU.addPreserved("domtree");
124 124
 
125 125
       AU.addRequiredID(LoopSimplifyID);
126
-      AU.addRequired<LoopInfo>();
127
-      AU.addRequired<DominatorTree>();
128 126
       AU.addRequired<ScalarEvolution>();
129 127
       AU.addPreserved<ScalarEvolution>();
130 128
       AU.addRequired<IVUsers>();
... ...
@@ -228,19 +215,17 @@ void LoopStrengthReduce::DeleteTriviallyDeadInstructions() {
228 228
   if (DeadInsts.empty()) return;
229 229
 
230 230
   while (!DeadInsts.empty()) {
231
-    Instruction *I = dyn_cast_or_null<Instruction>(DeadInsts.back());
232
-    DeadInsts.pop_back();
231
+    Instruction *I = dyn_cast_or_null<Instruction>(DeadInsts.pop_back_val());
233 232
 
234 233
     if (I == 0 || !isInstructionTriviallyDead(I))
235 234
       continue;
236 235
 
237
-    for (User::op_iterator OI = I->op_begin(), E = I->op_end(); OI != E; ++OI) {
236
+    for (User::op_iterator OI = I->op_begin(), E = I->op_end(); OI != E; ++OI)
238 237
       if (Instruction *U = dyn_cast<Instruction>(*OI)) {
239 238
         *OI = 0;
240 239
         if (U->use_empty())
241 240
           DeadInsts.push_back(U);
242 241
       }
243
-    }
244 242
 
245 243
     I->eraseFromParent();
246 244
     Changed = true;
... ...
@@ -265,7 +250,7 @@ static bool containsAddRecFromDifferentLoop(const SCEV *S, Loop *L) {
265 265
       if (newLoop == L)
266 266
         return false;
267 267
       // if newLoop is an outer loop of L, this is OK.
268
-      if (!LoopInfo::isNotAlreadyContainedIn(L, newLoop))
268
+      if (newLoop->contains(L->getHeader()))
269 269
         return false;
270 270
     }
271 271
     return true;
... ...
@@ -338,9 +323,6 @@ namespace {
338 338
   /// BasedUser - For a particular base value, keep information about how we've
339 339
   /// partitioned the expression so far.
340 340
   struct BasedUser {
341
-    /// SE - The current ScalarEvolution object.
342
-    ScalarEvolution *SE;
343
-
344 341
     /// Base - The Base value for the PHI node that needs to be inserted for
345 342
     /// this use.  As the use is processed, information gets moved from this
346 343
     /// field to the Imm field (below).  BasedUser values are sorted by this
... ...
@@ -372,9 +354,9 @@ namespace {
372 372
     bool isUseOfPostIncrementedValue;
373 373
 
374 374
     BasedUser(IVStrideUse &IVSU, ScalarEvolution *se)
375
-      : SE(se), Base(IVSU.getOffset()), Inst(IVSU.getUser()),
375
+      : Base(IVSU.getOffset()), Inst(IVSU.getUser()),
376 376
         OperandValToReplace(IVSU.getOperandValToReplace()),
377
-        Imm(SE->getIntegerSCEV(0, Base->getType())),
377
+        Imm(se->getIntegerSCEV(0, Base->getType())),
378 378
         isUseOfPostIncrementedValue(IVSU.isUseOfPostIncrementedValue()) {}
379 379
 
380 380
     // Once we rewrite the code to insert the new IVs we want, update the
... ...
@@ -383,14 +365,14 @@ namespace {
383 383
     void RewriteInstructionToUseNewBase(const SCEV *const &NewBase,
384 384
                                         Instruction *InsertPt,
385 385
                                        SCEVExpander &Rewriter, Loop *L, Pass *P,
386
-                                        LoopInfo &LI,
387
-                                        SmallVectorImpl<WeakVH> &DeadInsts);
386
+                                        SmallVectorImpl<WeakVH> &DeadInsts,
387
+                                        ScalarEvolution *SE);
388 388
 
389 389
     Value *InsertCodeForBaseAtPosition(const SCEV *const &NewBase,
390 390
                                        const Type *Ty,
391 391
                                        SCEVExpander &Rewriter,
392
-                                       Instruction *IP, Loop *L,
393
-                                       LoopInfo &LI);
392
+                                       Instruction *IP,
393
+                                       ScalarEvolution *SE);
394 394
     void dump() const;
395 395
   };
396 396
 }
... ...
@@ -404,27 +386,12 @@ void BasedUser::dump() const {
404 404
 Value *BasedUser::InsertCodeForBaseAtPosition(const SCEV *const &NewBase,
405 405
                                               const Type *Ty,
406 406
                                               SCEVExpander &Rewriter,
407
-                                              Instruction *IP, Loop *L,
408
-                                              LoopInfo &LI) {
409
-  // Figure out where we *really* want to insert this code.  In particular, if
410
-  // the user is inside of a loop that is nested inside of L, we really don't
411
-  // want to insert this expression before the user, we'd rather pull it out as
412
-  // many loops as possible.
413
-  Instruction *BaseInsertPt = IP;
414
-
415
-  // Figure out the most-nested loop that IP is in.
416
-  Loop *InsertLoop = LI.getLoopFor(IP->getParent());
417
-
418
-  // If InsertLoop is not L, and InsertLoop is nested inside of L, figure out
419
-  // the preheader of the outer-most loop where NewBase is not loop invariant.
420
-  if (L->contains(IP->getParent()))
421
-    while (InsertLoop && NewBase->isLoopInvariant(InsertLoop)) {
422
-      BaseInsertPt = InsertLoop->getLoopPreheader()->getTerminator();
423
-      InsertLoop = InsertLoop->getParentLoop();
424
-    }
425
-
426
-  Value *Base = Rewriter.expandCodeFor(NewBase, 0, BaseInsertPt);
407
+                                              Instruction *IP,
408
+                                              ScalarEvolution *SE) {
409
+  Value *Base = Rewriter.expandCodeFor(NewBase, 0, IP);
427 410
 
411
+  // Wrap the base in a SCEVUnknown so that ScalarEvolution doesn't try to
412
+  // re-analyze it.
428 413
   const SCEV *NewValSCEV = SE->getUnknown(Base);
429 414
 
430 415
   // Always emit the immediate into the same block as the user.
... ...
@@ -443,8 +410,8 @@ Value *BasedUser::InsertCodeForBaseAtPosition(const SCEV *const &NewBase,
443 443
 void BasedUser::RewriteInstructionToUseNewBase(const SCEV *const &NewBase,
444 444
                                                Instruction *NewBasePt,
445 445
                                       SCEVExpander &Rewriter, Loop *L, Pass *P,
446
-                                      LoopInfo &LI,
447
-                                      SmallVectorImpl<WeakVH> &DeadInsts) {
446
+                                      SmallVectorImpl<WeakVH> &DeadInsts,
447
+                                      ScalarEvolution *SE) {
448 448
   if (!isa<PHINode>(Inst)) {
449 449
     // By default, insert code at the user instruction.
450 450
     BasicBlock::iterator InsertPt = Inst;
... ...
@@ -473,7 +440,7 @@ void BasedUser::RewriteInstructionToUseNewBase(const SCEV *const &NewBase,
473 473
     }
474 474
     Value *NewVal = InsertCodeForBaseAtPosition(NewBase,
475 475
                                                 OperandValToReplace->getType(),
476
-                                                Rewriter, InsertPt, L, LI);
476
+                                                Rewriter, InsertPt, SE);
477 477
     // Replace the use of the operand Value with the new Phi we just created.
478 478
     Inst->replaceUsesOfWith(OperandValToReplace, NewVal);
479 479
 
... ...
@@ -535,7 +502,7 @@ void BasedUser::RewriteInstructionToUseNewBase(const SCEV *const &NewBase,
535 535
                                 PHIPred->getTerminator() :
536 536
                                 OldLoc->getParent()->getTerminator();
537 537
         Code = InsertCodeForBaseAtPosition(NewBase, PN->getType(),
538
-                                           Rewriter, InsertPt, L, LI);
538
+                                           Rewriter, InsertPt, SE);
539 539
 
540 540
         DEBUG(errs() << "      Changing PHI use to ");
541 541
         DEBUG(WriteAsOperand(errs(), Code, /*PrintType=*/false));
... ...
@@ -1011,17 +978,13 @@ const SCEV *LoopStrengthReduce::CheckForIVReuse(bool HasBaseReg,
1011 1011
                                 const SCEV *const &Stride,
1012 1012
                                 IVExpr &IV, const Type *Ty,
1013 1013
                                 const std::vector<BasedUser>& UsersToProcess) {
1014
-  if (StrideNoReuse.count(Stride))
1015
-    return SE->getIntegerSCEV(0, Stride->getType());
1016
-
1017 1014
   if (const SCEVConstant *SC = dyn_cast<SCEVConstant>(Stride)) {
1018 1015
     int64_t SInt = SC->getValue()->getSExtValue();
1019 1016
     for (unsigned NewStride = 0, e = IU->StrideOrder.size();
1020 1017
          NewStride != e; ++NewStride) {
1021 1018
       std::map<const SCEV *, IVsOfOneStride>::iterator SI =
1022 1019
                 IVsByStride.find(IU->StrideOrder[NewStride]);
1023
-      if (SI == IVsByStride.end() || !isa<SCEVConstant>(SI->first) ||
1024
-          StrideNoReuse.count(SI->first))
1020
+      if (SI == IVsByStride.end() || !isa<SCEVConstant>(SI->first))
1025 1021
         continue;
1026 1022
       // The other stride has no uses, don't reuse it.
1027 1023
       std::map<const SCEV *, IVUsersOfOneStride *>::iterator UI =
... ...
@@ -1780,8 +1743,8 @@ LoopStrengthReduce::StrengthReduceIVUsersOfStride(const SCEV *const &Stride,
1780 1780
         RewriteExpr = SE->getAddExpr(RewriteExpr, SE->getUnknown(BaseV));
1781 1781
 
1782 1782
       User.RewriteInstructionToUseNewBase(RewriteExpr, NewBasePt,
1783
-                                          Rewriter, L, this, *LI,
1784
-                                          DeadInsts);
1783
+                                          Rewriter, L, this,
1784
+                                          DeadInsts, SE);
1785 1785
 
1786 1786
       // Mark old value we replaced as possibly dead, so that it is eliminated
1787 1787
       // if we just replaced the last use of that value.
... ...
@@ -2745,8 +2708,6 @@ bool LoopStrengthReduce::OptimizeLoopCountIV(Loop *L) {
2745 2745
 
2746 2746
 bool LoopStrengthReduce::runOnLoop(Loop *L, LPPassManager &LPM) {
2747 2747
   IU = &getAnalysis<IVUsers>();
2748
-  LI = &getAnalysis<LoopInfo>();
2749
-  DT = &getAnalysis<DominatorTree>();
2750 2748
   SE = &getAnalysis<ScalarEvolution>();
2751 2749
   Changed = false;
2752 2750
 
... ...
@@ -2792,15 +2753,14 @@ bool LoopStrengthReduce::runOnLoop(Loop *L, LPPassManager &LPM) {
2792 2792
     // After all sharing is done, see if we can adjust the loop to test against
2793 2793
     // zero instead of counting up to a maximum.  This is usually faster.
2794 2794
     OptimizeLoopCountIV(L);
2795
-  }
2796 2795
 
2797
-  // We're done analyzing this loop; release all the state we built up for it.
2798
-  IVsByStride.clear();
2799
-  StrideNoReuse.clear();
2796
+    // We're done analyzing this loop; release all the state we built up for it.
2797
+    IVsByStride.clear();
2800 2798
 
2801
-  // Clean up after ourselves
2802
-  if (!DeadInsts.empty())
2803
-    DeleteTriviallyDeadInstructions();
2799
+    // Clean up after ourselves
2800
+    if (!DeadInsts.empty())
2801
+      DeleteTriviallyDeadInstructions();
2802
+  }
2804 2803
 
2805 2804
   // At this point, it is worth checking to see if any recurrence PHIs are also
2806 2805
   // dead, so that we can remove them as well.
... ...
@@ -154,8 +154,10 @@ template <> struct DenseMapInfo<Expression> {
154 154
   static bool isEqual(const Expression &LHS, const Expression &RHS) {
155 155
     return LHS == RHS;
156 156
   }
157
-  static bool isPod() { return true; }
158 157
 };
158
+template <>
159
+struct isPodLike<Expression> { static const bool value = true; };
160
+
159 161
 }
160 162
 
161 163
 //===----------------------------------------------------------------------===//
... ...
@@ -102,27 +102,25 @@ namespace {
102 102
 
103 103
     int isSafeAllocaToScalarRepl(AllocaInst *AI);
104 104
 
105
-    void isSafeForScalarRepl(Instruction *I, AllocaInst *AI, uint64_t Offset,
106
-                             uint64_t ArrayOffset, AllocaInfo &Info);
107
-    void isSafeGEP(GetElementPtrInst *GEPI, AllocaInst *AI, uint64_t &Offset,
108
-                   uint64_t &ArrayOffset, AllocaInfo &Info);
109
-    void isSafeMemAccess(AllocaInst *AI, uint64_t Offset, uint64_t ArrayOffset,
110
-                         uint64_t MemSize, const Type *MemOpType, bool isStore,
111
-                         AllocaInfo &Info);
112
-    bool TypeHasComponent(const Type *T, uint64_t Offset, uint64_t Size);
113
-    unsigned FindElementAndOffset(const Type *&T, uint64_t &Offset);
105
+    void isSafeUseOfAllocation(Instruction *User, AllocaInst *AI,
106
+                               AllocaInfo &Info);
107
+    void isSafeElementUse(Value *Ptr, bool isFirstElt, AllocaInst *AI,
108
+                          AllocaInfo &Info);
109
+    void isSafeMemIntrinsicOnAllocation(MemIntrinsic *MI, AllocaInst *AI,
110
+                                        unsigned OpNo, AllocaInfo &Info);
111
+    void isSafeUseOfBitCastedAllocation(BitCastInst *User, AllocaInst *AI,
112
+                                        AllocaInfo &Info);
114 113
     
115 114
     void DoScalarReplacement(AllocaInst *AI, 
116 115
                              std::vector<AllocaInst*> &WorkList);
117 116
     void CleanupGEP(GetElementPtrInst *GEP);
118
-    void CleanupAllocaUsers(Value *V);
117
+    void CleanupAllocaUsers(AllocaInst *AI);
119 118
     AllocaInst *AddNewAlloca(Function &F, const Type *Ty, AllocaInst *Base);
120 119
     
121
-    void RewriteForScalarRepl(Instruction *I, AllocaInst *AI, uint64_t Offset,
122
-                              SmallVector<AllocaInst*, 32> &NewElts);
123
-    void RewriteGEP(GetElementPtrInst *GEPI, AllocaInst *AI, uint64_t Offset,
124
-                    SmallVector<AllocaInst*, 32> &NewElts);
125
-    void RewriteMemIntrinUserOfAlloca(MemIntrinsic *MI, Instruction *Inst,
120
+    void RewriteBitCastUserOfAlloca(Instruction *BCInst, AllocaInst *AI,
121
+                                    SmallVector<AllocaInst*, 32> &NewElts);
122
+    
123
+    void RewriteMemIntrinUserOfAlloca(MemIntrinsic *MI, Instruction *BCInst,
126 124
                                       AllocaInst *AI,
127 125
                                       SmallVector<AllocaInst*, 32> &NewElts);
128 126
     void RewriteStoreUserOfWholeAlloca(StoreInst *SI, AllocaInst *AI,
... ...
@@ -362,12 +360,176 @@ void SROA::DoScalarReplacement(AllocaInst *AI,
362 362
     }
363 363
   }
364 364
 
365
-  // Now that we have created the new alloca instructions, rewrite all the
366
-  // uses of the old alloca.
367
-  RewriteForScalarRepl(AI, AI, 0, ElementAllocas);
365
+  // Now that we have created the alloca instructions that we want to use,
366
+  // expand the getelementptr instructions to use them.
367
+  while (!AI->use_empty()) {
368
+    Instruction *User = cast<Instruction>(AI->use_back());
369
+    if (BitCastInst *BCInst = dyn_cast<BitCastInst>(User)) {
370
+      RewriteBitCastUserOfAlloca(BCInst, AI, ElementAllocas);
371
+      BCInst->eraseFromParent();
372
+      continue;
373
+    }
374
+    
375
+    // Replace:
376
+    //   %res = load { i32, i32 }* %alloc
377
+    // with:
378
+    //   %load.0 = load i32* %alloc.0
379
+    //   %insert.0 insertvalue { i32, i32 } zeroinitializer, i32 %load.0, 0 
380
+    //   %load.1 = load i32* %alloc.1
381
+    //   %insert = insertvalue { i32, i32 } %insert.0, i32 %load.1, 1 
382
+    // (Also works for arrays instead of structs)
383
+    if (LoadInst *LI = dyn_cast<LoadInst>(User)) {
384
+      Value *Insert = UndefValue::get(LI->getType());
385
+      for (unsigned i = 0, e = ElementAllocas.size(); i != e; ++i) {
386
+        Value *Load = new LoadInst(ElementAllocas[i], "load", LI);
387
+        Insert = InsertValueInst::Create(Insert, Load, i, "insert", LI);
388
+      }
389
+      LI->replaceAllUsesWith(Insert);
390
+      LI->eraseFromParent();
391
+      continue;
392
+    }
393
+
394
+    // Replace:
395
+    //   store { i32, i32 } %val, { i32, i32 }* %alloc
396
+    // with:
397
+    //   %val.0 = extractvalue { i32, i32 } %val, 0 
398
+    //   store i32 %val.0, i32* %alloc.0
399
+    //   %val.1 = extractvalue { i32, i32 } %val, 1 
400
+    //   store i32 %val.1, i32* %alloc.1
401
+    // (Also works for arrays instead of structs)
402
+    if (StoreInst *SI = dyn_cast<StoreInst>(User)) {
403
+      Value *Val = SI->getOperand(0);
404
+      for (unsigned i = 0, e = ElementAllocas.size(); i != e; ++i) {
405
+        Value *Extract = ExtractValueInst::Create(Val, i, Val->getName(), SI);
406
+        new StoreInst(Extract, ElementAllocas[i], SI);
407
+      }
408
+      SI->eraseFromParent();
409
+      continue;
410
+    }
411
+    
412
+    GetElementPtrInst *GEPI = cast<GetElementPtrInst>(User);
413
+    // We now know that the GEP is of the form: GEP <ptr>, 0, <cst>
414
+    unsigned Idx =
415
+       (unsigned)cast<ConstantInt>(GEPI->getOperand(2))->getZExtValue();
416
+
417
+    assert(Idx < ElementAllocas.size() && "Index out of range?");
418
+    AllocaInst *AllocaToUse = ElementAllocas[Idx];
419
+
420
+    Value *RepValue;
421
+    if (GEPI->getNumOperands() == 3) {
422
+      // Do not insert a new getelementptr instruction with zero indices, only
423
+      // to have it optimized out later.
424
+      RepValue = AllocaToUse;
425
+    } else {
426
+      // We are indexing deeply into the structure, so we still need a
427
+      // getelement ptr instruction to finish the indexing.  This may be
428
+      // expanded itself once the worklist is rerun.
429
+      //
430
+      SmallVector<Value*, 8> NewArgs;
431
+      NewArgs.push_back(Constant::getNullValue(
432
+                                           Type::getInt32Ty(AI->getContext())));
433
+      NewArgs.append(GEPI->op_begin()+3, GEPI->op_end());
434
+      RepValue = GetElementPtrInst::Create(AllocaToUse, NewArgs.begin(),
435
+                                           NewArgs.end(), "", GEPI);
436
+      RepValue->takeName(GEPI);
437
+    }
438
+    
439
+    // If this GEP is to the start of the aggregate, check for memcpys.
440
+    if (Idx == 0 && GEPI->hasAllZeroIndices())
441
+      RewriteBitCastUserOfAlloca(GEPI, AI, ElementAllocas);
442
+
443
+    // Move all of the users over to the new GEP.
444
+    GEPI->replaceAllUsesWith(RepValue);
445
+    // Delete the old GEP
446
+    GEPI->eraseFromParent();
447
+  }
448
+
449
+  // Finally, delete the Alloca instruction
450
+  AI->eraseFromParent();
368 451
   NumReplaced++;
369 452
 }
370
-    
453
+
454
+/// isSafeElementUse - Check to see if this use is an allowed use for a
455
+/// getelementptr instruction of an array aggregate allocation.  isFirstElt
456
+/// indicates whether Ptr is known to the start of the aggregate.
457
+void SROA::isSafeElementUse(Value *Ptr, bool isFirstElt, AllocaInst *AI,
458
+                            AllocaInfo &Info) {
459
+  for (Value::use_iterator I = Ptr->use_begin(), E = Ptr->use_end();
460
+       I != E; ++I) {
461
+    Instruction *User = cast<Instruction>(*I);
462
+    switch (User->getOpcode()) {
463
+    case Instruction::Load:  break;
464
+    case Instruction::Store:
465
+      // Store is ok if storing INTO the pointer, not storing the pointer
466
+      if (User->getOperand(0) == Ptr) return MarkUnsafe(Info);
467
+      break;
468
+    case Instruction::GetElementPtr: {
469
+      GetElementPtrInst *GEP = cast<GetElementPtrInst>(User);
470
+      bool AreAllZeroIndices = isFirstElt;
471
+      if (GEP->getNumOperands() > 1 &&
472
+          (!isa<ConstantInt>(GEP->getOperand(1)) ||
473
+           !cast<ConstantInt>(GEP->getOperand(1))->isZero()))
474
+        // Using pointer arithmetic to navigate the array.
475
+        return MarkUnsafe(Info);
476
+      
477
+      // Verify that any array subscripts are in range.
478
+      for (gep_type_iterator GEPIt = gep_type_begin(GEP),
479
+           E = gep_type_end(GEP); GEPIt != E; ++GEPIt) {
480
+        // Ignore struct elements, no extra checking needed for these.
481
+        if (isa<StructType>(*GEPIt))
482
+          continue;
483
+
484
+        // This GEP indexes an array.  Verify that this is an in-range
485
+        // constant integer. Specifically, consider A[0][i]. We cannot know that
486
+        // the user isn't doing invalid things like allowing i to index an
487
+        // out-of-range subscript that accesses A[1].  Because of this, we have
488
+        // to reject SROA of any accesses into structs where any of the
489
+        // components are variables. 
490
+        ConstantInt *IdxVal = dyn_cast<ConstantInt>(GEPIt.getOperand());
491
+        if (!IdxVal) return MarkUnsafe(Info);
492
+        
493
+        // Are all indices still zero?
494
+        AreAllZeroIndices &= IdxVal->isZero();
495
+        
496
+        if (const ArrayType *AT = dyn_cast<ArrayType>(*GEPIt)) {
497
+          if (IdxVal->getZExtValue() >= AT->getNumElements())
498
+            return MarkUnsafe(Info);
499
+        } else if (const VectorType *VT = dyn_cast<VectorType>(*GEPIt)) {
500
+          if (IdxVal->getZExtValue() >= VT->getNumElements())
501
+            return MarkUnsafe(Info);
502
+        }
503
+      }
504
+      
505
+      isSafeElementUse(GEP, AreAllZeroIndices, AI, Info);
506
+      if (Info.isUnsafe) return;
507
+      break;
508
+    }
509
+    case Instruction::BitCast:
510
+      if (isFirstElt) {
511
+        isSafeUseOfBitCastedAllocation(cast<BitCastInst>(User), AI, Info);
512
+        if (Info.isUnsafe) return;
513
+        break;
514
+      }
515
+      DEBUG(errs() << "  Transformation preventing inst: " << *User << '\n');
516
+      return MarkUnsafe(Info);
517
+    case Instruction::Call:
518
+      if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(User)) {
519
+        if (isFirstElt) {
520
+          isSafeMemIntrinsicOnAllocation(MI, AI, I.getOperandNo(), Info);
521
+          if (Info.isUnsafe) return;
522
+          break;
523
+        }
524
+      }
525
+      DEBUG(errs() << "  Transformation preventing inst: " << *User << '\n');
526
+      return MarkUnsafe(Info);
527
+    default:
528
+      DEBUG(errs() << "  Transformation preventing inst: " << *User << '\n');
529
+      return MarkUnsafe(Info);
530
+    }
531
+  }
532
+  return;  // All users look ok :)
533
+}
534
+
371 535
 /// AllUsersAreLoads - Return true if all users of this value are loads.
372 536
 static bool AllUsersAreLoads(Value *Ptr) {
373 537
   for (Value::use_iterator I = Ptr->use_begin(), E = Ptr->use_end();
... ...
@@ -377,116 +539,72 @@ static bool AllUsersAreLoads(Value *Ptr) {
377 377
   return true;
378 378
 }
379 379
 
380
-/// isSafeForScalarRepl - Check if instruction I is a safe use with regard to
381
-/// performing scalar replacement of alloca AI.  The results are flagged in
382
-/// the Info parameter.  Offset and ArrayOffset indicate the position within
383
-/// AI that is referenced by this instruction.
384
-void SROA::isSafeForScalarRepl(Instruction *I, AllocaInst *AI, uint64_t Offset,
385
-                               uint64_t ArrayOffset, AllocaInfo &Info) {
386
-  for (Value::use_iterator UI = I->use_begin(), E = I->use_end(); UI!=E; ++UI) {
387
-    Instruction *User = cast<Instruction>(*UI);
388
-
389
-    if (BitCastInst *BC = dyn_cast<BitCastInst>(User)) {
390
-      isSafeForScalarRepl(BC, AI, Offset, ArrayOffset, Info);
391
-    } else if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(User)) {
392
-      uint64_t GEPArrayOffset = ArrayOffset;
393
-      uint64_t GEPOffset = Offset;
394
-      isSafeGEP(GEPI, AI, GEPOffset, GEPArrayOffset, Info);
395
-      if (!Info.isUnsafe)
396
-        isSafeForScalarRepl(GEPI, AI, GEPOffset, GEPArrayOffset, Info);
397
-    } else if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(UI)) {
398
-      ConstantInt *Length = dyn_cast<ConstantInt>(MI->getLength());
399
-      if (Length)
400
-        isSafeMemAccess(AI, Offset, ArrayOffset, Length->getZExtValue(), 0,
401
-                        UI.getOperandNo() == 1, Info);
402
-      else
403
-        MarkUnsafe(Info);
404
-    } else if (LoadInst *LI = dyn_cast<LoadInst>(User)) {
405
-      if (!LI->isVolatile()) {
406
-        const Type *LIType = LI->getType();
407
-        isSafeMemAccess(AI, Offset, ArrayOffset, TD->getTypeAllocSize(LIType),
408
-                        LIType, false, Info);
409
-      } else
410
-        MarkUnsafe(Info);
411
-    } else if (StoreInst *SI = dyn_cast<StoreInst>(User)) {
412
-      // Store is ok if storing INTO the pointer, not storing the pointer
413
-      if (!SI->isVolatile() && SI->getOperand(0) != I) {
414
-        const Type *SIType = SI->getOperand(0)->getType();
415
-        isSafeMemAccess(AI, Offset, ArrayOffset, TD->getTypeAllocSize(SIType),
416
-                        SIType, true, Info);
417
-      } else
418
-        MarkUnsafe(Info);
419
-    } else if (isa<DbgInfoIntrinsic>(UI)) {
420
-      // If one user is DbgInfoIntrinsic then check if all users are
421
-      // DbgInfoIntrinsics.
422
-      if (OnlyUsedByDbgInfoIntrinsics(I)) {
423
-        Info.needsCleanup = true;
424
-        return;
425
-      }
426
-      MarkUnsafe(Info);
427
-    } else {
428
-      DEBUG(errs() << "  Transformation preventing inst: " << *User << '\n');
429
-      MarkUnsafe(Info);
430
-    }
431
-    if (Info.isUnsafe) return;
432
-  }
433
-}
380
+/// isSafeUseOfAllocation - Check if this user is an allowed use for an
381
+/// aggregate allocation.
382
+void SROA::isSafeUseOfAllocation(Instruction *User, AllocaInst *AI,
383
+                                 AllocaInfo &Info) {
384
+  if (BitCastInst *C = dyn_cast<BitCastInst>(User))
385
+    return isSafeUseOfBitCastedAllocation(C, AI, Info);
386
+
387
+  if (LoadInst *LI = dyn_cast<LoadInst>(User))
388
+    if (!LI->isVolatile())
389
+      return;// Loads (returning a first class aggregrate) are always rewritable
390
+
391
+  if (StoreInst *SI = dyn_cast<StoreInst>(User))
392
+    if (!SI->isVolatile() && SI->getOperand(0) != AI)
393
+      return;// Store is ok if storing INTO the pointer, not storing the pointer
394
+ 
395
+  GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(User);
396
+  if (GEPI == 0)
397
+    return MarkUnsafe(Info);
434 398
 
435
-/// isSafeGEP - Check if a GEP instruction can be handled for scalar
436
-/// replacement.  It is safe when all the indices are constant, in-bounds
437
-/// references, and when the resulting offset corresponds to an element within
438
-/// the alloca type.  The results are flagged in the Info parameter.  Upon
439
-/// return, Offset is adjusted as specified by the GEP indices.  For the
440
-/// special case of a variable index to a 2-element array, ArrayOffset is set
441
-/// to the array element size.
442
-void SROA::isSafeGEP(GetElementPtrInst *GEPI, AllocaInst *AI,
443
-                     uint64_t &Offset, uint64_t &ArrayOffset,
444
-                     AllocaInfo &Info) {
445
-  gep_type_iterator GEPIt = gep_type_begin(GEPI), E = gep_type_end(GEPI);
446
-  if (GEPIt == E)
447
-    return;
399
+  gep_type_iterator I = gep_type_begin(GEPI), E = gep_type_end(GEPI);
448 400
 
449
-  // The first GEP index must be zero.
450
-  if (!isa<ConstantInt>(GEPIt.getOperand()) ||
451
-      !cast<ConstantInt>(GEPIt.getOperand())->isZero())
401
+  // The GEP is not safe to transform if not of the form "GEP <ptr>, 0, <cst>".
402
+  if (I == E ||
403
+      I.getOperand() != Constant::getNullValue(I.getOperand()->getType())) {
452 404
     return MarkUnsafe(Info);
453
-  if (++GEPIt == E)
454
-    return;
405
+  }
406
+
407
+  ++I;
408
+  if (I == E) return MarkUnsafe(Info);  // ran out of GEP indices??
455 409
 
410
+  bool IsAllZeroIndices = true;
411
+  
456 412
   // If the first index is a non-constant index into an array, see if we can
457 413
   // handle it as a special case.
458
-  const Type *ArrayEltTy = 0;
459
-  if (ArrayOffset == 0 && Offset == 0) {
460
-    if (const ArrayType *AT = dyn_cast<ArrayType>(*GEPIt)) {
461
-      if (!isa<ConstantInt>(GEPIt.getOperand())) {
462
-        uint64_t NumElements = AT->getNumElements();
463
-
464
-        // If this is an array index and the index is not constant, we cannot
465
-        // promote... that is unless the array has exactly one or two elements
466
-        // in it, in which case we CAN promote it, but we have to canonicalize
467
-        // this out if this is the only problem.
468
-        if ((NumElements != 1 && NumElements != 2) || !AllUsersAreLoads(GEPI))
469
-          return MarkUnsafe(Info);
414
+  if (const ArrayType *AT = dyn_cast<ArrayType>(*I)) {
415
+    if (!isa<ConstantInt>(I.getOperand())) {
416
+      IsAllZeroIndices = 0;
417
+      uint64_t NumElements = AT->getNumElements();
418
+      
419
+      // If this is an array index and the index is not constant, we cannot
420
+      // promote... that is unless the array has exactly one or two elements in
421
+      // it, in which case we CAN promote it, but we have to canonicalize this
422
+      // out if this is the only problem.
423
+      if ((NumElements == 1 || NumElements == 2) &&
424
+          AllUsersAreLoads(GEPI)) {
470 425
         Info.needsCleanup = true;
471
-        ArrayOffset = TD->getTypeAllocSizeInBits(AT->getElementType());
472
-        ArrayEltTy = AT->getElementType();
473
-        ++GEPIt;
426
+        return;  // Canonicalization required!
474 427
       }
428
+      return MarkUnsafe(Info);
475 429
     }
476 430
   }
477
-
431
+ 
478 432
   // Walk through the GEP type indices, checking the types that this indexes
479 433
   // into.
480
-  for (; GEPIt != E; ++GEPIt) {
434
+  for (; I != E; ++I) {
481 435
     // Ignore struct elements, no extra checking needed for these.
482
-    if (isa<StructType>(*GEPIt))
436
+    if (isa<StructType>(*I))
483 437
       continue;
438
+    
439
+    ConstantInt *IdxVal = dyn_cast<ConstantInt>(I.getOperand());
440
+    if (!IdxVal) return MarkUnsafe(Info);
484 441
 
485
-    ConstantInt *IdxVal = dyn_cast<ConstantInt>(GEPIt.getOperand());
486
-    if (!IdxVal)
487
-      return MarkUnsafe(Info);
488
-
489
-    if (const ArrayType *AT = dyn_cast<ArrayType>(*GEPIt)) {
442
+    // Are all indices still zero?
443
+    IsAllZeroIndices &= IdxVal->isZero();
444
+    
445
+    if (const ArrayType *AT = dyn_cast<ArrayType>(*I)) {
490 446
       // This GEP indexes an array.  Verify that this is an in-range constant
491 447
       // integer. Specifically, consider A[0][i]. We cannot know that the user
492 448
       // isn't doing invalid things like allowing i to index an out-of-range
... ...
@@ -494,254 +612,144 @@ void SROA::isSafeGEP(GetElementPtrInst *GEPI, AllocaInst *AI,
494 494
       // of any accesses into structs where any of the components are variables.
495 495
       if (IdxVal->getZExtValue() >= AT->getNumElements())
496 496
         return MarkUnsafe(Info);
497
-    } else {
498
-      const VectorType *VT = dyn_cast<VectorType>(*GEPIt);
499
-      assert(VT && "unexpected type in GEP type iterator");
497
+    } else if (const VectorType *VT = dyn_cast<VectorType>(*I)) {
500 498
       if (IdxVal->getZExtValue() >= VT->getNumElements())
501 499
         return MarkUnsafe(Info);
502 500
     }
503 501
   }
504
-
505
-  // All the indices are safe.  Now compute the offset due to this GEP and
506
-  // check if the alloca has a component element at that offset.
507
-  if (ArrayOffset == 0) {
508
-    SmallVector<Value*, 8> Indices(GEPI->op_begin() + 1, GEPI->op_end());
509
-    Offset += TD->getIndexedOffset(GEPI->getPointerOperandType(),
510
-                                   &Indices[0], Indices.size());
511
-  } else {
512
-    // Both array elements have the same type, so it suffices to check one of
513
-    // them.  Copy the GEP indices starting from the array index, but replace
514
-    // that variable index with a constant zero.
515
-    SmallVector<Value*, 8> Indices(GEPI->op_begin() + 2, GEPI->op_end());
516
-    Indices[0] = Constant::getNullValue(Type::getInt32Ty(GEPI->getContext()));
517
-    const Type *ArrayEltPtr = PointerType::getUnqual(ArrayEltTy);
518
-    Offset += TD->getIndexedOffset(ArrayEltPtr, &Indices[0], Indices.size());
519
-  }
520
-  if (!TypeHasComponent(AI->getAllocatedType(), Offset, 0))
521
-    MarkUnsafe(Info);
522
-}
523
-
524
-/// isSafeMemAccess - Check if a load/store/memcpy operates on the entire AI
525
-/// alloca or has an offset and size that corresponds to a component element
526
-/// within it.  The offset checked here may have been formed from a GEP with a
527
-/// pointer bitcasted to a different type.
528
-void SROA::isSafeMemAccess(AllocaInst *AI, uint64_t Offset,
529
-                           uint64_t ArrayOffset, uint64_t MemSize,
530
-                           const Type *MemOpType, bool isStore,
531
-                           AllocaInfo &Info) {
532
-  // Check if this is a load/store of the entire alloca.
533
-  if (Offset == 0 && ArrayOffset == 0 &&
534
-      MemSize == TD->getTypeAllocSize(AI->getAllocatedType())) {
535
-    bool UsesAggregateType = (MemOpType == AI->getAllocatedType());
536
-    // This is safe for MemIntrinsics (where MemOpType is 0), integer types
537
-    // (which are essentially the same as the MemIntrinsics, especially with
538
-    // regard to copying padding between elements), or references using the
539
-    // aggregate type of the alloca.
540
-    if (!MemOpType || isa<IntegerType>(MemOpType) || UsesAggregateType) {
541
-      if (!UsesAggregateType) {
542
-        if (isStore)
543
-          Info.isMemCpyDst = true;
544
-        else
545
-          Info.isMemCpySrc = true;
546
-      }
547
-      return;
548
-    }
549
-  }
550
-  // Check if the offset/size correspond to a component within the alloca type.
551
-  const Type *T = AI->getAllocatedType();
552
-  if (TypeHasComponent(T, Offset, MemSize) &&
553
-      (ArrayOffset == 0 || TypeHasComponent(T, Offset + ArrayOffset, MemSize)))
554
-    return;
555
-
556
-  return MarkUnsafe(Info);
502
+  
503
+  // If there are any non-simple uses of this getelementptr, make sure to reject
504
+  // them.
505
+  return isSafeElementUse(GEPI, IsAllZeroIndices, AI, Info);
557 506
 }
558 507
 
559
-/// TypeHasComponent - Return true if T has a component type with the
560
-/// specified offset and size.  If Size is zero, do not check the size.
561
-bool SROA::TypeHasComponent(const Type *T, uint64_t Offset, uint64_t Size) {
562
-  const Type *EltTy;
563
-  uint64_t EltSize;
564
-  if (const StructType *ST = dyn_cast<StructType>(T)) {
565
-    const StructLayout *Layout = TD->getStructLayout(ST);
566
-    unsigned EltIdx = Layout->getElementContainingOffset(Offset);
567
-    EltTy = ST->getContainedType(EltIdx);
568
-    EltSize = TD->getTypeAllocSize(EltTy);
569
-    Offset -= Layout->getElementOffset(EltIdx);
570
-  } else if (const ArrayType *AT = dyn_cast<ArrayType>(T)) {
571
-    EltTy = AT->getElementType();
572
-    EltSize = TD->getTypeAllocSize(EltTy);
573
-    Offset %= EltSize;
574
-  } else {
575
-    return false;
508
+/// isSafeMemIntrinsicOnAllocation - Check if the specified memory
509
+/// intrinsic can be promoted by SROA.  At this point, we know that the operand
510
+/// of the memintrinsic is a pointer to the beginning of the allocation.
511
+void SROA::isSafeMemIntrinsicOnAllocation(MemIntrinsic *MI, AllocaInst *AI,
512
+                                          unsigned OpNo, AllocaInfo &Info) {
513
+  // If not constant length, give up.
514
+  ConstantInt *Length = dyn_cast<ConstantInt>(MI->getLength());
515
+  if (!Length) return MarkUnsafe(Info);
516
+  
517
+  // If not the whole aggregate, give up.
518
+  if (Length->getZExtValue() !=
519
+      TD->getTypeAllocSize(AI->getType()->getElementType()))
520
+    return MarkUnsafe(Info);
521
+  
522
+  // We only know about memcpy/memset/memmove.
523
+  if (!isa<MemIntrinsic>(MI))
524
+    return MarkUnsafe(Info);
525
+  
526
+  // Otherwise, we can transform it.  Determine whether this is a memcpy/set
527
+  // into or out of the aggregate.
528
+  if (OpNo == 1)
529
+    Info.isMemCpyDst = true;
530
+  else {
531
+    assert(OpNo == 2);
532
+    Info.isMemCpySrc = true;
576 533
   }
577
-  if (Offset == 0 && (Size == 0 || EltSize == Size))
578
-    return true;
579
-  // Check if the component spans multiple elements.
580
-  if (Offset + Size > EltSize)
581
-    return false;
582
-  return TypeHasComponent(EltTy, Offset, Size);
583 534
 }
584 535
 
585
-/// RewriteForScalarRepl - Alloca AI is being split into NewElts, so rewrite
586
-/// the instruction I, which references it, to use the separate elements.
587
-/// Offset indicates the position within AI that is referenced by this
588
-/// instruction.
589
-void SROA::RewriteForScalarRepl(Instruction *I, AllocaInst *AI, uint64_t Offset,
590
-                                SmallVector<AllocaInst*, 32> &NewElts) {
591
-  for (Value::use_iterator UI = I->use_begin(), E = I->use_end(); UI != E; ) {
592
-    Instruction *User = cast<Instruction>(*UI++);
536
+/// isSafeUseOfBitCastedAllocation - Check if all users of this bitcast
537
+/// from an alloca are safe for SROA of that alloca.
538
+void SROA::isSafeUseOfBitCastedAllocation(BitCastInst *BC, AllocaInst *AI,
539
+                                          AllocaInfo &Info) {
540
+  for (Value::use_iterator UI = BC->use_begin(), E = BC->use_end();
541
+       UI != E; ++UI) {
542
+    if (BitCastInst *BCU = dyn_cast<BitCastInst>(UI)) {
543
+      isSafeUseOfBitCastedAllocation(BCU, AI, Info);
544
+    } else if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(UI)) {
545
+      isSafeMemIntrinsicOnAllocation(MI, AI, UI.getOperandNo(), Info);
546
+    } else if (StoreInst *SI = dyn_cast<StoreInst>(UI)) {
547
+      if (SI->isVolatile())
548
+        return MarkUnsafe(Info);
549
+      
550
+      // If storing the entire alloca in one chunk through a bitcasted pointer
551
+      // to integer, we can transform it.  This happens (for example) when you
552
+      // cast a {i32,i32}* to i64* and store through it.  This is similar to the
553
+      // memcpy case and occurs in various "byval" cases and emulated memcpys.
554
+      if (isa<IntegerType>(SI->getOperand(0)->getType()) &&
555
+          TD->getTypeAllocSize(SI->getOperand(0)->getType()) ==
556
+          TD->getTypeAllocSize(AI->getType()->getElementType())) {
557
+        Info.isMemCpyDst = true;
558
+        continue;
559
+      }
560
+      return MarkUnsafe(Info);
561
+    } else if (LoadInst *LI = dyn_cast<LoadInst>(UI)) {
562
+      if (LI->isVolatile())
563
+        return MarkUnsafe(Info);
593 564
 
594
-    if (BitCastInst *BC = dyn_cast<BitCastInst>(User)) {
595
-      if (BC->getOperand(0) == AI)
596
-        BC->setOperand(0, NewElts[0]);
597
-      // If the bitcast type now matches the operand type, it will be removed
598
-      // after processing its uses.
599
-      RewriteForScalarRepl(BC, AI, Offset, NewElts);
600
-    } else if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(User)) {
601
-      RewriteGEP(GEPI, AI, Offset, NewElts);
602
-    } else if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(User)) {
603
-      ConstantInt *Length = dyn_cast<ConstantInt>(MI->getLength());
604
-      uint64_t MemSize = Length->getZExtValue();
605
-      if (Offset == 0 &&
606
-          MemSize == TD->getTypeAllocSize(AI->getAllocatedType()))
607
-        RewriteMemIntrinUserOfAlloca(MI, I, AI, NewElts);
608
-    } else if (LoadInst *LI = dyn_cast<LoadInst>(User)) {
609
-      const Type *LIType = LI->getType();
610
-      if (LIType == AI->getAllocatedType()) {
611
-        // Replace:
612
-        //   %res = load { i32, i32 }* %alloc
613
-        // with:
614
-        //   %load.0 = load i32* %alloc.0
615
-        //   %insert.0 insertvalue { i32, i32 } zeroinitializer, i32 %load.0, 0
616
-        //   %load.1 = load i32* %alloc.1
617
-        //   %insert = insertvalue { i32, i32 } %insert.0, i32 %load.1, 1
618
-        // (Also works for arrays instead of structs)
619
-        Value *Insert = UndefValue::get(LIType);
620
-        for (unsigned i = 0, e = NewElts.size(); i != e; ++i) {
621
-          Value *Load = new LoadInst(NewElts[i], "load", LI);
622
-          Insert = InsertValueInst::Create(Insert, Load, i, "insert", LI);
623
-        }
624
-        LI->replaceAllUsesWith(Insert);
625
-        LI->eraseFromParent();
626
-      } else if (isa<IntegerType>(LIType) &&
627
-                 TD->getTypeAllocSize(LIType) ==
628
-                 TD->getTypeAllocSize(AI->getAllocatedType())) {
629
-        // If this is a load of the entire alloca to an integer, rewrite it.
630
-        RewriteLoadUserOfWholeAlloca(LI, AI, NewElts);
565
+      // If loading the entire alloca in one chunk through a bitcasted pointer
566
+      // to integer, we can transform it.  This happens (for example) when you
567
+      // cast a {i32,i32}* to i64* and load through it.  This is similar to the
568
+      // memcpy case and occurs in various "byval" cases and emulated memcpys.
569
+      if (isa<IntegerType>(LI->getType()) &&
570
+          TD->getTypeAllocSize(LI->getType()) ==
571
+          TD->getTypeAllocSize(AI->getType()->getElementType())) {
572
+        Info.isMemCpySrc = true;
573
+        continue;
631 574
       }
632
-    } else if (StoreInst *SI = dyn_cast<StoreInst>(User)) {
633
-      Value *Val = SI->getOperand(0);
634
-      const Type *SIType = Val->getType();
635
-      if (SIType == AI->getAllocatedType()) {
636
-        // Replace:
637
-        //   store { i32, i32 } %val, { i32, i32 }* %alloc
638
-        // with:
639
-        //   %val.0 = extractvalue { i32, i32 } %val, 0
640
-        //   store i32 %val.0, i32* %alloc.0
641
-        //   %val.1 = extractvalue { i32, i32 } %val, 1
642
-        //   store i32 %val.1, i32* %alloc.1
643
-        // (Also works for arrays instead of structs)
644
-        for (unsigned i = 0, e = NewElts.size(); i != e; ++i) {
645
-          Value *Extract = ExtractValueInst::Create(Val, i, Val->getName(), SI);
646
-          new StoreInst(Extract, NewElts[i], SI);
647
-        }
648
-        SI->eraseFromParent();
649
-      } else if (isa<IntegerType>(SIType) &&
650
-                 TD->getTypeAllocSize(SIType) ==
651
-                 TD->getTypeAllocSize(AI->getAllocatedType())) {
652
-        // If this is a store of the entire alloca from an integer, rewrite it.
653
-        RewriteStoreUserOfWholeAlloca(SI, AI, NewElts);
575
+      return MarkUnsafe(Info);
576
+    } else if (isa<DbgInfoIntrinsic>(UI)) {
577
+      // If one user is DbgInfoIntrinsic then check if all users are
578
+      // DbgInfoIntrinsics.
579
+      if (OnlyUsedByDbgInfoIntrinsics(BC)) {
580
+        Info.needsCleanup = true;
581
+        return;
654 582
       }
583
+      else
584
+        MarkUnsafe(Info);
655 585
     }
656
-  }
657
-  // Delete unused instructions and identity bitcasts.
658
-  if (I->use_empty())
659
-    I->eraseFromParent();
660
-  else if (BitCastInst *BC = dyn_cast<BitCastInst>(I)) {
661
-    if (BC->getDestTy() == BC->getSrcTy()) {
662
-      BC->replaceAllUsesWith(BC->getOperand(0));
663
-      BC->eraseFromParent();
586
+    else {
587
+      return MarkUnsafe(Info);
664 588
     }
589
+    if (Info.isUnsafe) return;
665 590
   }
666 591
 }
667 592
 
668
-/// FindElementAndOffset - Return the index of the element containing Offset
669
-/// within the specified type, which must be either a struct or an array.
670
-/// Sets T to the type of the element and Offset to the offset within that
671
-/// element.
672
-unsigned SROA::FindElementAndOffset(const Type *&T, uint64_t &Offset) {
673
-  unsigned Idx = 0;
674
-  if (const StructType *ST = dyn_cast<StructType>(T)) {
675
-    const StructLayout *Layout = TD->getStructLayout(ST);
676
-    Idx = Layout->getElementContainingOffset(Offset);
677
-    T = ST->getContainedType(Idx);
678
-    Offset -= Layout->getElementOffset(Idx);
679
-  } else {
680
-    const ArrayType *AT = dyn_cast<ArrayType>(T);
681
-    assert(AT && "unexpected type for scalar replacement");
682
-    T = AT->getElementType();
683
-    uint64_t EltSize = TD->getTypeAllocSize(T);
684
-    Idx = (unsigned)(Offset / EltSize);
685
-    Offset -= Idx * EltSize;
686
-  }
687
-  return Idx;
688
-}
593
+/// RewriteBitCastUserOfAlloca - BCInst (transitively) bitcasts AI, or indexes
594
+/// to its first element.  Transform users of the cast to use the new values
595
+/// instead.
596
+void SROA::RewriteBitCastUserOfAlloca(Instruction *BCInst, AllocaInst *AI,
597
+                                      SmallVector<AllocaInst*, 32> &NewElts) {
598
+  Value::use_iterator UI = BCInst->use_begin(), UE = BCInst->use_end();
599
+  while (UI != UE) {
600
+    Instruction *User = cast<Instruction>(*UI++);
601
+    if (BitCastInst *BCU = dyn_cast<BitCastInst>(User)) {
602
+      RewriteBitCastUserOfAlloca(BCU, AI, NewElts);
603
+      if (BCU->use_empty()) BCU->eraseFromParent();
604
+      continue;
605
+    }
689 606
 
690
-/// RewriteGEP - Check if this GEP instruction moves the pointer across
691
-/// elements of the alloca that are being split apart, and if so, rewrite
692
-/// the GEP to be relative to the new element.
693
-void SROA::RewriteGEP(GetElementPtrInst *GEPI, AllocaInst *AI, uint64_t Offset,
694
-                      SmallVector<AllocaInst*, 32> &NewElts) {
695
-  Instruction *Val = GEPI;
696
-
697
-  uint64_t OldOffset = Offset;
698
-  SmallVector<Value*, 8> Indices(GEPI->op_begin() + 1, GEPI->op_end());
699
-  Offset += TD->getIndexedOffset(GEPI->getPointerOperandType(),
700
-                                 &Indices[0], Indices.size());
701
-
702
-  const Type *T = AI->getAllocatedType();
703
-  unsigned OldIdx = FindElementAndOffset(T, OldOffset);
704
-  if (GEPI->getOperand(0) == AI)
705
-    OldIdx = ~0U; // Force the GEP to be rewritten.
706
-
707
-  T = AI->getAllocatedType();
708
-  uint64_t EltOffset = Offset;
709
-  unsigned Idx = FindElementAndOffset(T, EltOffset);
710
-
711
-  // If this GEP moves the pointer across elements of the alloca that are
712
-  // being split, then it needs to be rewritten.
713
-  if (Idx != OldIdx) {
714
-    const Type *i32Ty = Type::getInt32Ty(AI->getContext());
715
-    SmallVector<Value*, 8> NewArgs;
716
-    NewArgs.push_back(Constant::getNullValue(i32Ty));
717
-    while (EltOffset != 0) {
718
-      unsigned EltIdx = FindElementAndOffset(T, EltOffset);
719
-      NewArgs.push_back(ConstantInt::get(i32Ty, EltIdx));
607
+    if (MemIntrinsic *MI = dyn_cast<MemIntrinsic>(User)) {
608
+      // This must be memcpy/memmove/memset of the entire aggregate.
609
+      // Split into one per element.
610
+      RewriteMemIntrinUserOfAlloca(MI, BCInst, AI, NewElts);
611
+      continue;
720 612
     }
721
-    if (NewArgs.size() > 1) {
722
-      Val = GetElementPtrInst::CreateInBounds(NewElts[Idx], NewArgs.begin(),
723
-                                              NewArgs.end(), "", GEPI);
724
-      Val->takeName(GEPI);
725
-      if (Val->getType() != GEPI->getType())
726
-        Val = new BitCastInst(Val, GEPI->getType(), Val->getNameStr(), GEPI);
727
-    } else {
728
-      Val = NewElts[Idx];
729
-      // Insert a new bitcast.  If the types match, it will be removed after
730
-      // handling all of its uses.
731
-      Val = new BitCastInst(Val, GEPI->getType(), Val->getNameStr(), GEPI);
732
-      Val->takeName(GEPI);
613
+      
614
+    if (StoreInst *SI = dyn_cast<StoreInst>(User)) {
615
+      // If this is a store of the entire alloca from an integer, rewrite it.
616
+      RewriteStoreUserOfWholeAlloca(SI, AI, NewElts);
617
+      continue;
733 618
     }
734 619
 
735
-    GEPI->replaceAllUsesWith(Val);
736
-    GEPI->eraseFromParent();
620
+    if (LoadInst *LI = dyn_cast<LoadInst>(User)) {
621
+      // If this is a load of the entire alloca to an integer, rewrite it.
622
+      RewriteLoadUserOfWholeAlloca(LI, AI, NewElts);
623
+      continue;
624
+    }
625
+    
626
+    // Otherwise it must be some other user of a gep of the first pointer.  Just
627
+    // leave these alone.
628
+    continue;
737 629
   }
738
-
739
-  RewriteForScalarRepl(Val, AI, Offset, NewElts);
740 630
 }
741 631
 
742 632
 /// RewriteMemIntrinUserOfAlloca - MI is a memcpy/memset/memmove from or to AI.
743 633
 /// Rewrite it to copy or set the elements of the scalarized memory.
744
-void SROA::RewriteMemIntrinUserOfAlloca(MemIntrinsic *MI, Instruction *Inst,
634
+void SROA::RewriteMemIntrinUserOfAlloca(MemIntrinsic *MI, Instruction *BCInst,
745 635
                                         AllocaInst *AI,
746 636
                                         SmallVector<AllocaInst*, 32> &NewElts) {
747 637
   
... ...
@@ -753,10 +761,10 @@ void SROA::RewriteMemIntrinUserOfAlloca(MemIntrinsic *MI, Instruction *Inst,
753 753
   LLVMContext &Context = MI->getContext();
754 754
   unsigned MemAlignment = MI->getAlignment();
755 755
   if (MemTransferInst *MTI = dyn_cast<MemTransferInst>(MI)) { // memmove/memcopy
756
-    if (Inst == MTI->getRawDest())
756
+    if (BCInst == MTI->getRawDest())
757 757
       OtherPtr = MTI->getRawSource();
758 758
     else {
759
-      assert(Inst == MTI->getRawSource());
759
+      assert(BCInst == MTI->getRawSource());
760 760
       OtherPtr = MTI->getRawDest();
761 761
     }
762 762
   }
... ...
@@ -790,7 +798,7 @@ void SROA::RewriteMemIntrinUserOfAlloca(MemIntrinsic *MI, Instruction *Inst,
790 790
   // Process each element of the aggregate.
791 791
   Value *TheFn = MI->getOperand(0);
792 792
   const Type *BytePtrTy = MI->getRawDest()->getType();
793
-  bool SROADest = MI->getRawDest() == Inst;
793
+  bool SROADest = MI->getRawDest() == BCInst;
794 794
   
795 795
   Constant *Zero = Constant::getNullValue(Type::getInt32Ty(MI->getContext()));
796 796
 
... ...
@@ -802,9 +810,9 @@ void SROA::RewriteMemIntrinUserOfAlloca(MemIntrinsic *MI, Instruction *Inst,
802 802
     if (OtherPtr) {
803 803
       Value *Idx[2] = { Zero,
804 804
                       ConstantInt::get(Type::getInt32Ty(MI->getContext()), i) };
805
-      OtherElt = GetElementPtrInst::CreateInBounds(OtherPtr, Idx, Idx + 2,
805
+      OtherElt = GetElementPtrInst::Create(OtherPtr, Idx, Idx + 2,
806 806
                                            OtherPtr->getNameStr()+"."+Twine(i),
807
-                                                   MI);
807
+                                           MI);
808 808
       uint64_t EltOffset;
809 809
       const PointerType *OtherPtrTy = cast<PointerType>(OtherPtr->getType());
810 810
       if (const StructType *ST =
... ...
@@ -929,9 +937,15 @@ void SROA::RewriteStoreUserOfWholeAlloca(StoreInst *SI, AllocaInst *AI,
929 929
   // Extract each element out of the integer according to its structure offset
930 930
   // and store the element value to the individual alloca.
931 931
   Value *SrcVal = SI->getOperand(0);
932
-  const Type *AllocaEltTy = AI->getAllocatedType();
932
+  const Type *AllocaEltTy = AI->getType()->getElementType();
933 933
   uint64_t AllocaSizeBits = TD->getTypeAllocSizeInBits(AllocaEltTy);
934 934
   
935
+  // If this isn't a store of an integer to the whole alloca, it may be a store
936
+  // to the first element.  Just ignore the store in this case and normal SROA
937
+  // will handle it.
938
+  if (!isa<IntegerType>(SrcVal->getType()) ||
939
+      TD->getTypeAllocSizeInBits(SrcVal->getType()) != AllocaSizeBits)
940
+    return;
935 941
   // Handle tail padding by extending the operand
936 942
   if (TD->getTypeSizeInBits(SrcVal->getType()) != AllocaSizeBits)
937 943
     SrcVal = new ZExtInst(SrcVal,
... ...
@@ -1045,9 +1059,16 @@ void SROA::RewriteLoadUserOfWholeAlloca(LoadInst *LI, AllocaInst *AI,
1045 1045
                                         SmallVector<AllocaInst*, 32> &NewElts) {
1046 1046
   // Extract each element out of the NewElts according to its structure offset
1047 1047
   // and form the result value.
1048
-  const Type *AllocaEltTy = AI->getAllocatedType();
1048
+  const Type *AllocaEltTy = AI->getType()->getElementType();
1049 1049
   uint64_t AllocaSizeBits = TD->getTypeAllocSizeInBits(AllocaEltTy);
1050 1050
   
1051
+  // If this isn't a load of the whole alloca to an integer, it may be a load
1052
+  // of the first element.  Just ignore the load in this case and normal SROA
1053
+  // will handle it.
1054
+  if (!isa<IntegerType>(LI->getType()) ||
1055
+      TD->getTypeAllocSizeInBits(LI->getType()) != AllocaSizeBits)
1056
+    return;
1057
+  
1051 1058
   DEBUG(errs() << "PROMOTING LOAD OF WHOLE ALLOCA: " << *AI << '\n' << *LI
1052 1059
                << '\n');
1053 1060
   
... ...
@@ -1121,6 +1142,7 @@ void SROA::RewriteLoadUserOfWholeAlloca(LoadInst *LI, AllocaInst *AI,
1121 1121
   LI->eraseFromParent();
1122 1122
 }
1123 1123
 
1124
+
1124 1125
 /// HasPadding - Return true if the specified type has any structure or
1125 1126
 /// alignment padding, false otherwise.
1126 1127
 static bool HasPadding(const Type *Ty, const TargetData &TD) {
... ...
@@ -1170,10 +1192,14 @@ int SROA::isSafeAllocaToScalarRepl(AllocaInst *AI) {
1170 1170
   // the users are safe to transform.
1171 1171
   AllocaInfo Info;
1172 1172
   
1173
-  isSafeForScalarRepl(AI, AI, 0, 0, Info);
1174
-  if (Info.isUnsafe) {
1175
-    DEBUG(errs() << "Cannot transform: " << *AI << '\n');
1176
-    return 0;
1173
+  for (Value::use_iterator I = AI->use_begin(), E = AI->use_end();
1174
+       I != E; ++I) {
1175
+    isSafeUseOfAllocation(cast<Instruction>(*I), AI, Info);
1176
+    if (Info.isUnsafe) {
1177
+      DEBUG(errs() << "Cannot transform: " << *AI << "\n  due to user: "
1178
+                   << **I << '\n');
1179
+      return 0;
1180
+    }
1177 1181
   }
1178 1182
   
1179 1183
   // Okay, we know all the users are promotable.  If the aggregate is a memcpy
... ...
@@ -1182,7 +1208,7 @@ int SROA::isSafeAllocaToScalarRepl(AllocaInst *AI) {
1182 1182
   // types, but may actually be used.  In these cases, we refuse to promote the
1183 1183
   // struct.
1184 1184
   if (Info.isMemCpySrc && Info.isMemCpyDst &&
1185
-      HasPadding(AI->getAllocatedType(), *TD))
1185
+      HasPadding(AI->getType()->getElementType(), *TD))
1186 1186
     return 0;
1187 1187
 
1188 1188
   // If we require cleanup, return 1, otherwise return 3.
... ...
@@ -1219,15 +1245,15 @@ void SROA::CleanupGEP(GetElementPtrInst *GEPI) {
1219 1219
   // Insert the new GEP instructions, which are properly indexed.
1220 1220
   SmallVector<Value*, 8> Indices(GEPI->op_begin()+1, GEPI->op_end());
1221 1221
   Indices[1] = Constant::getNullValue(Type::getInt32Ty(GEPI->getContext()));
1222
-  Value *ZeroIdx = GetElementPtrInst::CreateInBounds(GEPI->getOperand(0),
1223
-                                                     Indices.begin(),
1224
-                                                     Indices.end(),
1225
-                                                     GEPI->getName()+".0",GEPI);
1222
+  Value *ZeroIdx = GetElementPtrInst::Create(GEPI->getOperand(0),
1223
+                                             Indices.begin(),
1224
+                                             Indices.end(),
1225
+                                             GEPI->getName()+".0", GEPI);
1226 1226
   Indices[1] = ConstantInt::get(Type::getInt32Ty(GEPI->getContext()), 1);
1227
-  Value *OneIdx = GetElementPtrInst::CreateInBounds(GEPI->getOperand(0),
1228
-                                                    Indices.begin(),
1229
-                                                    Indices.end(),
1230
-                                                    GEPI->getName()+".1", GEPI);
1227
+  Value *OneIdx = GetElementPtrInst::Create(GEPI->getOperand(0),
1228
+                                            Indices.begin(),
1229
+                                            Indices.end(),
1230
+                                            GEPI->getName()+".1", GEPI);
1231 1231
   // Replace all loads of the variable index GEP with loads from both
1232 1232
   // indexes and a select.
1233 1233
   while (!GEPI->use_empty()) {
... ...
@@ -1238,24 +1264,22 @@ void SROA::CleanupGEP(GetElementPtrInst *GEPI) {
1238 1238
     LI->replaceAllUsesWith(R);
1239 1239
     LI->eraseFromParent();
1240 1240
   }
1241
+  GEPI->eraseFromParent();
1241 1242
 }
1242 1243
 
1244
+
1243 1245
 /// CleanupAllocaUsers - If SROA reported that it can promote the specified
1244 1246
 /// allocation, but only if cleaned up, perform the cleanups required.
1245
-void SROA::CleanupAllocaUsers(Value *V) {
1247
+void SROA::CleanupAllocaUsers(AllocaInst *AI) {
1246 1248
   // At this point, we know that the end result will be SROA'd and promoted, so
1247 1249
   // we can insert ugly code if required so long as sroa+mem2reg will clean it
1248 1250
   // up.
1249
-  for (Value::use_iterator UI = V->use_begin(), E = V->use_end();
1251
+  for (Value::use_iterator UI = AI->use_begin(), E = AI->use_end();
1250 1252
        UI != E; ) {
1251 1253
     User *U = *UI++;
1252
-    if (isa<BitCastInst>(U)) {
1253
-      CleanupAllocaUsers(U);
1254
-    } else if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(U)) {
1254
+    if (GetElementPtrInst *GEPI = dyn_cast<GetElementPtrInst>(U))
1255 1255
       CleanupGEP(GEPI);
1256
-      CleanupAllocaUsers(GEPI);
1257
-      if (GEPI->use_empty()) GEPI->eraseFromParent();
1258
-    } else {
1256
+    else {
1259 1257
       Instruction *I = cast<Instruction>(U);
1260 1258
       SmallVector<DbgInfoIntrinsic *, 2> DbgInUses;
1261 1259
       if (!isa<StoreInst>(I) && OnlyUsedByDbgInfoIntrinsics(I, &DbgInUses)) {
... ...
@@ -1371,7 +1395,7 @@ bool SROA::CanConvertToScalar(Value *V, bool &IsNotTrivial, const Type *&VecTy,
1371 1371
       
1372 1372
       // Compute the offset that this GEP adds to the pointer.
1373 1373
       SmallVector<Value*, 8> Indices(GEP->op_begin()+1, GEP->op_end());
1374
-      uint64_t GEPOffset = TD->getIndexedOffset(GEP->getPointerOperandType(),
1374
+      uint64_t GEPOffset = TD->getIndexedOffset(GEP->getOperand(0)->getType(),
1375 1375
                                                 &Indices[0], Indices.size());
1376 1376
       // See if all uses can be converted.
1377 1377
       if (!CanConvertToScalar(GEP, IsNotTrivial, VecTy, SawVec,Offset+GEPOffset,
... ...
@@ -1433,7 +1457,7 @@ void SROA::ConvertUsesToScalar(Value *Ptr, AllocaInst *NewAI, uint64_t Offset) {
1433 1433
     if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(User)) {
1434 1434
       // Compute the offset that this GEP adds to the pointer.
1435 1435
       SmallVector<Value*, 8> Indices(GEP->op_begin()+1, GEP->op_end());
1436
-      uint64_t GEPOffset = TD->getIndexedOffset(GEP->getPointerOperandType(),
1436
+      uint64_t GEPOffset = TD->getIndexedOffset(GEP->getOperand(0)->getType(),
1437 1437
                                                 &Indices[0], Indices.size());
1438 1438
       ConvertUsesToScalar(GEP, NewAI, Offset+GEPOffset*8);
1439 1439
       GEP->eraseFromParent();
... ...
@@ -2644,10 +2644,11 @@ bool SimplifyLibCalls::doInitialization(Module &M) {
2644 2644
 //   * strcspn("",a) -> 0
2645 2645
 //   * strcspn(s,"") -> strlen(a)
2646 2646
 //
2647
-// strstr:
2647
+// strstr: (PR5783)
2648 2648
 //   * strstr(x,x)  -> x
2649
-//   * strstr(s1,s2) -> offset_of_s2_in(s1)
2650
-//       (if s1 and s2 are constant strings)
2649
+//   * strstr(x, "") -> x
2650
+//   * strstr(x, "a") -> strchr(x, 'a')
2651
+//   * strstr(s1,s2) -> result   (if s1 and s2 are constant strings)
2651 2652
 //
2652 2653
 // tan, tanf, tanl:
2653 2654
 //   * tan(atan(x)) -> x
... ...
@@ -55,7 +55,6 @@ struct DenseMapInfo<std::pair<BasicBlock*, unsigned> > {
55 55
   static bool isEqual(const EltTy &LHS, const EltTy &RHS) {
56 56
     return LHS == RHS;
57 57
   }
58
-  static bool isPod() { return true; }
59 58
 };
60 59
 }
61 60
 
... ...
@@ -102,7 +101,7 @@ namespace {
102 102
   public:
103 103
     typedef std::vector<Value *> ValVector;
104 104
     
105
-    RenamePassData() {}
105
+    RenamePassData() : BB(NULL), Pred(NULL), Values() {}
106 106
     RenamePassData(BasicBlock *B, BasicBlock *P,
107 107
                    const ValVector &V) : BB(B), Pred(P), Values(V) {}
108 108
     BasicBlock *BB;
... ...
@@ -62,7 +62,6 @@ struct DenseMapAPIntKeyInfo {
62 62
   static bool isEqual(const KeyTy &LHS, const KeyTy &RHS) {
63 63
     return LHS == RHS;
64 64
   }
65
-  static bool isPod() { return false; }
66 65
 };
67 66
 
68 67
 struct DenseMapAPFloatKeyInfo {
... ...
@@ -89,7 +88,6 @@ struct DenseMapAPFloatKeyInfo {
89 89
   static bool isEqual(const KeyTy &LHS, const KeyTy &RHS) {
90 90
     return LHS == RHS;
91 91
   }
92
-  static bool isPod() { return false; }
93 92
 };
94 93
 
95 94
 class LLVMContextImpl {
... ...
@@ -41,6 +41,10 @@ Pass::~Pass() {
41 41
 // Force out-of-line virtual method.
42 42
 ModulePass::~ModulePass() { }
43 43
 
44
+PassManagerType ModulePass::getPotentialPassManagerType() const {
45
+  return PMT_ModulePassManager;
46
+}
47
+
44 48
 bool Pass::mustPreserveAnalysisID(const PassInfo *AnalysisID) const {
45 49
   return Resolver->getAnalysisIfAvailable(AnalysisID, true) != 0;
46 50
 }
... ...
@@ -60,6 +64,27 @@ const char *Pass::getPassName() const {
60 60
   return "Unnamed pass: implement Pass::getPassName()";
61 61
 }
62 62
 
63
+void Pass::preparePassManager(PMStack &) {
64
+  // By default, don't do anything.
65
+}
66
+
67
+PassManagerType Pass::getPotentialPassManagerType() const {
68
+  // Default implementation.
69
+  return PMT_Unknown; 
70
+}
71
+
72
+void Pass::getAnalysisUsage(AnalysisUsage &) const {
73
+  // By default, no analysis results are used, all are invalidated.
74
+}
75
+
76
+void Pass::releaseMemory() {
77
+  // By default, don't do anything.
78
+}
79
+
80
+void Pass::verifyAnalysis() const {
81
+  // By default, don't do anything.
82
+}
83
+
63 84
 // print - Print out the internal state of the pass.  This is called by Analyze
64 85
 // to print out the contents of an analysis.  Otherwise it is not necessary to
65 86
 // implement this method.
... ...
@@ -79,6 +104,10 @@ void Pass::dump() const {
79 79
 // Force out-of-line virtual method.
80 80
 ImmutablePass::~ImmutablePass() { }
81 81
 
82
+void ImmutablePass::initializePass() {
83
+  // By default, don't do anything.
84
+}
85
+
82 86
 //===----------------------------------------------------------------------===//
83 87
 // FunctionPass Implementation
84 88
 //
... ...
@@ -107,6 +136,20 @@ bool FunctionPass::run(Function &F) {
107 107
   return Changed | doFinalization(*F.getParent());
108 108
 }
109 109
 
110
+bool FunctionPass::doInitialization(Module &) {
111
+  // By default, don't do anything.
112
+  return false;
113
+}
114
+
115
+bool FunctionPass::doFinalization(Module &) {
116
+  // By default, don't do anything.
117
+  return false;
118
+}
119
+
120
+PassManagerType FunctionPass::getPotentialPassManagerType() const {
121
+  return PMT_FunctionPassManager;
122
+}
123
+
110 124
 //===----------------------------------------------------------------------===//
111 125
 // BasicBlockPass Implementation
112 126
 //
... ...
@@ -121,6 +164,30 @@ bool BasicBlockPass::runOnFunction(Function &F) {
121 121
   return Changed | doFinalization(F);
122 122
 }
123 123
 
124
+bool BasicBlockPass::doInitialization(Module &) {
125
+  // By default, don't do anything.
126
+  return false;
127
+}
128
+
129
+bool BasicBlockPass::doInitialization(Function &) {
130
+  // By default, don't do anything.
131
+  return false;
132
+}
133
+
134
+bool BasicBlockPass::doFinalization(Function &) {
135
+  // By default, don't do anything.
136
+  return false;
137
+}
138
+
139
+bool BasicBlockPass::doFinalization(Module &) {
140
+  // By default, don't do anything.
141
+  return false;
142
+}
143
+
144
+PassManagerType BasicBlockPass::getPotentialPassManagerType() const {
145
+  return PMT_BasicBlockPassManager; 
146
+}
147
+
124 148
 //===----------------------------------------------------------------------===//
125 149
 // Pass Registration mechanism
126 150
 //
... ...
@@ -1,5 +1,7 @@
1
-; RUN: llc < %s -mtriple=i386-apple-darwin -asm-verbose=false   | FileCheck %s -check-prefix=32BIT
2 1
 ; RUN: llc < %s -mtriple=x86_64-apple-darwin -asm-verbose=false | FileCheck %s -check-prefix=64BIT
2
+; rdar://7329206
3
+
4
+; In 32-bit the partial register stall would degrade performance.
3 5
 
4 6
 define zeroext i16 @t1(i16 zeroext %c, i16 zeroext %k) nounwind ssp {
5 7
 entry:
6 8
new file mode 100644
... ...
@@ -0,0 +1,15 @@
0
+; RUN: llc < %s -march=x86 -o %t
1
+; RUN: grep "movl	.48, %ecx" %t
2
+; RUN: grep "movl	.24, %edx" %t
3
+; RUN: grep "movl	.12, %eax" %t
4
+
5
+%0 = type { i32, i32, i32 }
6
+
7
+define internal fastcc %0 @ReturnBigStruct() nounwind readnone {
8
+entry:
9
+  %0 = insertvalue %0 zeroinitializer, i32 12, 0
10
+  %1 = insertvalue %0 %0, i32 24, 1
11
+  %2 = insertvalue %0 %1, i32 48, 2
12
+  ret %0 %2
13
+}
14
+
0 15
new file mode 100644
... ...
@@ -0,0 +1,37 @@
0
+; RUN: llc < %s -mtriple=x86_64-apple-darwin | FileCheck %s
1
+; XFAIL: *
2
+; rdar://7329206
3
+
4
+; Use sbb x, x to materialize carry bit in a GPR. The value is either
5
+; all 1's or all 0's.
6
+
7
+define zeroext i16 @t1(i16 zeroext %x) nounwind readnone ssp {
8
+entry:
9
+; CHECK: t1:
10
+; CHECK: seta %al
11
+; CHECK: movzbl %al, %eax
12
+; CHECK: shll $5, %eax
13
+  %0 = icmp ugt i16 %x, 26                        ; <i1> [#uses=1]
14
+  %iftmp.1.0 = select i1 %0, i16 32, i16 0        ; <i16> [#uses=1]
15
+  ret i16 %iftmp.1.0
16
+}
17
+
18
+define zeroext i16 @t2(i16 zeroext %x) nounwind readnone ssp {
19
+entry:
20
+; CHECK: t2:
21
+; CHECK: sbbl %eax, %eax
22
+; CHECK: andl $32, %eax
23
+  %0 = icmp ult i16 %x, 26                        ; <i1> [#uses=1]
24
+  %iftmp.0.0 = select i1 %0, i16 32, i16 0        ; <i16> [#uses=1]
25
+  ret i16 %iftmp.0.0
26
+}
27
+
28
+define i64 @t3(i64 %x) nounwind readnone ssp {
29
+entry:
30
+; CHECK: t3:
31
+; CHECK: sbbq %rax, %rax
32
+; CHECK: andq $64, %rax
33
+  %0 = icmp ult i64 %x, 18                        ; <i1> [#uses=1]
34
+  %iftmp.2.0 = select i1 %0, i64 64, i64 0        ; <i64> [#uses=1]
35
+  ret i64 %iftmp.2.0
36
+}
0 37
new file mode 100644
... ...
@@ -0,0 +1,13 @@
0
+; RUN: llc < %s -march=x86-64 -disable-mmx | grep punpcklwd | count 2
1
+
2
+define void @foo() nounwind {
3
+  %cti69 = trunc <8 x i32> undef to <8 x i16>     ; <<8 x i16>> [#uses=1]
4
+  store <8 x i16> %cti69, <8 x i16>* undef
5
+  ret void
6
+}
7
+
8
+define void @bar() nounwind {
9
+  %cti44 = trunc <4 x i32> undef to <4 x i16>     ; <<4 x i16>> [#uses=1]
10
+  store <4 x i16> %cti44, <4 x i16>* undef
11
+  ret void
12
+}
0 13
new file mode 100644
... ...
@@ -0,0 +1,25 @@
0
+; RUN: llc < %s -march=x86 | FileCheck %s
1
+
2
+define i32 @t1(i8 zeroext %x) nounwind readnone ssp {
3
+entry:
4
+; CHECK: t1:
5
+; CHECK: shll
6
+; CHECK-NOT: movzwl
7
+; CHECK: ret
8
+  %0 = zext i8 %x to i16
9
+  %1 = shl i16 %0, 5
10
+  %2 = zext i16 %1 to i32
11
+  ret i32 %2
12
+}
13
+
14
+define i32 @t2(i8 zeroext %x) nounwind readnone ssp {
15
+entry:
16
+; CHECK: t2:
17
+; CHECK: shrl
18
+; CHECK-NOT: movzwl
19
+; CHECK: ret
20
+  %0 = zext i8 %x to i16
21
+  %1 = lshr i16 %0, 3
22
+  %2 = zext i16 %1 to i32
23
+  ret i32 %2
24
+}
... ...
@@ -336,8 +336,8 @@ separate option groups syntactically.
336 336
      it is synonymous with ``required``. Incompatible with ``required`` and
337 337
      ``zero_or_one``.
338 338
 
339
-   - ``zero_or_one`` - the option can be specified zero or one times. Useful
340
-     only for list options in conjunction with ``multi_val``. Incompatible with
339
+   - ``optional`` - the option can be specified zero or one times. Useful only
340
+     for list options in conjunction with ``multi_val``. Incompatible with
341 341
      ``required`` and ``one_or_more``.
342 342
 
343 343
    - ``hidden`` - the description of this option will not appear in
... ...
@@ -356,14 +356,15 @@ separate option groups syntactically.
356 356
    - ``multi_val n`` - this option takes *n* arguments (can be useful in some
357 357
      special cases). Usage example: ``(parameter_list_option "foo", (multi_val
358 358
      3))``; the command-line syntax is '-foo a b c'. Only list options can have
359
-     this attribute; you can, however, use the ``one_or_more``, ``zero_or_one``
359
+     this attribute; you can, however, use the ``one_or_more``, ``optional``
360 360
      and ``required`` properties.
361 361
 
362 362
    - ``init`` - this option has a default value, either a string (if it is a
363
-     parameter), or a boolean (if it is a switch; boolean constants are called
364
-     ``true`` and ``false``). List options can't have this attribute. Usage
365
-     examples: ``(switch_option "foo", (init true))``; ``(prefix_option "bar",
366
-     (init "baz"))``.
363
+     parameter), or a boolean (if it is a switch; as in C++, boolean constants
364
+     are called ``true`` and ``false``). List options can't have ``init``
365
+     attribute.
366
+     Usage examples: ``(switch_option "foo", (init true))``; ``(prefix_option
367
+     "bar", (init "baz"))``.
367 368
 
368 369
    - ``extern`` - this option is defined in some other plugin, see `below`__.
369 370
 
... ...
@@ -534,6 +534,31 @@ TEST_F(JITTest, FunctionPointersOutliveTheirCreator) {
534 534
 #endif
535 535
 }
536 536
 
537
+}  // anonymous namespace
538
+// This variable is intentionally defined differently in the statically-compiled
539
+// program from the IR input to the JIT to assert that the JIT doesn't use its
540
+// definition.
541
+extern "C" int32_t JITTest_AvailableExternallyGlobal;
542
+int32_t JITTest_AvailableExternallyGlobal = 42;
543
+namespace {
544
+
545
+TEST_F(JITTest, AvailableExternallyGlobalIsntEmitted) {
546
+  TheJIT->DisableLazyCompilation(true);
547
+  LoadAssembly("@JITTest_AvailableExternallyGlobal = "
548
+               "  available_externally global i32 7 "
549
+               " "
550
+               "define i32 @loader() { "
551
+               "  %result = load i32* @JITTest_AvailableExternallyGlobal "
552
+               "  ret i32 %result "
553
+               "} ");
554
+  Function *loaderIR = M->getFunction("loader");
555
+
556
+  int32_t (*loader)() = reinterpret_cast<int32_t(*)()>(
557
+    (intptr_t)TheJIT->getPointerToFunction(loaderIR));
558
+  EXPECT_EQ(42, loader()) << "func should return 42 from the external global,"
559
+                          << " not 7 from the IR version.";
560
+}
561
+
537 562
 // This code is copied from JITEventListenerTest, but it only runs once for all
538 563
 // the tests in this directory.  Everything seems fine, but that's strange
539 564
 // behavior.
... ...
@@ -13,3 +13,6 @@ LINK_COMPONENTS := asmparser core support jit native
13 13
 
14 14
 include $(LEVEL)/Makefile.config
15 15
 include $(LLVM_SRC_ROOT)/unittests/Makefile.unittest
16
+
17
+# Permit these tests to use the JIT's symbolic lookup.
18
+LD.Flags += $(RDYNAMIC)
... ...
@@ -317,9 +317,9 @@ sub RunLoggedCommand {
317 317
   } else {
318 318
       if ($VERBOSE) {
319 319
           print "$Title\n";
320
-          print "$Command 2>&1 > $Log\n";
320
+          print "$Command > $Log 2>&1\n";
321 321
       }
322
-      system "$Command 2>&1 > $Log";
322
+      system "$Command > $Log 2>&1";
323 323
   }
324 324
 }
325 325
 
... ...
@@ -336,9 +336,9 @@ sub RunAppendingLoggedCommand {
336 336
   } else {
337 337
       if ($VERBOSE) {
338 338
           print "$Title\n";
339
-          print "$Command 2>&1 > $Log\n";
339
+          print "$Command >> $Log 2>&1\n";
340 340
       }
341
-      system "$Command 2>&1 >> $Log";
341
+      system "$Command >> $Log 2>&1";
342 342
   }
343 343
 }
344 344
 
... ...
@@ -393,10 +393,8 @@ sub CopyFile { #filename, newfile
393 393
 # to our central server via the post method
394 394
 #
395 395
 #~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
396
-sub SendData {
397
-    $host = $_[0];
398
-    $file = $_[1];
399
-    $variables = $_[2];
396
+sub WriteSentData {
397
+    $variables = $_[0];
400 398
 
401 399
     # Write out the "...-sentdata.txt" file.
402 400
 
... ...
@@ -406,6 +404,12 @@ sub SendData {
406 406
         $sentdata.= "$x  => $value\n";
407 407
     }
408 408
     WriteFile "$Prefix-sentdata.txt", $sentdata;
409
+}
410
+
411
+sub SendData {
412
+    $host = $_[0];
413
+    $file = $_[1];
414
+    $variables = $_[2];
409 415
 
410 416
     if (!($SUBMITAUX eq "")) {
411 417
         system "$SUBMITAUX \"$Prefix-sentdata.txt\"";
... ...
@@ -503,8 +507,8 @@ sub BuildLLVM {
503 503
   }
504 504
   RunAppendingLoggedCommand("(time -p $NICE $MAKECMD $MAKEOPTS)", $BuildLog, "BUILD");
505 505
 
506
-  if (`grep '^$MAKECMD\[^:]*: .*Error' $BuildLog | wc -l` + 0 ||
507
-      `grep '^$MAKECMD: \*\*\*.*Stop.' $BuildLog | wc -l` + 0) {
506
+  if (`grep -a '^$MAKECMD\[^:]*: .*Error' $BuildLog | wc -l` + 0 ||
507
+      `grep -a '^$MAKECMD: \*\*\*.*Stop.' $BuildLog | wc -l` + 0) {
508 508
     return 0;
509 509
   }
510 510
 
... ...
@@ -531,15 +535,15 @@ sub TestDirectory {
531 531
   $LLCBetaOpts = `$MAKECMD print-llcbeta-option`;
532 532
 
533 533
   my $ProgramsTable;
534
-  if (`grep '^$MAKECMD\[^:]: .*Error' $ProgramTestLog | wc -l` + 0) {
534
+  if (`grep -a '^$MAKECMD\[^:]: .*Error' $ProgramTestLog | wc -l` + 0) {
535 535
     $ProgramsTable="Error running test $SubDir\n";
536 536
     print "ERROR TESTING\n";
537
-  } elsif (`grep '^$MAKECMD\[^:]: .*No rule to make target' $ProgramTestLog | wc -l` + 0) {
537
+  } elsif (`grep -a '^$MAKECMD\[^:]: .*No rule to make target' $ProgramTestLog | wc -l` + 0) {
538 538
     $ProgramsTable="Makefile error running tests $SubDir!\n";
539 539
     print "ERROR TESTING\n";
540 540
   } else {
541 541
     # Create a list of the tests which were run...
542
-    system "egrep 'TEST-(PASS|FAIL)' < $ProgramTestLog ".
542
+    system "egrep -a 'TEST-(PASS|FAIL)' < $ProgramTestLog ".
543 543
            "| sort > $Prefix-$SubDir-Tests.txt";
544 544
   }
545 545
   $ProgramsTable = ReadFile "report.nightly.csv";
... ...
@@ -797,6 +801,9 @@ my %hash_of_data = (
797 797
   'a_file_sizes' => ""
798 798
 );
799 799
 
800
+# Write out the "...-sentdata.txt" file.
801
+WriteSentData \%hash_of_data;
802
+
800 803
 if ($SUBMIT || !($SUBMITAUX eq "")) {
801 804
   my $response = SendData $SUBMITSERVER,$SUBMITSCRIPT,\%hash_of_data;
802 805
   if( $VERBOSE) { print "============================\n$response"; }
... ...
@@ -15,8 +15,6 @@
15 15
 #include "Record.h"
16 16
 
17 17
 #include "llvm/ADT/IntrusiveRefCntPtr.h"
18
-#include "llvm/ADT/SmallVector.h"
19
-#include "llvm/ADT/StringExtras.h"
20 18
 #include "llvm/ADT/StringMap.h"
21 19
 #include "llvm/ADT/StringSet.h"
22 20
 #include <algorithm>
... ...
@@ -211,7 +209,7 @@ OptionType::OptionType stringToOptionType(const std::string& T) {
211 211
 namespace OptionDescriptionFlags {
212 212
   enum OptionDescriptionFlags { Required = 0x1, Hidden = 0x2,
213 213
                                 ReallyHidden = 0x4, Extern = 0x8,
214
-                                OneOrMore = 0x10, ZeroOrOne = 0x20,
214
+                                OneOrMore = 0x10, Optional = 0x20,
215 215
                                 CommaSeparated = 0x40 };
216 216
 }
217 217
 
... ...
@@ -260,8 +258,8 @@ struct OptionDescription {
260 260
   bool isOneOrMore() const;
261 261
   void setOneOrMore();
262 262
 
263
-  bool isZeroOrOne() const;
264
-  void setZeroOrOne();
263
+  bool isOptional() const;
264
+  void setOptional();
265 265
 
266 266
   bool isHidden() const;
267 267
   void setHidden();
... ...
@@ -331,11 +329,11 @@ void OptionDescription::setOneOrMore() {
331 331
   Flags |= OptionDescriptionFlags::OneOrMore;
332 332
 }
333 333
 
334
-bool OptionDescription::isZeroOrOne() const {
335
-  return Flags & OptionDescriptionFlags::ZeroOrOne;
334
+bool OptionDescription::isOptional() const {
335
+  return Flags & OptionDescriptionFlags::Optional;
336 336
 }
337
-void OptionDescription::setZeroOrOne() {
338
-  Flags |= OptionDescriptionFlags::ZeroOrOne;
337
+void OptionDescription::setOptional() {
338
+  Flags |= OptionDescriptionFlags::Optional;
339 339
 }
340 340
 
341 341
 bool OptionDescription::isHidden() const {
... ...
@@ -548,7 +546,7 @@ public:
548 548
       AddHandler("one_or_more", &CollectOptionProperties::onOneOrMore);
549 549
       AddHandler("really_hidden", &CollectOptionProperties::onReallyHidden);
550 550
       AddHandler("required", &CollectOptionProperties::onRequired);
551
-      AddHandler("zero_or_one", &CollectOptionProperties::onZeroOrOne);
551
+      AddHandler("optional", &CollectOptionProperties::onOptional);
552 552
       AddHandler("comma_separated", &CollectOptionProperties::onCommaSeparated);
553 553
 
554 554
       staticMembersInitialized_ = true;
... ...
@@ -595,8 +593,8 @@ private:
595 595
 
596 596
   void onRequired (const DagInit* d) {
597 597
     checkNumberOfArguments(d, 0);
598
-    if (optDesc_.isOneOrMore() || optDesc_.isZeroOrOne())
599
-      throw "Only one of (required), (zero_or_one) or "
598
+    if (optDesc_.isOneOrMore() || optDesc_.isOptional())
599
+      throw "Only one of (required), (optional) or "
600 600
         "(one_or_more) properties is allowed!";
601 601
     optDesc_.setRequired();
602 602
   }
... ...
@@ -617,8 +615,8 @@ private:
617 617
 
618 618
   void onOneOrMore (const DagInit* d) {
619 619
     checkNumberOfArguments(d, 0);
620
-    if (optDesc_.isRequired() || optDesc_.isZeroOrOne())
621
-      throw "Only one of (required), (zero_or_one) or "
620
+    if (optDesc_.isRequired() || optDesc_.isOptional())
621
+      throw "Only one of (required), (optional) or "
622 622
         "(one_or_more) properties is allowed!";
623 623
     if (!OptionType::IsList(optDesc_.Type))
624 624
       llvm::errs() << "Warning: specifying the 'one_or_more' property "
... ...
@@ -626,15 +624,15 @@ private:
626 626
     optDesc_.setOneOrMore();
627 627
   }
628 628
 
629
-  void onZeroOrOne (const DagInit* d) {
629
+  void onOptional (const DagInit* d) {
630 630
     checkNumberOfArguments(d, 0);
631 631
     if (optDesc_.isRequired() || optDesc_.isOneOrMore())
632
-      throw "Only one of (required), (zero_or_one) or "
632
+      throw "Only one of (required), (optional) or "
633 633
         "(one_or_more) properties is allowed!";
634 634
     if (!OptionType::IsList(optDesc_.Type))
635
-      llvm::errs() << "Warning: specifying the 'zero_or_one' property"
635
+      llvm::errs() << "Warning: specifying the 'optional' property"
636 636
         "on a non-list option will have no effect.\n";
637
-    optDesc_.setZeroOrOne();
637
+    optDesc_.setOptional();
638 638
   }
639 639
 
640 640
   void onMultiVal (const DagInit* d) {
... ...
@@ -1454,9 +1452,9 @@ void EmitCaseConstructHandler(const Init* Case, unsigned IndentLevel,
1454 1454
            EmitCaseStatementCallback<F>(Callback, O), IndentLevel);
1455 1455
 }
1456 1456
 
1457
-/// TokenizeCmdline - converts from "$CALL(HookName, 'Arg1', 'Arg2')/path" to
1458
-/// ["$CALL(", "HookName", "Arg1", "Arg2", ")/path"] .
1459
-/// Helper function used by EmitCmdLineVecFill and.
1457
+/// TokenizeCmdline - converts from
1458
+/// "$CALL(HookName, 'Arg1', 'Arg2')/path -arg1 -arg2" to
1459
+/// ["$CALL(", "HookName", "Arg1", "Arg2", ")/path", "-arg1", "-arg2"].
1460 1460
 void TokenizeCmdline(const std::string& CmdLine, StrVector& Out) {
1461 1461
   const char* Delimiters = " \t\n\v\f\r";
1462 1462
   enum TokenizerState
... ...
@@ -1537,62 +1535,99 @@ void TokenizeCmdline(const std::string& CmdLine, StrVector& Out) {
1537 1537
   }
1538 1538
 }
1539 1539
 
1540
-/// SubstituteSpecialCommands - Perform string substitution for $CALL
1541
-/// and $ENV. Helper function used by EmitCmdLineVecFill().
1542
-StrVector::const_iterator SubstituteSpecialCommands
1543
-(StrVector::const_iterator Pos, StrVector::const_iterator End, raw_ostream& O)
1540
+/// SubstituteCall - Given "$CALL(HookName, [Arg1 [, Arg2 [...]]])", output
1541
+/// "hooks::HookName([Arg1 [, Arg2 [, ...]]])". Helper function used by
1542
+/// SubstituteSpecialCommands().
1543
+StrVector::const_iterator
1544
+SubstituteCall (StrVector::const_iterator Pos,
1545
+                StrVector::const_iterator End,
1546
+                bool IsJoin, raw_ostream& O)
1544 1547
 {
1548
+  const char* errorMessage = "Syntax error in $CALL invocation!";
1549
+  checkedIncrement(Pos, End, errorMessage);
1550
+  const std::string& CmdName = *Pos;
1545 1551
 
1546
-  const std::string& cmd = *Pos;
1547
-
1548
-  if (cmd == "$CALL") {
1549
-    checkedIncrement(Pos, End, "Syntax error in $CALL invocation!");
1550
-    const std::string& CmdName = *Pos;
1552
+  if (CmdName == ")")
1553
+    throw "$CALL invocation: empty argument list!";
1551 1554
 
1552
-    if (CmdName == ")")
1553
-      throw "$CALL invocation: empty argument list!";
1555
+  O << "hooks::";
1556
+  O << CmdName << "(";
1554 1557
 
1555
-    O << "hooks::";
1556
-    O << CmdName << "(";
1557 1558
 
1559
+  bool firstIteration = true;
1560
+  while (true) {
1561
+    checkedIncrement(Pos, End, errorMessage);
1562
+    const std::string& Arg = *Pos;
1563
+    assert(Arg.size() != 0);
1558 1564
 
1559
-    bool firstIteration = true;
1560
-    while (true) {
1561
-      checkedIncrement(Pos, End, "Syntax error in $CALL invocation!");
1562
-      const std::string& Arg = *Pos;
1563
-      assert(Arg.size() != 0);
1565
+    if (Arg[0] == ')')
1566
+      break;
1564 1567
 
1565
-      if (Arg[0] == ')')
1566
-        break;
1568
+    if (firstIteration)
1569
+      firstIteration = false;
1570
+    else
1571
+      O << ", ";
1567 1572
 
1568
-      if (firstIteration)
1569
-        firstIteration = false;
1573
+    if (Arg == "$INFILE") {
1574
+      if (IsJoin)
1575
+        throw "$CALL(Hook, $INFILE) can't be used with a Join tool!";
1570 1576
       else
1571
-        O << ", ";
1572
-
1577
+        O << "inFile.c_str()";
1578
+    }
1579
+    else {
1573 1580
       O << '"' << Arg << '"';
1574 1581
     }
1582
+  }
1575 1583
 
1576
-    O << ')';
1584
+  O << ')';
1577 1585
 
1578
-  }
1579
-  else if (cmd == "$ENV") {
1580
-    checkedIncrement(Pos, End, "Syntax error in $ENV invocation!");
1581
-    const std::string& EnvName = *Pos;
1586
+  return Pos;
1587
+}
1588
+
1589
+/// SubstituteEnv - Given '$ENV(VAR_NAME)', output 'getenv("VAR_NAME")'. Helper
1590
+/// function used by SubstituteSpecialCommands().
1591
+StrVector::const_iterator
1592
+SubstituteEnv (StrVector::const_iterator Pos,
1593
+               StrVector::const_iterator End, raw_ostream& O)
1594
+{
1595
+  const char* errorMessage = "Syntax error in $ENV invocation!";
1596
+  checkedIncrement(Pos, End, errorMessage);
1597
+  const std::string& EnvName = *Pos;
1598
+
1599
+  if (EnvName == ")")
1600
+    throw "$ENV invocation: empty argument list!";
1601
+
1602
+  O << "checkCString(std::getenv(\"";
1603
+  O << EnvName;
1604
+  O << "\"))";
1605
+
1606
+  checkedIncrement(Pos, End, errorMessage);
1607
+
1608
+  return Pos;
1609
+}
1582 1610
 
1583
-    if (EnvName == ")")
1584
-      throw "$ENV invocation: empty argument list!";
1611
+/// SubstituteSpecialCommands - Given an invocation of $CALL or $ENV, output
1612
+/// handler code. Helper function used by EmitCmdLineVecFill().
1613
+StrVector::const_iterator
1614
+SubstituteSpecialCommands (StrVector::const_iterator Pos,
1615
+                           StrVector::const_iterator End,
1616
+                           bool IsJoin, raw_ostream& O)
1617
+{
1585 1618
 
1586
-    O << "checkCString(std::getenv(\"";
1587
-    O << EnvName;
1588
-    O << "\"))";
1619
+  const std::string& cmd = *Pos;
1589 1620
 
1590
-    checkedIncrement(Pos, End, "Syntax error in $ENV invocation!");
1621
+  // Perform substitution.
1622
+  if (cmd == "$CALL") {
1623
+    Pos = SubstituteCall(Pos, End, IsJoin, O);
1624
+  }
1625
+  else if (cmd == "$ENV") {
1626
+    Pos = SubstituteEnv(Pos, End, O);
1591 1627
   }
1592 1628
   else {
1593 1629
     throw "Unknown special command: " + cmd;
1594 1630
   }
1595 1631
 
1632
+  // Handle '$CMD(ARG)/additional/text'.
1596 1633
   const std::string& Leftover = *Pos;
1597 1634
   assert(Leftover.at(0) == ')');
1598 1635
   if (Leftover.size() != 1)
... ...
@@ -1652,7 +1687,7 @@ void EmitCmdLineVecFill(const Init* CmdLine, const std::string& ToolName,
1652 1652
       }
1653 1653
       else {
1654 1654
         O << "vec.push_back(";
1655
-        I = SubstituteSpecialCommands(I, E, O);
1655
+        I = SubstituteSpecialCommands(I, E, IsJoin, O);
1656 1656
         O << ");\n";
1657 1657
       }
1658 1658
     }
... ...
@@ -1665,7 +1700,7 @@ void EmitCmdLineVecFill(const Init* CmdLine, const std::string& ToolName,
1665 1665
 
1666 1666
   O.indent(IndentLevel) << "cmd = ";
1667 1667
   if (StrVec[0][0] == '$')
1668
-    SubstituteSpecialCommands(StrVec.begin(), StrVec.end(), O);
1668
+    SubstituteSpecialCommands(StrVec.begin(), StrVec.end(), IsJoin, O);
1669 1669
   else
1670 1670
     O << '"' << StrVec[0] << '"';
1671 1671
   O << ";\n";
... ...
@@ -1786,17 +1821,36 @@ class EmitActionHandlersCallback
1786 1786
   const OptionDescriptions& OptDescs;
1787 1787
   typedef EmitActionHandlersCallbackHandler Handler;
1788 1788
 
1789
-  void onAppendCmd (const DagInit& Dag,
1790
-                    unsigned IndentLevel, raw_ostream& O) const
1789
+  /// EmitHookInvocation - Common code for hook invocation from actions. Used by
1790
+  /// onAppendCmd and onOutputSuffix.
1791
+  void EmitHookInvocation(const std::string& Str,
1792
+                          const char* BlockOpen, const char* BlockClose,
1793
+                          unsigned IndentLevel, raw_ostream& O) const
1791 1794
   {
1792
-    checkNumberOfArguments(&Dag, 1);
1793
-    const std::string& Cmd = InitPtrToString(Dag.getArg(0));
1794 1795
     StrVector Out;
1795
-    llvm::SplitString(Cmd, Out);
1796
+    TokenizeCmdline(Str, Out);
1796 1797
 
1797 1798
     for (StrVector::const_iterator B = Out.begin(), E = Out.end();
1798
-         B != E; ++B)
1799
-      O.indent(IndentLevel) << "vec.push_back(\"" << *B << "\");\n";
1799
+         B != E; ++B) {
1800
+      const std::string& cmd = *B;
1801
+
1802
+      O.indent(IndentLevel) << BlockOpen;
1803
+
1804
+      if (cmd.at(0) == '$')
1805
+        B = SubstituteSpecialCommands(B, E,  /* IsJoin = */ true, O);
1806
+      else
1807
+        O << '"' << cmd << '"';
1808
+
1809
+      O << BlockClose;
1810
+    }
1811
+  }
1812
+
1813
+  void onAppendCmd (const DagInit& Dag,
1814
+                    unsigned IndentLevel, raw_ostream& O) const
1815
+  {
1816
+    checkNumberOfArguments(&Dag, 1);
1817
+    this->EmitHookInvocation(InitPtrToString(Dag.getArg(0)),
1818
+                             "vec.push_back(", ");\n", IndentLevel, O);
1800 1819
   }
1801 1820
 
1802 1821
   void onForward (const DagInit& Dag,
... ...
@@ -1845,16 +1899,16 @@ class EmitActionHandlersCallback
1845 1845
     const OptionDescription& D = OptDescs.FindListOrParameter(Name);
1846 1846
 
1847 1847
     O.indent(IndentLevel) << "vec.push_back(" << "hooks::"
1848
-                          << Hook << "(" << D.GenVariableName() << "));\n";
1848
+                          << Hook << "(" << D.GenVariableName()
1849
+                          << (D.isParameter() ? ".c_str()" : "") << "));\n";
1849 1850
   }
1850 1851
 
1851
-
1852 1852
   void onOutputSuffix (const DagInit& Dag,
1853 1853
                        unsigned IndentLevel, raw_ostream& O) const
1854 1854
   {
1855 1855
     checkNumberOfArguments(&Dag, 1);
1856
-    const std::string& OutSuf = InitPtrToString(Dag.getArg(0));
1857
-    O.indent(IndentLevel) << "output_suffix = \"" << OutSuf << "\";\n";
1856
+    this->EmitHookInvocation(InitPtrToString(Dag.getArg(0)),
1857
+                             "output_suffix = ", ";\n", IndentLevel, O);
1858 1858
   }
1859 1859
 
1860 1860
   void onStopCompilation (const DagInit& Dag,
... ...
@@ -2115,7 +2169,7 @@ void EmitToolClassDefinition (const ToolDescription& D,
2115 2115
   else
2116 2116
     O << "Tool";
2117 2117
 
2118
-  O << "{\nprivate:\n";
2118
+  O << " {\nprivate:\n";
2119 2119
   O.indent(Indent1) << "static const char* InputLanguages_[];\n\n";
2120 2120
 
2121 2121
   O << "public:\n";
... ...
@@ -2174,8 +2228,8 @@ void EmitOptionDefinitions (const OptionDescriptions& descs,
2174 2174
     else if (val.isOneOrMore() && val.isList()) {
2175 2175
         O << ", cl::OneOrMore";
2176 2176
     }
2177
-    else if (val.isZeroOrOne() && val.isList()) {
2178
-        O << ", cl::ZeroOrOne";
2177
+    else if (val.isOptional() && val.isList()) {
2178
+        O << ", cl::Optional";
2179 2179
     }
2180 2180
 
2181 2181
     if (val.isReallyHidden())
... ...
@@ -2483,7 +2537,9 @@ public:
2483 2483
   {}
2484 2484
 
2485 2485
   void onAction (const DagInit& Dag) {
2486
-    if (GetOperatorName(Dag) == "forward_transformed_value") {
2486
+    const std::string& Name = GetOperatorName(Dag);
2487
+
2488
+    if (Name == "forward_transformed_value") {
2487 2489
       checkNumberOfArguments(Dag, 2);
2488 2490
       const std::string& OptName = InitPtrToString(Dag.getArg(0));
2489 2491
       const std::string& HookName = InitPtrToString(Dag.getArg(1));
... ...
@@ -2492,29 +2548,16 @@ public:
2492 2492
       HookNames_[HookName] = HookInfo(D.isList() ? HookInfo::ListHook
2493 2493
                                       : HookInfo::ArgHook);
2494 2494
     }
2495
-  }
2496
-
2497
-  void operator()(const Init* Arg) {
2498
-
2499
-    // We're invoked on an action (either a dag or a dag list).
2500
-    if (typeid(*Arg) == typeid(DagInit)) {
2501
-      const DagInit& Dag = InitPtrToDag(Arg);
2502
-      this->onAction(Dag);
2503
-      return;
2504
-    }
2505
-    else if (typeid(*Arg) == typeid(ListInit)) {
2506
-      const ListInit& List = InitPtrToList(Arg);
2507
-      for (ListInit::const_iterator B = List.begin(), E = List.end(); B != E;
2508
-           ++B) {
2509
-        const DagInit& Dag = InitPtrToDag(*B);
2510
-        this->onAction(Dag);
2511
-      }
2512
-      return;
2495
+    else if (Name == "append_cmd" || Name == "output_suffix") {
2496
+      checkNumberOfArguments(Dag, 1);
2497
+      this->onCmdLine(InitPtrToString(Dag.getArg(0)));
2513 2498
     }
2499
+  }
2514 2500
 
2515
-    // We're invoked on a command line.
2501
+  void onCmdLine(const std::string& Cmd) {
2516 2502
     StrVector cmds;
2517
-    TokenizeCmdline(InitPtrToString(Arg), cmds);
2503
+    TokenizeCmdline(Cmd, cmds);
2504
+
2518 2505
     for (StrVector::const_iterator B = cmds.begin(), E = cmds.end();
2519 2506
          B != E; ++B) {
2520 2507
       const std::string& cmd = *B;
... ...
@@ -2524,7 +2567,6 @@ public:
2524 2524
         checkedIncrement(B, E, "Syntax error in $CALL invocation!");
2525 2525
         const std::string& HookName = *B;
2526 2526
 
2527
-
2528 2527
         if (HookName.at(0) == ')')
2529 2528
           throw "$CALL invoked with no arguments!";
2530 2529
 
... ...
@@ -2540,9 +2582,30 @@ public:
2540 2540
             + HookName;
2541 2541
         else
2542 2542
           HookNames_[HookName] = HookInfo(NumArgs);
2543
+      }
2544
+    }
2545
+  }
2543 2546
 
2547
+  void operator()(const Init* Arg) {
2548
+
2549
+    // We're invoked on an action (either a dag or a dag list).
2550
+    if (typeid(*Arg) == typeid(DagInit)) {
2551
+      const DagInit& Dag = InitPtrToDag(Arg);
2552
+      this->onAction(Dag);
2553
+      return;
2554
+    }
2555
+    else if (typeid(*Arg) == typeid(ListInit)) {
2556
+      const ListInit& List = InitPtrToList(Arg);
2557
+      for (ListInit::const_iterator B = List.begin(), E = List.end(); B != E;
2558
+           ++B) {
2559
+        const DagInit& Dag = InitPtrToDag(*B);
2560
+        this->onAction(Dag);
2544 2561
       }
2562
+      return;
2545 2563
     }
2564
+
2565
+    // We're invoked on a command line.
2566
+    this->onCmdLine(InitPtrToString(Arg));
2546 2567
   }
2547 2568
 
2548 2569
   void operator()(const DagInit* Test, unsigned, bool) {