<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Hi Kevin,<div><br></div><div>I assume you are using gcc as your build compiler? At least in gcc v4.9 there seems an issue with the register allocator resulting in a stack overwrite in the target independent part of the machine combiner:</div><div><br></div><div>MachineCombiner.cpp:</div><div><div style="margin: 0px; font-family: Menlo; color: rgb(83, 48, 225);">/// preservesResourceLen - True when the new instructions do not increase</div><div style="margin: 0px; font-family: Menlo; color: rgb(83, 48, 225);">/// resource length</div><div style="margin: 0px; font-family: Menlo; color: rgb(83, 48, 225);">...</div><div style="margin: 0px; font-family: Menlo; min-height: 21px;">>>> Allocate ptr to MBB on the stack</div><div style="margin: 0px; font-family: Menlo;"> ArrayRef<<span style="color: #34bd26">const</span> MachineBasicBlock *> MBBarr(MBB);</div><div style="margin: 0px; font-family: Menlo;"> <span style="color: #34bd26">unsigned</span> ResLenBeforeCombine = BlockTrace.getResourceLength(MBBarr);</div><div style="margin: 0px; font-family: Menlo; min-height: 21px;"><br></div><div style="margin: 0px; font-family: Menlo; min-height: 21px;">>>> During the next few value the address on the stack is overwritten.</div><div style="margin: 0px; font-family: Menlo; min-height: 21px;"><br></div><div style="margin: 0px; font-family: Menlo;"> SmallVector<<span style="color: #34bd26">const</span> MCSchedClassDesc *, <span style="color: #c33720">16</span>> InsInstrsSC;</div><div style="margin: 0px; font-family: Menlo;"> SmallVector<<span style="color: #34bd26">const</span> MCSchedClassDesc *, <span style="color: #c33720">16</span>> DelInstrsSC;</div><div style="margin: 0px; font-family: Menlo; min-height: 21px;"><br></div><div style="margin: 0px; font-family: Menlo;"> instr2instrSC(InsInstrs, InsInstrsSC);</div><div style="margin: 0px; font-family: Menlo;"> instr2instrSC(DelInstrs, DelInstrsSC);</div><div style="margin: 0px; font-family: Menlo; min-height: 21px;"><br></div><div style="margin: 0px; font-family: Menlo;"> ArrayRef<<span style="color: #34bd26">const</span> MCSchedClassDesc *> MSCInsArr = makeArrayRef(InsInstrsSC);</div><div style="margin: 0px; font-family: Menlo;"> ArrayRef<<span style="color: #34bd26">const</span> MCSchedClassDesc *> MSCDelArr = makeArrayRef(DelInstrsSC);</div><div style="margin: 0px; font-family: Menlo; min-height: 21px;"><br></div><div style="margin: 0px; font-family: Menlo;">>>> MBBarr contains garbage ptr to MBB</div><div style="margin: 0px; font-family: Menlo;"><span style="color: rgb(83, 48, 225);"> <span style="color: rgb(52, 189, 38);">unsigned</span> ResLenAfterCombine =</span></div><div style="margin: 0px; font-family: Menlo;"> BlockTrace.getResourceLength(MBBarr, MSCInsArr, MSCDelArr);</div><div style="margin: 0px; font-family: Menlo;"><br></div><div style="margin: 0px;">Any ideas or suggestions on how to pursue from here? There are multiple ways to work-around in the code eg. avoiding ArrayRefs and using SmallVectors etc., but I would prefer to get help on completing root causing the issue.</div><div style="margin: 0px;"><br></div><div style="margin: 0px;">Big thanks also to Justin for helping with access to Linux resources and his expertise in zooming in on this problem.</div><div style="margin: 0px;"><br></div><div style="margin: 0px;">Cheers</div><div style="margin: 0px;">Gerolf</div><div style="margin: 0px; font-family: Menlo;"><br></div><div style="margin: 0px; font-family: Menlo;"><br></div><div style="margin: 0px; font-family: Menlo;"><br></div><div style="margin: 0px; font-family: Menlo;"><br></div><div style="margin: 0px; font-family: Menlo;"><br></div><div><div>On Aug 5, 2014, at 1:08 PM, Gerolf Hoflehner <<a href="mailto:ghoflehner@apple.com">ghoflehner@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><meta http-equiv="Content-Type" content="text/html charset=windows-1252"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Hi Kevin,<div><br></div><div>apologies for multiple inconveniences. There is an issue with the machine model your are using that I don’t understand yet and is hiding from me so far in my local testing (not surprising since I used a different model). The getResourceLength function is supposed to return the resource length of a basic block before any combining action starts. </div><div><br></div><div>I also realized that I had a mail filter in place hiding the build breakage news. :-(</div><div><br></div><div>Thanks</div><div>Gerolf</div><div><br></div><div><br></div><div><br><div><div>On Aug 4, 2014, at 11:01 PM, Kevin Qin <<a href="mailto:kevinqindev@gmail.com">kevinqindev@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">Hi Gerolf,<div><br></div><div>I reverted it again because it broke the broke compiling of most Benchmark and internal test, as clang got clashed by segmentation fault or assertion.</div><div>Here are some dump information:</div>
<div><br></div><div><div>0 clang-3.6 0x000000000204502b llvm::sys::PrintStackTrace(_IO_FILE*) + 38</div><div>1 clang-3.6 0x00000000020452a8</div><div>2 clang-3.6 0x0000000002044c4d</div><div>3 libpthread.so.0 0x00002b25774e6340</div>
<div>4 clang-3.6 0x0000000001031e28</div><div>5 clang-3.6 0x0000000001a6c28a llvm::MachineTraceMetrics::Trace::getResourceLength(llvm::ArrayRef<llvm::MachineBasicBlock const*>, llvm::ArrayRef<llvm::MCSchedClassDesc const*>, llvm::ArrayRef<llvm::MCSchedClassDesc const*>) const + 256</div>
<div>6 clang-3.6 0x00000000019fa6ae</div><div>7 clang-3.6 0x00000000019facd9</div><div>8 clang-3.6 0x00000000019fb295</div><div>9 clang-3.6 0x0000000001a12c7d llvm::MachineFunctionPass::runOnFunction(llvm::Function&) + 95</div>
<div>10 clang-3.6 0x0000000001cc89ca llvm::FPPassManager::runOnFunction(llvm::Function&) + 290</div><div>11 clang-3.6 0x0000000001cc8b3a llvm::FPPassManager::runOnModule(llvm::Module&) + 84</div><div>12 clang-3.6 0x0000000001cc8e58</div>
<div>13 clang-3.6 0x0000000001cc94fc llvm::legacy::PassManagerImpl::run(llvm::Module&) + 244</div><div>14 clang-3.6 0x0000000001cc971b llvm::legacy::PassManager::run(llvm::Module&) + 39</div><div>15 clang-3.6 0x0000000002558021</div>
<div>16 clang-3.6 0x00000000025580f0 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::raw_ostream*) + 127</div>
<div>17 clang-3.6 0x000000000254244f</div><div>18 clang-3.6 0x0000000002ea9b48 clang::ParseAST(clang::Sema&, bool, bool) + 776</div><div>19 clang-3.6 0x000000000221fedc clang::ASTFrontendAction::ExecuteAction() + 322</div>
<div>20 clang-3.6 0x0000000002544572 clang::CodeGenAction::ExecuteAction() + 1370</div><div>21 clang-3.6 0x000000000221fa11 clang::FrontendAction::Execute() + 139</div><div>22 clang-3.6 0x00000000021eef47 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 721</div>
<div>23 clang-3.6 0x000000000231ec65 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 993</div><div>24 clang-3.6 0x000000000100e05c cc1_main(char const**, char const**, char const*, void*) + 722</div>
<div>25 clang-3.6 0x000000000100775a main + 769</div><div>26 libc.so.6 0x00002b257814eec5 __libc_start_main + 245</div><div>27 clang-3.6 0x0000000001004c59</div><div>Stack dump:</div><div>0.<span class="" style="white-space:pre"> </span>Program arguments: /home/kevin/llvm_trunk/build/bin/clang-3.6 -cc1 -triple arm64--linux-gnueabi -emit-obj -disable-free -main-file-name compress.c -mrelocation-model static -mdisable-fp-elim -menable-no-infs -menable-no-nans -menable-unsafe-fp-math -ffp-contract=fast -ffast-math -masm-verbose -mconstructor-aliases -fuse-init-array -target-cpu cortex-a57 -target-feature +neon -target-feature +crc -target-feature +crypto -target-abi aapcs -dwarf-column-info -coverage-file /home/kevin/Bench/spec2006/benchspec/CPU2006/401.bzip2/build/build_base_llvm-high-opt.0001/compress.o -resource-dir /home/kevin/llvm_trunk/build/bin/../lib/clang/3.6.0 -D SPEC_CPU -D NDEBUG -D SPEC_CPU_LP64 -isysroot /home/kevin/gcc-linaro-aarch64/aarch64-linux-gnu/libc -internal-isystem /home/kevin/gcc-linaro-aarch64/aarch64-linux-gnu/libc/usr/local/include -internal-isystem /home/kevin/llvm_trunk/build/bin/../lib/clang/3.6.0/include -internal-externc-isystem /home/kevin/gcc-linaro-aarch64/aarch64-linux-gnu/libc/include -internal-externc-isystem /home/kevin/gcc-linaro-aarch64/aarch64-linux-gnu/libc/usr/include -O3 -fdebug-compilation-dir /home/kevin/Bench/spec2006/benchspec/CPU2006/401.bzip2/build/build_base_llvm-high-opt.0001 -ferror-limit 19 -fmessage-length 0 -mstackrealign -fno-signed-char -fobjc-runtime=gcc -fdiagnostics-show-option -vectorize-loops -vectorize-slp -o compress.o -x c compress.c </div>
<div>1.<span class="" style="white-space:pre"> </span><eof> parser at end of file</div><div>2.<span class="" style="white-space:pre"> </span>Code generation</div><div>3.<span class="" style="white-space:pre"> </span>Running pass 'Function Pass Manager' on module 'compress.c'.</div>
<div>4.<span class="" style="white-space:pre"> </span>Running pass 'Machine InstCombiner' on function '@BZ2_compressBlock'</div><div>clang-3.6: error: unable to execute command: Segmentation fault (core dumped)</div>
<div>clang-3.6: error: clang frontend command failed due to signal (use -v to see invocation)</div></div><div><br></div><div>Or</div><div><br></div><div>clang-3.6: /home/llvm-test/slave/pre-commit/build/include/llvm/ADT/SmallVector.h:145: const T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](unsigned int) const [with T = llvm::MachineTraceMetrics::FixedBlockInfo; <template-parameter-1-2> = void; llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::const_reference = const llvm::MachineTraceMetrics::FixedBlockInfo&]: Assertion `begin() + idx < end()' failed.<br>
</div><div><br></div><div>These failures should be easily reproduced by compiling LNT, SPEC2000 or SPEC2006 on x64 linux.</div><div><br></div><div>Regards,</div><div>Kevin</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">
2014-08-05 9:16 GMT+08:00 Gerolf Hoflehner <span dir="ltr"><<a href="mailto:ghoflehner@apple.com" target="_blank">ghoflehner@apple.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Author: ghoflehner<br>
Date: Mon Aug 4 20:16:13 2014<br>
New Revision: 214832<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=214832&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=214832&view=rev</a><br>
Log:<br>
MachineCombiner Pass for selecting faster instruction<br>
sequence on AArch64<br>
<br>
Re-commit of r214669 without changes to test cases<br>
LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and<br>
LLVM:: CodeGen/AArch64/dp-3source.ll<br>
This resolves the reported compfails of the original commit.<br>
<br>
<br>
Added:<br>
llvm/trunk/lib/Target/AArch64/AArch64MachineCombinerPattern.h<br>
llvm/trunk/test/CodeGen/AArch64/madd-lohi.ll<br>
Modified:<br>
llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td<br>
llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp<br>
llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h<br>
llvm/trunk/lib/Target/AArch64/AArch64TargetMachine.cpp<br>
llvm/trunk/test/CodeGen/AArch64/mul-lohi.ll<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td?rev=214832&r1=214831&r2=214832&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td?rev=214832&r1=214831&r2=214832&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td Mon Aug 4 20:16:13 2014<br>
@@ -1351,14 +1351,15 @@ class BaseMulAccum<bit isSub, bits<3> op<br>
}<br>
<br>
multiclass MulAccum<bit isSub, string asm, SDNode AccNode> {<br>
+ // MADD/MSUB generation is decided by MachineCombiner.cpp<br>
def Wrrr : BaseMulAccum<isSub, 0b000, GPR32, GPR32, asm,<br>
- [(set GPR32:$Rd, (AccNode GPR32:$Ra, (mul GPR32:$Rn, GPR32:$Rm)))]>,<br>
+ [/*(set GPR32:$Rd, (AccNode GPR32:$Ra, (mul GPR32:$Rn, GPR32:$Rm)))*/]>,<br>
Sched<[WriteIM32, ReadIM, ReadIM, ReadIMA]> {<br>
let Inst{31} = 0;<br>
}<br>
<br>
def Xrrr : BaseMulAccum<isSub, 0b000, GPR64, GPR64, asm,<br>
- [(set GPR64:$Rd, (AccNode GPR64:$Ra, (mul GPR64:$Rn, GPR64:$Rm)))]>,<br>
+ [/*(set GPR64:$Rd, (AccNode GPR64:$Ra, (mul GPR64:$Rn, GPR64:$Rm)))*/]>,<br>
Sched<[WriteIM64, ReadIM, ReadIM, ReadIMA]> {<br>
let Inst{31} = 1;<br>
}<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp?rev=214832&r1=214831&r2=214832&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp?rev=214832&r1=214831&r2=214832&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp Mon Aug 4 20:16:13 2014<br>
@@ -14,6 +14,7 @@<br>
#include "AArch64InstrInfo.h"<br>
#include "AArch64Subtarget.h"<br>
#include "MCTargetDesc/AArch64AddressingModes.h"<br>
+#include "AArch64MachineCombinerPattern.h"<br>
#include "llvm/CodeGen/MachineFrameInfo.h"<br>
#include "llvm/CodeGen/MachineInstrBuilder.h"<br>
#include "llvm/CodeGen/MachineMemOperand.h"<br>
@@ -697,17 +698,12 @@ static bool UpdateOperandRegClass(Machin<br>
return true;<br>
}<br>
<br>
-/// optimizeCompareInstr - Convert the instruction supplying the argument to the<br>
-/// comparison into one that sets the zero bit in the flags register.<br>
-bool AArch64InstrInfo::optimizeCompareInstr(<br>
- MachineInstr *CmpInstr, unsigned SrcReg, unsigned SrcReg2, int CmpMask,<br>
- int CmpValue, const MachineRegisterInfo *MRI) const {<br>
-<br>
- // Replace SUBSWrr with SUBWrr if NZCV is not used.<br>
- int Cmp_NZCV = CmpInstr->findRegisterDefOperandIdx(AArch64::NZCV, true);<br>
- if (Cmp_NZCV != -1) {<br>
+/// convertFlagSettingOpcode - return opcode that does not<br>
+/// set flags when possible. The caller is responsible to do<br>
+/// the actual substitution and legality checking.<br>
+static unsigned convertFlagSettingOpcode(MachineInstr *MI) {<br>
unsigned NewOpc;<br>
- switch (CmpInstr->getOpcode()) {<br>
+ switch (MI->getOpcode()) {<br>
default:<br>
return false;<br>
case AArch64::ADDSWrr: NewOpc = AArch64::ADDWrr; break;<br>
@@ -727,7 +723,22 @@ bool AArch64InstrInfo::optimizeCompareIn<br>
case AArch64::SUBSXrs: NewOpc = AArch64::SUBXrs; break;<br>
case AArch64::SUBSXrx: NewOpc = AArch64::SUBXrx; break;<br>
}<br>
+ return NewOpc;<br>
+}<br>
<br>
+/// optimizeCompareInstr - Convert the instruction supplying the argument to the<br>
+/// comparison into one that sets the zero bit in the flags register.<br>
+bool AArch64InstrInfo::optimizeCompareInstr(<br>
+ MachineInstr *CmpInstr, unsigned SrcReg, unsigned SrcReg2, int CmpMask,<br>
+ int CmpValue, const MachineRegisterInfo *MRI) const {<br>
+<br>
+ // Replace SUBSWrr with SUBWrr if NZCV is not used.<br>
+ int Cmp_NZCV = CmpInstr->findRegisterDefOperandIdx(AArch64::NZCV, true);<br>
+ if (Cmp_NZCV != -1) {<br>
+ unsigned Opc = CmpInstr->getOpcode();<br>
+ unsigned NewOpc = convertFlagSettingOpcode(CmpInstr);<br>
+ if (NewOpc == Opc)<br>
+ return false;<br>
const MCInstrDesc &MCID = get(NewOpc);<br>
CmpInstr->setDesc(MCID);<br>
CmpInstr->RemoveOperand(Cmp_NZCV);<br>
@@ -2185,3 +2196,448 @@ void AArch64InstrInfo::getNoopForMachoTa<br>
NopInst.setOpcode(AArch64::HINT);<br>
NopInst.addOperand(MCOperand::CreateImm(0));<br>
}<br>
+/// useMachineCombiner - return true when a target supports MachineCombiner<br>
+bool AArch64InstrInfo::useMachineCombiner(void) const {<br>
+ // AArch64 supports the combiner<br>
+ return true;<br>
+}<br>
+//<br>
+// True when Opc sets flag<br>
+static bool isCombineInstrSettingFlag(unsigned Opc) {<br>
+ switch (Opc) {<br>
+ case AArch64::ADDSWrr:<br>
+ case AArch64::ADDSWri:<br>
+ case AArch64::ADDSXrr:<br>
+ case AArch64::ADDSXri:<br>
+ case AArch64::SUBSWrr:<br>
+ case AArch64::SUBSXrr:<br>
+ // Note: MSUB Wd,Wn,Wm,Wi -> Wd = Wi - WnxWm, not Wd=WnxWm - Wi.<br>
+ case AArch64::SUBSWri:<br>
+ case AArch64::SUBSXri:<br>
+ return true;<br>
+ default:<br>
+ break;<br>
+ }<br>
+ return false;<br>
+}<br>
+//<br>
+// 32b Opcodes that can be combined with a MUL<br>
+static bool isCombineInstrCandidate32(unsigned Opc) {<br>
+ switch (Opc) {<br>
+ case AArch64::ADDWrr:<br>
+ case AArch64::ADDWri:<br>
+ case AArch64::SUBWrr:<br>
+ case AArch64::ADDSWrr:<br>
+ case AArch64::ADDSWri:<br>
+ case AArch64::SUBSWrr:<br>
+ // Note: MSUB Wd,Wn,Wm,Wi -> Wd = Wi - WnxWm, not Wd=WnxWm - Wi.<br>
+ case AArch64::SUBWri:<br>
+ case AArch64::SUBSWri:<br>
+ return true;<br>
+ default:<br>
+ break;<br>
+ }<br>
+ return false;<br>
+}<br>
+//<br>
+// 64b Opcodes that can be combined with a MUL<br>
+static bool isCombineInstrCandidate64(unsigned Opc) {<br>
+ switch (Opc) {<br>
+ case AArch64::ADDXrr:<br>
+ case AArch64::ADDXri:<br>
+ case AArch64::SUBXrr:<br>
+ case AArch64::ADDSXrr:<br>
+ case AArch64::ADDSXri:<br>
+ case AArch64::SUBSXrr:<br>
+ // Note: MSUB Wd,Wn,Wm,Wi -> Wd = Wi - WnxWm, not Wd=WnxWm - Wi.<br>
+ case AArch64::SUBXri:<br>
+ case AArch64::SUBSXri:<br>
+ return true;<br>
+ default:<br>
+ break;<br>
+ }<br>
+ return false;<br>
+}<br>
+//<br>
+// Opcodes that can be combined with a MUL<br>
+static bool isCombineInstrCandidate(unsigned Opc) {<br>
+ return (isCombineInstrCandidate32(Opc) || isCombineInstrCandidate64(Opc));<br>
+}<br>
+<br>
+static bool canCombineWithMUL(MachineBasicBlock &MBB, MachineOperand &MO,<br>
+ unsigned MulOpc, unsigned ZeroReg) {<br>
+ MachineRegisterInfo &MRI = MBB.getParent()->getRegInfo();<br>
+ MachineInstr *MI = nullptr;<br>
+ // We need a virtual register definition.<br>
+ if (MO.isReg() && TargetRegisterInfo::isVirtualRegister(MO.getReg()))<br>
+ MI = MRI.getUniqueVRegDef(MO.getReg());<br>
+ // And it needs to be in the trace (otherwise, it won't have a depth).<br>
+ if (!MI || MI->getParent() != &MBB || (unsigned)MI->getOpcode() != MulOpc)<br>
+ return false;<br>
+<br>
+ assert(MI->getNumOperands() >= 4 && MI->getOperand(0).isReg() &&<br>
+ MI->getOperand(1).isReg() && MI->getOperand(2).isReg() &&<br>
+ MI->getOperand(3).isReg() && "MAdd/MSub must have a least 4 regs");<br>
+<br>
+ // The third input reg must be zero.<br>
+ if (MI->getOperand(3).getReg() != ZeroReg)<br>
+ return false;<br>
+<br>
+ // Must only used by the user we combine with.<br>
+ if (!MRI.hasOneNonDBGUse(MI->getOperand(0).getReg()))<br>
+ return false;<br>
+<br>
+ return true;<br>
+}<br>
+<br>
+/// hasPattern - return true when there is potentially a faster code sequence<br>
+/// for an instruction chain ending in \p Root. All potential patterns are<br>
+/// listed<br>
+/// in the \p Pattern vector. Pattern should be sorted in priority order since<br>
+/// the pattern evaluator stops checking as soon as it finds a faster sequence.<br>
+<br>
+bool AArch64InstrInfo::hasPattern(<br>
+ MachineInstr &Root,<br>
+ SmallVectorImpl<MachineCombinerPattern::MC_PATTERN> &Pattern) const {<br>
+ unsigned Opc = Root.getOpcode();<br>
+ MachineBasicBlock &MBB = *Root.getParent();<br>
+ bool Found = false;<br>
+<br>
+ if (!isCombineInstrCandidate(Opc))<br>
+ return 0;<br>
+ if (isCombineInstrSettingFlag(Opc)) {<br>
+ int Cmp_NZCV = Root.findRegisterDefOperandIdx(AArch64::NZCV, true);<br>
+ // When NZCV is live bail out.<br>
+ if (Cmp_NZCV == -1)<br>
+ return 0;<br>
+ unsigned NewOpc = convertFlagSettingOpcode(&Root);<br>
+ // When opcode can't change bail out.<br>
+ // CHECKME: do we miss any cases for opcode conversion?<br>
+ if (NewOpc == Opc)<br>
+ return 0;<br>
+ Opc = NewOpc;<br>
+ }<br>
+<br>
+ switch (Opc) {<br>
+ default:<br>
+ break;<br>
+ case AArch64::ADDWrr:<br>
+ assert(Root.getOperand(1).isReg() && Root.getOperand(2).isReg() &&<br>
+ "ADDWrr does not have register operands");<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDWrrr,<br>
+ AArch64::WZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULADDW_OP1);<br>
+ Found = true;<br>
+ }<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(2), AArch64::MADDWrrr,<br>
+ AArch64::WZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULADDW_OP2);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ case AArch64::ADDXrr:<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDXrrr,<br>
+ AArch64::XZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULADDX_OP1);<br>
+ Found = true;<br>
+ }<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(2), AArch64::MADDXrrr,<br>
+ AArch64::XZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULADDX_OP2);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ case AArch64::SUBWrr:<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDWrrr,<br>
+ AArch64::WZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULSUBW_OP1);<br>
+ Found = true;<br>
+ }<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(2), AArch64::MADDWrrr,<br>
+ AArch64::WZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULSUBW_OP2);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ case AArch64::SUBXrr:<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDXrrr,<br>
+ AArch64::XZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULSUBX_OP1);<br>
+ Found = true;<br>
+ }<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(2), AArch64::MADDXrrr,<br>
+ AArch64::XZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULSUBX_OP2);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ case AArch64::ADDWri:<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDWrrr,<br>
+ AArch64::WZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULADDWI_OP1);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ case AArch64::ADDXri:<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDXrrr,<br>
+ AArch64::XZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULADDXI_OP1);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ case AArch64::SUBWri:<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDWrrr,<br>
+ AArch64::WZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULSUBWI_OP1);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ case AArch64::SUBXri:<br>
+ if (canCombineWithMUL(MBB, Root.getOperand(1), AArch64::MADDXrrr,<br>
+ AArch64::XZR)) {<br>
+ Pattern.push_back(MachineCombinerPattern::MC_MULSUBXI_OP1);<br>
+ Found = true;<br>
+ }<br>
+ break;<br>
+ }<br>
+ return Found;<br>
+}<br>
+<br>
+/// genMadd - Generate madd instruction and combine mul and add.<br>
+/// Example:<br>
+/// MUL I=A,B,0<br>
+/// ADD R,I,C<br>
+/// ==> MADD R,A,B,C<br>
+/// \param Root is the ADD instruction<br>
+/// \param [out] InsInstr is a vector of machine instructions and will<br>
+/// contain the generated madd instruction<br>
+/// \param IdxMulOpd is index of operand in Root that is the result of<br>
+/// the MUL. In the example above IdxMulOpd is 1.<br>
+/// \param MaddOpc the opcode fo the madd instruction<br>
+static MachineInstr *genMadd(MachineFunction &MF, MachineRegisterInfo &MRI,<br>
+ const TargetInstrInfo *TII, MachineInstr &Root,<br>
+ SmallVectorImpl<MachineInstr *> &InsInstrs,<br>
+ unsigned IdxMulOpd, unsigned MaddOpc) {<br>
+ assert(IdxMulOpd == 1 || IdxMulOpd == 2);<br>
+<br>
+ unsigned IdxOtherOpd = IdxMulOpd == 1 ? 2 : 1;<br>
+ MachineInstr *MUL = MRI.getUniqueVRegDef(Root.getOperand(IdxMulOpd).getReg());<br>
+ MachineOperand R = Root.getOperand(0);<br>
+ MachineOperand A = MUL->getOperand(1);<br>
+ MachineOperand B = MUL->getOperand(2);<br>
+ MachineOperand C = Root.getOperand(IdxOtherOpd);<br>
+ MachineInstrBuilder MIB = BuildMI(MF, Root.getDebugLoc(), TII->get(MaddOpc))<br>
+ .addOperand(R)<br>
+ .addOperand(A)<br>
+ .addOperand(B)<br>
+ .addOperand(C);<br>
+ // Insert the MADD<br>
+ InsInstrs.push_back(MIB);<br>
+ return MUL;<br>
+}<br>
+<br>
+/// genMaddR - Generate madd instruction and combine mul and add using<br>
+/// an extra virtual register<br>
+/// Example - an ADD intermediate needs to be stored in a register:<br>
+/// MUL I=A,B,0<br>
+/// ADD R,I,Imm<br>
+/// ==> ORR V, ZR, Imm<br>
+/// ==> MADD R,A,B,V<br>
+/// \param Root is the ADD instruction<br>
+/// \param [out] InsInstr is a vector of machine instructions and will<br>
+/// contain the generated madd instruction<br>
+/// \param IdxMulOpd is index of operand in Root that is the result of<br>
+/// the MUL. In the example above IdxMulOpd is 1.<br>
+/// \param MaddOpc the opcode fo the madd instruction<br>
+/// \param VR is a virtual register that holds the value of an ADD operand<br>
+/// (V in the example above).<br>
+static MachineInstr *genMaddR(MachineFunction &MF, MachineRegisterInfo &MRI,<br>
+ const TargetInstrInfo *TII, MachineInstr &Root,<br>
+ SmallVectorImpl<MachineInstr *> &InsInstrs,<br>
+ unsigned IdxMulOpd, unsigned MaddOpc,<br>
+ unsigned VR) {<br>
+ assert(IdxMulOpd == 1 || IdxMulOpd == 2);<br>
+<br>
+ MachineInstr *MUL = MRI.getUniqueVRegDef(Root.getOperand(IdxMulOpd).getReg());<br>
+ MachineOperand R = Root.getOperand(0);<br>
+ MachineOperand A = MUL->getOperand(1);<br>
+ MachineOperand B = MUL->getOperand(2);<br>
+ MachineInstrBuilder MIB = BuildMI(MF, Root.getDebugLoc(), TII->get(MaddOpc))<br>
+ .addOperand(R)<br>
+ .addOperand(A)<br>
+ .addOperand(B)<br>
+ .addReg(VR);<br>
+ // Insert the MADD<br>
+ InsInstrs.push_back(MIB);<br>
+ return MUL;<br>
+}<br>
+/// genAlternativeCodeSequence - when hasPattern() finds a pattern<br>
+/// this function generates the instructions that could replace the<br>
+/// original code sequence<br>
+void AArch64InstrInfo::genAlternativeCodeSequence(<br>
+ MachineInstr &Root, MachineCombinerPattern::MC_PATTERN Pattern,<br>
+ SmallVectorImpl<MachineInstr *> &InsInstrs,<br>
+ SmallVectorImpl<MachineInstr *> &DelInstrs,<br>
+ DenseMap<unsigned, unsigned> &InstrIdxForVirtReg) const {<br>
+ MachineBasicBlock &MBB = *Root.getParent();<br>
+ MachineRegisterInfo &MRI = MBB.getParent()->getRegInfo();<br>
+ MachineFunction &MF = *MBB.getParent();<br>
+ const TargetInstrInfo *TII = MF.getTarget().getSubtargetImpl()->getInstrInfo();<br>
+<br>
+ MachineInstr *MUL;<br>
+ unsigned Opc;<br>
+ switch (Pattern) {<br>
+ default:<br>
+ // signal error.<br>
+ break;<br>
+ case MachineCombinerPattern::MC_MULADDW_OP1:<br>
+ case MachineCombinerPattern::MC_MULADDX_OP1:<br>
+ // MUL I=A,B,0<br>
+ // ADD R,I,C<br>
+ // ==> MADD R,A,B,C<br>
+ // --- Create(MADD);<br>
+ Opc = Pattern == MachineCombinerPattern::MC_MULADDW_OP1 ? AArch64::MADDWrrr<br>
+ : AArch64::MADDXrrr;<br>
+ MUL = genMadd(MF, MRI, TII, Root, InsInstrs, 1, Opc);<br>
+ break;<br>
+ case MachineCombinerPattern::MC_MULADDW_OP2:<br>
+ case MachineCombinerPattern::MC_MULADDX_OP2:<br>
+ // MUL I=A,B,0<br>
+ // ADD R,C,I<br>
+ // ==> MADD R,A,B,C<br>
+ // --- Create(MADD);<br>
+ Opc = Pattern == MachineCombinerPattern::MC_MULADDW_OP2 ? AArch64::MADDWrrr<br>
+ : AArch64::MADDXrrr;<br>
+ MUL = genMadd(MF, MRI, TII, Root, InsInstrs, 2, Opc);<br>
+ break;<br>
+ case MachineCombinerPattern::MC_MULADDWI_OP1:<br>
+ case MachineCombinerPattern::MC_MULADDXI_OP1:<br>
+ // MUL I=A,B,0<br>
+ // ADD R,I,Imm<br>
+ // ==> ORR V, ZR, Imm<br>
+ // ==> MADD R,A,B,V<br>
+ // --- Create(MADD);<br>
+ {<br>
+ const TargetRegisterClass *RC =<br>
+ MRI.getRegClass(Root.getOperand(1).getReg());<br>
+ unsigned NewVR = MRI.createVirtualRegister(RC);<br>
+ unsigned BitSize, OrrOpc, ZeroReg;<br>
+ if (Pattern == MachineCombinerPattern::MC_MULADDWI_OP1) {<br>
+ BitSize = 32;<br>
+ OrrOpc = AArch64::ORRWri;<br>
+ ZeroReg = AArch64::WZR;<br>
+ Opc = AArch64::MADDWrrr;<br>
+ } else {<br>
+ OrrOpc = AArch64::ORRXri;<br>
+ BitSize = 64;<br>
+ ZeroReg = AArch64::XZR;<br>
+ Opc = AArch64::MADDXrrr;<br>
+ }<br>
+ uint64_t Imm = Root.getOperand(2).getImm();<br>
+<br>
+ if (Root.getOperand(3).isImm()) {<br>
+ unsigned val = Root.getOperand(3).getImm();<br>
+ Imm = Imm << val;<br>
+ }<br>
+ uint64_t UImm = Imm << (64 - BitSize) >> (64 - BitSize);<br>
+ uint64_t Encoding;<br>
+<br>
+ if (AArch64_AM::processLogicalImmediate(UImm, BitSize, Encoding)) {<br>
+ MachineInstrBuilder MIB1 =<br>
+ BuildMI(MF, Root.getDebugLoc(), TII->get(OrrOpc))<br>
+ .addOperand(MachineOperand::CreateReg(NewVR, RegState::Define))<br>
+ .addReg(ZeroReg)<br>
+ .addImm(Encoding);<br>
+ InsInstrs.push_back(MIB1);<br>
+ InstrIdxForVirtReg.insert(std::make_pair(NewVR, 0));<br>
+ MUL = genMaddR(MF, MRI, TII, Root, InsInstrs, 1, Opc, NewVR);<br>
+ }<br>
+ }<br>
+ break;<br>
+ case MachineCombinerPattern::MC_MULSUBW_OP1:<br>
+ case MachineCombinerPattern::MC_MULSUBX_OP1: {<br>
+ // MUL I=A,B,0<br>
+ // SUB R,I, C<br>
+ // ==> SUB V, 0, C<br>
+ // ==> MADD R,A,B,V // = -C + A*B<br>
+ // --- Create(MADD);<br>
+ const TargetRegisterClass *RC =<br>
+ MRI.getRegClass(Root.getOperand(1).getReg());<br>
+ unsigned NewVR = MRI.createVirtualRegister(RC);<br>
+ unsigned SubOpc, ZeroReg;<br>
+ if (Pattern == MachineCombinerPattern::MC_MULSUBW_OP1) {<br>
+ SubOpc = AArch64::SUBWrr;<br>
+ ZeroReg = AArch64::WZR;<br>
+ Opc = AArch64::MADDWrrr;<br>
+ } else {<br>
+ SubOpc = AArch64::SUBXrr;<br>
+ ZeroReg = AArch64::XZR;<br>
+ Opc = AArch64::MADDXrrr;<br>
+ }<br>
+ // SUB NewVR, 0, C<br>
+ MachineInstrBuilder MIB1 =<br>
+ BuildMI(MF, Root.getDebugLoc(), TII->get(SubOpc))<br>
+ .addOperand(MachineOperand::CreateReg(NewVR, RegState::Define))<br>
+ .addReg(ZeroReg)<br>
+ .addOperand(Root.getOperand(2));<br>
+ InsInstrs.push_back(MIB1);<br>
+ InstrIdxForVirtReg.insert(std::make_pair(NewVR, 0));<br>
+ MUL = genMaddR(MF, MRI, TII, Root, InsInstrs, 1, Opc, NewVR);<br>
+ } break;<br>
+ case MachineCombinerPattern::MC_MULSUBW_OP2:<br>
+ case MachineCombinerPattern::MC_MULSUBX_OP2:<br>
+ // MUL I=A,B,0<br>
+ // SUB R,C,I<br>
+ // ==> MSUB R,A,B,C (computes C - A*B)<br>
+ // --- Create(MSUB);<br>
+ Opc = Pattern == MachineCombinerPattern::MC_MULSUBW_OP2 ? AArch64::MSUBWrrr<br>
+ : AArch64::MSUBXrrr;<br>
+ MUL = genMadd(MF, MRI, TII, Root, InsInstrs, 2, Opc);<br>
+ break;<br>
+ case MachineCombinerPattern::MC_MULSUBWI_OP1:<br>
+ case MachineCombinerPattern::MC_MULSUBXI_OP1: {<br>
+ // MUL I=A,B,0<br>
+ // SUB R,I, Imm<br>
+ // ==> ORR V, ZR, -Imm<br>
+ // ==> MADD R,A,B,V // = -Imm + A*B<br>
+ // --- Create(MADD);<br>
+ const TargetRegisterClass *RC =<br>
+ MRI.getRegClass(Root.getOperand(1).getReg());<br>
+ unsigned NewVR = MRI.createVirtualRegister(RC);<br>
+ unsigned BitSize, OrrOpc, ZeroReg;<br>
+ if (Pattern == MachineCombinerPattern::MC_MULSUBWI_OP1) {<br>
+ BitSize = 32;<br>
+ OrrOpc = AArch64::ORRWri;<br>
+ ZeroReg = AArch64::WZR;<br>
+ Opc = AArch64::MADDWrrr;<br>
+ } else {<br>
+ OrrOpc = AArch64::ORRXri;<br>
+ BitSize = 64;<br>
+ ZeroReg = AArch64::XZR;<br>
+ Opc = AArch64::MADDXrrr;<br>
+ }<br>
+ int Imm = Root.getOperand(2).getImm();<br>
+ if (Root.getOperand(3).isImm()) {<br>
+ unsigned val = Root.getOperand(3).getImm();<br>
+ Imm = Imm << val;<br>
+ }<br>
+ uint64_t UImm = -Imm << (64 - BitSize) >> (64 - BitSize);<br>
+ uint64_t Encoding;<br>
+ if (AArch64_AM::processLogicalImmediate(UImm, BitSize, Encoding)) {<br>
+ MachineInstrBuilder MIB1 =<br>
+ BuildMI(MF, Root.getDebugLoc(), TII->get(OrrOpc))<br>
+ .addOperand(MachineOperand::CreateReg(NewVR, RegState::Define))<br>
+ .addReg(ZeroReg)<br>
+ .addImm(Encoding);<br>
+ InsInstrs.push_back(MIB1);<br>
+ InstrIdxForVirtReg.insert(std::make_pair(NewVR, 0));<br>
+ MUL = genMaddR(MF, MRI, TII, Root, InsInstrs, 1, Opc, NewVR);<br>
+ }<br>
+ } break;<br>
+ }<br>
+ // Record MUL and ADD/SUB for deletion<br>
+ DelInstrs.push_back(MUL);<br>
+ DelInstrs.push_back(&Root);<br>
+<br>
+ return;<br>
+}<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h?rev=214832&r1=214831&r2=214832&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h?rev=214832&r1=214831&r2=214832&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h Mon Aug 4 20:16:13 2014<br>
@@ -17,6 +17,7 @@<br>
#include "AArch64.h"<br>
#include "AArch64RegisterInfo.h"<br>
#include "llvm/Target/TargetInstrInfo.h"<br>
+#include "llvm/CodeGen/MachineCombinerPattern.h"<br>
<br>
#define GET_INSTRINFO_HEADER<br>
#include "AArch64GenInstrInfo.inc"<br>
@@ -156,9 +157,26 @@ public:<br>
bool optimizeCompareInstr(MachineInstr *CmpInstr, unsigned SrcReg,<br>
unsigned SrcReg2, int CmpMask, int CmpValue,<br>
const MachineRegisterInfo *MRI) const override;<br>
+ /// hasPattern - return true when there is potentially a faster code sequence<br>
+ /// for an instruction chain ending in <Root>. All potential patterns are<br>
+ /// listed<br>
+ /// in the <Pattern> array.<br>
+ virtual bool hasPattern(<br>
+ MachineInstr &Root,<br>
+ SmallVectorImpl<MachineCombinerPattern::MC_PATTERN> &Pattern) const;<br>
+<br>
+ /// genAlternativeCodeSequence - when hasPattern() finds a pattern<br>
+ /// this function generates the instructions that could replace the<br>
+ /// original code sequence<br>
+ virtual void genAlternativeCodeSequence(<br>
+ MachineInstr &Root, MachineCombinerPattern::MC_PATTERN P,<br>
+ SmallVectorImpl<MachineInstr *> &InsInstrs,<br>
+ SmallVectorImpl<MachineInstr *> &DelInstrs,<br>
+ DenseMap<unsigned, unsigned> &InstrIdxForVirtReg) const;<br>
+ /// useMachineCombiner - AArch64 supports MachineCombiner<br>
+ virtual bool useMachineCombiner(void) const;<br>
<br>
bool expandPostRAPseudo(MachineBasicBlock::iterator MI) const override;<br>
-<br>
private:<br>
void instantiateCondBranch(MachineBasicBlock &MBB, DebugLoc DL,<br>
MachineBasicBlock *TBB,<br>
<br>
Added: llvm/trunk/lib/Target/AArch64/AArch64MachineCombinerPattern.h<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64MachineCombinerPattern.h?rev=214832&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64MachineCombinerPattern.h?rev=214832&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64MachineCombinerPattern.h (added)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64MachineCombinerPattern.h Mon Aug 4 20:16:13 2014<br>
@@ -0,0 +1,42 @@<br>
+//===- AArch64MachineCombinerPattern.h -===//<br>
+//===- AArch64 instruction pattern supported by combiner -===//<br>
+//<br>
+// The LLVM Compiler Infrastructure<br>
+//<br>
+// This file is distributed under the University of Illinois Open Source<br>
+// License. See LICENSE.TXT for details.<br>
+//<br>
+//===----------------------------------------------------------------------===//<br>
+//<br>
+// This file defines instruction pattern supported by combiner<br>
+//<br>
+//===----------------------------------------------------------------------===//<br>
+<br>
+#ifndef LLVM_TARGET_AArch64MACHINECOMBINERPATTERN_H<br>
+#define LLVM_TARGET_AArch64MACHINECOMBINERPATTERN_H<br>
+<br>
+namespace llvm {<br>
+<br>
+/// Enumeration of instruction pattern supported by machine combiner<br>
+///<br>
+///<br>
+namespace MachineCombinerPattern {<br>
+enum MC_PATTERN : int {<br>
+ MC_NONE = 0,<br>
+ MC_MULADDW_OP1 = 1,<br>
+ MC_MULADDW_OP2 = 2,<br>
+ MC_MULSUBW_OP1 = 3,<br>
+ MC_MULSUBW_OP2 = 4,<br>
+ MC_MULADDWI_OP1 = 5,<br>
+ MC_MULSUBWI_OP1 = 6,<br>
+ MC_MULADDX_OP1 = 7,<br>
+ MC_MULADDX_OP2 = 8,<br>
+ MC_MULSUBX_OP1 = 9,<br>
+ MC_MULSUBX_OP2 = 10,<br>
+ MC_MULADDXI_OP1 = 11,<br>
+ MC_MULSUBXI_OP1 = 12<br>
+};<br>
+} // end namespace MachineCombinerPattern<br>
+} // end namespace llvm<br>
+<br>
+#endif<br>
<br>
Modified: llvm/trunk/lib/Target/AArch64/AArch64TargetMachine.cpp<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetMachine.cpp?rev=214832&r1=214831&r2=214832&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64TargetMachine.cpp?rev=214832&r1=214831&r2=214832&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/lib/Target/AArch64/AArch64TargetMachine.cpp (original)<br>
+++ llvm/trunk/lib/Target/AArch64/AArch64TargetMachine.cpp Mon Aug 4 20:16:13 2014<br>
@@ -24,6 +24,10 @@ static cl::opt<bool><br>
EnableCCMP("aarch64-ccmp", cl::desc("Enable the CCMP formation pass"),<br>
cl::init(true), cl::Hidden);<br>
<br>
+static cl::opt<bool> EnableMCR("aarch64-mcr",<br>
+ cl::desc("Enable the machine combiner pass"),<br>
+ cl::init(true), cl::Hidden);<br>
+<br>
static cl::opt<bool><br>
EnableStPairSuppress("aarch64-stp-suppress", cl::desc("Suppress STP for AArch64"),<br>
cl::init(true), cl::Hidden);<br>
@@ -174,6 +178,8 @@ bool AArch64PassConfig::addInstSelector(<br>
bool AArch64PassConfig::addILPOpts() {<br>
if (EnableCCMP)<br>
addPass(createAArch64ConditionalCompares());<br>
+ if (EnableMCR)<br>
+ addPass(&MachineCombinerID);<br>
addPass(&EarlyIfConverterID);<br>
if (EnableStPairSuppress)<br>
addPass(createAArch64StorePairSuppressPass());<br>
<br>
Added: llvm/trunk/test/CodeGen/AArch64/madd-lohi.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/madd-lohi.ll?rev=214832&view=auto" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/madd-lohi.ll?rev=214832&view=auto</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/AArch64/madd-lohi.ll (added)<br>
+++ llvm/trunk/test/CodeGen/AArch64/madd-lohi.ll Mon Aug 4 20:16:13 2014<br>
@@ -0,0 +1,19 @@<br>
+; RUN: llc -mtriple=arm64-apple-ios7.0 %s -o - | FileCheck %s<br>
+; RUN: llc -mtriple=aarch64_be-linux-gnu %s -o - | FileCheck --check-prefix=CHECK-BE %s<br>
+<br>
+define i128 @test_128bitmul(i128 %lhs, i128 %rhs) {<br>
+; CHECK-LABEL: test_128bitmul:<br>
+; CHECK-DAG: umulh [[CARRY:x[0-9]+]], x0, x2<br>
+; CHECK-DAG: madd [[PART1:x[0-9]+]], x0, x3, [[CARRY]]<br>
+; CHECK: madd x1, x1, x2, [[PART1]]<br>
+; CHECK: mul x0, x0, x2<br>
+<br>
+; CHECK-BE-LABEL: test_128bitmul:<br>
+; CHECK-BE-DAG: umulh [[CARRY:x[0-9]+]], x1, x3<br>
+; CHECK-BE-DAG: madd [[PART1:x[0-9]+]], x1, x2, [[CARRY]]<br>
+; CHECK-BE: madd x0, x0, x3, [[PART1]]<br>
+; CHECK-BE: mul x1, x1, x3<br>
+<br>
+ %prod = mul i128 %lhs, %rhs<br>
+ ret i128 %prod<br>
+}<br>
<br>
Modified: llvm/trunk/test/CodeGen/AArch64/mul-lohi.ll<br>
URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/mul-lohi.ll?rev=214832&r1=214831&r2=214832&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/AArch64/mul-lohi.ll?rev=214832&r1=214831&r2=214832&view=diff</a><br>
==============================================================================<br>
--- llvm/trunk/test/CodeGen/AArch64/mul-lohi.ll (original)<br>
+++ llvm/trunk/test/CodeGen/AArch64/mul-lohi.ll Mon Aug 4 20:16:13 2014<br>
@@ -1,17 +1,16 @@<br>
-; RUN: llc -mtriple=arm64-apple-ios7.0 %s -o - | FileCheck %s<br>
-; RUN: llc -mtriple=aarch64_be-linux-gnu %s -o - | FileCheck --check-prefix=CHECK-BE %s<br>
-<br>
+; RUN: llc -mtriple=arm64-apple-ios7.0 -mcpu=cyclone %s -o - | FileCheck %s<br>
+; RUN: llc -mtriple=aarch64_be-linux-gnu -mcpu=cyclone %s -o - | FileCheck --check-prefix=CHECK-BE %s<br>
define i128 @test_128bitmul(i128 %lhs, i128 %rhs) {<br>
; CHECK-LABEL: test_128bitmul:<br>
+; CHECK-DAG: mul [[PART1:x[0-9]+]], x0, x3<br>
; CHECK-DAG: umulh [[CARRY:x[0-9]+]], x0, x2<br>
-; CHECK-DAG: madd [[PART1:x[0-9]+]], x0, x3, [[CARRY]]<br>
-; CHECK: madd x1, x1, x2, [[PART1]]<br>
+; CHECK: mul [[PART2:x[0-9]+]], x1, x2<br>
; CHECK: mul x0, x0, x2<br>
<br>
; CHECK-BE-LABEL: test_128bitmul:<br>
+; CHECK-BE-DAG: mul [[PART1:x[0-9]+]], x1, x2<br>
; CHECK-BE-DAG: umulh [[CARRY:x[0-9]+]], x1, x3<br>
-; CHECK-BE-DAG: madd [[PART1:x[0-9]+]], x1, x2, [[CARRY]]<br>
-; CHECK-BE: madd x0, x0, x3, [[PART1]]<br>
+; CHECK-BE: mul [[PART2:x[0-9]+]], x0, x3<br>
; CHECK-BE: mul x1, x1, x3<br>
<br>
%prod = mul i128 %lhs, %rhs<br>
<br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">Best Regards,<div><br></div><div>Kevin Qin</div></div>
</div>
</blockquote></div><br></div></div></blockquote></div><br></div></body></html>