<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">The flag -enable-aa-sched-mi should do what you want you want in the MachineScheduler pass.<div><br></div><div>If you want to do it in the selection DAG, there is a subtarget hook that might do it:<br><div><br></div><div>TargetSubtargetInfo::useAA()</div><div><br></div><div>LLVM won’t generate the schedule you want anyway for Intel core processors, but the alias analysis can be useful in general.</div><div><br></div><div>-Andy<br><div><br><div><div>On Dec 16, 2013, at 6:03 AM, Haishan <<a href="mailto:hndxvon@163.com">hndxvon@163.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div style="line-height: 1.7; font-size: 14px; font-family: arial;"><pre>At 2013-12-15 22:43:34,"Caldarale, Charles R" <<a href="mailto:Chuck.Caldarale@unisys.com">Chuck.Caldarale@unisys.com</a>> wrote:

>> From: <a href="mailto:llvmdev-bounces@cs.uiuc.edu">llvmdev-bounces@cs.uiuc.edu</a> [<a href="mailto:llvmdev-bounces@cs.uiuc.edu">mailto:llvmdev-bounces@cs.uiuc.edu</a>]

>> On Behalf Of Haishan

>> Subject: [LLVMdev] Question about Pre-RA-schedule in LLVM3.3

>

>> My clang version is 3.3 and debug build.

>

>> //test.c

>> int a[6] = {1, 2, 3, 4, 5, 6}

>> int main() {

>>  a[0] = a[5];

>>  a[1] = a[4];

>>  a[2] = a[5];

>> }

>> //end test.c

>> Then test.dump is generated by using the objdump tool.

>> //test.dump

>> ldr  r1, [r0, #20]

>> str  r1, [r0]

>> ldr  r1, [r0, #16]

>> str  r1, [r0, #4]

>> ldr  r1, [r0, #12]

>> str  r1, [r0, #8]

>> bx  lr

>> //end test.dump

>

>It appears you have a typo in the above, since the generated array reference offsets do not correspond to the code in test.c.  Presumably, the last array reference in test.c was really from a[3], not a[5].</pre><pre>I'm sorry for making a mistake in the above test.c.

And your presumption is right.</pre><pre>>

>> However, for 3th and 4th instructions, they should be allocated different 

>> register from the second instruction.

>

>Why?

>

> - Chuck

>

</pre><pre>Thank you for your answer.</pre><pre>If 3th and 4th instructions are allocated different register from the second instruction.

Then the same machine register dependence will disappear, 

this sequence instructions would be executed with less stalls and cycles.

However, in the latest version of LLVM, the Pre-RA-sched builds a scheduling graph(original graph) which is shown following.

//original graph

----> data flow

====> control flow

load1 ----> store1 ====> load2 ----> store2 ====> load3 ----> store3

//end original graph

So, Pre-RA-sched is unable to schedule apart load/store instruction pair.

Due to LiveRange in the Register Allocation stage, all load/store instruction pair are allocated the same register.

If we change the control flow in the above original graph, the modified graph is shown following.

//modified graph

----> data flow

====> control flow

load1 ----> store1 ====> store2 ====> store3

load2 ----> store2

load3 ----> store3

//end modified graph

I think the Pre-RA-sched is able to schedule apart load/store instruction pairs.

Then each instruction pair uses different register.

The order of scheduled instruction of test.c may be load1, load2, load3, store1, store2, store3.</pre><pre>Best Wishes</pre><pre>- Haishan</pre></div><br><br><span title="neteasefooter"><span id="netease_mail_footer"></span></span>_______________________________________________<br>LLVM Developers mailing list<br><a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a>         <a href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a><br><a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br></blockquote></div><br></div></div></div></body></html>