[llvm-dev] Identification of LEA instructions with complex addressing mode
Saba, Lama via llvm-dev
llvm-dev at lists.llvm.org
Sun Jul 9 00:07:25 PDT 2017
Is this an RFC for (an expansion of) the patch you already have https://reviews.llvm.org/D35014? Or are you planning on making a different design?
Thanks,
Lama
From: Jatin Bhateja [mailto:jatin.bhateja at gmail.com]
Sent: Sunday, July 9, 2017 9:21 AM
To: llvm-dev <llvm-dev at lists.llvm.org>
Cc: llvm-dev at redking.me.uk; Saba, Lama <lama.saba at intel.com>
Subject: RFC: Identification of LEA instructions with complex addressing mode
Current state:
1/ LEA with complex addressing mode are supported by Intel micro architectures after Nehalem.
2/ LEA detection is being done during SelectionDAG based instruction selection through
addressing mode based complex pattern matching.
3/ This does not identify LEA patterns beyond Scaling factor 1.
e.g.
T1 = A + B;
T2 = T1 + B;
T3 = T2 + B;
T4 = T3 + 10;
T5 = T4 + 20;
T6 = T5 + B
Above sequence can be folded to
LEA 30( A , 4 , B);
where BASE = A, SCALE = 4, INDEX = B and OFFSET = 30
4/ Control flow information is not present at SelectionDAG level, as SelectionDAG based selection
work over a single BasicBlock at a time. Which makes it difficult to avoid generation of
complex LEA with 3 operands (even with Scale=1) within Loops.
Proposal:
1/ To have a pre-RA pass to identify LEAs with dense folding. By dense folding I mean scale factor greater than 1.
2/ Since this pass will run over MachineInstrs so it will be usable for FastISel and Global ISel based flows also which
bypass SelectionDAG.
3/ At MI level we have control flow Analysis info in the form of MachineDominatorTree and MachineLoopInfo
to avoid formation of LEAs in loops.
4/ Perform CSE over dense LEAs (which have Scale factor > 1) to factor out overlapping computations.
When two LEAs share BASE , INDEX and OFFSET but have different SCALE we can
take out the common complex LEA and generate a simple LEA with legal Scale.
e.g.
LEA1 : RES1 = LEA 10( A , 4 , B)
LEA2 : RES2 = LEA 10( A , 8 , B)
can be converted to
LEA1 : RES1 = LEA 10( A , 4 , B)
LEA2 : RES2 = LEA (RES1 , 4 , B)
5/ Disintegration of complex LEAs with Scale 1 to simple LEA + ADD which is already being handled during
FixupLEAPass.
Kindly drop your suggestions/comments.
Thanks,
Jatin Bhateja
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/25965544/attachment.html>
More information about the llvm-dev
mailing list