[llvm-dev] Identification of LEA instructions with complex addressing mode

Sun Jul 9 00:07:25 PDT 2017

Is this an RFC for (an expansion of) the patch you already have https://reviews.llvm.org/D35014? Or are you planning on making a different design?

Thanks,
Lama

From: Jatin Bhateja [mailto:jatin.bhateja at gmail.com]
Sent: Sunday, July 9, 2017 9:21 AM
To: llvm-dev <llvm-dev at lists.llvm.org>
Cc: llvm-dev at redking.me.uk; Saba, Lama <lama.saba at intel.com>
Subject: RFC: Identification of LEA instructions with complex addressing mode

  Current state:

     1/ LEA with complex addressing mode are supported by Intel micro architectures after Nehalem.

     2/ LEA detection is being done during SelectionDAG based instruction selection through
         addressing mode based complex pattern matching.

     3/ This does not identify LEA patterns beyond Scaling factor 1.
         e.g.
             T1 = A  + B;
             T2 = T1 + B;
             T3 = T2 + B;
             T4 = T3 + 10;
             T5 = T4 + 20;
             T6 = T5 + B

         Above sequence can be folded to

              LEA   30( A , 4 , B);

         where BASE = A, SCALE = 4, INDEX = B and OFFSET = 30

     4/ Control flow information is not present at SelectionDAG level, as SelectionDAG based selection
         work over a single BasicBlock at a time. Which makes it difficult to avoid generation of
         complex LEA with 3 operands (even with Scale=1) within Loops.

  Proposal:

     1/ To have a pre-RA pass to identify LEAs with dense folding. By dense folding I mean scale factor greater than 1.

     2/ Since this pass will run over MachineInstrs so it will be usable for FastISel and Global ISel based flows also which
         bypass SelectionDAG.

     3/ At MI level we have control flow Analysis info in the form of MachineDominatorTree and MachineLoopInfo
         to avoid formation of LEAs in loops.

     4/ Perform CSE over dense LEAs (which have Scale factor > 1) to factor out overlapping computations.
         When two LEAs share BASE , INDEX and OFFSET but have different SCALE we can
         take out the common complex LEA and generate a simple LEA with legal Scale.
         e.g.

              LEA1 : RES1 = LEA 10( A , 4 , B)
              LEA2 : RES2 = LEA 10( A , 8 , B)

         can be converted to

              LEA1 : RES1 = LEA 10( A , 4 , B)
              LEA2 : RES2 = LEA (RES1 , 4 , B)

      5/  Disintegration of complex LEAs with Scale 1 to simple LEA + ADD which is already being handled during
           FixupLEAPass.

Kindly drop your suggestions/comments.

Thanks,
Jatin Bhateja

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170709/25965544/attachment.html>