[PATCH] D138529: [AVR] Optimize constant 32-bit shifts

Thu Dec 22 11:30:57 PST 2022

aykevl added a comment.

I created a series of patches to replace this one, for easier reviewing. Thank you for the thorough review!

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:314
+    case ISD::SRA:
+      Opc = AVRISD::ASRW;
+      break;
----------------
benshi001 wrote:
> aykevl wrote:
> > benshi001 wrote:
> > > Would it better to use `std::map` or equivalent but more efficient llvm utilities?
> > Do you know of any?
> > Other code seems to use a `switch` and I think this is very readable.
> Though `switch` is more readable, it seems boring, in my opinion. 
> 
> So I suggest
> 
> ```
>   std::map<unsigned, unsigned> OpMap = {{ISD::SHL, AVRISD::LSLW},
>                                         {ISD::SRL, AVRISD::LSRW}, 
>                                         {ISD::SRA, AVRISD::ASRW}};
>   assert(OpMap.find(Op.getOpcode()) != OpMap.end() &&
>          "Unexpected shift opcode");
>   unsigned Opc = OpMap[Op.getOpcode()];
> ```
> 
> which looks more clear.
I'm not really convinced this is more readable. I like boring code, it means it is easy to understand.

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:1843
+// https://aykevl.nl/2021/02/avr-bitshift
+static void insertMultibyteShift(MachineInstr &MI, MachineBasicBlock *BB,
+                                 MutableArrayRef<std::pair<Register, int>> Regs,
----------------
benshi001 wrote:
> I think using negative ShiftAmt for left shift is so counter-intuitive, it would be better to use one more argument `bool leftShift`, or direct expose the Opcode `{shl, lshr, ashr}`.
Hmm, you have a point here. But I kind of like using the signedness of the value.
What do you think of renaming it to `ShiftRightAmt` instead? Then it's clearer that a negative number will be a left shift.
I prefer to avoid opcodes because that would make the code less generic (I want to write a 64-bit shift later, for example).

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:1855
+  // then move registers around to get the correct end result.
+  if (ShiftAmt < 0 && (-ShiftAmt % 8) >= 6) {
+    // Left shift modulo 6 or 7.
----------------
benshi001 wrote:
> Would it be better to split this large function `insertMultibyteShift` to small ones for each `if` statment ? for example, this `if` can be a standalone function `insertMultibyteShiftMod6_7` (or any better name). 
What would be the benefit of that?

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:1861
+    size_t ShiftRegsOffset = -ShiftAmt / 8;
+    size_t ShiftBytes = Regs.size() - ShiftRegsOffset;
+    MutableArrayRef<std::pair<Register, int>> ShiftRegs =
----------------
benshi001 wrote:
> Renaming `ShiftBytes` to `ShiftRegsSize` looks better.
:+1: sounds good

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:1934
+    // Shift one more to the left for modulo 6 shifts.
+    if (ShiftAmt % 8 == 6) {
+      insertMultibyteShift(MI, BB, ShiftRegs, -1, false);
----------------
benshi001 wrote:
> This `if` statement is not tested for ashr. It would be better to add an ashr6 or ashr30 test.
It is, it is tested by `ashr_i32_30`. I have added a few more tests in my updated code.

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:1949
+      } else if (Idx == -1) {
+        Regs[I] = std::pair(Ext, 0);
+      } else {
----------------
benshi001 wrote:
> `Ext` and `ExtMore` seem confusing. May it be better that
> 
> `Ext`     -> `HighByte`
> `ExtMore` -> `ExtByte`
Fair enough. I'll update the code with these new names and some extra comments to explain what they do.

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:2004
+    // Continue shifts with the leftover registers.
+    Regs = Regs.slice(1, Regs.size() - 1);
+
----------------
benshi001 wrote:
> I am a bit confused about here. With `Regs = Regs.slice(1, Regs.size() - 1);`, so the 0 index element is dropped, and only 3 elements left in `Regs` ?
Yes. I'll update the code to use `drop_front` and `drop_back` instead.

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:2008
+  }
+
+  // Shift by four bits, using a complicated swap/eor/andi/eor sequence.
----------------
benshi001 wrote:
> So here should be an `assert((ShiftAmt > -8 && ShiftAmt < 8) && "Unexpect shift amount");`
:+1: seems fine by me.

================
Comment at: llvm/lib/Target/AVR/AVRISelLowering.cpp:2146
+  // already equal to or faster than avr-gcc in all cases except ashr 8).
+  if (ShiftAmt > 0 && (!ArithmeticShift || (ShiftAmt < 16 || ShiftAmt >= 22))) {
+    // Use the resulting registers starting with the least significant byte.
----------------
benshi001 wrote:
> Have your supplemented tests covered all those special conditions ?
Not sure, I did test all cases locally (all 93 cases) and this does not affect or improves code size in all cases. You can see a number of tests in D140573 that are optimized this way.
In any case, this is an optimization. As long as correct code is generated in both cases, the condition is purely a heuristic.

================
Comment at: llvm/test/CodeGen/AVR/shift32.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=avr -mattr=movw -verify-machineinstrs | FileCheck %s
+
----------------
benshi001 wrote:
> Also check for AVRTiny ？
The only difference is the lack of `movw` (which isn't emitted by this patch, instead is uses `AVR::COPY` which is lowered to either `movw` or two `mov` instructions depending on support). All other instructions are supported by avrtiny. So I do not think testing for AVRTiny is necessary here.

(I could add it of course if you think it's useful, but the test output will likely be near-identical).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138529/new/

https://reviews.llvm.org/D138529