[PATCH] D27803: [ARM] GlobalISel: Load i1, i8 and i16 args from stack

Mon Dec 19 07:43:50 PST 2016

rovka added a comment.

================
Comment at: lib/Target/ARM/ARMCallLowering.cpp:134-142
+    if (VA.getLocInfo() == CCValAssign::SExt ||
+        VA.getLocInfo() == CCValAssign::ZExt)
+      // If the argument is zero- or sign-extended by the caller, its size
+      // becomes 4 bytes, so that's what we should load.
+      AddDefaultPred(MIRBuilder.buildInstr(ARM::LDRi12)
+                         .addDef(ValVReg)
+                         .addUse(Addr)
----------------
t.p.northover wrote:
> Is this necessary? Just because the caller stored  4 bytes doesn't mean we have to load 4. You might make some kind of efficiency argument if you expect most arithmetic to be done on i32 (as in C code), but I don't think this really helps there anyway because the G_SEXT/G_ZEXT will still exist in the MIR.
> 
> Also, if this really is necessary the `MMO` is misrecording the size of the memory operation.
Well, actually we're not generating a G_ZEXT or G_SEXT here... Should we? 

It looks like the simplest thing to do, but OTOH it's the caller's job to generate the extension, so if we did anything here we'd just be duplicating the caller's work (and introducing a potential source of bugs).

I'm not sure what the best course of action is: 
* Keep the size and generate the G_LOAD + G_ZEXT/G_SEXT instructions
* Generate G_LOAD with size 4
* Generate LDRi12 directly as in this patch

Are we breaking any assumptions if we choose one of the last 2? Would that be premature optimization?

https://reviews.llvm.org/D27803