[PATCH] [X86] New pass that moves immediate operands to registers.

Quentin Colombet qcolombet at apple.com
Fri Oct 3 14:34:25 PDT 2014


Hi Serge,

A couple of general comments:
- None of the comment uses the doxygen style: /// instead of //. Please fix it.
- Comments start with a capital letter and end with a period (or any appropriate punctuation :)).
- As I said in one of my email, we should do that just for Os and Oz functions (i.e., look for function attribute: OptimizeForSize and MinSize).

The overall approach seems more complicated than I would expect for such a pass. In particular, I think that all the InReg, Scheduled, and Waiting states could be avoided, as well as all the history thing with the expiration date.
My understanding is that you try to limit the impact of this transformation on the register pressure. That seems fragile to me (there are a lot of magic numbers involved here for the different limits) and harder than it should be to understand.

That I would recommend is having more faith in the register allocator. The live-ranges, you are creating/extending are rematerializable. Therefore, if the register pressure becomes too high, the register allocator can theoritically rematerialize the values instead of spilling.
The main problem then, is that you may end up with code like this:
reg = mov imm
reg2 = op reg3, reg
instead of
reg2 = op reg3, imm

That said, we could probably teach the register allocator to “fold” the operand as we do for memory load, so I wouldn’t worry too much about that.

The bottom line is that I would expect the pass to be a relatively simple scan that creates appropriate constants when the whole basic block has been traversed.
I.e., something like:
ConstantsUsage // Sort of dictionary. Map a constant to its uses.
for each instr in bb
  if instr can use register variant
    record instr, instr.imm in ConstantsUsage // This is could be the trick part. See below for some thoughts.

for each constant in ConstantsUsage
  if constant is profitable
    materialize constant // This creates one constant and update all its uses using the appropriate subreg, removing the redundant load imm, etc.

Regarding ‘record instr, imm, ConstantsUsage’.
The hard part, per say, is to enable reuse of wide constant, e.g., 0x0fff is used for 0xff.
I believe that you wouldn’t have that many constants to look through per basic block and a linear search may be sufficient.
Now, if we want something faster, we could do some trick like having a mapping per size, sorted by profitability, etc.

What do you think?

BTW, you could also have some function level scope if you really are interested at saving as much size as possible.

Cheers,
-Quentin

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:10
@@ +9,3 @@
+//
+// This file defines the pass which will move immediate operand into a resister
+// if it is used in several adjacent instructions. This transformation tries to
----------------
s/resister/register

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:32
@@ +31,3 @@
+
+#define DEBUG_TYPE "x86-imm2reg"
+#include "X86.h"
----------------
I believe this should come after the includes. In other words, usually right after using namespace llvm.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:50
@@ +49,3 @@
+// may be set smaller by supplying proper command line option.
+const unsigned MaxHistoryDepth = 8;
+
----------------
I believe most users won’t mess with the related option and I rather have the number I asked on the command line than a hidden maximal value.
The bottom line is, please do not use a hard-coded maximal value here.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:54
@@ +53,3 @@
+// may be set smaller by supplying proper command line option.
+const unsigned MaxHistoryWidth = 4;
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:58
@@ +57,3 @@
+// immediate value before it is dropped from history.
+const unsigned DefMaxSeparatingInstrs = MaxHistoryWidth;
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:62
@@ +61,3 @@
+// values.
+const unsigned DefMaxRegisters = 2;
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:66
@@ +65,3 @@
+// register.
+const unsigned DefMinProfit = 1;
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:72
@@ +71,3 @@
+    cl::desc("Max number of immediate values tracked in history."),
+    cl::init(MaxHistoryWidth), cl::Hidden);
+
----------------
Init this with the maximal value.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:79
@@ +78,3 @@
+    cl::desc("Max number of uses stored for a tracked immediate value."),
+    cl::init(MaxHistoryDepth), cl::Hidden);
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:88
@@ +87,3 @@
+    cl::desc("Max number of instructions separating two immediate uses."),
+    cl::init(DefMaxSeparatingInstrs), cl::Hidden);
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:93
@@ +92,3 @@
+                 cl::desc("Max number of registers used for materialization."),
+                 cl::init(DefMaxRegisters), cl::Hidden);
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:98
@@ +97,3 @@
+    cl::desc("Minimal number of bytes that materialization must save."),
+    cl::init(DefMinProfit), cl::Hidden);
+
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:112
@@ +111,3 @@
+  // Must be 1 << DATA_N == size in bytes
+  DATA_8,
+  DATA_16,
----------------
I would explicitly set the value, to be sure anyone updating this code matches the intent.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:130
@@ +129,3 @@
+  int Opcode;
+  int NewOpCode;      // Opcode if imm is replaced by reg
+  ImmediateSize Size; // of immediate data
----------------
Period at the end of the comments.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:134
@@ +133,3 @@
+  int Profit;         // Gain in bytes if imm is changed to reg
+  bool IsLoad;        // is this an instruction like 'mov imm, reg'?
+};
----------------
IsLoadImm maybe?

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:135
@@ +134,3 @@
+  bool IsLoad;        // is this an instruction like 'mov imm, reg'?
+};
+
----------------
For this patch I think it is fine to have all the information written here, but in the future it would be nice if it could be part of tablegen.
I do not like having yet another place to update when I add instructions.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:241
@@ +240,3 @@
+
+static int GetSizeOfMovImmToRegInstr(int OpCode) {
+  if (OpCode == X86::MOV16ri)
----------------
Could you add a comment on that function?
I do not understand its purpose with just its name.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:243
@@ +242,3 @@
+  if (OpCode == X86::MOV16ri)
+    return 4; // 66 B8 iw
+  if (OpCode == X86::MOV32ri)
----------------
What do these comments mean?

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:253
@@ +252,3 @@
+
+static unsigned GetSubreg(ImmediateSize Entire, ImmediateSize Part) {
+  if (Entire == Part)
----------------
Shouldn’t we assert Entire is bigger than Part?

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:269
@@ +268,3 @@
+static unsigned GetSubregByBytes(unsigned Entire, unsigned Part) {
+  if (Entire == Part)
+    return X86::NoSubRegister;
----------------
Ditto.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:795
@@ +794,3 @@
+  InstrInfo *Rec = OpCodeTable::Find(Instr.getOpcode());
+  if (Rec == 0)
+    return false;
----------------
nullptr instead of 0.

================
Comment at: lib/Target/X86/X86MaterializeImmediates.cpp:947
@@ +946,3 @@
+
+  // If use set of the value contains load instruction, expences for load
+  // instruction can be decreased.
----------------
expences -> expenses.

http://reviews.llvm.org/D5544






More information about the llvm-commits mailing list