[LLVMdev] Subword register allocation
Tzu-Chien Chiu
tzuchien.chiu at gmail.com
Sat Sep 17 02:32:50 PDT 2005
Hi,
I have a question about implementing subword register allocation
problems (see the REFERENCES in the end of this message) on LLVM. I
have algorithms, but don't know the best way to implement them in
LLVM.
I asked similar question before:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2005-
May/004001.html
Because I still don't have a satisfying solution now, I try to
elaborate it again. Pardon.
All registers are 128-bit. Each register can be divided into four
32-bit subwords. Each subword can be independently read and written. A
symbolic name is given to each subword: x, y, z, w.
MUL r0.xyz, r1.xyz, r2.xxx
SUB r0.w, r3,y, r4.z
ADD r5.xyzw, r0.xyzw, r2.xyzw
MUL defines the three subwords of r0, and SUB defines the rest one.
Note that ADD uses the four subwords defined by the previous two
instructions. The register allocate must be aware of this, otherwise
additional MOV instructions may have to be inserted:
MUL r0.xyz, r1.xyz, r2.xxx
SUB r9.x, r3,y, r4.z
MOV r0.w, r9.x
ADD r5.xyzw, r0.xyzw, r2.xyzw
In the previous code snippet, when allocating the destination register
for SUB, the register allocator doesn't choose r0.w, and later when
it's found the four subwords are referenced in ADD, an additional MOV
must be inserted to move the subword from r9.x to r9.w.
Even if the subwords in a 128-bit registers are never referenced
together in an instruction, minimizing the number of registers is
always preferred in many architectures to avoid spills.
For example, the code
ADD r0.xy, r1.xy, r2.xy
MUL r4.xy, r1.xy, r2.xy
cn be improved to save r4, since the rest of the two subwords, z and w
of r0, are not available:
ADD r0.xy, r1.xy, r2.xy
MUL r0.zw, r1.xy, r2.xy
I know some register algorithms [1][2]. My question is how to
implement them in LLVM. I try to avoid making too much changes to
existing LLVM live interval analysis and register allocators. I wish
them could be re-used without modification, and perhaps using some
tricks in the TableGen .td file.
I don't know how to do it, but it may be like what is done in
X86RegisterInfo.td. AL and AH are defined to be the alias of AX by
RegisterGroup. But this method doesn't seem work. I'm not sure, and I
need your comments before implementing this similar techniques because
I have a tight schedule.
REFERENCES
[1] S. Tallam and R. Gupta, "Bitwidth aware global register
allocation", Annual Symposium on Principles of Programming Languages,
pp.85 - 96, 2003.
[2] Bengu Li, Youtao Zhang, and Rajiv Gupta, "Speculative Subword
Register Allocation in Embedded Processors", The 17th International
Workshop on Languages and Compilers
for Parallel Computing, 2004.
--
Tzu-Chien Chiu,
3D Graphics Hardware Architect
<URL:http://www.csie.nctu.edu.tw/~jwchiu>
More information about the llvm-dev
mailing list