[PATCH] ARM: allow machine-CSE on litpool materialisations of a global
t.p.northover at gmail.com
Tue Nov 26 05:59:26 PST 2013
Currently, Darwin global address materialization when movw/movt are not available gets converted early on into something like:
%vreg0<def> = tLDRpci_pic <cp#0>, 0; mem:LD4[ConstantPool] tGPR:%vreg0
where "<cp#0> = var-(LPC0+4)" and that second "0" is encoding the LPC0. A separate LPCn label is even created for non-PIC uses, even though it's discarded later.
This has very bad effects on code size, since each of these tLDRpci_pic instructions is completely different as far as CSE is concerned so the address of a global is often redundantly calculated. Even worse, these days this path kicks in only on CPUs like Cortex-M0, where code size is almost the most important factor.
This patch reworks the materialisation so that it's closer to the movw/movt case, where only a globaladdress is carried around until very late in the compiler (ARMExpandPseudoInsts to be precise) and this can be CSE'd without difficulty.
Currently this only affects Darwin: ELF uses a different scheme where this could be applied to LowerGLOBAL_OFFSET_TABLE, but the variable-specific offsets are already unaffected. That's hopefully going to be my weekend project!
OK to commit?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 15578 bytes
Desc: not available
More information about the llvm-commits