[PATCH] Implement ADRP CSE for global symbols

Jiangning Liu liujiangning1 at gmail.com
Thu May 15 02:48:08 PDT 2014


Hi Quentin,

Thanks a lot of your performance data, and it also surprised by those regressions. But I can reproduce the regressions on ARM64 for the benchmarks below,

CINT2000/164.gzip/164.g	 22.8601	 23.2677	 1.02	 +2%
CINT2000/255.vortex/255	 4.8632	 5.0017	 1.03	 +3%

If I don't have other higher priority tasks, I will have a look. But it would be much better if your patch of fixing MOVAddr issue can be shared with me.

Is there any possibility after this optimization the live range of base address is extended, and triggered spill/fill in hot loop by RA? If this is the case the rematerialization needs to be tuned, I think.

> > Anyway, I would be extremely appreciative if you can understand the current situation and agree to upstream this patch first.
>
> I am fine with that assuming:
> 1. This is disabled by default for the arm64-apple-* triple.

Actually both -global-merge-on-external and -global-merge-aligned are false by default, so if end-user doesn't enable them we wouldn't see any performance impact. Therefore, they are disabled by default for all targets.

Actually I was hoping your patch can get it enabled at least for ARM64. :-)

> 2. You have double check the alignment stuff :).

I checked the alignment behavior. If we don't set alignment, which is also the default behavior with/without my patch, the following code in AsmPrinter will be effective,

  // If the alignment is specified, we *must* obey it.  Overaligning a global
  // with a specified alignment is a prompt way to break globals emitted to
  // sections and expected to be contiguous (e.g. ObjC metadata).
  unsigned AlignLog = getGVAlignmentLog2(GV, *DL);

  for (const HandlerInfo &HI : Handlers) {
    NamedRegionTimer T(HI.TimerName, HI.TimerGroupName, TimePassesIsEnabled);
    HI.Handler->setSymbolSize(GVSym, Size);
  }

  // Handle common and BSS local symbols (.lcomm).
  if (GVKind.isCommon() || GVKind.isBSSLocal()) {
    if (Size == 0) Size = 1;   // .comm Foo, 0 is undefined, avoid it.
    unsigned Align = 1 << AlignLog;

So that means the RoundupToPowerof2(sizeof(MergedGlobal)) will be used as alignment.

The test case test/CodeGen/AArch64/global-merge.ll, which only uses "internal globals", also shows below with llc,

$ ../build/bin/llc -mtriple=arm64-apple-ios -O1 < test/CodeGen/AArch64/global-merge.ll
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_f1
	.align	2
_f1:                                    ; @f1
	.cfi_startproc
; BB#0:
Lloh0:
	adrp	x8, __MergedGlobals at PAGE
Lloh1:
	add	x8, x8, __MergedGlobals at PAGEOFF
	stp	w0, w1, [x8]
	ret
	.loh AdrpAdd	Lloh0, Lloh1
	.cfi_endproc

.zerofill __DATA,__bss,__MergedGlobals,8,3 ; @_MergedGlobals

So actually I'm keeping the original behavior.

Anyway, my patch doesn't change the original semantic at all, if only -global-merge-on-external and -global-merge-aligned are not enabled.

Are you happy with this?

Thanks,
-Jiangning

>
> Thanks,
> -Quentin

http://reviews.llvm.org/D3432






More information about the llvm-commits mailing list