[PATCH] [AArch64] Improve and enable the SeparateConstOffsetFromGEP for AArch64 backend.

Thu Nov 13 19:07:18 PST 2014

Hi Jingyue,

Nice suggestion. It really helps a lot for the review.

I've attached a new patch with full context. Review please.

Thanks,
-Hao

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:720
@@ +719,3 @@
+      StructType *StTy = cast<StructType>(*GTI);
+      int64_t Field = cast<ConstantInt>(GEP->getOperand(I))->getSExtValue();
+      // Skip field 0 as the offset is always 0.
----------------
jingyue wrote:
> HaoLiu wrote:
> > jingyue wrote:
> > > Not critical, but I slightly prefer getZExtValue() because a field index is non-negative and getElementOffset takes unsigned. 
> > Aha, previously I use getZExtValue(). You commented I should change to getSExtValue(). I thought it was not critical, so I modified it. 
> > Now, It seems I should roll back...
> > : )
> No, I don't think I suggested you change it to getSExtValue(). When I said "getSExtValue() returns uint64_t", I expected you to write
> ```
> uint64_t Field = ...->getZExtValue();
> ```
> because I slightly prefer delaying truncating which loses precision. But, anyway, I'm fine with "unsigned". 
Oh, I double checked that. I indeed misunderstood your meaning. Last time I used
"unsigned Field = xxx->getZExtValue()". not "uint64_t Field =  xxx->getZExtValue()".
Sorry for the noise. 
Anyway, diff 16135 has already fix it.

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:721
@@ +720,3 @@
+  for (unsigned I = 1, E = GEP->getNumOperands(); I != E; ++I, ++GTI) {
+    if (!isa<ConstantInt>(GEP->getOperand(I))) {
+      Value *Idx = GEP->getOperand(I);
----------------
jingyue wrote:
> HaoLiu wrote:
> > jingyue wrote:
> > > HaoLiu wrote:
> > > > jingyue wrote:
> > > > > Why checking whether it's a constant? Shouldn't you check whether it is an array index? My understanding is, the offsets of structure indices (which are guaranteed to be constant) and the constant offsets of array indices are already counted in accumulativeByteOffsets, and now you only need to rebuild the variadic part of the array indices. 
> > > > Yes, you are right. I'll add comments to explain this.
> > > I wasn't just proposing a comment change. I meant the source code should read something like
> > > ```
> > > if (isa<SequentialType>(GTI))
> > > ```
> > > You are building new GEPs/arithmetics for the remainder of the array indices, and checking whether the indexed type is a SequentialType matches your intention much better. 
> > > 
> > > I also believe checking for SequentialType instead of ConstantInt solves your previous question on handling "a = 1 + 2". I'll go back to that later. 
> > If check SequentialType, we also need to check wheter it is ConstantInt. Then the code will be like:
> >     if (isa<SequentialType>(GTI))
> >       if (!isa<ConstantInt>(GEP->getOperand(I))
> >          ...
> >       else
> >          AccumulativeByteOffset += Const
> > This looks more complicate.
> > 
> > About the "a = 1 + 2", I think handling it when extracting constant in indices is better. Then we don't need to add code in both lowerToXXX functions. Also if we handle it in lowerToXXX functions, the logic for the original splitGEP (I.E. when lowerGEP is diabled) can still not handle this problem.
> > 
> I'm confused. How does AccumulativeByteOffset in your snippet come up here? I think AccumulativeByteOffset is computed before calling transformToSimplerGEPs. Why do you bother changing that again? 
> 
> In case I haven't made this clear, I never expect Find/Extract always extracts all constant offsets. For example, given a = 1 + 2, I'm totally fine if the Find function finds only 2, and in that case the remaining code should treat 1 just as a variable remainder. The GEP after extraction may look like "gep base, i1, 1, i2" which may not be the optimal form, but again I am fine with that because it's sound. If we want to get the optimal form, I am looking for either a more comprehensive way (not just handling tricky cases like adding two constants; your initial diff might serve as a good start but the code was too complicated) or reassociating each index before this pass so that all constant offsets in the index are merged (which I suppose is the approach we want to go with). 
> 
> With that, do we still need the nested "if (!isa<CostantInt>)" check? Do you have an example where simply checking isa<SequentialType> isn't enough? I think a concrete example may help to make things clearer. 
Yeah, I understand your meaning. We just let the "1" as a remainder. It's sound because the logic of Find/Extract is to handle only one constant and return. It's reasonable.
Previously I focused on making it optimal. So I want to extract all the constant. Anyway, this is just corner case and it may be fixed in the future. I'm fine with that. 
BTW, there is already a test case @test_const_add checking we generate correct result for "a = 1 + 2".

================
Comment at: lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp:889
@@ +888,3 @@
+
+void SeparateConstOffsetFromGEP::transformToSimplerGEPs(
+    GetElementPtrInst *GEP, int64_t AccumulativeByteOffset) {
----------------
jingyue wrote:
> HaoLiu wrote:
> > jingyue wrote:
> > > jingyue wrote:
> > > > HaoLiu wrote:
> > > > > jingyue wrote:
> > > > > > Can we merge the two transformToXXX functions? The logics are pretty similar. 
> > > > > I think we'd better not do merge. They logic look similar, but they do quite different things. If we want to merge them, we need to add checks like "if (TransformToSimplerGEP)" in many places (From the first for loop logic to the function end). That would be a quite hard work for other people to understand the code.
> > > > The more I look at these two functions, the more strongly I believe we should merge them. Especially, after you get rid of ResultIndices, you probably only need to check LowerGEP once in the merged function. No? 
> > > I don't see this comment addressed. 
> > I still doubt about your suggestion. The diff 16027 gets ride of ResultIndices. They are still quite different and difficult to combine them.
> > Lowering to arithmetic operations starts from creating ptrtoint, and then creates ADD/MUL/SHL, and ends with creating ADD and inttoptr. 
> > Lowering to GEPs starts from creating bitcast, and then creates MUL/SHL/GEPs, and ends with creating GEP and bitcast. 
> Anyway, I'll probably do that by myself after this diff is in. 
OK, thanks.

http://reviews.llvm.org/D5864