[PATCH] D14064: Convert Returned Constant i1 Values to i32 on PPC64

hfinkel@anl.gov via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 10 07:58:48 PST 2015


hfinkel added a comment.

In http://reviews.llvm.org/D14064#276116, @hfinkel wrote:

> ...
>
> As a general comment, I'd make this a bit more general. The core problem, as you know, if that the logic that takes care of eliminating unnecessary round-tripping through crbit regs is in PPCISelLowering, and thus is BB local. Using crbits makes a lot of sense only when some input comes from a comparison, or feeds a conditional branch. If none of the i1 values comes from a comparison, and none feeds a branch, then it seems likely we'll have unnecessary round-tripping through crbit regs.
>
> Thus, it seems like you could start with all i1 consumers that are not selects or conditional branches (returns, function arguments, [zs]exts), and build a tree of defines all the way back to the initial producers. Unless one of the producers is a comparison, then promote everything to i32.


I feel like you've not addressed this, and this pass is still much too focused on a particular special case. While I'm generally fine with incremental improvement, the comments in the code should at least explain why this particular special case is worth handling in isolation.

I think this is justifiable because handling more-complex cases might require additional logic to take them out of canonical form. To provide a quick example, consider this:

  $ cat /tmp/f.cpp 
  bool test1(bool a, bool b, bool c, bool d) {
    return a && b && c && d;
  }

but as we canonicalize this, we get:

  $ clang++ -O3 -S -emit-llvm -o - /tmp/f.cpp
  ; ModuleID = '/tmp/f.cpp'
  target datalayout = "E-m:e-i64:64-n32:64"
  target triple = "powerpc64-unknown-linux-gnu"
  
  ; Function Attrs: nounwind readnone
  define zeroext i1 @_Z5test1bbbb(i1 zeroext %a, i1 zeroext %b, i1 zeroext %c, i1 zeroext %d) #0 {
  entry:
    %brmerge.demorgan = and i1 %a, %b
    br i1 %brmerge.demorgan, label %land.lhs.true.5, label %land.end
  
  land.lhs.true.5:                                  ; preds = %entry
    %d. = and i1 %c, %d
    ret i1 %d.
  
  land.end:                                         ; preds = %entry
    ret i1 false
  }

and the code we generate, using crbits or not, is pretty bad:

  $ clang++ -O3 -S -o - /tmp/f.cpp
  ...
  .Lfunc_begin0:
  # BB#0:                                 # %entry
  	andi. 6, 6, 1
  	and 3, 3, 4
  	crmove	 20, 1
  	andi. 5, 5, 1
  	crmove	 21, 1
  	andi. 3, 3, 1
  	bc 4, 1, .LBB0_3
  # BB#1:                                 # %land.lhs.true.5
  	crand 20, 21, 20
  	li 3, 1
  	bclr 12, 20, 0
  # BB#2:                                 # %land.lhs.true.5
  	li 3, 0
  	blr
  .LBB0_3:                                # %land.end
  	li 3, 0
  	blr

but this is still not good:

  $ clang++ -O3 -S -o - /tmp/f.cpp -mno-crbits
  ...
  .Lfunc_begin0:
  # BB#0:                                 # %entry
  	cmplwi 0, 3, 0
  	beq	 0, .LBB0_3
  # BB#1:                                 # %entry
  	cmplwi 0, 4, 0
  	beq	 0, .LBB0_3
  # BB#2:                                 # %land.lhs.true.5
  	and 3, 5, 6
  	clrldi	 3, 3, 32
  	blr
  .LBB0_3:                                # %land.end
  	li 3, 0
  	blr

so we have a lot of work to do on this front.

I will point out, however, that:

  $ cat /tmp/f2.cpp 
  bool test1(bool a, bool b, bool c, bool d) {
    return a & b & c & d;
  }
  
  $ clang++ -O3 -S -o - /tmp/f2.cpp
  ...
  .Lfunc_begin0:
  # BB#0:                                 # %entry
  	and 3, 3, 4
  	and 3, 3, 5
  	and 3, 3, 6
  	blr

and that is what we want.


================
Comment at: lib/Target/PowerPC/PPCTargetMachine.cpp:286
@@ -285,1 +285,3 @@
+  if (TM->getOptLevel() != CodeGenOpt::None)
+    addPass(createBoolRetToIntPass());
   addPass(createAtomicExpandPass(&getPPCTargetMachine()));
----------------
kbarton wrote:
> This placement will make this one of the first passes that gets runs, won't it? Is this where we want to run it? I admit I haven't thought through the different placements for this, but for some reason I was expecting it to run later (not sure why). 
This seems like a reasonable place for it. We probably do want it before CodeGenPrep runs, and I see no particular reason for it to be later.



http://reviews.llvm.org/D14064





More information about the llvm-commits mailing list