[llvm-dev] [AArch64/Cyclone] ZCZeroing Feature

Haicheng Wu via llvm-dev llvm-dev at lists.llvm.org
Thu Apr 28 08:07:52 PDT 2016


Hi,

 

I am trying to port FeatureZCZeroing from Cyclone to Kryo. Using immediate
#0 to zero out W and X registers works great in Kryo.  But using #0 to zero
out float registers sometimes causes extra register spills or move
instructions on either Cyclone or Kryo.

 

Take the following C function as an example

 

double foo(int n) {

  double r=-10000;     

  for (int i=0;i<n;i++) {     

    x = sin(i);                                                         

    r = max(r,x);

  }     

  return r;   

}

 

If compiled towards Cyclone, the loop body has one spill and two reloads as
below

 

.LBB0_1:                                // %for.body

                                        // =>This Inner Loop Header: Depth=1

                str                           q0, [sp]        // 16-byte
Folded Spill

                ldr                           q0, [sp]        // 16-byte
Folded Reload

                bl            sin

                fmaxnm               d8, d8, d0

                ldr                           q0, [sp]        // 16-byte
Folded Reload

                fadd       d0, d0, d9

                add        w20, w20, #1            // =1

                cmp                       w20, w19

                b.lt         .LBB0_1

 

If FeatureZCZeroing is disabled (together with FeatureZCRegMove) on Cyclone,
the translated assembly does not have these load/store instructions:

 

.LBB0_1:                                // %for.body

                                        // =>This Inner Loop Header: Depth=1

                mov                       v0.16b, v8.16b

                bl            sin

                fmaxnm               d9, d9, d0

                fadd       d8, d8, d10

                add        w20, w20, #1            // =1

                cmp                       w20, w19

                b.lt         .LBB0_1

 

PR27454 has an attached .ll test case.   It would be nice if this problem
could be solved so that Kryo and Cyclone could use the united method to zero
out float registers.

 

Best,

 

Haicheng

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160428/124236da/attachment-0001.html>


More information about the llvm-dev mailing list