[PATCH] D93080: [RISCV] Use tail agnostic policy for vsetvli instruction emitted in the custom inserter
Craig Topper via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 10 19:52:27 PST 2020
craig.topper added a comment.
In D93080#2447590 <https://reviews.llvm.org/D93080#2447590>, @khchen wrote:
> Hi @craig.topper
> I think maybe default tail undisturbed would be more friendly and intuitive for programmer or vectorizer in reduction case.
> please see below example:
>
> //scalar
> float sum=0;
> for(int i=0;i<n;++i) {
> sum += src1[i]*src2[i];
> }
> return sum;
>
>
>
> float foo(float *src1, float *src2, size_t n) {
> size_t len;
> vsetvlmax_e32m8();
> vfloat32m8_t v16 = vfmv_v_f_f32m8(0.0);
> vsetvl_e32m1();
> vfloat32m1_t v24 = vfmv_s_f_f32m1(vundefined_f32m1(), 0.0);
> for (; (len = vl_extract(vsetvl_e32m8(n))) > 0; n -= len) {
> vfloat32m8_t v0 = vle32_v_f32m8(src1);
> vfloat32m8_t v8 = vle32_v_f32m8(src2);
> #if 0
> if maxvl = 2, n = 3;
> src1 = [1, 2, 3]
> src2 = [2, 3, 4]
> 1st iteration, vl=2, input v16 = [0, 0], result v16 = [2, 6]
> 2nd iteration, vl=1, input v16 = [2, 6], result v16 = [14, 6] // tail is still 6 because tail undisturbed.
> #endif
> v16 = vfmacc_vv_f32m1(v16, v0, v8);
> src1 += len;
> src2 += len;
> }
> vsetvlmax_e32m8();
> // input v16 = [14, 6], result = [20, ?]
> vfloat32m1_t result = vfredosum_vs_f32m8_f32m1(v16, v24);
> return vfmv_f_s_f32m1_f32(result);
> }
>
> Community also discussed the difference in issue <https://github.com/riscv/riscv-v-spec/issues/157#issuecomment-527104675> before.
Maybe we should use tail undisturbed for instructions that have something like "let Constraints = "$rd = $rs3"?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D93080/new/
https://reviews.llvm.org/D93080
More information about the llvm-commits
mailing list