[PATCH] D99741: [RISCV][Clang] Add some RVV Floating-Point intrinsic functions. (vfclass, vfmerge, vfrec7, vfrsqrt7, vfsqrt)

Thu Apr 15 00:09:16 PDT 2021

rogfer01 added a comment.

In D99741#2690156 <https://reviews.llvm.org/D99741#2690156>, @craig.topper wrote:

> In D99741#2690124 <https://reviews.llvm.org/D99741#2690124>, @thakis wrote:
>
>> One of the ten commits that landed here (https://github.com/llvm/llvm-project/compare/a3bfddbb6a27...59d5b8c27b43), this being one of them, slowed down check-clang by over 20%: https://bugs.llvm.org/show_bug.cgi?id=49962
>>
>> Since recovering test runtime regressions is very difficult, it's probably a good idea to revert this first and then reland in smaller chunks to see which of the commits causes the problem (?)
>
> The fact that we're running the optimization pipeline in these tests might be to blame. Might also be that riscv_vector.h is currently about 71000 lines which probably isn't quick to parse.

I hacked the `RISCVVEmitter.cpp` file to emit a `riscv_vector_testing.h` that wraps all the definitions with macros like `RVV_TEST_vadd` and `RVV_TEST_vadd_mask` (I used the `IRName` as it was easier for the experiment).

The testing header looks like this

  #if defined(RVV_TEST_vadd)                                                       
  __rvv_overloaded vint8m1_t vadd(vint8m1_t op0, vint8m1_t op1, size_t op2){       
    return vadd_vv_i8m1(op0, op1, op2);                                            
  }                                                                                
  #endif // RVV_TEST_vadd                                                          

  #if defined(RVV_TEST_vadd_mask)                                                  
  __rvv_overloaded vint8m1_t vadd(vbool8_t op0, vint8m1_t op1, vint8m1_t op2, vint8m1_t op3, size_t op4){
    return vadd_vv_i8m1_m(op0, op1, op2, op3, op4);                                
  }                                                                                
  #endif // RVV_TEST_vadd_mask                                                     

In a release build (in a machine that is not very fast) looks like this:

  $ ./bin/llvm-lit --time-test -sv ../llvm-src/clang/test/CodeGen/RISCV/rvv-intrinsics/vadd.c ../llvm-src/clang/test/CodeGen/RISCV/rvv-intrinsics/vadd_fast.c 
  llvm-lit: /home/rferrer/fio/upstream/llvm-src/llvm/utils/lit/lit/llvm/config.py:428: note: using clang: /home/rferrer/fio/upstream/llvm-build-release/bin/clang
  Slowest Tests:
  --------------------------------------------------------------------------
  8.91s: Clang :: CodeGen/RISCV/rvv-intrinsics/vadd.c
  2.11s: Clang :: CodeGen/RISCV/rvv-intrinsics/vadd_fast.c

Where `vadd.c` is the current file and `vadd_fast.c` is the same as vadd.c with this change

  --- vadd.c	2021-04-15 06:51:30.554996630 +0000
  +++ vadd_fast.c	2021-04-15 07:00:38.684074973 +0000
  @@ -8,7 +8,9 @@
   // RUN:   -Werror -Wall -o - %s -S >/dev/null 2>&1 | FileCheck --check-prefix=ASM --allow-empty %s

   // ASM-NOT: warning
  -#include <riscv_vector.h>
  +#define RVV_TEST_vadd
  +#define RVV_TEST_vadd_mask
  +#include <riscv_vector_testing.h>

   // CHECK-RV32-LABEL: @test_vadd_vv_i8mf8(
   // CHECK-RV32-NEXT:  entry:

This seems to suggest that making sure the tested header contains only the necessary will speed the testing up significantly.

We would want to make sure `riscv_vector.h` and `riscv_vector_testing.h` are equivalent when all the macros are enabled. I guess such a test is feasible if we preprocess the testing header with all the `RVV_TEST_nnn` macros enabled and then we do some form of textual comparison (which will likely have to ignore whitespace). This way we can test `riscv_vector.h` using its proxy `riscv_vector_testing.h`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99741/new/

https://reviews.llvm.org/D99741