[cfe-dev] [RFC][RISCV] Add intrinsic and/or builtin functions by #pragma

David Rector via cfe-dev cfe-dev at lists.llvm.org
Tue Jun 15 08:10:53 PDT 2021


IIUC OpenCL faced the same issue, and their solution was pretty clever and generalizable; a similar approach could conceivably improve compile speeds still further, while also minimizing memory usage and making pragmas unnecessary.  https://lists.llvm.org/pipermail/cfe-dev/2021-February/067610.html <https://lists.llvm.org/pipermail/cfe-dev/2021-February/067610.html>

The basic idea if I recall (Anastasia cc’d might correct me), is to create the necessarily declarations whenever lookup fails.  I.e., if lookup of `vint32m1_t` fails, before giving up clang checks if that is the name of one of your intrinsics; if so it adds the necessarily declaration/overloaded declarations (the particulars handled via Tablegen) and returns that.  

The effect is to "instantiate" these declarations as needed, as if from a template.  

What also seems nice about this approach is that heavy-duty users can alternatively choose to just #include the large header, or use a pre-compiled header, and thereby automatically avoid any costs associated with this last-ditch-lookup solution.

> On Jun 15, 2021, at 2:59 AM, Kito Cheng via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> 
> Hi :
> 
> 
> # TL;DR:
> 
> It's the intrinsic and/or builtin functions related issue again, in
> this RFC we are trying to use pragma to import intrinsics and declare
> intrinsic wrappers function to reduce the compilation time.
> 
> And here is the PoC for this RFC:
> https://reviews.llvm.org/D103228
> 
> # Background:
> 
> RISC-V vector extension has defined 25,386 intrinsic and 2,102
> overloaded intrinsic functions in riscv_vector.h which increase a lot
> of compilation time; the header file contains ~60k lines for those
> overload functions and intrinsic wrapper functions.
> 
> An empty file with include riscv_vector.h takes 0.395s on release
> build and 8.067s second on debug build, and this also increases the
> clang test time.
> 
> # Proposal:
> 
> Using Tablegen to generate the table of the intrinsic wrapper
> functions and then using pragma to declare intrinsic wrapper
> functions.
> 
> Syntax:
> ```c
> #pragma riscv intrinsic vector
> ```
> 
> Then import all builtin functions and intrinsic wrappers into the
> symbol table, this could save lots of time parsing the prototypes of
> the intrinsic wrapper function.
> 
> And this idea of trick is borrowing from AArch64/SVE's implementation on GCC:
> https://github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/arm_sve.h#L40
> 
> 
> # Experimental Results:
> ## Size of riscv_vector.h:
>      |      size |     LoC |
> ------------------------------
> Before | 4,434,725 |  69,749 |
> After  |     5,463 |     159 |
> 
> ## Compilation Speed for Simple File
> 
> testcase:
> ```c
> #include <riscv_vector.h>
> 
> vint32m1_t test_vadd_vv_vfloat32m1_t(vint32m1_t op1, vint32m1_t op2,
> size_t vl) {
>  return vadd(op1, op2, vl);
> }
> ```
> 
> Release build:
>  Before: 0m0.417s
>  After:  0m0.090s
> 
> Debug build:
>  Before: 0m8.016s
>  After:  0m2.295s
> 
> 
> ## Regression Time
> LLVM regression on our 48 core server:
> Release build:
>  Before : Testing Time: 203.81s
>  After : Testing Time: 181.13s
> 
> Debug build:
>  Before : Testing Time: 675.18s
>  After : Testing Time: 647.20s
> 
> 
> 
> Any comments or feedback are appreciated!
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210615/4e721081/attachment.html>


More information about the cfe-dev mailing list