[cfe-dev] [RFC][RISCV] Add intrinsic and/or builtin functions by #pragma
David Rector via cfe-dev
cfe-dev at lists.llvm.org
Tue Jun 15 08:10:53 PDT 2021
IIUC OpenCL faced the same issue, and their solution was pretty clever and generalizable; a similar approach could conceivably improve compile speeds still further, while also minimizing memory usage and making pragmas unnecessary. https://lists.llvm.org/pipermail/cfe-dev/2021-February/067610.html <https://lists.llvm.org/pipermail/cfe-dev/2021-February/067610.html>
The basic idea if I recall (Anastasia cc’d might correct me), is to create the necessarily declarations whenever lookup fails. I.e., if lookup of `vint32m1_t` fails, before giving up clang checks if that is the name of one of your intrinsics; if so it adds the necessarily declaration/overloaded declarations (the particulars handled via Tablegen) and returns that.
The effect is to "instantiate" these declarations as needed, as if from a template.
What also seems nice about this approach is that heavy-duty users can alternatively choose to just #include the large header, or use a pre-compiled header, and thereby automatically avoid any costs associated with this last-ditch-lookup solution.
> On Jun 15, 2021, at 2:59 AM, Kito Cheng via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> Hi :
>
>
> # TL;DR:
>
> It's the intrinsic and/or builtin functions related issue again, in
> this RFC we are trying to use pragma to import intrinsics and declare
> intrinsic wrappers function to reduce the compilation time.
>
> And here is the PoC for this RFC:
> https://reviews.llvm.org/D103228
>
> # Background:
>
> RISC-V vector extension has defined 25,386 intrinsic and 2,102
> overloaded intrinsic functions in riscv_vector.h which increase a lot
> of compilation time; the header file contains ~60k lines for those
> overload functions and intrinsic wrapper functions.
>
> An empty file with include riscv_vector.h takes 0.395s on release
> build and 8.067s second on debug build, and this also increases the
> clang test time.
>
> # Proposal:
>
> Using Tablegen to generate the table of the intrinsic wrapper
> functions and then using pragma to declare intrinsic wrapper
> functions.
>
> Syntax:
> ```c
> #pragma riscv intrinsic vector
> ```
>
> Then import all builtin functions and intrinsic wrappers into the
> symbol table, this could save lots of time parsing the prototypes of
> the intrinsic wrapper function.
>
> And this idea of trick is borrowing from AArch64/SVE's implementation on GCC:
> https://github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/arm_sve.h#L40
>
>
> # Experimental Results:
> ## Size of riscv_vector.h:
> | size | LoC |
> ------------------------------
> Before | 4,434,725 | 69,749 |
> After | 5,463 | 159 |
>
> ## Compilation Speed for Simple File
>
> testcase:
> ```c
> #include <riscv_vector.h>
>
> vint32m1_t test_vadd_vv_vfloat32m1_t(vint32m1_t op1, vint32m1_t op2,
> size_t vl) {
> return vadd(op1, op2, vl);
> }
> ```
>
> Release build:
> Before: 0m0.417s
> After: 0m0.090s
>
> Debug build:
> Before: 0m8.016s
> After: 0m2.295s
>
>
> ## Regression Time
> LLVM regression on our 48 core server:
> Release build:
> Before : Testing Time: 203.81s
> After : Testing Time: 181.13s
>
> Debug build:
> Before : Testing Time: 675.18s
> After : Testing Time: 647.20s
>
>
>
> Any comments or feedback are appreciated!
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210615/4e721081/attachment.html>
More information about the cfe-dev
mailing list