[cfe-dev] [RFC][RISCV] Add intrinsic and/or builtin functions by #pragma

Anastasia Stulova via cfe-dev cfe-dev at lists.llvm.org
Tue Jun 22 10:50:16 PDT 2021


FYI, in case it helps we have started documentation about the internals of the
approach https://clang.llvm.org/docs/OpenCLSupport.html#opencl-builtins.
Although it is still a bit concise. There is not much OpenCL specific in the
approach we have implemented so it should be easily generalizable with some
renaming and minor refactoring (CC to Sven who might be able to provide more
info if needed). You might need to add a few special types if you use any that
we don't have in OpenCL yet. Although we have covered a good variety from C99
already.

We have removed the need for the pragmas in the last commits but it is mainly
because it wasn't useful in OpenCL in a way it was defined in the spec as it
was not similar to a header include. The TableGen based header include is very
fast compared to parsing the large header files so I can certainly recommend
this route.

Cheers,
Anastasia
________________________________
From: cfe-dev <cfe-dev-bounces at lists.llvm.org> on behalf of Kito Cheng via cfe-dev <cfe-dev at lists.llvm.org>
Sent: 22 June 2021 03:41
To: David Rector <davrecthreads at gmail.com>
Cc: Clang Dev <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] [RFC][RISCV] Add intrinsic and/or builtin functions by #pragma

Hi David:

Thanks for your info, I investigate OpenCL intrinsic last few days,
and I saw OpenCL already use some #pragama to control the extenison
on/off.
So I think the mechnish is pretty simiular, the difference is OpenCL
apporache need to write a new td file to generate those helper
functions.

And our apparoch is extending existing builtin declare mechnish: add
one filed to record the enable contdition.

We consider pre-compiled header before, but seems like pre-compiled
header are not fit RISC-V scenario - having different -march
combination which will affect the content of the header, so it seems
not work for RISC-V intrinsic headers.


Thanks :)

On Tue, Jun 15, 2021 at 11:11 PM David Rector via cfe-dev
<cfe-dev at lists.llvm.org> wrote:
>
> IIUC OpenCL faced the same issue, and their solution was pretty clever and generalizable; a similar approach could conceivably improve compile speeds still further, while also minimizing memory usage and making pragmas unnecessary.  https://lists.llvm.org/pipermail/cfe-dev/2021-February/067610.html
>
> The basic idea if I recall (Anastasia cc’d might correct me), is to create the necessarily declarations whenever lookup fails.  I.e., if lookup of `vint32m1_t` fails, before giving up clang checks if that is the name of one of your intrinsics; if so it adds the necessarily declaration/overloaded declarations (the particulars handled via Tablegen) and returns that.
>
> The effect is to "instantiate" these declarations as needed, as if from a template.
>
> What also seems nice about this approach is that heavy-duty users can alternatively choose to just #include the large header, or use a pre-compiled header, and thereby automatically avoid any costs associated with this last-ditch-lookup solution.
>
> On Jun 15, 2021, at 2:59 AM, Kito Cheng via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> Hi :
>
>
> # TL;DR:
>
> It's the intrinsic and/or builtin functions related issue again, in
> this RFC we are trying to use pragma to import intrinsics and declare
> intrinsic wrappers function to reduce the compilation time.
>
> And here is the PoC for this RFC:
> https://reviews.llvm.org/D103228
>
> # Background:
>
> RISC-V vector extension has defined 25,386 intrinsic and 2,102
> overloaded intrinsic functions in riscv_vector.h which increase a lot
> of compilation time; the header file contains ~60k lines for those
> overload functions and intrinsic wrapper functions.
>
> An empty file with include riscv_vector.h takes 0.395s on release
> build and 8.067s second on debug build, and this also increases the
> clang test time.
>
> # Proposal:
>
> Using Tablegen to generate the table of the intrinsic wrapper
> functions and then using pragma to declare intrinsic wrapper
> functions.
>
> Syntax:
> ```c
> #pragma riscv intrinsic vector
> ```
>
> Then import all builtin functions and intrinsic wrappers into the
> symbol table, this could save lots of time parsing the prototypes of
> the intrinsic wrapper function.
>
> And this idea of trick is borrowing from AArch64/SVE's implementation on GCC:
> https://github.com/gcc-mirror/gcc/blob/master/gcc/config/aarch64/arm_sve.h#L40
>
>
> # Experimental Results:
> ## Size of riscv_vector.h:
>      |      size |     LoC |
> ------------------------------
> Before | 4,434,725 |  69,749 |
> After  |     5,463 |     159 |
>
> ## Compilation Speed for Simple File
>
> testcase:
> ```c
> #include <riscv_vector.h>
>
> vint32m1_t test_vadd_vv_vfloat32m1_t(vint32m1_t op1, vint32m1_t op2,
> size_t vl) {
>  return vadd(op1, op2, vl);
> }
> ```
>
> Release build:
>  Before: 0m0.417s
>  After:  0m0.090s
>
> Debug build:
>  Before: 0m8.016s
>  After:  0m2.295s
>
>
> ## Regression Time
> LLVM regression on our 48 core server:
> Release build:
>  Before : Testing Time: 203.81s
>  After : Testing Time: 181.13s
>
> Debug build:
>  Before : Testing Time: 675.18s
>  After : Testing Time: 647.20s
>
>
>
> Any comments or feedback are appreciated!
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210622/2cc5dc96/attachment-0001.html>


More information about the cfe-dev mailing list