<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
OpenCL Tablegen builtin implementation is fairly generic but right now it only
<div>supports function declarations. </div>
<div><br>
</div>
<div>The X86 header seems to contain the function definition instead. However </div>
<div>the definitions seem to follow regular patterns, so it feels like using a </div>
<div>similar Tablegen solution should work, but some infrastructural changes will</div>
<div>be needed first. However, if there are no requirements for the memory size</div>
<div>to store the headers perhaps the modules could be good enough. <br>
</div>
</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> David Rector <davrecthreads@gmail.com><br>
<b>Sent:</b> 09 February 2021 00:21<br>
<b>To:</b> Reid Kleckner <rnk@google.com><br>
<b>Cc:</b> James Y Knight <jyknight@google.com>; nd <nd@arm.com>; clang developer list <cfe-dev@lists.llvm.org>; Sven Van Haastregt <Sven.VanHaastregt@arm.com>; Anastasia Stulova <Anastasia.Stulova@arm.com><br>
<b>Subject:</b> Re: [cfe-dev] [RFC][OpenCL] Add builtin types and functions from the standard headers implicitly in the driver</font>
<div> </div>
</div>
<div class="" style="word-wrap:break-word; line-break:after-white-space">
<div class="" style="margin:0px; font-stretch:normal; line-height:normal"><span class="" style="font-kerning:none">To me the tablegen proposal seems a clever and generalizable means of basically "instantiating" non-templatable builtins and related declarations
only as they are needed, </span>avoiding not only parsing as a module/PCH would, but also storage of unused AST nodes.</div>
<div class="" style="margin:0px; font-stretch:normal; line-height:normal; min-height:14px">
<span class="" style="font-kerning:none"></span><br class="">
</div>
<div class="" style="margin:0px; font-stretch:normal; line-height:normal"><span class="" style="font-kerning:none">I have little familiarity with x86intrin.h et al, but why couldn't the same approach be fruitful there? To the extent their content is not manually
written/maintained but rather is generated from much smaller data via some program (is it? At a glance seems possible), couldn’t tablegen be that program, generating only the bare-bones data needed to create all the necessary declarations, and leaving it to
the compiler to actually create those declarations as needed, just as Anastasia proposes for OpenCL?</span></div>
<div class="" style="margin:0px; font-stretch:normal; line-height:normal; min-height:14px">
<span class="" style="font-kerning:none"></span><br class="">
</div>
<div class="" style="margin:0px; font-stretch:normal; line-height:normal"><span class="" style="font-kerning:none">Modules need not be the ne plus ultra solution for large, universally-#included headers, given the speed + storage improvements the OpenCL folks
have apparently realized with this alternative approach.</span></div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Feb 8, 2021, at 3:34 PM, Reid Kleckner via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:</div>
<br class="x_Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">My short, not carefully considered, input is that the suggested approach of tablegen + a header seems reasonable.
<div class=""><br class="">
</div>
<div class="">The modules / PCH approach has been considered for the ever-growing immintrin.h on x86, but I don't believe it got much traction:</div>
<div class=""><a href="https://lists.llvm.org/pipermail/cfe-dev/2016-September/050980.html" class="">https://lists.llvm.org/pipermail/cfe-dev/2016-September/050980.html</a></div>
<div class=""></div>
</div>
<br class="">
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Fri, Feb 5, 2021 at 2:13 PM James Y Knight via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:<br class="">
</div>
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div dir="ltr" class="">The whole tablegen thing seems like a sad path to have gone down, although I can certainly see the practical benefits. Substantially the same problem of compilation-speed exists for <x86intrin.h> (and friends), and I really don't think
we want to start defining all of <i class="">those</i> with a tablegen rule.
<div class=""><br class="">
</div>
<div class="">It would be really nice to instead somehow take advantage of the modules infrastructure to fix this problem -- I'd really love it if we could somehow start shipping a pre-built module artifact for our giant intrinsics headers, included with the
compiler distribution. And then use that by default, regardless of whether users are otherwise enabling modules. If we got that to work, we could use the same solution for both X86 and opencl.</div>
<div class="">
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">However, independent of that discussion -- and more to the immediate issue you're trying to raise -- your proposal seems like it's resulting in a very complex set of options, and I'm not sure what the purpose is.</div>
<div class=""><br class="">
</div>
<div class="">IIUC, the overall desire is to have, by default, these tens-of-thousands of prototypes available to all OpenCL compilations. But, I don't see any reason why users should care exactly HOW these are provided. I'd expect that a given prototype should
be provided either by the tablegen-based builtin code, OR an auto-included header file -- not both, and not neither. </div>
<div class=""><br class="">
What difference does it make if the builtin-tablegen code doesn't provide 100% of the declarations, so long as the remainder are provided by an automatically-included header? Why do you want to make users choose between getting a half-baked set of function
prototypes (tablegen version) and slow compilation (auto-including giant header), when you can have the fully correct set of functions AND nearly-as-fast compilation, by simply supplementing tablegen with an auto-included header providing the remainder?</div>
<div class=""><br class="">
</div>
<div class="">And then you need just a single user-visible option: to disable the automatic declarations (via both tablegen + autoinclude).<br class="">
</div>
<div class=""></div>
</div>
</div>
<br class="">
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Wed, Feb 3, 2021 at 11:58 AM Anastasia Stulova via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a>> wrote:<br class="">
</div>
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div dir="ltr" class="">
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
Hello,</div>
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
<br class="">
</div>
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
I would like to check if there is any feedback to the following proposal for</div>
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
improving the interface of standard type and function includes.<br class="">
</div>
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
<br class="">
</div>
<div class="" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt">
Background
<div class=""><br class="">
</div>
<div class="">Currently the default clang driver provides incomplete functionality for OpenCL
</div>
<div class="">because the headers with builtin function declarations are not included by
</div>
<div class="">default. The header can only be added using frontend flags requiring ‘-cc1’ or
</div>
<div class="">other frontend forwarding options</div>
<div class="">(<a href="https://clang.llvm.org/docs/UsersManual.html#opencl-header" target="_blank" class="">https://clang.llvm.org/docs/UsersManual.html#opencl-header</a>). This means it</div>
<div class="">is inaccessible to users in a conventional way. </div>
<div class=""><br class="">
</div>
<div class="">We propose to add the implicit header by default when a source is compiled in
</div>
<div class="">OpenCL mode. A review for this has been uploaded by Matt a few months ago:</div>
<div class=""><a href="https://reviews.llvm.org/D78979" target="_blank" class="">https://reviews.llvm.org/D78979</a>. Note that the standard library functionality is
</div>
<div class="">added by default in OpenCL C without using include directives in the compiling
</div>
<div class="">sources. This means all builtin function declarations (there are 17000 of
</div>
<div class="">them) are to be parsed every time the source is compiled because which builtins
</div>
<div class="">are used by the kernels is not known beforehand. This impacts the compilation
</div>
<div class="">speed. For example, parsing a simple kernel with the builtin function
</div>
<div class="">declarations is 138 times slower in a Debug build and 13 times slower in a
</div>
<div class="">Release build than parsing the same kernel without those declarations.
</div>
<div class=""><br class="">
</div>
<div class="">To mitigate the overhead of parsing the full header, a fast Tablegen based
</div>
<div class="">solution has been developed</div>
<div class="">(<a href="https://llvm.org/devmtg/2019-10/talk-abstracts.html#lit5" target="_blank" class="">https://llvm.org/devmtg/2019-10/talk-abstracts.html#lit5</a>). The parsing speed with</div>
<div class="">this mechanism for builtin function declarations is only 1.3 times slower in a Debug</div>
<div class="">build and 1.05 times slower in a Release build compared to clang without the</div>
<div class="">builtins. While this mechanism covers most of OpenCL standard </div>
<div class="">functions it lacks two main classes of builtins: builtins defined by
</div>
<div class="">vendor extensions and builtins with enum arguments. </div>
<div class=""><br class="">
</div>
<div class="">Proposal </div>
<div class=""><br class="">
</div>
<div class="">We propose the following changes in the clang driver interface for OpenCL:
<br class="">
</div>
<div class="">- Enable the fast Tablegen based builtin function declaration mechanism by
</div>
<div class="">default in the clang driver. This makes the majority of OpenCL builtin
</div>
<div class="">functions available.</div>
<div class="">- In addition, include the small header opencl-c-base.h by default in the clang</div>
<div class="">driver. This header provides basic types and constants.</div>
<div class="">No frontend or driver flags will be needed to allow using the majority of OpenCL
</div>
<div class="">types and functions from the standard, at a very low parsing speed increase.
</div>
<div class=""><br class="">
</div>
<div class="">Since the Tablegen mechanism has some small overhead and it is not fully
</div>
<div class="">complete, we propose to add the following additional clang driver flags:</div>
<div class="">1. Add a new clang driver flag -cl-no-stdinc (*) that disables such extra includes to
</div>
<div class="">minimize further compilation speed or for the use cases that don’t require
</div>
<div class="">standard libraries. The majority of OpenCL clang tests will use this option.</div>
<div class="">2. [Optionally, if there is enough interest] Add a new clang driver flag</div>
<div class="">-cl-all-stdinc (*) that will include the full header instead of using the Tablegen
</div>
<div class="">mechanism, at the cost of a significant increase in parsing time. </div>
<div class=""><br class="">
</div>
<div class="">At present we propose no change to the ‘cc1’ interface, but in the future it is
</div>
<div class="">expected that the functionality will be aligned between driver and frontend
</div>
<div class="">command line interfaces for the OpenCL headers. </div>
<div class=""><br class="">
</div>
<div class="">(*) The exact spelling of command line options is to be discussed. </div>
<div class=""><br class="">
</div>
<div class="">Summary </div>
<div class=""><br class="">
</div>
<div class="">This proposal enhances the clang driver with full functionality of OpenCL C by
</div>
<div class="">adding builtin function declarations implicitly without the need for any
</div>
<div class="">frontend flags to be given in the command line. </div>
<div class=""><br class="">
</div>
<div class="">The default clang behavior proposed is not expected to negatively impact users
</div>
<div class="">of clang as the parsing speed difference remains within the same order of
</div>
<div class="">magnitude. While the fast header mechanism matures, a fallback mechanism will
</div>
<div class="">be provided if needed that would allow switching to the slow header with the
</div>
<div class="">full functionality using a new driver flag. For backward compatibility, another
</div>
<div class="">flag is provided to disable all OpenCL declarations that are not builtin into
</div>
<div class="">the compiler. </div>
<div class=""><br class="">
</div>
<div class="">The solution proposed improves the driver interface and reduces risks of
</div>
<div class="">forcing the OpenCL community to update their use of clang due to significant
</div>
regression of the compilation speed. <br class="">
</div>
</div>
_______________________________________________<br class="">
cfe-dev mailing list<br class="">
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a><br class="">
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank" class="">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br class="">
</blockquote>
</div>
_______________________________________________<br class="">
cfe-dev mailing list<br class="">
<a href="mailto:cfe-dev@lists.llvm.org" target="_blank" class="">cfe-dev@lists.llvm.org</a><br class="">
<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank" class="">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br class="">
</blockquote>
</div>
_______________________________________________<br class="">
cfe-dev mailing list<br class="">
<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class="">
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<br class="">
</div>
</blockquote>
</div>
<br class="">
</div>
</body>
</html>