[cfe-dev] [RFC] Proposing an Extended Integer Type
Keane, Erich via cfe-dev
cfe-dev at lists.llvm.org
Tue Feb 4 09:14:14 PST 2020
Hi Chris, thanks for your feedback!
>> Could you provide a patch that updates the Clang extensions manual?
I updated Language Extensions in the patch (D73967 being proposed). Is that sufficient, or are you looking for more information there?
>> Make sure the promotion semantics follow that of the rest of C.
C usual promotions are quite harmful to these types, so they don’t participate unless otherwise forced to. For example:
SomeI7 + SomeI8 // Operation done at I8 size
SomeI7 + SomeChar // SomeChar goes through usual promotions, so the operation happens as an int.
The second case is necessary for consistency with the C language, the first because otherwise these types don’t end up being particularly useful. On things like an FPGA (or otherwise limited hardware), rounding up is absurdly expensive.
>> - Your comment on the patch that "Unfortunately, this results in arrays on these platforms having ‘padding’ bits, but sufficiently motivated code generation can repair this problem.” I’d pretty strongly recommend that you do *not* special case things like this, because they have user observable behavior. Either you decide to round sizeof up to a power of two, or to the next byte, I don’t see any other reasonable alternative. If the array behavior is important, then rounding up to the next byte and using an alignment of 1 would provide a reasonable approximation of dense arrays.
Ah, I apologize, it seems that you’re looking at my first attempt at this proposal. I realize that I put both links, but https://reviews.llvm.org/D73967 is the active one. We chose to do/propose the sizeof roundup.
>> - My recollection is that Clang provides non-power-of-two bit width semantics for large bitfields in certain cases (this was required for GCC compatibility) without promoting to larger types. It would be good to make sure this is consistent with that.
I believe they are consistent with that, but if you have an example that you’d like to be particularly consistent with, I’d love to check.
>> - These things will end up being passed and returned as arguments, it is important to nail down the ABI implications. I’d recommend passing them as the next power of two size integer.
This is already the behavior of LLVM in my experience (round to the next power of 2). We currently just defer to the LLVM implementation.
>> - What is the maximum N? If the ABI or other behavior is specified in terms of power of existing power of two integers, then it would be good to limit it to whatever maxint is for a target.
In the patch, our Max-N is llvm::IntegerType::MAX_INT_BITS. Our language proposal specifies an implementation defined limit, so we chose the max that LLVM can deal with.
From: Chris Lattner <clattner at nondot.org>
Sent: Tuesday, February 4, 2020 9:04 AM
To: Keane, Erich <erich.keane at intel.com>
Cc: Clang Dev <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] [RFC] Proposing an Extended Integer Type
This looks like really interesting work, and I think it would be great for Clang to support this. That said, this is a significant language extension, and I think it is important to nail down the corner cases. Could you provide a patch that updates the Clang extensions manual? This would provide a good place to describe the details of the behavior.
On the semantics of this, I’d recommend the following:
- Make sure the promotion semantics follow that of the rest of C.
- I agree that zero padding and rounding in sizeof is the right way to go.
- Your comment on the patch that "Unfortunately, this results in arrays on these platforms having ‘padding’ bits, but sufficiently motivated code generation can repair this problem.” I’d pretty strongly recommend that you do *not* special case things like this, because they have user observable behavior. Either you decide to round sizeof up to a power of two, or to the next byte, I don’t see any other reasonable alternative. If the array behavior is important, then rounding up to the next byte and using an alignment of 1 would provide a reasonable approximation of dense arrays.
- My recollection is that Clang provides non-power-of-two bit width semantics for large bitfields in certain cases (this was required for GCC compatibility) without promoting to larger types. It would be good to make sure this is consistent with that.
- These things will end up being passed and returned as arguments, it is important to nail down the ABI implications. I’d recommend passing them as the next power of two size integer.
- What is the maximum N? If the ABI or other behavior is specified in terms of power of existing power of two integers, then it would be good to limit it to whatever maxint is for a target.
On Feb 4, 2020, at 7:09 AM, Keane, Erich via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
TL;DR: We're proposing _ExtInt(N), a type in C languages that represents llvm iN in the language.
Note: This functionality was proposed earlier this year in https://reviews.llvm.org/D59105 . Valuable feedback was received by Richard Smith that was considered and integrated into this RFC. In the meantime, we also have user experience with the predecessor of D5910 which we have built on. A updated review based on the extensive feedback by Richard is here: https://reviews.llvm.org/D73967
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax. Integers of non-power-of-two aren't particularly interesting or useful on most hardware, so much so that no language in Clang has been motivated to expose it before.
However, in the case of FPGA hardware normal integer types where the full bitwidth isn't used, is extremely wasteful and has severe performance/space concerns. Because of this, Intel has introduced this functionality in the High Level Synthesis compiler (https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html) under the name "Arbitrary Precision Integer" (ap_int for short). This has been extremely useful and effective for our users, permitting them to optimize their storage and operation space on an architecture where both can be extremely expensive.
We are proposing upstreaming a more palatable version of this to the community, in the form of this proposal and accompanying patch. We are proposing the syntax _ExtInt(N). We intend to propose this to the WG14 committee, and the underscore-capital seems like the active direction for a WG14 paper's acceptance. An alternative that Richard Smith suggested on the initial review was __int(N), however we believe that is much less acceptable by WG14. We considered _Int, however _Int is used as an identifier in libstdc++ and there is no good way to fall back to an identifier (since _Int(5) is indistinguishable from an unnamed initializer of a template type named _Int).
Extension Proposal Requirements: http://clang.llvm.org/get_involved.html
Below are the extension proposal requirements along with some discussion that we believe we sufficiently meet for acceptance to the Clang/LLVM project.
1: Evidence of a significant user community: This is based on a number of factors, including an existing user community, the perceived likelihood that users would adopt such a feature if it were available, and any secondary effects that come from, e.g., a library adopting the feature and providing benefits to its users.
Our current HLS product has a large number of users that program against the ap_int interface on a near daily basis. However, this type/set of types isn't useful for JUST FPGAs. Thanks to the architecture of LLVM, these types are useable in normal C/C++. Using signed versions of these types can be used for loop bounds, which provides some input to the loop optimizers, potentially resulting in better code generation. Both signed and unsigned versions provide more context as to the important bits of a variable, which the optimizers can use to provide better code.
Even absent that, the additional expressivity these types provide are advantageous in many situations.
2: A specific need to reside within the Clang tree: There are some extensions that would be better expressed as a separate tool, and should remain as separate tools even if they end up being hosted as part of the LLVM umbrella project.
These types need to be part of the type system, so no other tool can provide an effective interface for these. A set of library types was considered, however these are unable to properly represent them with the same gaurantees that are necessary for effective code generation.
3: A specification: The specification must be sufficient to understand the design of the feature as well as interpret the meaning of specific examples. The specification should be detailed enough that another compiler vendor could implement the feature.
A more formal specification is provided in the review in the Language Extensions documentation. We intend this to evolve toward completeness as the review progresses.
4: Representation within the appropriate governing organization: For extensions to a language governed by a standards committee (C, C++, OpenCL), the extension itself must have an active proposal and proponent within that committee and have a reasonable chance of acceptance. Clang should drive the standard, not diverge from it. This criterion does not apply to all extensions, since some extensions fall outside of the realm of the standards bodies.
It is our intent to propose these types to the WG14 standards committee, with Melanie Blower authoring and presenting the paper for acceptance to the C committee. Said paper is available here: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf. Additionally, there is an active effort to propose these types to the SYCL standards committee.
5: A long-term support plan: increasingly large or complex extensions to Clang need matching commitments to supporting them over time, including improving their implementation and specification as Clang evolves. The capacity of the contributor to make that commitment is as important as the commitment itself.
It is our intent to move our current customers to these new types (from a predecessor of the D59105 version), so this will be a long-term feature we intend to maintain in the clang code base. Additionally, these types will likely be well used in SYCL, an actively used and maintained clang compiler language (currently maintained out of tree, with active effort to bring in-tree). Finally, contributor Erich Keane (the author of both D59105 as well as the patch accompanying this RFC) will be providing time, effort, and experience to the continued maintenance of these types in the clang codebase. Additionally, we will be evolving both the implementation and specification along with the standardization efforts in WG14 and SYCL.
6: A high-quality implementation: The implementation must fit well into Clang's architecture, follow LLVM's coding conventions, and meet Clang's quality standards, including diagnostics and complete AST representations. This is particularly important for language extensions, because users will learn how those extensions work through the behavior of the compiler.
It is our belief that the accompanying patch provided under review comes very close to meeting this criteria, and are confident that it will meet this criteria after community review.
7: A test suite: Extensive testing is crucial to ensure that the language extension is not broken by ongoing maintenance in Clang. The test suite should be complete enough that another compiler vendor could conceivably validate their implementation of the feature against it.
The accompanying patch provides extensive semantic analysis LIT tests that we anticipate will be an effective test suite for other implementers. Additionally the patch has a large coverage of IR-CodeGen tests that should prevent further breakage inside the clang codebase. We anticipate that the review process and ongoing maintenance of these types will further increase the test coverage of for these types.
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev