[PATCH] D142274: [ADT] Add llvm::byteswap to bit.h

David Blaikie via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 23 13:39:33 PST 2023


dblaikie added inline comments.


================
Comment at: llvm/include/llvm/ADT/bit.h:70
+    uint32_t UV = V;
+#if defined(__llvm__) || (defined(__GNUC__) && !defined(__ICC))
+    return __builtin_bswap32(UV);
----------------
barannikov88 wrote:
> dblaikie wrote:
> > RKSimon wrote:
> > > kazu wrote:
> > > > barannikov88 wrote:
> > > > > kazu wrote:
> > > > > > kazu wrote:
> > > > > > > RKSimon wrote:
> > > > > > > > use has_builtin ?
> > > > > > > Shall we stop using the intrinsics and just stick to the ANDs and ORs?  This way, we don't have to worry about the availability of intrinsics, the availability of has_builtin itself, whether we are using ICC, etc.  (I couldn't find out why ICC is excluded here.)  According to compiler explorer, gcc-4.5.3 and clang-10.0.0 generate bswap for the following code snippet:
> > > > > > > 
> > > > > > > ```
> > > > > > > #include <stdint.h>
> > > > > > > 
> > > > > > > uint32_t bs32(uint32_t v) { 
> > > > > > >     uint32_t Byte0 = v & 0x000000FF;
> > > > > > >     uint32_t Byte1 = v & 0x0000FF00;
> > > > > > >     uint32_t Byte2 = v & 0x00FF0000;
> > > > > > >     uint32_t Byte3 = v & 0xFF000000;
> > > > > > >     return (Byte0 << 24) | (Byte1 << 8) | (Byte2 >> 8) | (Byte3 >> 24);
> > > > > > > }
> > > > > > > 
> > > > > > > uint64_t bs64(uint64_t v) {
> > > > > > >     uint64_t Hi = bs32(v);
> > > > > > >     uint32_t Lo = bs32(v >> 32);
> > > > > > >     return (Hi << 32) | Lo;
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > clang prior to 10.0.0 does not generate 64-bit bswap, but we don't attempt to use 64-bit bswap intrinsic in this patch or `SwapByteOrder.h` anyway.
> > > > > > > 
> > > > > > > Note that I mention gcc-4.5.3 above, but we require gcc-7.1.0.
> > > > > > I've removed the dependence on GCC-style `__builtin_xxx`, so I don't use `__has_builtin`,
> > > > > > 
> > > > > > Both gcc and clang generate good code.
> > > > > > According to compiler explorer, gcc-4.5.3 and clang-10.0.0 generate bswap for the following code snippet
> > > > > They may generate the intrinsic for this sample code, but they also may fail to do so when this function is inlined.
> > > > > I'd prefer seeing the intrinsic.
> > > > > 
> > > > OK.  I've restored the calls to intrinsics.
> > > Shouldn't this be `__has_builtin(__builtin_bswap32)` ?
> > > 
> > > > According to compiler explorer, gcc-4.5.3 and clang-10.0.0 generate bswap for the following code snippet
> > > 
> > > They may generate the intrinsic for this sample code, but they also may fail to do so when this function is inlined.
> > I'd prefer seeing the intrinsic.
> > 
> > Are there examples of this failing? If not, I'd prefer not to maintain #ifdef'd codepaths that can diverge/introduce build failures that are harder to test (because they don't show up for some people, etc).
> > 
> > Generally we rely on "sufficiently" well-performing compilers (specifically/especially sometimes as aggressively as "clang with LTO") to get good performance out of LLVM, so if Clang gets this right, that's probably enough justification to remove the special case.
> > Are there examples of this failing?
> 
> I didn't do actual testing, but [[ https://reviews.llvm.org/D142274/new/#4071403 | this comment ]] suggests that MSVC fails to recognize 64-bit version even when the function is not inlined.
> 
> > Generally we rely on "sufficiently" well-performing compilers (specifically/especially sometimes as aggressively as "clang with LTO") to get good performance out of LLVM, so if Clang gets this right, that's probably enough justification to remove the special case.
> 
> It is possible that expanded version may even perform better than the "builtin" version in some cases. My judgement was only relying on the fact that there //are// these builtins, so there must have been a good reason for introducing them.
> 
> > If not, I'd prefer not to maintain #ifdef'd codepaths that can diverge/introduce build failures that are harder to test (because they don't show up for some people, etc).
> 
> It is hard to disagree with it. I guess this particular change could rely on compiler optimizations (compared to e.g. popcount below), and that's why I only expressed my personal preference, not a requirement. Should I have marked it as "(optional)" or something like that?
> 
> > Are there examples of this failing?
> 
> I didn't do actual testing, but this comment suggests that MSVC fails to recognize 64-bit version even when the function is not inlined.

Yeah, I'd be willing to have slightly worse perf on MSVC really - now that we have clang-cl, a bootstrap would recover the performance, I guess.


> > Generally we rely on "sufficiently" well-performing compilers (specifically/especially sometimes as aggressively as "clang with LTO") to get good performance out of LLVM, so if Clang gets this right, that's probably enough justification to remove the special case.
> 
> It is possible that expanded version may even perform better than the "builtin" version in some cases. My judgement was only relying on the fact that there are these builtins, so there must have been a good reason for introducing them.

There's probably loads of history there - perhaps some folks need it for correctness, maybe in the past compilers couldn't optimize it as well as they can now.


> > If not, I'd prefer not to maintain #ifdef'd codepaths that can diverge/introduce build failures that are harder to test (because they don't show up for some people, etc).
> 
> It is hard to disagree with it. I guess this particular change could rely on compiler optimizations (compared to e.g. popcount below), and that's why I only expressed my personal preference, not a requirement. Should I have marked it as "(optional)" or something like that?

(I think even popcount can be pretty well identified/optimized by clang these days?)

Eh, no worries - not enough I'd bother writing a patch now/arguing for changes later. Would've if this wasn't already committed, but just a thing/note for next time sort of thing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142274/new/

https://reviews.llvm.org/D142274



More information about the llvm-commits mailing list