[llvm-dev] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types

Craig Topper via llvm-dev llvm-dev at lists.llvm.org
Tue May 26 16:00:15 PDT 2020


We also have the opposite problem of the store shrinking. We'll also try to
widen 4 byte aligned i16/i8 extload to i32. I didn't count out the
alignment in this particular struct layout.

~Craig


On Tue, May 26, 2020 at 3:54 PM Eli Friedman via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> By default, clang emits all bitfield load/store operations using the width
> of the entire sequence of bitfield members.  If you look at the LLVM IR for
> your testcase, all the bitfield operations are i16.  (For thread safety,
> the C/C++ standards treat a sequence of bitfield members as a single
> "field".)
>
> If you look at the assembly, though, an "andb $-2, (%rdi)" slips in.  This
> is specific to the x86 backend: it's narrowing the store to save a couple
> bytes in the encoding, and a potential decoding stall due to a 2-byte
> immediate.  Maybe we shouldn't do that, or we should guard it with a better
> heuristic.
>
> -Eli
>
> -----Original Message-----
> From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Bill Wendling
> via cfe-dev
> Sent: Tuesday, May 26, 2020 3:29 PM
> To: cfe-dev at lists.llvm.org Developers <cfe-dev at lists.llvm.org>; llvm-dev <
> llvm-dev at lists.llvm.org>
> Subject: [EXT] [cfe-dev] [RFC] Loading Bitfields with Smallest Needed Types
>
> We're running into an interesting issue with the Linux kernel, and
> wanted advice on how to proceed.
>
> Example of what we're talking about: https://godbolt.org/z/ABGySq
>
> The issue is that when working with a bitfield a load may happen
> quickly after a store. For instance:
>
> struct napi_gro_cb {
>     void *frag;
>     unsigned int frag_len;
>     u16 flush;
>     u16 flush_id;
>     u16 count;
>     u16 gro_remcsum_start;
>     unsigned long age;
>     u16 proto;
>     u8 same_flow : 1;
>     u8 encap_mark : 1;
>     u8 csum_valid : 1;
>     u8 csum_cnt : 3;
>     u8 free : 2;
>     u8 is_ipv6 : 1;
>     u8 is_fou : 1;
>     u8 is_atomic : 1;
>     u8 recursive_counter : 4;
>     __wsum csum;
>     struct sk_buff *last;
> };
>
> void dev_gro_receive(struct sk_buff *skb)
> {
> ...
>         same_flow = NAPI_GRO_CB(skb)->same_flow;
> ...
> }
>
> Right before the "same_flow = ... ->same_flow;" statement is executed,
> a store is made to the bitfield at the end of a called function:
>
>     NAPI_GRO_CB(skb)->same_flow = 1;
>
> The store is a byte:
>
>     orb    $0x1,0x4a(%rbx)
>
> while the read is a word:
>
>     movzwl 0x4a(%r12),%r15d
>
> The problem is that between the store and the load the value hasn't
> been retired / placed in the cache. One would expect store-to-load
> forwarding to kick in, but on x86 that doesn't happen because x86
> requires the store to be of equal or greater size than the load. So
> instead the load takes the slow path, causing unacceptable slowdowns.
>
> GCC gets around this by using the smallest load for a bitfield. It
> seems to use a byte for everything, at least in our examples. From the
> comments, this is intentional, because according to the comments
> (which are never wrong) C++0x doesn't allow one to touch bits outside
> of the bitfield. (I'm not a language lawyer, but take this to mean
> that gcc is trying to minimize which bits it's accessing by using byte
> stores and loads whenever possible.)
>
> The question I have is what should we do to fix this issue? Once we
> get to LLVM IR, the information saying that we're accessing a bitfield
> is gone. We have a few options:
>
> * We can glean this information from how the loaded value is used and
> fix this during DAG combine, but it seems a bit brittle.
>
> * We could correct the size of the load during the front-end's code
> generation. This benefits from using all of LLVM's passes on the code.
>
> * We could perform the transformation in another place, possible in MIR or
> MC.
>
> What do people think?
>
> -bw
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200526/6eb9f7be/attachment-0001.html>


More information about the llvm-dev mailing list