<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi Johannes,<div><br></div><div>Sure! The underlying problem is that raw-memory access handlers are treated</div><div>as integers, while they are not really integers. Especially std::byte that specifically</div><div>states that it has raw-memory access semantics. This semantic mismatch can make</div><div>AA wrong and a pointer to escape.</div><div><br></div><div>Consider the following LLVM IR that copies a pointer:</div><div><br></div><div><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%src8 = bitcast i8** %src to i8*</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%dst8 = bitcast i8** %dst to i8*</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dst8, i8* %src8, i32 8, i1 false)</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%load = load i8*, i8** %dst</p>

<p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%addr = ptrtoint i8* %load to i64</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">ret i64 %addr</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">If we optimize the call to memcpy, then the IR becomes</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%src64 = bitcast i8** %src to i64*</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%dst64 = bitcast i8** %dst to i64*</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%addr = load i64, i64* %src64, align 1</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">store i64 %addr, i64* %dst64, align 1</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">ret i64 %addr</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">Since there is no "data" type in LLVM like a byte, the ptrtoint is optimized out and the</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">pointer escapes. If we have used a pair of byte load/store that would not happen.</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">The mentioned bug is just an example, but this pattern can be</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">replicated in other cases that optimize memcpy as integer load/stores and drop ptrtoint,</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">undermining alias analysis. To illustrate further, consider loading a pointer like:</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%v1 = load i8*, %p</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%v2 = load i8*, %p</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">Instcombine optimizes this into</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%v1 = load i64, %p</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">%v2 = inttoptr %v1</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">changing the underlying provenance (again, not good for AA). If we had a byte type instead,</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">then we could</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">1. preserve provenance</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">2. differentiate between integers and the raw data copied</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">This might be beneficial for some backends (e.g CHERI) that want to differentiate between</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">true integers and bytes that can be integers. The optimizations for known integers<br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">can become more aggressive in this case.</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue""><br></p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">Thanks,</p><p style="margin:0px;font-stretch:normal;font-size:13px;line-height:normal;font-family:"Helvetica Neue"">George</p></div></div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jun 4, 2021 at 7:03 PM Johannes Doerfert <<a href="mailto:johannesdoerfert@gmail.com">johannesdoerfert@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi George,<br>

<br>

On 6/4/21 10:24 AM, George Mitenkov via cfe-dev wrote:<br>

> Hi all,<br>

><br>

> Together with Nuno Lopes and Juneyoung Lee we propose to add a new byte<br>

> type to LLVM to fix miscompilations due to load type punning. Please see<br>

> the proposal below. It would be great to hear the<br>

> feedback/comments/suggestions!<br>

><br>

><br>

> Motivation<br>

> ==========<br>

><br>

> char and unsigned char are considered to be universal holders in C. They<br>

> can access raw memory and are used to implement memcpy. i8 is the LLVM’s<br>

> counterpart but it does not have such semantics, which is also not<br>

> desirable as it would disable many optimizations.<br>

><br>

> We therefore propose to introduce a new byte type that would have a raw-memory<br>

> access semantics and can be differentiated from i8. This can help to fix<br>

> unsound optimizations and make the lowering of char, unsigned char or<br>

> std::byte correct. Currently, we denote the byte type as b<N>, where N is<br>

> the number of bits.<br>

><br>

> In our semantics, byte type carries provenance and copying bytes does not<br>

> escape pointers, thereby benefiting alias analysis. If the memory contains<br>

> poison bits, then the result of loading a byte from it is bitwise poison.<br>

<br>

Could you elaborate on the motivation a bit. I'm not sure I follow.<br>

Maybe examples would be good. The only one I found is in<br>

<a href="https://bugs.llvm.org/show_bug.cgi?id=37469" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=37469</a>. For that one I'm<br>

wondering how much we would in reality loose if we make the required<br>

ptr2int explicit when people cast a pointer to an integer. In general,<br>

explicit encoding seems to me preferable and overall less work (new<br>

types kinda scare me TBH).<br>

<br>

~ Johannes<br>

<br>

<br>

><br>

><br>

> Benefits<br>

> ========<br>

><br>

> The small calls to memcpy and memmove that are transformed by Instcombine<br>

> into integer load/store pairs would be correctly transformed into the<br>

> loads/stores of the new byte type no matter what underlying value is being<br>

> copied (integral value or a pointer value).<br>

><br>

>     1.<br>

><br>

>     The new byte type solves the problem of copying the padded data, with no<br>

>     contamination of the loaded value due to poison bits of the padding.<br>

>     2.<br>

><br>

>     When copying pointers as bytes implicit pointer-to-integer casts are<br>

>     avoided.The current setting of performing a memcpy using i8s leads to<br>

>     miscompilations (example: bug report 37469<br>

>     <<a href="https://bugs.llvm.org/show_bug.cgi?id=37469" rel="noreferrer" target="_blank">https://bugs.llvm.org/show_bug.cgi?id=37469</a>>) and is bad for alias<br>

>     analysis.<br>

><br>

><br>

><br>

> Lowering of char, unsigned char and std::byte<br>

><br>

> ======================================<br>

><br>

> For any function that takes char, unsigned char or std::byte as arguments,<br>

> we lower these to the b8 instead of i8. The motivation comes from the fact<br>

> that any pointer can be explicitly copied in bytes, and each byte can be<br>

> passed as an argument to the copying function. Since a copy like that can<br>

> escape a pointer, we need to generate the byte type to avoid the issue.<br>

><br>

> Example<br>

><br>

> void foo(unsigned char arg1, char arg2) {...}<br>

><br>

>              =><br>

><br>

> void @foo(zeroext b8 %arg1, signext b8 %arg2) {...}<br>

><br>

><br>

><br>

> std::byte is defined as enum class byte : unsigned char {} and therefore<br>

> all bitwise operations over it can be lowered equivalently to operations on<br>

> unsigned char (operation is performed over zero-extended to i32 operands,<br>

> and the result is truncated back).<br>

><br>

><br>

> ==============================<br>

><br>

> Byte Type Semantics (Version 1)<br>

><br>

> ==============================<br>

><br>

><br>

> [allowed] alloca/load/store<br>

><br>

> =====================<br>

><br>

> The byte type is allowed to be allocated, stored and loaded. No semantic<br>

> changes needed.<br>

><br>

> [allowed] zext/sext/trunc<br>

><br>

> ====================<br>

><br>

> In order to easily adapt the current replacement of i8 with b8, we want to<br>

> extend zext/sext/trunc instructions’ semantics to allow the first operand<br>

> as a byte type as well. This will reduce the number of instructions needed<br>

> for the cast.<br>

><br>

> Example<br>

><br>

> trunc i32 %int to b8<br>

><br>

> zext i8 %char to b32<br>

><br>

> [modified] bitcast<br>

><br>

> ==============<br>

><br>

> We modify the semantics of the bitcast to allow casts from and to the byte<br>

> types. If the byte has a value of the same kind (integral or a pointer) as<br>

> the other type, then the bitcast is a noop. Otherwise, the result is poison.<br>

><br>

> Example<br>

><br>

> bitcast i<N> %val to b<N><br>

><br>

> bitcast i<N>* %val to b64<br>

><br>

> bitcast b<N> %byte to i<N>     [byte has an integral value => noop]<br>

><br>

><br>

><br>

> [new] bytecast<br>

><br>

> ============<br>

><br>

> During IR generation, we cannot deduce whether the byte value contains a<br>

> pointer or an integral value, and therefore we cannot create a bitcast.<br>

><br>

> Example<br>

><br>

> int @cast(char c) { return (int) c; }<br>

><br>

> Current<br>

><br>

> Proposed<br>

><br>

> i32 @cast(i8 signext %c) {<br>

><br>

>    %1 = alloca i8, align 1<br>

><br>

>    store i8 %c, i8* %1, align 1<br>

><br>

>    %2 = load i8, i8* %1, align 1<br>

><br>

>    %3 = sext i8 %2 to i32<br>

><br>

>    ret i32 %3<br>

><br>

> }<br>

><br>

> i32 @cast(b8 %c) {<br>

><br>

>    %1 = alloca b8<br>

><br>

>    store b8 %c, b8* %11<br>

><br>

>    %2 = load b8, b8* %1<br>

><br>

>    %3 =    ?    b8 %2 to i32<br>

><br>

>    ret i32 %3<br>

><br>

> }<br>

><br>

> In this example, the operation  ?  can be sext (b8 is i8) or can be<br>

> ptrtoint if the byte has a pointer value.<br>

><br>

> We therefore introduce a new bytecast instruction. The frontend will always<br>

> produce a bytecast, which can then be optimized into a more specific cast<br>

> if necessary. Bytecast operands must have the same bitwidth (or size).<br>

><br>

> The semantics of bytecast instruction is:<br>

><br>

> bytecast b<N> %byte to T<br>

><br>

> The kind of the value of the byte<br>

><br>

> T<br>

><br>

> Semantics<br>

><br>

> Pointer<br>

><br>

> Integral<br>

><br>

> ptrtoint<br>

><br>

> Pointer<br>

><br>

> Pointer<br>

><br>

> bitcast<br>

><br>

> Integral<br>

><br>

> Integral<br>

><br>

> integral casts (zext, sext, etc.)<br>

><br>

> Integral<br>

><br>

> Pointer<br>

><br>

> inttoptr<br>

><br>

><br>

><br>

> Essentially, this instruction is a selection of the right casts. Once the<br>

> compiler has been able to prove the kind of the byte’s value, the bytecast<br>

> can be replaced with appropriate cast or become a noop.<br>

><br>

> Example<br>

><br>

> bytecast b64 %bytes to i32*<br>

><br>

> bytecast b8 %byte to i8<br>

><br>

><br>

><br>

> [disabled] and/or/xor<br>

><br>

> =================<br>

><br>

> We do not allow bitwise operations over byte types because it’s not trivial<br>

> to define all cases, particularly the ones involving pointers. If these<br>

> operations are useful for optimizations and/or some frontends, semantics<br>

> for these can be considered separately.<br>

><br>

> If the operation is desired on the frontend level, then the default<br>

> generated code can always cast the byte type to an integer to operate on<br>

> integer values.<br>

><br>

><br>

><br>

> [disabled] arithmetic<br>

><br>

> =================<br>

><br>

> Performing arithmetic operations over the byte type is similarly not<br>

> allowed (A valid question is what does it mean to add to bytes of raw<br>

> memory?). If we want to perform arithmetic, we need to cast a byte to an<br>

> integer (via sext/zext/trunc explicitly).<br>

><br>

><br>

><br>

> [modified] comparison<br>

><br>

> ==================<br>

><br>

> We allow performing comparisons, as we may potentially want to compare the<br>

> ordering of the memory instances, check for null, etc. Comparison is also<br>

> needed since char types are commonly compared. We define the following<br>

> semantic of the byte type comparison.<br>

><br>

> Case 1: The values of the bytes have the same kinds: compare the values.<br>

><br>

> Case 2: The values of the bytes have the different kinds: do a bytecast of<br>

> the non-integral type to the other type and compare the integral values.<br>

><br>

><br>

> Example<br>

><br>

> =======<br>

><br>

> unsigned char sum(unsigned char a, unsigned char b) { return a + b; }<br>

><br>

> Current<br>

><br>

> Proposed<br>

><br>

> zeroext i8 @sum(i8 zeroext %a, i8 zeroext %b) {<br>

><br>

>    %1 = alloca i8<br>

><br>

>    %2 = alloca i8<br>

><br>

>    store i8 %a, i8* %1<br>

><br>

>    store i8 %b, i8* %2<br>

><br>

>    %3 = load i8, i8* %1<br>

><br>

>    %4 = zext i8 %3 to i32<br>

><br>

>    %5 = load i8, i8* %2<br>

><br>

>    %6 = zext i8 %5 to i32<br>

><br>

>    %7 = add nsw i32 %4, %6<br>

><br>

>    %8 = trunc i32 %7 to i8<br>

><br>

>    ret i8 %8<br>

><br>

> }<br>

><br>

> zeroext b8 @sum(b8 zeroext %a, b8 zeroext %b) {<br>

><br>

>    %1 = alloca b8<br>

><br>

>    %2 = alloca b8<br>

><br>

>    store b8 %a, b8* %1<br>

><br>

>    store b8 %b, b8* %2<br>

><br>

>    %3 = load b8, b8* %1<br>

><br>

>    %4 = bytecast b8 %3 to i8<br>

><br>

>    %5 = zext i8 %4 to i32<br>

><br>

>    %6 = load b8, b8* %2<br>

><br>

>    %7 = bytecast b8 %6 to i8<br>

><br>

>    %8 = zext i8 %7 to i32<br>

><br>

>    %9 = add nsw i32 %5, %8<br>

><br>

>    %10 = trunc i32 %9 to b8<br>

><br>

>    ret b8 %10<br>

><br>

> }<br>

><br>

><br>

><br>

><br>

><br>

> Thanks,<br>

><br>

> George<br>

><br>

><br>

> _______________________________________________<br>

> cfe-dev mailing list<br>

> <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

> <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</blockquote></div></div>