[llvm] r217708 - llvm-profdata: Avoid undefined behaviour when reading raw profiles

Duncan P. N. Exon Smith dexonsmith at apple.com
Tue Sep 23 15:47:13 PDT 2014


> On 2014-Sep-15, at 16:01, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
>> On Mon, Sep 15, 2014 at 3:51 PM, Justin Bogner <mail at justinbogner.com> wrote:
>> David Blaikie <dblaikie at gmail.com> writes:
>> > On Fri, Sep 12, 2014 at 2:22 PM, Justin Bogner <mail at justinbogner.com> wrote:
>> >
>> >     Author: bogner
>> >     Date: Fri Sep 12 16:22:55 2014
>> >     New Revision: 217708
>> >
>> >     URL: http://llvm.org/viewvc/llvm-project?rev=217708&view=rev
>> >     Log:
>> >     llvm-profdata: Avoid undefined behaviour when reading raw profiles
>> >
>> >     The raw profiles that are generated in compiler-rt always add padding
>> >     so that each profile is aligned, so we can simply treat files that
>> >     don't have this property as malformed.
>> >
>> > Any chance of changing this to not rely on UB (this code looks like it's
>> > violating C++ aliasing rules pretty consistently) and instead using memcpy?
>> 
>> It's always casting from char *, which IIUC doesn't break aliasing
>> rules, so I *think* that this isn't relying on UB.
>> 
> [+Richard for some language lawyering]
> 
> I believe the rules are:
> 
> You can take a pointer to any entity, cast it to a char* then back, and continue using it as the original type. But this only works if the pointer was originally a pointer to that type you're dereferencing by, which isn't the case here.
> 
> You can take a pointer to T (where T is a standard layout(?)), cast to pointer to char, read the bytes of the object then write them back into some other object (via a char* that actually points to a T - derived the same way as the original) you can do that safely.
> 
> But none of this allows taking a char* that wasn't derived from a T*, casting it to a T* and reading.
> 
> But this is just my high-level understanding, I don't actually know where the words are in the standard to support this position.

C++11 has them in the bullet point list in section 3.10.

Sorry about this, looks like I missed that subtlety :/.

> Hoping Richard can fill us in/confirm/deny and/or I can go hunting.
> 
> - David
>  
> 
> Assuming I'm right about that, the current design avoids unnecessarily
> copying from the file to an auxiliary structure just to read the data
> and throw it away.

I think the optimizer will see through the `memcpy`s anyway, so this
probably isn't a big deal.



More information about the llvm-commits mailing list