[llvm] r217708 - llvm-profdata: Avoid undefined behaviour when reading raw profiles

David Blaikie dblaikie at gmail.com
Mon Sep 15 16:01:04 PDT 2014


On Mon, Sep 15, 2014 at 3:51 PM, Justin Bogner <mail at justinbogner.com>
wrote:

> David Blaikie <dblaikie at gmail.com> writes:
> > On Fri, Sep 12, 2014 at 2:22 PM, Justin Bogner <mail at justinbogner.com>
> wrote:
> >
> >     Author: bogner
> >     Date: Fri Sep 12 16:22:55 2014
> >     New Revision: 217708
> >
> >     URL: http://llvm.org/viewvc/llvm-project?rev=217708&view=rev
> >     Log:
> >     llvm-profdata: Avoid undefined behaviour when reading raw profiles
> >
> >     The raw profiles that are generated in compiler-rt always add padding
> >     so that each profile is aligned, so we can simply treat files that
> >     don't have this property as malformed.
> >
> > Any chance of changing this to not rely on UB (this code looks like it's
> > violating C++ aliasing rules pretty consistently) and instead using
> memcpy?
>
> It's always casting from char *, which IIUC doesn't break aliasing
> rules, so I *think* that this isn't relying on UB.
>

[+Richard for some language lawyering]

I believe the rules are:

You can take a pointer to any entity, cast it to a char* then back, and
continue using it as the original type. But this only works if the pointer
was originally a pointer to that type you're dereferencing by, which isn't
the case here.

You can take a pointer to T (where T is a standard layout(?)), cast to
pointer to char, read the bytes of the object then write them back into
some other object (via a char* that actually points to a T - derived the
same way as the original) you can do that safely.

But none of this allows taking a char* that wasn't derived from a T*,
casting it to a T* and reading.

But this is just my high-level understanding, I don't actually know where
the words are in the standard to support this position. Hoping Richard can
fill us in/confirm/deny and/or I can go hunting.

- David


>
> Assuming I'm right about that, the current design avoids unnecessarily
> copying from the file to an auxiliary structure just to read the data
> and throw it away.
>
> >     Caught by Alexey's new ubsan bot. Thanks!
> >
> >     Modified:
> >         llvm/trunk/lib/ProfileData/InstrProfReader.cpp
> >         llvm/trunk/test/tools/llvm-profdata/raw-two-profiles.test
> >
> >     Modified: llvm/trunk/lib/ProfileData/InstrProfReader.cpp
> >     URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/ProfileData/
> >     InstrProfReader.cpp?rev=217708&r1=217707&r2=217708&view=diff
> >
>  ==========================================================================
> >     ====
> >     --- llvm/trunk/lib/ProfileData/InstrProfReader.cpp (original)
> >     +++ llvm/trunk/lib/ProfileData/InstrProfReader.cpp Fri Sep 12
> 16:22:55
> >     2014
> >     @@ -190,6 +190,9 @@ RawInstrProfReader<IntPtrT>::readNextHea
> >        // garbage at the end of the file.
> >        if (CurrentPos + sizeof(RawHeader) > End)
> >          return instrprof_error::malformed;
> >     +  // The writer ensures each profile is padded to start at an
> aligned
> >     address.
> >     +  if (reinterpret_cast<size_t>(CurrentPos) % alignOf<uint64_t>())
> >     +    return instrprof_error::malformed;
> >        // The magic should have the same byte order as in the previous
> header.
> >        uint64_t Magic = *reinterpret_cast<const uint64_t *>(CurrentPos);
> >        if (Magic != swap(getRawMagic<IntPtrT>()))
> >
> >     Modified: llvm/trunk/test/tools/llvm-profdata/raw-two-profiles.test
> >     URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/tools/
> >
>  llvm-profdata/raw-two-profiles.test?rev=217708&r1=217707&r2=217708&view=
> >     diff
> >
>  ==========================================================================
> >     ====
> >     --- llvm/trunk/test/tools/llvm-profdata/raw-two-profiles.test
> (original)
> >     +++ llvm/trunk/test/tools/llvm-profdata/raw-two-profiles.test Fri
> Sep 12
> >     16:22:55 2014
> >     @@ -39,11 +39,9 @@ RUN: printf '\0\0\0\0\0' >> %t-foo-padde
> >      RUN: cat %t-bar.profraw > %t-bar-padded.profraw
> >      RUN: printf '\0\0\0\0\0' >> %t-bar-padded.profraw
> >
> >     -RUN: cat %t-foo.profraw %t-bar.profraw > %t-nopad.profraw
> >      RUN: cat %t-foo-padded.profraw %t-bar.profraw >
> %t-pad-between.profraw
> >      RUN: cat %t-foo-padded.profraw %t-bar-padded.profraw >
> %t-pad.profraw
> >
> >     -RUN: llvm-profdata show %t-nopad.profraw -all-functions -counts |
> >     FileCheck %s
> >      RUN: llvm-profdata show %t-pad-between.profraw -all-functions
> -counts |
> >     FileCheck %s
> >      RUN: llvm-profdata show %t-pad.profraw -all-functions -counts |
> FileCheck
> >     %s
> >
> >     _______________________________________________
> >     llvm-commits mailing list
> >     llvm-commits at cs.uiuc.edu
> >     http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140915/f4df4d5b/attachment.html>


More information about the llvm-commits mailing list