[cfe-dev] C++ objects sometimes initialized to zero before constructor is called?

Richard Smith richard at metafoo.co.uk
Mon Jul 30 15:08:40 PDT 2012


Hi Ryan,

There are a few different C++ notions which are relevant here:

 * default-initialization is what happens when no initializer is specified
for an object
 * value-initialization is what happens when an empty initializer is
specified (as empty parens, or in C++11, as empty braces)
 * zero-initialization is the step which is memset()ing your object to 0s.

Both default-initialization and value-initialization can result in a
default constructor being called. But value-initialization sometimes
performs zero-initialization first, when the default constructor is not
user-provided.

Example:

#include <iostream>

struct A {
  A() { std::cout << n << std::endl; }
  int n;
};
struct B {
  A a;
};
struct C {
  C() {}
  A a;
};

int main() {
  char *p = new char[sizeof(B)];
  *(int*)p = 1;
  new (p) B; // prints 1

  char *q = new char[sizeof(B)];
  *(int*)q = 1;
  new (q) B(); // prints 0

  char *r = new char[sizeof(B)];
  *(int*)r = 1;
  new (r) C; // prints 1

  char *s = new char[sizeof(B)];
  *(int*)s = 1;
  new (s) C(); // prints 1
}

Note that new B() triggers zero-initialization, because it calls a
non-user-provided default constructor using empty parens. The other cases
do not, either because they call a user-provided constructor or because
they use default-initialization rather than value-initialization.

On Mon, Jul 30, 2012 at 12:58 PM, Ryan C. Gordon <icculus at icculus.org>wrote:

>
> I'm running into a problem with clang (both in XCode 4.4 and clang's
> Subversion repository, and maybe earlier versions too) which is a little
> strange. I've tried to dig around in clang, but there's just too much
> background knowledge I lack to solve this on my own. I'll try to explain
> it here the best I can, and maybe someone can point me in the right
> direction?
>
> Here's my problem:
>
> I've got a game built on an old version of the Unreal Engine, which does
> something strange/scary/interesting to support the engine's scripting
> language. The scripting language is bridged to C++, so that some objects
> can be used in either language, back and forth, transparently.
>
> To support this, the game malloc()'s a buffer for a scripted object,
> memcpy()'s a block of bytes over it that has bunch of default values for
> various data members specified in script, and then does a placement-new
> on that buffer, where the C++ constructor for that object might
> overwrite some of those default values, bubbling up through all the C++
> parent classes.
>
> There's a whole bunch of macro and template magic to make this system
> work. It's impressive and terrifying, but it has worked on several
> titles since 1997 or so, across various platforms, versions and targets
> for CodeWarrior, gcc, and Visual Studio.
>
> Here's the problem: some of the objects, between where we memcpy()'d the
> default values from script and where we call the C++ constructor via
> placement-new, has a memset() inserted that zeroes out the whole object.
> This memset() is being inserted by clang, not our code.
>
> This is happening around line 416 of clang's lib/CodeGen/CGExprCXX.cpp:
>
>    // Otherwise, just memset the whole thing to zero.  This is legal
>    // because in LLVM, all default initializers (other than the ones we
> just
>    // handled above) are guaranteed to have a bit pattern of all zeros.
>    CGF.Builder.CreateMemSet(DestPtr, CGF.Builder.getInt8(0), SizeVal,
>                             Align.getQuantity());
>
>
> All the scriptable objects get constructed with the same macro, but
> looking at the disassembly, only some of them get a memset() inserted.
> It's not clear to me why some do and some don't.
>
> The same macro magic that constructs the object is used in every C++
> class that is scriptable, but most of these classes are big and
> complicated beyond that piece of code. I could arrange for Apple to take
> a look at the source code off-list if that would be helpful, but it's
> not my code to hand out to the public.
>
> This isn't my area of expertise, but I need to get rid of that memset()
> reliably for Unreal's mechanism to function properly. What might make
> clang decide that a given C++ constructor would need to run through
> EmitNullBaseClassInitialization()? I wasn't able to find the right
> incantation of grep to figure out where semantic analysis makes this
> decision.
>
> (This is tested against clang on Mac OS X. The gcc-llvm that ships in
> Xcode 4.4 works for the game, but I'd rather ship this game with clang
> if possible, because clang is awesome.  :)  )
>
> Thanks,
> --ryan.
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20120730/774dec98/attachment.html>


More information about the cfe-dev mailing list