[cfe-dev] C++ objects sometimes initialized to zero before constructor is called?

Ryan C. Gordon icculus at icculus.org
Mon Jul 30 12:58:26 PDT 2012


I'm running into a problem with clang (both in XCode 4.4 and clang's 
Subversion repository, and maybe earlier versions too) which is a little 
strange. I've tried to dig around in clang, but there's just too much 
background knowledge I lack to solve this on my own. I'll try to explain 
it here the best I can, and maybe someone can point me in the right 
direction?

Here's my problem:

I've got a game built on an old version of the Unreal Engine, which does 
something strange/scary/interesting to support the engine's scripting 
language. The scripting language is bridged to C++, so that some objects 
can be used in either language, back and forth, transparently.

To support this, the game malloc()'s a buffer for a scripted object, 
memcpy()'s a block of bytes over it that has bunch of default values for 
various data members specified in script, and then does a placement-new 
on that buffer, where the C++ constructor for that object might 
overwrite some of those default values, bubbling up through all the C++ 
parent classes.

There's a whole bunch of macro and template magic to make this system 
work. It's impressive and terrifying, but it has worked on several 
titles since 1997 or so, across various platforms, versions and targets 
for CodeWarrior, gcc, and Visual Studio.

Here's the problem: some of the objects, between where we memcpy()'d the 
default values from script and where we call the C++ constructor via 
placement-new, has a memset() inserted that zeroes out the whole object. 
This memset() is being inserted by clang, not our code.

This is happening around line 416 of clang's lib/CodeGen/CGExprCXX.cpp:

   // Otherwise, just memset the whole thing to zero.  This is legal
   // because in LLVM, all default initializers (other than the ones we just
   // handled above) are guaranteed to have a bit pattern of all zeros.
   CGF.Builder.CreateMemSet(DestPtr, CGF.Builder.getInt8(0), SizeVal,
                            Align.getQuantity());


All the scriptable objects get constructed with the same macro, but 
looking at the disassembly, only some of them get a memset() inserted. 
It's not clear to me why some do and some don't.

The same macro magic that constructs the object is used in every C++ 
class that is scriptable, but most of these classes are big and 
complicated beyond that piece of code. I could arrange for Apple to take 
a look at the source code off-list if that would be helpful, but it's 
not my code to hand out to the public.

This isn't my area of expertise, but I need to get rid of that memset() 
reliably for Unreal's mechanism to function properly. What might make 
clang decide that a given C++ constructor would need to run through 
EmitNullBaseClassInitialization()? I wasn't able to find the right 
incantation of grep to figure out where semantic analysis makes this 
decision.

(This is tested against clang on Mac OS X. The gcc-llvm that ships in 
Xcode 4.4 works for the game, but I'd rather ship this game with clang 
if possible, because clang is awesome.  :)  )

Thanks,
--ryan.




More information about the cfe-dev mailing list