[LLVMdev] Strange behavior when converting arrays to strings

Martinez, Javier E javier.e.martinez at intel.com
Wed Jul 28 12:51:13 PDT 2010

Hi Duncan,

Thanks for the reply. I had seen that notice before and I'm sorry I didn't mention it in my original email. In the default case a NULL gets added at the end of the array which is the way character arrays are usually represented in memory. In this default case a conversion back to a string that's not the same as the original. In a scenario where only the array is passed to a function without knowledge about how it was generated one would have to resort to using the c_str() string class member to manipulate the string using the old school str functions. This I believe goes against the spirit of the ConstantArray class implementation.

I probably was too hasty in proposing a solution. Perhaps a better one is to add a bool member variable to the ConstantArray class to store whether a NULL was added to the end of the array. The new member variable would be checked during a getAsString() and if true the NULL character won't be added to the string. With this change the conversion from string to ConstantArray and back would always result in the same string.


-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
Sent: Wednesday, July 28, 2010 12:37 AM
To: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Strange behavior when converting arrays to strings

Hi Javier,

> I found saw some strange behavior (to me) when converting constant
> arrays to strings. Consider the following example:
> std::string Text = "HelloWorld";
> unsigned TextLengthBefore = Text.length();
> ConstantArray *pArray =
> dyn_cast<ConstantArray>(llvm::ConstantArray::get(pModule->getContext(),
> Text, true));

from Constants.h:

   /// This method constructs a ConstantArray and initializes it with a text
   /// string. The default behavior (AddNull==true) causes a null terminator to
   /// be placed at the end of the array. This effectively increases the length
   /// of the array by one (you've been warned).  However, in some situations
   /// this is not desired so if AddNull==false then the string is copied without
   /// null termination.
   static Constant *get(LLVMContext &Context, StringRef Initializer,
                        bool AddNull = true);



> unsigned NumElements = pArray->getNumOperands();
> Text = pArray->getAsString();
> unsigned TextLengthAfter = Text.length();
> After running this example here are the values in each variable:
> TextLengthBefore = 10
> NumElements = 11
> TextLengthAfter = 11
> In the conversion from constant array to a string the null terminating
> character is added as part of the string and becomes the 11^th
> character. This becomes a problem when the data is streamed out to a
> buffer because a NULL is inserted in the middle. Below is the code for
> getAsString:
> 1: std::string ConstantArray::getAsString() const {
> 2: assert(isString() && "Not a string!");
> 3: std::string Result;
> 4: Result.reserve(getNumOperands());
> 5: for (unsigned i = 0, e = getNumOperands(); i != e; ++i)
> 6:
> Result.push_back((char)cast<ConstantInt>(getOperand(i))->getZExtValue());
> 7: return Result;
> 8: }
> I think that the loop terminating condition in line 5 should be changed
> from != to <. Does this look right?
> Thanks,
> Javier
