[LLVMdev] Strange behavior when converting arrays to strings

Martinez, Javier E javier.e.martinez at intel.com
Wed Jul 28 12:51:13 PDT 2010


Hi Duncan,

Thanks for the reply. I had seen that notice before and I'm sorry I didn't mention it in my original email. In the default case a NULL gets added at the end of the array which is the way character arrays are usually represented in memory. In this default case a conversion back to a string that's not the same as the original. In a scenario where only the array is passed to a function without knowledge about how it was generated one would have to resort to using the c_str() string class member to manipulate the string using the old school str functions. This I believe goes against the spirit of the ConstantArray class implementation.

I probably was too hasty in proposing a solution. Perhaps a better one is to add a bool member variable to the ConstantArray class to store whether a NULL was added to the end of the array. The new member variable would be checked during a getAsString() and if true the NULL character won't be added to the string. With this change the conversion from string to ConstantArray and back would always result in the same string.

Thanks,
Javier

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Duncan Sands
Sent: Wednesday, July 28, 2010 12:37 AM
To: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] Strange behavior when converting arrays to strings

Hi Javier,

> I found saw some strange behavior (to me) when converting constant
> arrays to strings. Consider the following example:
>
> std::string Text = "HelloWorld";
>
> unsigned TextLengthBefore = Text.length();
>
> ConstantArray *pArray =
> dyn_cast<ConstantArray>(llvm::ConstantArray::get(pModule->getContext(),
> Text, true));

from Constants.h:

   /// This method constructs a ConstantArray and initializes it with a text
   /// string. The default behavior (AddNull==true) causes a null terminator to
   /// be placed at the end of the array. This effectively increases the length
   /// of the array by one (you've been warned).  However, in some situations
   /// this is not desired so if AddNull==false then the string is copied without
   /// null termination.
   static Constant *get(LLVMContext &Context, StringRef Initializer,
                        bool AddNull = true);

Ciao,

Duncan.

>
> unsigned NumElements = pArray->getNumOperands();
>
> Text = pArray->getAsString();
>
> unsigned TextLengthAfter = Text.length();
>
> After running this example here are the values in each variable:
>
> TextLengthBefore = 10
>
> NumElements = 11
>
> TextLengthAfter = 11
>
> In the conversion from constant array to a string the null terminating
> character is added as part of the string and becomes the 11^th
> character. This becomes a problem when the data is streamed out to a
> buffer because a NULL is inserted in the middle. Below is the code for
> getAsString:
>
> 1: std::string ConstantArray::getAsString() const {
>
> 2: assert(isString() && "Not a string!");
>
> 3: std::string Result;
>
> 4: Result.reserve(getNumOperands());
>
> 5: for (unsigned i = 0, e = getNumOperands(); i != e; ++i)
>
> 6:
> Result.push_back((char)cast<ConstantInt>(getOperand(i))->getZExtValue());
>
> 7: return Result;
>
> 8: }
>
> I think that the loop terminating condition in line 5 should be changed
> from != to <. Does this look right?
>
> Thanks,
>
> Javier
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list