[cfe-dev] createCXString reads one past end byte

Argyrios Kyrtzidis akyrtzi at gmail.com
Thu Jan 24 12:07:14 PST 2013


On Jan 24, 2013, at 11:47 AM, Dmitri Gribenko <gribozavr at gmail.com> wrote:

> On Wed, Jan 23, 2013 at 6:38 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>> On Jan 22, 2013, at 11:38 AM, Dmitri Gribenko <gribozavr at gmail.com> wrote:
>> 
>>> On Tue, Jan 22, 2013 at 9:29 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>>> On Jan 22, 2013, at 6:49 AM, Dmitri Gribenko <gribozavr at gmail.com> wrote:
>>>> 
>>>>> Hello Argyrios,
>>>>> 
>>>>> On Mon, Jan 21, 2013 at 9:13 PM, Argyrios Kyrtzidis <akyrtzi at gmail.com> wrote:
>>>>>> We could change how CXString works; instead of eagerly malloc'ing in case of a StringRef, have it stored in a "StringRef-kind" form and malloc when clang_getCString is called.
>>>>> 
>>>>> This will add an extra backing storage mode to CXString, let's call
>>>>> that CXS_UnmanagedWithLength.  In this mode we will store the pointer
>>>>> in 'data', and length will be stored in the upper bits of
>>>>> 'private_flags'.  But who will own the memory returned by
>>>>> clang_getCString?  We can not change the original CXString from
>>>>> CXS_UnmanagedWithLength to CXS_Malloc because CXString is a value
>>>>> type.  Or did I misunderstand something?
>>>> 
>>>> You are right, it is so easy to get a copy of it and trigger a memory leak unintentionally.
>>>> 
>>>> I'll do some measurements to see if it makes much difference to malloc on every StringRef (in the meantime any suggestions on how to avoid the malloc would be greatly appreciated).
>>> 
>>> We can add an indirection layer.  Let's add a mode CXS_Lazy.  'data'
>>> would point to a struct:
>>> 
>>> struct {
>>> const char *Str;
>>> unsigned Length:31;
>>> unsigned IsNulTerminated:1;
>>> };
>>> 
>>> Then CXString would remain a value type, and all copies would point to
>>> the same underlying data.
>> 
>> I like it! We can also enhance the mechanism a bit to be able to use a free list of those so that we avoid malloc'ing them for the normal use pattern of having a CXString as a temporary object to get at the underlying string, what do you think ?
> 
> A good malloc implementation should already do that.  Anyway, we do
> have ArrayRecycler, so we can use it here.  But I don't see a good
> place to put an instance of ArrayRecycler.  We can not make it static
> because of thread safety.

How about putting the instance (or whoever owns it) in the indirection struct  ?

> 
> Dmitri
> 
> -- 
> main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
> (j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/





More information about the cfe-dev mailing list