[PATCH] Allow for the use of StringRef command line options instead of cl::opt<std::string>

David Blaikie dblaikie at gmail.com
Tue Oct 1 15:02:42 PDT 2013


On Tue, Oct 1, 2013 at 2:50 PM, Puyan Lotfi <plotfi at apple.com> wrote:

> EmitAssemblyHelper::CreateTargetMachine() in
> clang/lib/CodeGen/BackendUtil.cpp
>
> The CodeGenOptons class holds option std:strings copied from ArgsLists
> that came from either argv in clangs main,
>

So it sounds like for the caller from 'main' this is safe, yes? It comes
from argv, which stays live long enough and isn't tainted/modified in
undesirable ways?


> or from whoever is calling into CompilerInvocation in libclang.
>

OK, so if I understand you correctly this is an API question for libclang?

A quick grep (so maybe I've missed things - I likely have) of libclang for
CompilerInvocation finds clang_indexSourceFile_Impl which uses a
temporary/local std::vector Args that is passed to the
createInvocationFromCommandLine but the result of that call is used and
destroyed within the lifetime of this function, so again it looks like the
backing store of the caller is sufficiently live to keep the strings for
the region of usage.

Are there other callers in libclang or elsewhere where you believe the
caller is not currently keeping things sufficiently live? (and it's not
practical to modify it to keep it live for that period)


> It’s not very easy to know if the compiler invocation instance that lead
> to that point in the code in clang will be alive for the lifetime of the
> program.
>
> I think we’ll have a similar set of problems with ParseEnvironmentOptions
> in unit tests/Support/CommandLineTest.cpp if we were to change the
> cl::opt<std::string>’s to cl::opt<StringRef>’s.
>

Which callers are you concerned about here? Each test seems fairly
standalone and the Input strings are scoped for the life of the function.


>
> -Puyan
>
>
> On Oct 1, 2013, at 2:36 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> I've been following this thread only tangentially for the last few weeks.
> Personally, I would like to live in a world where callers of
> ParseCommandLineOptions handle the lifetime of those strings to ensure they
> remain live for the lifetime of the code that needs the command line
> options.
>
> Can you point me to a particular situation where a caller is unable to do
> this that ends up necessitating the GetPersistentClArgStr you're referring
> to?
>
>
> On Tue, Oct 1, 2013 at 2:33 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>
>> David, Chris:
>>
>> It seems that in certain cases it not very easy for callers of
>> ParseCommandLineOptions to get to the global static instances of a given
>> string and that is being parsed (when the strings come from some
>> dynamically populated string table rather than main’s argv that is passed,
>> as is the case with the driver code in clang).
>>
>> I’ve found that in one case (in clang), a uniquing table fixes the
>> problem of handing strings given to the ParseCommandLineOptions function;
>> something like this:
>>
>> static const char* GetPersistentClArgStr(const StringRef &key) {
>>   static StringMap<const char*> PersistentClArgString;
>>   StringMap<const char*>::const_iterator I =
>> PersistentClArgString.find(key);
>>   if (I == PersistentClArgString.end()) {
>>     const char* persistVal = strdup(key.data());
>>     PersistentClArgString[key] = persistVal;
>>     return persistVal;
>>   }
>>   return I->second;
>> }
>>
>> I used this in one particular case in clang, but it now seems like it
>> would be useful to integrate into the CommandLine library and optionally
>> have it turned on with an optional parameter to Parse*CommandLine*Options.
>> For example, the Parse*Environment*Options (which calls
>> ParseCommandLineOptions) function will fail when used with a
>> cl::opt<StringRef> rather than std::string. It actually looks like
>> ParseEnvironmentOptions has a StrDup class it uses to hold on to argument
>> string duplicates but it seems to then go and delete them. afterward.
>>
>> I am very eager for suggestions on this.
>>
>> Thanks
>>
>> -Puyan
>>
>> On Aug 15, 2013, at 10:57 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>> On Thu, Aug 15, 2013 at 4:56 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>>
>>
>> This sounds like a workable compromise.
>> To avoid leaking how about a CommandLine.cpp handled string pool that
>> allocates copies on demand (specified maybe by an optional flag to
>> ParseCommandLineOptions/ParseEnvironmentOptions) but also has the option of
>> flushing/deallocating the pool?
>>
>>
>> I'm with Chris - I think it should be simpler to have
>> ParseCommandLineOptions/CommandLine in general not be concerned with
>> task of owning the strings. That should be the responsibility of the
>> caller.
>>
>>
>> -Puyan
>>
>> On Aug 15, 2013, at 1:47 PM, Chris Lattner <clattner at apple.com> wrote:
>>
>>
>> On Aug 14, 2013, at 4:06 PM, Puyan Lotfi <plotfi at apple.com> wrote:
>>
>> Oh wow. I see that.
>>
>> I am not completely sure how the char* arrays would get handled this way.
>> I did not make any clang modifications in this patch (although since
>> clang is using the cl library it is affected) but I noticed that the use of
>> cl:: in the llvm source tree made use of global cl::opt variables that
>> specify the opt type.
>> I don’t see that in the files handling the -mllvm options; I’ll have to
>> look at the CommandLine.cpp code a little more closely to be sure in this
>> case that the char*’s aren’t being implicitly put into StringRefs that
>> might have their data pointers deallocated after
>> cl::ParseCommandLineOptions() is called.
>>
>>
>> After thinking about it for a bit, how about we go with this approach:
>>
>> 1. Document ParseCommandLineOptions/ParseEnvironmentOptions to only work
>> with strings that live forever in CommandLine.h
>> 2. Change clang to leak (or store globally) the strings that are affected.
>>
>> I don't think it is worth designing a context system or anything like
>> that to handle this case.
>>
>> -Chris
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131001/bd67720b/attachment.html>


More information about the llvm-commits mailing list