[lldb-dev] Backslashes in command arguments
Zachary Turner
zturner at google.com
Wed Jan 14 10:54:34 PST 2015
I take back my first sentence. It would be two different syntaxes if one
argument is a file path and one argument is a non file-path string. Like
for example:
(lldb) somecommand testarg c:\foobar
In this case yes, the two arguments would have different rules. But a)
they would only have different rules on Windows, and b) I think that's ok,
people whose primary operating system is Windows have the \ so engrained in
their heads that this will be the most natural behavior.
On Wed Jan 14 2015 at 10:47:41 AM Zachary Turner <zturner at google.com> wrote:
> It wouldn't be two arguments to the same command having a different
> syntax, it would be one argument to a command having a different syntax
> depending on how you were using the command.
>
> If I'm debugging Windows remotely from a Linux machine, and I need to
> specify a remote executable, it's more natural to use a Windows syntax to
> specify it. Of course the default would be the host's default.
>
> The reason I'm not crazy about the protection character is because any
> character you use is going to be awkward. Escaping characters with # signs
> just isn't very intuitive to anyone on any platform. On the other hand,
> escaping characters in filenames just isn't a thing that anyone needs to do
> on Windows, because characters that require escaping are not valid filename
> characters. I guess ultimately I'm looking for a way so that each platform
> can have the most natural way of doing things on that platform, so nobody
> feels like a second class citizen. Using # characters (or really any
> character other than \) for escaping is going to be unnatural, but even
> more importantly unnecessary since we only need this for file paths.
>
> That's what led me to the aforementioned idea. We already have the
> ability to mark option values as file paths by using eOptionTypeFileSpec.
> So why not just put the logic there, and then there's no setting anywhere,
> and everyone can use the native path syntax for the platform they're
> working with.
>
> On Wed Jan 14 2015 at 10:28:26 AM <jingham at apple.com> wrote:
>
>> I don't like this. Having two arguments to the same command with
>> different syntaxes seems pretty ugly to me.
>>
>> I didn't completely follow why having the protection character be
>> settable ended up being annoying to do, but short of that, I think the best
>> option is to have the protection character be a host setting, leave it at
>> "\" for everything but Windows, and pick some likely character for
>> Windows. Then we'll have to document this somewhere that the average
>> debugger user is likely to find it.
>>
>> Jim
>>
>>
>> > On Jan 13, 2015, at 6:38 PM, Zachary Turner <zturner at google.com> wrote:
>> >
>> > For example, one idea i had to fix this in a deeper way was to audit
>> all settings and options. For each one that's a path, make sure the option
>> type is eOptionTypeFileSpec. Then sink the escaping logic into FileSpec.
>> The added benefit of this is that it easily gives you access to the native
>> syntax of a remote platform's file system by way of the PathSyntax enum i
>> added to FileSpec some time ago. So if you're running some command that
>> takes a path in the context of a remote host, you can use that hosts
>> escaping rules. Plus this only changes the behavior for paths, which is the
>> only place we need it, so doesn't break anything else.
>> >
>> > This is a bigger change though, and not needed immediately, which is
>> why I would prefer postponing.
>> >
>> > Thoughts about this approach?
>> > On Tue, Jan 13, 2015 at 6:13 PM Zachary Turner <zturner at google.com>
>> wrote:
>> > I definitely need to be able to specify paths with backslashes
>> unquoted, but whether this is the best approach I'm still undecided.
>> >
>> > To be honest, I'm ok for now just disabling backslash processing with
>> an #ifndef _WIN32. This might cause other issues down the line for us, but
>> they will be fairly uncommon, and it will fix the common case now. And we
>> can do a more proper fix if/when the problem resurfaces.
>> > On Tue, Jan 13, 2015 at 5:06 PM Greg Clayton <gclayton at apple.com>
>> wrote:
>> >
>> > > On Jan 12, 2015, at 5:03 PM, Zachary Turner <zturner at google.com>
>> wrote:
>> > >
>> > > The way Process does it is a little confusing to me. It has this
>> ProcessProperties() class which takes a boolean, is_global. If this value
>> is true, it appends some sub-properties to it, and if it's false, it just
>> initializes itself by calling itself again with true. So false and true
>> make no difference, it's always the same instance (?)
>> >
>> > No. is_global is set to true if this is the initial "seed" options for
>> "process" under "target". This is why when "is_global" is set to true that
>> it appends the settings to the "target" global settings, and it doesn't
>> when not.
>> >
>> > If you think about it you need a set of options that can be editable
>> before any processes exist. So the "is_global" means we are creating this
>> one set of settings, it has nothing to do with the fact that individual
>> settings can be global or instance based. When a lldb_private::Process is
>> made, it inherits from ProcessProperties, and these are the settings that
>> get modified and any global properties are shared with the global versions,
>> and any instance variables copy the current global settings to be the basis
>> for the initial settings in Process.
>> >
>> > So lets say ProcessProperties has two settings "a" which is global and
>> "b" which is not (set by the bool in each property definition). When we
>> first start out we create the global properties which contain default
>> values for "a" and "b":
>> >
>> > Global ProcessProperties:
>> > a = true
>> > b = false
>> >
>> > Now we run some commands:
>> >
>> > (lldb) settings set process.a false
>> >
>> > So now we have:
>> >
>> > Global ProcessProperties:
>> > a = false
>> > b = false
>> >
>> > Now we create a process whose pid is 123:
>> >
>> > (lldb) file /bin/ls
>> > (lldb) b malloc
>> > (lldB) r
>> >
>> > Now a new process gets created and it makes a copy of "Global
>> ProcessProperties":
>> >
>> > Process 123 ProcessProperties:
>> > a = false
>> > b = false
>> >
>> > Now someone says:
>> >
>> > (lldb) settings set process.a true
>> >
>> > Since "a" is global we get:
>> >
>> > Global ProcessProperties:
>> > a = true
>> > b = false
>> >
>> > Process 123 ProcessProperties:
>> > a = true
>> > b = false
>> >
>> > But if we do:
>> >
>> > (lldb) settings set process.b true
>> >
>> > The Global properties remain unchanged, and only Process 123's settings
>> get changed:
>> >
>> > Global ProcessProperties:
>> > a = true
>> > b = false
>> >
>> > Process 123 ProcessProperties:
>> > a = true
>> > b = true
>> >
>> >
>> > >
>> > > Then, to further confuse matters, there's a single PropertyDefinition
>> array, where some of them are specified as global and some not via the
>> third argument. When you call GetGlobalProperties(), it then just returns
>> the whole array. There's not much choice it has, because it's all just one
>> array, so it can't really return a filtered array with only the items where
>> global==true.
>> > >
>> > > I feel like the correct way to do this is to have two separate
>> arrays. One for global properties and one for instance properties. Then,
>> instead of inheriting from Properties, just have two instances of it in the
>> class. An instance instance and a static instance. This way you don't
>> even need the third argument to PropertyDefinition at all. And the global
>> properties are added as a subcategory, such as interpreter.global. This
>> also makes it clear to the user when they run "settings list" which
>> settings are global and which are not. It might be clearer with an
>> example, so I'll upload this patch so you can see what I'm talking about.
>> If you don't like the way I've done it let me know.
>> >
>> > Read what I typed above carefully and make sure that:
>> > 1 - we need to be able to set non-global properties without any
>> instances so there needs to be global settings that contain all global and
>> all instance based properties so we can set these values before we have any
>> instances around
>> > 2 - when a new instance is made, it needs to get a copy of the global
>> pref's global and non-global option values
>> > 3 - if you modify a value that is non-global, there needs to be a way
>> to find the right instance so you can set the settings as the code
>> currently does
>> >
>> > I would rather you not change this code as it currently works and is
>> relatively bug free. If you do still feel you need to change the code do
>> NOT break anything. This means a lot of testing to make sure it behaves the
>> exact way it used to. Messing with this can really hose things up, so
>> again, don't break anything if you still feel you need to change the code.
>> >
>> > Greg
>> >
>> >
>> > >
>> > > On Mon Jan 12 2015 at 3:01:21 PM Greg Clayton <gclayton at apple.com>
>> wrote:
>> > > Actually we do have command interpreter settings as Jim just pointed
>> out to me, so feel free to add it there and hook the global settings up
>> correctly for that with a static function on CommandInterpreter to get the
>> escape character.
>> > >
>> > > Greg
>> > >
>> > > > On Jan 12, 2015, at 2:59 PM, Greg Clayton <gclayton at apple.com>
>> wrote:
>> > > >
>> > > > Probably the command interpreter, but we have all the command
>> interpreter settings on the debugger right now anyway, so don't change
>> anything on that end..
>> > > >
>> > > >> On Jan 12, 2015, at 2:35 PM, Zachary Turner <zturner at google.com>
>> wrote:
>> > > >>
>> > > >> Thanks for the response. By the way, is this setting more
>> appropriate on the CommandInterpreter instead of the debugger?
>> > > >>
>> > > >> On Mon Jan 12 2015 at 2:34:06 PM Greg Clayton <gclayton at apple.com>
>> wrote:
>> > > >>
>> > > >>> On Jan 12, 2015, at 2:05 PM, Zachary Turner <zturner at google.com>
>> wrote:
>> > > >>>
>> > > >>> There doesn't seem to be a good way to get access to the Debugger
>> object from the Args class. I tried changing Args to take a Debugger to
>> its constructor, but it seems this isn't always possible, for example with
>> anything having to do with OptionValue.
>> > > >>> I could provide a default value of NULL for the Debugger, and if
>> NULL it would fall back to the default escape character, but this seems
>> awkward.
>> > > >>>
>> > > >>> Since I've declared this as a global property, why should I even
>> need a Debugger instance? Shouldn't the global settings be static on the
>> Debugger so that they can be accessed without an instance?
>> > > >>
>> > > >>
>> > > >> This is the problem. See how process does it:
>> > > >>
>> > > >> class Process {
>> > > >> static const ProcessPropertiesSP &
>> > > >> GetGlobalProperties();
>> > > >> }
>> > > >>
>> > > >> The debugger should do the same thing. It should also add static
>> accessors for all of the settings that are truly global settings. Looking
>> at the global settings, they all seem to be set to be global values which
>> probably isn't what we want. Why? Xcode creates a new Debugger for each
>> debugging window that it creates and if someone types "setting set
>> auto-confirm false" in one command interpreter, it shouldn't affect the
>> other.
>> > > >>
>> > > >> So I would say we need to switch all debugger settings to be
>> non-global (third initialized in each of the PropertyDefinition values
>> should be set to "false" in g_properties in Debugger.cpp) and the only one
>> that should remain global is your new escape character. Then Debugger
>> should get a new static accessor:
>> > > >>
>> > > >> class Debugger
>> > > >> {
>> > > >> static char
>> > > >> GetEscapeCharacter();
>> > > >> };
>> > > >>
>> > > >> This in turn will access the global properties and return it to
>> outside folks.
>> > > >>
>> > > >> Check out the ProcessProperties class in Process.h and check out
>> its constructor. We will need to do something similar for the debugger's
>> settings. Right now Debugger inherits directly from Properties, but we will
>> need to change it to be like the process where there is a
>> DebuggerProperties class. The global version gets constructed with a "bool
>> is_global" set to false, and the instance ones get constructed with that
>> set to true.
>> > > >>
>> > > >> The Process class is the model we will need to follow if you want
>> to make this change and have a true global property that can be accessed
>> without a Debugger instance.
>> > > >>
>> > > >>
>> > > >>>
>> > > >>> On Thu Jan 08 2015 at 3:00:30 PM <jingham at apple.com> wrote:
>> > > >>> It's not just file names, you would also have to make sure
>> there's nothing else that might be in command argument or option value that
>> has a single & a double quote, since without the backslashes this would be
>> impossible.
>> > > >>>
>> > > >>> If you really want to do this, I think it would be better to add
>> a debugger setting for the "protection character". Then this would be set
>> to something else (maybe "#") on Windows, and backslash on all other
>> systems. That way you could probably always find a protection character
>> that you could use for whatever gnarly string you ended up having to pass
>> through the command interpreter.
>> > > >>>
>> > > >>> Jim
>> > > >>>
>> > > >>>
>> > > >>>> On Jan 8, 2015, at 2:55 PM, Zachary Turner <zturner at google.com>
>> wrote:
>> > > >>>>
>> > > >>>> Actually ' is a valid filename character in Windows, but not ".
>> That being said, it's so rare, and having a backslash is so common that I
>> would prefer the common case to be easy, and the rare case being either
>> unsupported or difficult is ok with me. So I'd still prefer to disable
>> this backslash handling on Windows for now, and then fix ' on Windows
>> separately in the future.
>> > > >>>>
>> > > >>>> On Thu Jan 08 2015 at 2:51:51 PM Zachary Turner <
>> zturner at google.com> wrote:
>> > > >>>> On Windows, that's not a valid file name, and in fact any file
>> name that has an escaped character is not a valid filename. I see what
>> you're getting at though, that for non-Windows it's necessary. Can I wrap
>> the backslash handling in #ifndef _WIN32 then?
>> > > >>>>
>> > > >>>> On Thu Jan 08 2015 at 2:49:49 PM <jingham at apple.com> wrote:
>> > > >>>> If I have a file called foo"bar'baz, how what would I put in the
>> place of <AWKWARD NAME> for
>> > > >>>>
>> > > >>>> (lldb) file <AWKWARD NAME>
>> > > >>>>
>> > > >>>> Right now, I can do:
>> > > >>>>
>> > > >>>> (lldb) file foo\"bar\'baz
>> > > >>>> Current executable set to 'foo"bar'baz' (x86_64).
>> > > >>>>
>> > > >>>> Jim
>> > > >>>>
>> > > >>>>
>> > > >>>>> On Jan 8, 2015, at 2:31 PM, Zachary Turner <zturner at google.com>
>> wrote:
>> > > >>>>>
>> > > >>>>> FWIW, Removing backslash handling from this function
>> (essentially ignoring backslashes in command arguments) fixes about 12-15
>> tests on Windows with no other changes. I see there's some logic in Args
>> for encoding and decoding escape sequences, but this only happens if a
>> particular command option is set, and that command only only appears to be
>> set for the "prompt" command. I'm not sure if removing the backslash logic
>> from SetCommandString would interfere with this command in any way, but it
>> doesn't seem like it would interfere with any other commands, unless I'm
>> missing something.
>> > > >>>>>
>> > > >>>>> On Thu Jan 08 2015 at 1:43:16 PM Zachary Turner <
>> zturner at google.com> wrote:
>> > > >>>>> One thing causing many tests to fail on Windows is the presence
>> of backslashes in argument. Until now, I've worked around this in many
>> cases by making sure that arguments with backslashes are always quoted.
>> > > >>>>>
>> > > >>>>> But there are some cases where this is not easy to guarantee
>> and now I'm leaning towards (at least on Windows) allowing backslashes in
>> argument strings. The code in question comes from the function void
>> SetCommandString (const char *command) in the file Args.cpp
>> > > >>>>>
>> > > >>>>> In particular, it implements special handling of whitespace,
>> single quotes, double quotes, and backslashes. For the case of backslashes
>> it removes them from the string.
>> > > >>>>>
>> > > >>>>> What would be the implications of removing backslash handling
>> from this function for all platforms? I would prefer to keep platform
>> specific code out of generic code, but I think this needs to be changed on
>> Windows.
>> > > >>>>> _______________________________________________
>> > > >>>>> lldb-dev mailing list
>> > > >>>>> lldb-dev at cs.uiuc.edu
>> > > >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>> > > >>>>
>> > > >>>
>> > > >>> _______________________________________________
>> > > >>> lldb-dev mailing list
>> > > >>> lldb-dev at cs.uiuc.edu
>> > > >>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
>> > > >>
>> > > >
>> > >
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20150114/2fc380eb/attachment.html>
More information about the lldb-dev
mailing list