[llvm-dev] Syntax for FileCheck numeric variables and expressions

Thomas Preudhomme via llvm-dev llvm-dev at lists.llvm.org
Wed Aug 22 02:07:30 PDT 2018


Hi James,

Yes I think you summary proposal is a good one though I disagree with the
colon being optional because there is ambiguity with looking for the value
of VAR5 in the %x format. If anything, [[# %x, VAR5]] is equivalent to
[[#:%x, VAR5]] or ([[#:%x = VAR5]] with your proposal. My other suggestion
would be to use == rather than = since = could be confused with assignment.

Note that I'll stick to only implementing = for now as supporting <, <=, >
or >= requires a different logic than what I'm doing now.

By the way FYI, I have already started working on the new syntax, still a
fair amount to do as I was busy on other tasks but I'm progressing.

Best regards,

Thomas

On Fri, 17 Aug 2018 at 10:39, James Henderson <jh7370.2008 at my.bristol.ac.uk>
wrote:

> Hi,
>
> I had some more thoughts. Summary of my proposal is at the bottom, but
> basically I wonder if we need to look again at the syntax a little.
>
> +++Details+++
>
> In https://reviews.llvm.org/D49964, the proposed test at the time of
> writing has the size of a compressed and a decompressed version of a
> section hard-coded in. This feels a little fragile to me, and really what's
> interesting is whether the compressed section is smaller than the
> decompressed section. That then led me to think that the current test
> harness we use for some of our tools allows us to capture an integer and
> then compare it against another integer to report success or failure.
> Sometimes, the value in the comparison is computed from a captured number
> too. I don't have a fully-thought out syntax for this, but I think it
> should be complementary to the variable expression syntax.
>
> Example proposal:
>
> [[# %x < VAR - 10]]
> [[# < VAR - 10]]
>
> The first would match a hex number that is strictly 10 or more less than
> the value of VAR, and the second would match whatever the default pattern
> is. Thus the format specifier still works as before. The only difference is
> that we replace the ',' with a comparison operator (equally valid would be
> '<=', '>' etc). That then led me to wonder, why not use '==' (or just '=')
> to indicate equality, instead of ',' i.e:
>
> [[# %x == VAR - 10]]
>
> Related to this, it occurred to me that sometimes, we might want to
> capture the variable for reuse later, but also verify that it is based on
> some other variable (e.g. END is 4 higher than BEGIN, and written in hex).
> So maybe both can live alongside each other:
>
> [[# %x, VAR1 < VAR2 - 10]]
>
> This would capture a hex number, store it in VAR1, but fail if that number
> is not more than ten less than VAR2. Maybe we might want to use a colon to
> delineate the capture side from the verification side.
>
> +++Summary+++
>
> I think the following would be my proposal:
>
> [[# %x, VAR1 : < VAR2 - 10]] // Capture VAR1 from hex, fail if it doesn't
> meet the right-hand expression.
> [[# %x, VAR3 :]] // Capture VAR3 from hex, always succeeding.
> [[# %x = VAR4 + 10]] // Capture a hex string that must equal VAR4 + 10.
>
> Thus if nothing is after the colon, just capture a variable (which I think
> is what we agreed on before). Anything after a colon is used as a variable
> expression that must match the captured expression.
>
> That would suggest that the following would not want to be valid syntax,
> but maybe ',' could be treated as synonymous with '=', or maybe the colon
> can be omitted in cases where no verification is needed (and thus the
> following becomes an assignment)?
> [[# %x, VAR5]]
>
> Thoughts?
>
> James
>
>
> On 31 July 2018 at 17:36, <paul.robinson at sony.com> wrote:
>
>> I can certainly envision a use case for a [BASE + LENGTH + 4] computation
>> to verify the address of a next-thingy.  Comes up in DWARF dumps all the
>> time.
>>
>> --paulr
>>
>>
>>
>> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *James
>> Henderson via llvm-dev
>> *Sent:* Tuesday, July 31, 2018 11:53 AM
>> *To:* Thomas Preudhomme
>> *Cc:* llvm-dev
>>
>> *Subject:* Re: [llvm-dev] Syntax for FileCheck numeric variables and
>> expressions
>>
>>
>>
>> This looks like a reasonable subset of features to me. My only question
>> is related to this one:
>>
>>
>>
>> > - arithmetic expression involving several variables
>>
>>
>>
>> Is it actually harder to write FileCheck to handle this case than to not
>> handle it? I'm (naively) assuming that the variables will be in some form
>> of container, and are just substituted in. If it is harder, that's fine.
>> Otherwise, I just say do it.
>>
>>
>>
>> James
>>
>>
>>
>> On 31 July 2018 at 11:51, Thomas Preudhomme via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>> Hi Alex,
>>
>> On Fri, 27 Jul 2018 at 11:53, Alexander Richardson
>> <arichardson.kde at gmail.com> wrote:
>> >
>> > On Thu, 26 Jul 2018 at 10:28 Thomas Preudhomme <
>> thomas.preudhomme at linaro.org> wrote:
>> >>
>> >> Hi Alexander,
>> >>
>> >> Please forgive me if I'm missing the obvious but I do not see how the
>> >> order helps allowing a comma in the expression. It seems to me that
>> >> what would allow it is to make FMTSPEC mandatory or at least the comma
>> >> to separate it (ie. [[#,EXPR]] for the default format specifier). In
>> >> any case comma in a function-call like expression can be distinguished
>> >> from comma for the format specifier since one is always inside a
>> >> parenthesized expression.
>> >>
>> > Hi Thomas,
>> >
>> > I though that FMTSPEC first might be easier to implement because you
>> can just check if the first non-whitespace character after # is a %. If it
>> is parse a fmtspec followed by a comma and if not treat everything else as
>> the expression. But you are right a function-like syntax would always
>> contain parentheses so there is no ambiguity.
>> > I think [[#,EXPR]]  looks a bit strange and I think we can determine
>> default format vs format specifier based on the first character after the #
>> being a % or not. I.e. [[#EXPR]] means default format and [[#%x,EXPR]] is
>> hex. Does that sound reasonable?
>>
>> Yes it does. I've started reworking the changes I made to
>> FileCheck.rst to document the agreed upon syntax. At the moment I'm
>> thinking about supporting %u, %d, %x and %X as input and output format
>> specifier, the optionality of format specifier (defaulting to %u) and
>> basic numeric variable definition and numeric expression use involving
>> a variable and an immediate. In particular, I do *not* plan to
>> implement the following:
>> - defining a numeric variable from a numeric expression
>> - arithmetic operations other than - and +
>> - arithmetic expression involving several variables
>>
>> I'll make sure that this can easily be added later and will mention in
>> the doc that the syntax for these feature has already been agreed as
>> well.
>>
>> Feel free to give me feedback on the set of features I intend to
>> implement in this initial patch.
>>
>> Best regards,
>>
>> Thomas
>>
>>
>> >
>> >
>> >>
>> >> That said I don't have a strong opinion about the ordering of the
>> >> expression wrt. the format specifier. I find EXPR, FMTSPEC more
>> >> natural but at 2 persons (James and you) expressed preference for the
>> >> reverse order so I'll assume that's the general preference.
>> >>
>> >
>> > I don't have a strong preference whether it should come before or after
>> and agree with James that whatever is easiest to implement should be done.
>> >
>> > Thanks,
>> > Alex
>> >
>> >
>> >> Best regards,
>> >>
>> >> Thomas
>> >>
>> >> P.S.: My apologies for only asking now but how do you prefer to be
>> >> called? Alexander Vs Alex Vs something else?
>> >
>> > Most people call me Alex but if you prefer Alexander is also fine.
>> >
>> >>
>> >>
>> >> On Sun, 22 Jul 2018 at 20:23, Alexander Richardson
>> >> <arichardson.kde at gmail.com> wrote:
>> >> >
>> >> > On Wed, 18 Jul 2018 at 13:50 Thomas Preudhomme <
>> thomas.preudhomme at linaro.org> wrote:
>> >> >>
>> >> >> Hi Alex,
>> >> >>
>> >> >> Thanks for the feedback. My first thought was that introducing the
>> new
>> >> >> pseudo var @EXPR is a nice way to generalize that syntax beyond
>> @LINE
>> >> >> since it would also evaluate to an arithmetic value. On the other
>> hand
>> >> >> there is a small inconsistency because @LINE evaluates to a value
>> >> >> which can be part of an expression while @EXPR is an expression, and
>> >> >> so the @ syntax as a whole becomes defined as introducing something
>> >> >> which is not a regular variable, ie. a negative definition.
>> >> >>
>> >> >> I'll stick with the # syntax because # is usually associated with
>> >> >> numbers and can be defined as introducing an integer
>> >> >> expression/variable. The one question I wonder is if the # should be
>> >> >> next to the variable name or next to the [[ as proposed by James. I
>> >> >> like the former better *but* I think the latter makes more sense
>> since
>> >> >> [[#VAR + 1]] would suggest that the [[<something>]] syntax already
>> >> >> allows numeric expression without numeric variable which is not the
>> >> >> case. Having the # right at the start also clearly indicates that
>> the
>> >> >> whole expression might have a conversion specifier. Finally, the #
>> >> >> syntax can allow defining a variable with the result of an
>> arithmetic
>> >> >> expression:
>> >> >> [[#BAR, %x:]]
>> >> >> [[# FOO:BAR+12]]
>> >> >>
>> >> >> So BAR takes an hex value in lower case syntax, value gets added 12
>> >> >> (in decimal) and the result is put into FOO. In which case there
>> >> >> should be no format specifier when defining FOO. ie. format
>> specifier
>> >> >> for definition is only when there's nothing about the colon. Of
>> course
>> >> >> we could allow hex immediate with 0x syntax if needed. Again, I'm
>> not
>> >> >> advocating for implementing all this from the start, but make sure
>> >> >> that the syntax would allow it if we realize we need this later and
>> I
>> >> >> think Jame's proposal does.
>> >> >>
>> >> >> It seems this syntax would suit all your current uses (albeit the
>> >> >> rewriting necessary), did I miss something?
>> >> >>
>> >> >
>> >> > Hi Thomas,
>> >> >
>> >> > That would indeed work fine for me and it would be easy to update
>> our tests with a few regex replaces.
>> >> >
>> >> > I think I prefer the [[# %FMTSPEC, EXPR]] syntax since that would
>> also make it possible to have commas in the expression part. This might be
>> useful if we allow function-call like expressions such as [[# %X, pow(10,
>> FOO) + 20]].
>> >> >
>> >> >
>> >> > Alex
>> >> >
>> >> >
>> >> >
>> >> >>
>> >> >> Best regards,
>> >> >>
>> >> >> Thomas
>> >> >>
>> >> >> On Tue, 17 Jul 2018 at 21:59, Alexander Richardson
>> >> >> <arichardson.kde at gmail.com> wrote:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Tue, 17 Jul 2018 at 10:02 Thomas Preudhomme via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>> >> >> >>
>> >> >> >> To be clear, I do not intend to add support for hex specifier in
>> the
>> >> >> >> current patch, I just want to make sure the syntax we choose is
>> going
>> >> >> >> to allow it later. My immediate use case is decimal integer and I
>> >> >> >> intend to write the code so that it's easy to extend to more
>> type of
>> >> >> >> numeric variables and expressions later. This way we'll only add
>> >> >> >> specifier that are actually required by actual testcases.
>> >> >> >>
>> >> >> >
>> >> >> > I also added FileCheck expressions to our fork of LLVM in order
>> to allow testing both a 128-bit and a 256-bits versions of our CHERI ISA in
>> a single test case [1].
>> >> >> > I used [[@EXPR foo * 2 + 1]] for FileCheck expressions [2]. I'm
>> not particularly happy with this syntax since it is quite verbose (but then
>> again we don't need it that often so it doesn't really matter). It also
>> doesn't allow saving the expression result so it needs to be repeated
>> everywhere. I could probably use [[@EXPR:OUTVAR INVAR + 42]] but I haven't
>> really had the need for that yet.
>> >> >> >
>> >> >> > We currently need the following two features:
>> >> >> >
>> >> >> > - Simple arithmetic with multiple operations. Example:
>> >> >> > `cld $gp, $zero, [[@EXPR 2 * $CAP_SIZE - 8]]($c11)`
>> >> >> >
>> >> >> > - Conversion to hex (upper and lower case since not all tools are
>> consistent here) and to decimal.
>> >> >> > Example: // READOBJ-NEXT: 0x50 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE
>> .data 0x[[@EXPR hex($CAP_SIZE * 2)]]
>> >> >> >
>> >> >> > Alex
>> >> >> >
>> >> >> > [1] For most test cases the simple -DVAR=value flag in FileCheck
>> is good enough: we have a %cheri_FileCheck lit substitution that expands to
>> `FileCheck '-D$CAP_SIZE=16/32'` . This works for most IR level tests since
>> usually the only thing that is different is "align 16" vs "align 32".
>> However, when checking the assembly output or linker addresses we often
>> need something more complex.
>> >> >> >
>> >> >> > [2] A test case showing all the currently supported expressions
>> can be found here: <
>> https://github.com/CTSRD-CHERI/llvm/blob/master/test/FileCheck/expressions.txt
>> >
>> >> >> >
>> >> >> >
>> >> >> >>
>> >> >> >> On Mon, 16 Jul 2018 at 18:39, <paul.robinson at sony.com> wrote:
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > > -----Original Message-----
>> >> >> >> > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On
>> Behalf Of
>> >> >> >> > > Thomas Preudhomme via llvm-dev
>> >> >> >> > > Sent: Monday, July 16, 2018 6:24 AM
>> >> >> >> > > To: jh7370.2008 at my.bristol.ac.uk
>> >> >> >> > > Cc: llvm-dev at lists.llvm.org
>> >> >> >> > > Subject: Re: [llvm-dev] Syntax for FileCheck numeric
>> variables and
>> >> >> >> > > expressions
>> >> >> >> > >
>> >> >> >> > > Hi James,
>> >> >> >> > >
>> >> >> >> > > I like that suggestion very much but I think keeping the
>> order of the
>> >> >> >> > > two sides as initially proposed makes more sense. In
>> printf/scanf the
>> >> >> >> > > string is first because the primary use of these functions
>> is to do
>> >> >> >> > > I/O and so you first specify what you are going to
>> output/input and
>> >> >> >> > > then where to capture variables. The primary objective of
>> FileCheck
>> >> >> >> > > variables and expressions is to capture/print them, the
>> specifier is
>> >> >> >> > > an addon to allow some conversion. Does it make sense?
>> >> >> >> >
>> >> >> >> > My immediate reaction is that I'd rather not have FileCheck
>> get into
>> >> >> >> > the business of handling printf specifiers.  OTOH, while LLVM
>> tools
>> >> >> >> > do typically print lowercase hex, that's not guaranteed, and
>> looking
>> >> >> >> > at the output of other tools can be useful too.  So, a way to
>> specify
>> >> >> >> > the case for a hex conversion seems worthwhile.
>> >> >> >> >
>> >> >> >> > I had also been thinking in terms of the trailing colon to
>> distinguish
>> >> >> >> > definition from use, as James suggested, that's sort-of
>> consistent
>> >> >> >> > with the current syntax.
>> >> >> >> >
>> >> >> >> > This is starting to make parsing the insides of [[]] much more
>> involved,
>> >> >> >> > so you'll want to pay attention to making that code
>> well-structured and
>> >> >> >> > readable.
>> >> >> >> > --paulr
>> >> >> >> >
>> >> >> >> > >
>> >> >> >> > > In the interest of speeding things up I plan to start
>> implementing
>> >> >> >> > > this proposal starting tomorrow unless someone gives some
>> more
>> >> >> >> > > feedback.
>> >> >> >> > >
>> >> >> >> > > Best regards,
>> >> >> >> > >
>> >> >> >> > > Thomas
>> >> >> >> > >
>> >> >> >> > > On Fri, 13 Jul 2018 at 15:51, James Henderson
>> >> >> >> > > <jh7370.2008 at my.bristol.ac.uk> wrote:
>> >> >> >> > > >
>> >> >> >> > > > Hi Thomas,
>> >> >> >> > > >
>> >> >> >> > > > In general, I think this is a good proposal. However, I
>> don't think that
>> >> >> >> > > using ">" or "<" to specify base (at least alone) is a good
>> idea, as it
>> >> >> >> > > might clash with future ideas to do comparisons etc. I also
>> think it would
>> >> >> >> > > be nice to have the syntax consistent between definition and
>> use. My first
>> >> >> >> > > thought on a reasonable alternative was to use commas to
>> separate the two
>> >> >> >> > > parts, so something like:
>> >> >> >> > > >
>> >> >> >> > > > [[# VAR, 16:]] to capture a hexadecimal number (where the
>> spaces are
>> >> >> >> > > optional). [[# VAR, 16]] to use a variable, converted to a
>> hexadecimal
>> >> >> >> > > string. In both cases, the base component is optional, and
>> defaults to
>> >> >> >> > > decimal.
>> >> >> >> > > >
>> >> >> >> > > > This led me to thing that it might be better to use
>> something similar to
>> >> >> >> > > printf style for the latter half, so to capture a
>> hexadecimal number with
>> >> >> >> > > a leading "0x" would be: "0x[[# VAR, %x:]]" and to use it
>> would be "0x[[#
>> >> >> >> > > VAR, %x]]". Indeed, that would allow straightforward
>> conversions between
>> >> >> >> > > formats, so say you defined it by capturing a decimal
>> integer and using it
>> >> >> >> > > to match a hexadecimal in upper case, with leading 0x and 8
>> digits
>> >> >> >> > > following the 0x:
>> >> >> >> > > >
>> >> >> >> > > > CHECK: [[# VAR, %d:]] # Defines
>> >> >> >> > > > CHECK: 0x[[# VAR + 1, %8X]] # Uses
>> >> >> >> > > >
>> >> >> >> > > > Of course, if we go down that route, it would probably
>> make more sense
>> >> >> >> > > to reverse the two sides (e.g. to become "[[# %d, VAR:]]" to
>> capture a
>> >> >> >> > > decimal and "[[# %8X, VAR + 1]]" to use it).
>> >> >> >> > > >
>> >> >> >> > > > Regards,
>> >> >> >> > > >
>> >> >> >> > > > James
>> >> >> >> > > >
>> >> >> >> > > > On 12 July 2018 at 15:34, Thomas Preudhomme via llvm-dev
>> <llvm-
>> >> >> >> > > dev at lists.llvm.org> wrote:
>> >> >> >> > > >>
>> >> >> >> > > >> Hi all,
>> >> >> >> > > >>
>> >> >> >> > > >> I've written a patch to extend FileCheck to support
>> matching
>> >> >> >> > > >> arithmetic expressions involving variable [1] (eg. to
>> match REG+1
>> >> >> >> > > >> where REG is a variable with a numeric value). It was
>> suggested to me
>> >> >> >> > > >> in the review to introduce the concept of numeric
>> variable and to
>> >> >> >> > > >> allow for specifying the base the value are written in.
>> >> >> >> > > >>
>> >> >> >> > > >> [1] https://reviews.llvm.org/D49084
>> >> >> >> > > >>
>> >> >> >> > > >> I think the syntax should satisfy the below requirements:
>> >> >> >> > > >>
>> >> >> >> > > >> * based off the [[]] construct since anything else might
>> overload an
>> >> >> >> > > >> existing valid syntax (eg. $$ is supposed to match
>> literally now)
>> >> >> >> > > >> * consistent with syntax for expressions using @LINE
>> >> >> >> > > >> * consistent with using ':' to define regular variable
>> >> >> >> > > >> * allows to specify base of the number a numeric variable
>> is being set
>> >> >> >> > > to
>> >> >> >> > > >> * allows to specify base of the result of the numeric
>> expression
>> >> >> >> > > >>
>> >> >> >> > > >> I've come up with the following syntax for which I'd like
>> feedback:
>> >> >> >> > > >>
>> >> >> >> > > >> Numeric variable definition: [[#X<base:]] (eg.
>> [[#ADDR<16:]]) where X
>> >> >> >> > > >> is the numeric variable being defined and <base is
>> optional in which
>> >> >> >> > > >> case base defaults to 10
>> >> >> >> > > >> Numeric variable use: [[#X>base]] (eg. [[#ADDR]]>2) where
>> <base is
>> >> >> >> > > >> optional in which case base defaults 10
>> >> >> >> > > >> Numeric expression: [[exp>base]] (eg. [[#ADDR+2>16]]
>> where expression
>> >> >> >> > > >> must contain at least one numeric variable
>> >> >> >> > > >>
>> >> >> >> > > >>
>> >> >> >> > > >> I'm not a big fan of the > for the output base being
>> inside the
>> >> >> >> > > >> expression but [[exp]]>base would match >base literally.
>> >> >> >> > > >>
>> >> >> >> > > >> Any suggestions / opinions?
>> >> >> >> > > >>
>> >> >> >> > > >> Best regards,
>> >> >> >> > > >>
>> >> >> >> > > >> Thomas
>> >> >> >> > > >> _______________________________________________
>> >> >> >> > > >> LLVM Developers mailing list
>> >> >> >> > > >> llvm-dev at lists.llvm.org
>> >> >> >> > > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >> >> > > >
>> >> >> >> > > >
>> >> >> >> > > _______________________________________________
>> >> >> >> > > LLVM Developers mailing list
>> >> >> >> > > llvm-dev at lists.llvm.org
>> >> >> >> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >> >> _______________________________________________
>> >> >> >> LLVM Developers mailing list
>> >> >> >> llvm-dev at lists.llvm.org
>> >> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180822/3ac80598/attachment.html>


More information about the llvm-dev mailing list