[llvm-dev] Syntax for FileCheck numeric variables and expressions

Alexander Richardson via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 27 03:53:42 PDT 2018


On Thu, 26 Jul 2018 at 10:28 Thomas Preudhomme <thomas.preudhomme at linaro.org>
wrote:

> Hi Alexander,
>
> Please forgive me if I'm missing the obvious but I do not see how the
> order helps allowing a comma in the expression. It seems to me that
> what would allow it is to make FMTSPEC mandatory or at least the comma
> to separate it (ie. [[#,EXPR]] for the default format specifier). In
> any case comma in a function-call like expression can be distinguished
> from comma for the format specifier since one is always inside a
> parenthesized expression.
>
> Hi Thomas,

I though that FMTSPEC first might be easier to implement because you can
just check if the first non-whitespace character after # is a %. If it is
parse a fmtspec followed by a comma and if not treat everything else as the
expression. But you are right a function-like syntax would always contain
parentheses so there is no ambiguity.
I think [[#,EXPR]]  looks a bit strange and I think we can determine
default format vs format specifier based on the first character after the #
being a % or not. I.e. [[#EXPR]] means default format and [[#%x,EXPR]] is
hex. Does that sound reasonable?



> That said I don't have a strong opinion about the ordering of the
> expression wrt. the format specifier. I find EXPR, FMTSPEC more
> natural but at 2 persons (James and you) expressed preference for the
> reverse order so I'll assume that's the general preference.
>
>
I don't have a strong preference whether it should come before or after and
agree with James that whatever is easiest to implement should be done.

Thanks,
Alex


Best regards,
>
> Thomas
>
> P.S.: My apologies for only asking now but how do you prefer to be
> called? Alexander Vs Alex Vs something else?
>
Most people call me Alex but if you prefer Alexander is also fine.


>
> On Sun, 22 Jul 2018 at 20:23, Alexander Richardson
> <arichardson.kde at gmail.com> wrote:
> >
> > On Wed, 18 Jul 2018 at 13:50 Thomas Preudhomme <
> thomas.preudhomme at linaro.org> wrote:
> >>
> >> Hi Alex,
> >>
> >> Thanks for the feedback. My first thought was that introducing the new
> >> pseudo var @EXPR is a nice way to generalize that syntax beyond @LINE
> >> since it would also evaluate to an arithmetic value. On the other hand
> >> there is a small inconsistency because @LINE evaluates to a value
> >> which can be part of an expression while @EXPR is an expression, and
> >> so the @ syntax as a whole becomes defined as introducing something
> >> which is not a regular variable, ie. a negative definition.
> >>
> >> I'll stick with the # syntax because # is usually associated with
> >> numbers and can be defined as introducing an integer
> >> expression/variable. The one question I wonder is if the # should be
> >> next to the variable name or next to the [[ as proposed by James. I
> >> like the former better *but* I think the latter makes more sense since
> >> [[#VAR + 1]] would suggest that the [[<something>]] syntax already
> >> allows numeric expression without numeric variable which is not the
> >> case. Having the # right at the start also clearly indicates that the
> >> whole expression might have a conversion specifier. Finally, the #
> >> syntax can allow defining a variable with the result of an arithmetic
> >> expression:
> >> [[#BAR, %x:]]
> >> [[# FOO:BAR+12]]
> >>
> >> So BAR takes an hex value in lower case syntax, value gets added 12
> >> (in decimal) and the result is put into FOO. In which case there
> >> should be no format specifier when defining FOO. ie. format specifier
> >> for definition is only when there's nothing about the colon. Of course
> >> we could allow hex immediate with 0x syntax if needed. Again, I'm not
> >> advocating for implementing all this from the start, but make sure
> >> that the syntax would allow it if we realize we need this later and I
> >> think Jame's proposal does.
> >>
> >> It seems this syntax would suit all your current uses (albeit the
> >> rewriting necessary), did I miss something?
> >>
> >
> > Hi Thomas,
> >
> > That would indeed work fine for me and it would be easy to update our
> tests with a few regex replaces.
> >
> > I think I prefer the [[# %FMTSPEC, EXPR]] syntax since that would also
> make it possible to have commas in the expression part. This might be
> useful if we allow function-call like expressions such as [[# %X, pow(10,
> FOO) + 20]].
> >
> >
> > Alex
> >
> >
> >
> >>
> >> Best regards,
> >>
> >> Thomas
> >>
> >> On Tue, 17 Jul 2018 at 21:59, Alexander Richardson
> >> <arichardson.kde at gmail.com> wrote:
> >> >
> >> >
> >> >
> >> > On Tue, 17 Jul 2018 at 10:02 Thomas Preudhomme via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >> >>
> >> >> To be clear, I do not intend to add support for hex specifier in the
> >> >> current patch, I just want to make sure the syntax we choose is going
> >> >> to allow it later. My immediate use case is decimal integer and I
> >> >> intend to write the code so that it's easy to extend to more type of
> >> >> numeric variables and expressions later. This way we'll only add
> >> >> specifier that are actually required by actual testcases.
> >> >>
> >> >
> >> > I also added FileCheck expressions to our fork of LLVM in order to
> allow testing both a 128-bit and a 256-bits versions of our CHERI ISA in a
> single test case [1].
> >> > I used [[@EXPR foo * 2 + 1]] for FileCheck expressions [2]. I'm not
> particularly happy with this syntax since it is quite verbose (but then
> again we don't need it that often so it doesn't really matter). It also
> doesn't allow saving the expression result so it needs to be repeated
> everywhere. I could probably use [[@EXPR:OUTVAR INVAR + 42]] but I haven't
> really had the need for that yet.
> >> >
> >> > We currently need the following two features:
> >> >
> >> > - Simple arithmetic with multiple operations. Example:
> >> > `cld $gp, $zero, [[@EXPR 2 * $CAP_SIZE - 8]]($c11)`
> >> >
> >> > - Conversion to hex (upper and lower case since not all tools are
> consistent here) and to decimal.
> >> > Example: // READOBJ-NEXT: 0x50 R_MIPS_64/R_MIPS_NONE/R_MIPS_NONE
> .data 0x[[@EXPR hex($CAP_SIZE * 2)]]
> >> >
> >> > Alex
> >> >
> >> > [1] For most test cases the simple -DVAR=value flag in FileCheck is
> good enough: we have a %cheri_FileCheck lit substitution that expands to
> `FileCheck '-D$CAP_SIZE=16/32'` . This works for most IR level tests since
> usually the only thing that is different is "align 16" vs "align 32".
> However, when checking the assembly output or linker addresses we often
> need something more complex.
> >> >
> >> > [2] A test case showing all the currently supported expressions can
> be found here: <
> https://github.com/CTSRD-CHERI/llvm/blob/master/test/FileCheck/expressions.txt
> >
> >> >
> >> >
> >> >>
> >> >> On Mon, 16 Jul 2018 at 18:39, <paul.robinson at sony.com> wrote:
> >> >> >
> >> >> >
> >> >> >
> >> >> > > -----Original Message-----
> >> >> > > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On
> Behalf Of
> >> >> > > Thomas Preudhomme via llvm-dev
> >> >> > > Sent: Monday, July 16, 2018 6:24 AM
> >> >> > > To: jh7370.2008 at my.bristol.ac.uk
> >> >> > > Cc: llvm-dev at lists.llvm.org
> >> >> > > Subject: Re: [llvm-dev] Syntax for FileCheck numeric variables
> and
> >> >> > > expressions
> >> >> > >
> >> >> > > Hi James,
> >> >> > >
> >> >> > > I like that suggestion very much but I think keeping the order
> of the
> >> >> > > two sides as initially proposed makes more sense. In
> printf/scanf the
> >> >> > > string is first because the primary use of these functions is to
> do
> >> >> > > I/O and so you first specify what you are going to output/input
> and
> >> >> > > then where to capture variables. The primary objective of
> FileCheck
> >> >> > > variables and expressions is to capture/print them, the
> specifier is
> >> >> > > an addon to allow some conversion. Does it make sense?
> >> >> >
> >> >> > My immediate reaction is that I'd rather not have FileCheck get
> into
> >> >> > the business of handling printf specifiers.  OTOH, while LLVM tools
> >> >> > do typically print lowercase hex, that's not guaranteed, and
> looking
> >> >> > at the output of other tools can be useful too.  So, a way to
> specify
> >> >> > the case for a hex conversion seems worthwhile.
> >> >> >
> >> >> > I had also been thinking in terms of the trailing colon to
> distinguish
> >> >> > definition from use, as James suggested, that's sort-of consistent
> >> >> > with the current syntax.
> >> >> >
> >> >> > This is starting to make parsing the insides of [[]] much more
> involved,
> >> >> > so you'll want to pay attention to making that code
> well-structured and
> >> >> > readable.
> >> >> > --paulr
> >> >> >
> >> >> > >
> >> >> > > In the interest of speeding things up I plan to start
> implementing
> >> >> > > this proposal starting tomorrow unless someone gives some more
> >> >> > > feedback.
> >> >> > >
> >> >> > > Best regards,
> >> >> > >
> >> >> > > Thomas
> >> >> > >
> >> >> > > On Fri, 13 Jul 2018 at 15:51, James Henderson
> >> >> > > <jh7370.2008 at my.bristol.ac.uk> wrote:
> >> >> > > >
> >> >> > > > Hi Thomas,
> >> >> > > >
> >> >> > > > In general, I think this is a good proposal. However, I don't
> think that
> >> >> > > using ">" or "<" to specify base (at least alone) is a good
> idea, as it
> >> >> > > might clash with future ideas to do comparisons etc. I also
> think it would
> >> >> > > be nice to have the syntax consistent between definition and
> use. My first
> >> >> > > thought on a reasonable alternative was to use commas to
> separate the two
> >> >> > > parts, so something like:
> >> >> > > >
> >> >> > > > [[# VAR, 16:]] to capture a hexadecimal number (where the
> spaces are
> >> >> > > optional). [[# VAR, 16]] to use a variable, converted to a
> hexadecimal
> >> >> > > string. In both cases, the base component is optional, and
> defaults to
> >> >> > > decimal.
> >> >> > > >
> >> >> > > > This led me to thing that it might be better to use something
> similar to
> >> >> > > printf style for the latter half, so to capture a hexadecimal
> number with
> >> >> > > a leading "0x" would be: "0x[[# VAR, %x:]]" and to use it would
> be "0x[[#
> >> >> > > VAR, %x]]". Indeed, that would allow straightforward conversions
> between
> >> >> > > formats, so say you defined it by capturing a decimal integer
> and using it
> >> >> > > to match a hexadecimal in upper case, with leading 0x and 8
> digits
> >> >> > > following the 0x:
> >> >> > > >
> >> >> > > > CHECK: [[# VAR, %d:]] # Defines
> >> >> > > > CHECK: 0x[[# VAR + 1, %8X]] # Uses
> >> >> > > >
> >> >> > > > Of course, if we go down that route, it would probably make
> more sense
> >> >> > > to reverse the two sides (e.g. to become "[[# %d, VAR:]]" to
> capture a
> >> >> > > decimal and "[[# %8X, VAR + 1]]" to use it).
> >> >> > > >
> >> >> > > > Regards,
> >> >> > > >
> >> >> > > > James
> >> >> > > >
> >> >> > > > On 12 July 2018 at 15:34, Thomas Preudhomme via llvm-dev <llvm-
> >> >> > > dev at lists.llvm.org> wrote:
> >> >> > > >>
> >> >> > > >> Hi all,
> >> >> > > >>
> >> >> > > >> I've written a patch to extend FileCheck to support matching
> >> >> > > >> arithmetic expressions involving variable [1] (eg. to match
> REG+1
> >> >> > > >> where REG is a variable with a numeric value). It was
> suggested to me
> >> >> > > >> in the review to introduce the concept of numeric variable
> and to
> >> >> > > >> allow for specifying the base the value are written in.
> >> >> > > >>
> >> >> > > >> [1] https://reviews.llvm.org/D49084
> >> >> > > >>
> >> >> > > >> I think the syntax should satisfy the below requirements:
> >> >> > > >>
> >> >> > > >> * based off the [[]] construct since anything else might
> overload an
> >> >> > > >> existing valid syntax (eg. $$ is supposed to match literally
> now)
> >> >> > > >> * consistent with syntax for expressions using @LINE
> >> >> > > >> * consistent with using ':' to define regular variable
> >> >> > > >> * allows to specify base of the number a numeric variable is
> being set
> >> >> > > to
> >> >> > > >> * allows to specify base of the result of the numeric
> expression
> >> >> > > >>
> >> >> > > >> I've come up with the following syntax for which I'd like
> feedback:
> >> >> > > >>
> >> >> > > >> Numeric variable definition: [[#X<base:]] (eg. [[#ADDR<16:]])
> where X
> >> >> > > >> is the numeric variable being defined and <base is optional
> in which
> >> >> > > >> case base defaults to 10
> >> >> > > >> Numeric variable use: [[#X>base]] (eg. [[#ADDR]]>2) where
> <base is
> >> >> > > >> optional in which case base defaults 10
> >> >> > > >> Numeric expression: [[exp>base]] (eg. [[#ADDR+2>16]] where
> expression
> >> >> > > >> must contain at least one numeric variable
> >> >> > > >>
> >> >> > > >>
> >> >> > > >> I'm not a big fan of the > for the output base being inside
> the
> >> >> > > >> expression but [[exp]]>base would match >base literally.
> >> >> > > >>
> >> >> > > >> Any suggestions / opinions?
> >> >> > > >>
> >> >> > > >> Best regards,
> >> >> > > >>
> >> >> > > >> Thomas
> >> >> > > >> _______________________________________________
> >> >> > > >> LLVM Developers mailing list
> >> >> > > >> llvm-dev at lists.llvm.org
> >> >> > > >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >> > > >
> >> >> > > >
> >> >> > > _______________________________________________
> >> >> > > LLVM Developers mailing list
> >> >> > > llvm-dev at lists.llvm.org
> >> >> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >> >> _______________________________________________
> >> >> LLVM Developers mailing list
> >> >> llvm-dev at lists.llvm.org
> >> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180727/54c7a27b/attachment.html>


More information about the llvm-dev mailing list