[llvm-commits] [llvm] r123990 - in /llvm/trunk: include/llvm/ADT/Triple.h lib/Support/Triple.cpp unittests/ADT/TripleTest.cpp

Sun Jan 23 05:21:58 PST 2011

Hi Renato,

On 22/01/11 12:41, Renato Golin wrote:
> On 21/01/11 20:53, Duncan Sands wrote:
>> if there is no vendor then you could just output nothing, resulting
>> in triples like eg x86_64--linux-gnu
>> However if triples with "none" are floating around in the real world then
>> this is OK.
>
> Hi Duncan,
>
> Indeed, I completely agree with you, but the way gcc computes the
> triples for ARM is to use the "none" when nothing is in there. So, if
> the Triples class reports "arm--eabi", it'll never find the
> "arm--eabi-gcc" in the path...

I don't understand... what does it mean to find a triple "in the path"?
Anyway, the goal of the LLVM triple class is to provide useful information
to LLVM, not to be a cure-all for triple problems.  LLVM needs to be able
to reliably extract information like the architecture and the O/S because
it makes use of this information.  If LLVM isn't going to use something then
there is no pointing in reasoning about it.  The basic problem is that
people like to write their triples in any old order, eg gnu-linux-x86_64.
The normalize method solves this problem by permuting triple components into
their correct positions.  It does this in a very simple way: if the first
component does not parse as a known architecture but some other component
does, then it moves that component into the first position.  Components that
don't parse as a valid architecture, vendor etc are just left alone.  So
given the above example "gnu-linux-x86_64" it recognizes "x86_64" as a valid
architecture and "linux" as a valid O/S, while it doesn't recognize "gnu".
It moves the architecture to position 1, the O/S to position 3 and it leaves
the unrecognised "gnu" component alone, giving "x86_64-gnu-linux".  Note that
in a sense this is wrong because "gnu" should come at the end rather than as
the vendor component.  But it doesn't matter because the vendor will parse as
"UnknownVendor" whether "gnu" is there or placed at the end, so moving "gnu"
would provide no added value: LLVM would act the same whether it is moved to
the end or not.

The advantage of this very simple scheme is that we don't need 1500 lines of
crazy logic like in GCC's config.sub.  The disadvantage is that while it is
good enough for what LLVM does with triples, front-ends like clang may need
more.  In my opinion if clang needs more it should build its own infrastructure
for providing more on top of the triple class, using the output of Normalize as
a useful starting point.

> This patch is by no means complete and was working around some
> deficiencies of the triple mechanism (gcc-wise) not to break all the
> other tests.

Not sure what you are saying here.

>> I don't think you should be doing this either. Why do you touch this? It
>> seems
>> quite wrong to me. If you want to do special case logic don't do it
>> here, do it
>> below at the point that says:
>>
>> // Special case logic goes here. At this point Arch, Vendor and OS have the
>> // correct values for the computed components.
>
> Long story short, all the kludge added to the triple was to make up for
> the fact that GCC puts environment in the place of OS (eabi in the third
> slot) and the validation of the triple is done while building it.
>
> Meaning that validation will fail if I don't change it on the fly, which
> is the same as not doing the change at all. And by putting the code in
> the special area below means I'll have to re-normalize after that.

I don't understand what you mean by "the validation of the triple is done
while building it".  Anyway, if you teach the triple class that eabi is a
valid environment then Normalize will automagically move it to the right
place, that's what Normalize is for.

>> This is a sign that something is wrong - I suspect this fix is a bandaid
>> and the problem would go away if you removed the logic trying to relate OS
>> and environment earlier in the normalization logic.
>
> The normalization logic does not know that environment went in the slot
> of the OS, and that's why you need the empty bucket to put it in, just
> in case. (I agree it's not the best solution, but it was a first step).

Sure it does, that's what Normalize is for: if it sees that the value in the
environment slot does not parse as a valid environment but some other value
("eabi") does then it will move eabi to the environment slot.

>> In short, I think you should at most just add NoOs and NoVendor in a trivial
>> way, without poking at Parse or Normalize at all. At worst any special case
>> logic (and why do you need any?) should go at the point I mentioned above.
>
> Fair enough, but for that I need the validation to be a separate pass,
> after the normalization.

What is validation?  LLVM has no notion of a valid or invalid triple.  It just
wants to know what the architecture is and what the O/S is.  If the user
provided a valid arch and O/S in the triple string then Normalize will put them
in the right positions, and the rest of LLVM will be able to use them without
any problems.  Is this something you need for clang?

> That way, I can have a failed OS and Environment, use the special region
> to do the env->os logic and only validate after that. Because, according
> to GCC, "arm-none-eabi" is a valid triple (widely used as well), but
> EABI is not the OS and "none" can mean several things...
>
> If that's ok, I can work on decouple the normalization from the
> validation and put the special section in between them two.

As you can see I didn't get where you are coming from.  By teaching Triple that
eabi is a valid environment, Normalize will automagically turn arm-none-eabi
into arm-none--eabi, and thus the rest of LLVM will see an arch of arm and an
environment of eabi which is what you want, right?  I don't see why the extra
logic you added is useful.

Ciao, Duncan.