[LLVMdev] make DataLayout a mandatory part of Module

Sat Feb 1 00:57:09 PST 2014

FWIW, I strongly support making this a mandatory part of the module. There
is *so* much code to delete, this clearly simplifies the IR model.

On Fri, Jan 31, 2014 at 5:21 PM, Nick Lewycky <nlewycky at google.com> wrote:

> On 30 January 2014 02:10, David Chisnall <David.Chisnall at cl.cam.ac.uk>wrote:
>
>> On 30 Jan 2014, at 00:04, Nick Lewycky <nlewycky at google.com> wrote:
>>
>> > This is also what many clang tests do, where TUs get parsed using the
>> host triple. If we keep target datalayout out of the test files and fill it
>> in with the host's information, then our test coverage expands as our
>> buildbot diversity grows, which is a neat property.
>>
>> Unfortunately, reproducibility suffers.  You commit a change, a test
>> fails on two buildbots but passes on all of the others and on your local
>> system.  Now what do you do?
>
>
> There's two issues here. One is what to do if we encounter a .ll/.bc with
> no target data. We're obliged to support llvm 3.0 bitcode files, so we need
> to have an answer to this question.
>
> Second is what to do in our test suite. If the answer to the first
> question is "make it use the host target data" then the second part is a
> choice either to leave the tests with no explicit layout and thereby use
> the host target, or to require that tests in the testsuite specify their
> datalayout. The tradeoff is that in one case we get more coverage across
> different machines, and in the other case we get better reproducibility,
> which is important for a regression suite or for a new user to verify that
> their build of llvm is valid.
>

Since you mentioned this I've been of two minds, but increasingly I think
following the host is the wrong behavior here.

Following the host makes lots of sense for Clang because a) there is a
reasonable portable subset of C and C++ in which we can (and should) write
test cases, and b) there is a need for Clang itself to default to the host,
so exercising that behavior seems reasonable. Still, we try to isolate many
parts of it by using the CC1 layer, etc, to make things more explicit and
more reproducible. It isn't perfect, but the tradeoff makes sense to me.

With llc we have a different tradeoff -- without following the host there
is really nothing sensible to do. While I would have a mild preference for
failing to specify a target being an error, :: shrug ::, this just doesn't
seem to matter much.

But with opt, I feel like there is a different and better tradeoff. Here,
the primary use case is driving, testing, experimenting, and regression
analysis of the optimizer. Also, there seems to be a very good default of a
completely boring or "typical" layout with minimal information. This might
correspond to "", or not, it doesn't matter to me. So I would set up a
stable and reliable default. That way test cases and other things can be
simpler and totally reproducible. We can generate RUN lines with different
triples and/or data layouts to get testing across different configurations,
but it doesn't really seem like the primary role we need from the opt tool.

So to sum up, I'm increasingly a fan of what makes testing and playing with
optimizations easier. A simple, representative default that is consistent
across all platforms seems to fit that bill really nicely.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140201/2d003353/attachment.html>