[PATCH] Refactor sameNoopInput (determining tail-call legitimacy)

Tue Jul 30 21:31:11 PDT 2013

Hi Tim,

I changed my mind and decided that it's probably not worth doing my
smaller point fix first, given that there are more issues with the
current code than just the truncate issue you found originally :)

Anyway, thanks for this! The code looks great from what I can tell,
although I haven't worked through all the logic yet. (I just know I
wrote pretty good tests when I touched this stuff last, so if those
all pass and your new ones do, I suspect everything works as it should
:D.)

Minor thought, though: I noticed the condition check for a truncate is
now the following:

+               TLI.isTypeLegal(EVT::getEVT(Op->getType())) &&
+               TLI.isTruncateFree(Op->getType(), I->getType())) {

I understand that this is an improvement from where it is before, but
do you think this is enough to guarantee that there are no false
positives? I can imagine a target implementing a weird calling
convention in which a type was legal AND a truncation to a smaller
type was "free", but for some reason didn't assign both types to the
same register for return value purposes or where somehow the presence
of the larger type in the argument list leads to a different
assignment of registers for subsequent arguments...in fact, I'm pretty
sure it's pretty easy create such a calling convention using tablegen
definitions without any custom code, given the flexibility of those
definitions.

Given that possibility, and since this is already being changed, maybe
a more direct hook should be implemented specifically for the purpose
of this function with the exact semantics so that there is no
possibility of false positives? Basically, it would guarantee that the
larger type is passed in the same register in the small type and
(furthermore) that the presence of the larger type does not lead to
any different register assignments for later return values, for all
calling conventions supported by the target. (This could also be more
granular check that is parameterized by the caller and callee calling
conventions, but it's probably overkill for now.)

Similarly, I think your method of comparing struct elements one "real"
type at a time and ignoring any structural differences is OK with all
calling conventions currently implemented, but is there any guarantee
that this will always be true? Maybe this should call into the target
lowering info to check that this is OK, as well? (All current targets
can just trivially implement this hook to return "true", so it won't
really make a difference in practice, but it at least provides a
mechanism for documenting what the assumptions are and opting out of
this behavior if desired.)

Stephen

On Thu, Jul 25, 2013 at 5:19 AM, Tim Northover <t.p.northover at gmail.com> wrote:
> This change came about primarily because of two issues in the existing code. Niether of:
>
> define i64 @test1(i64 %val) {
>   %in = trunc i64 %val to i32
>   tail call i32 @ret32(i32 returned %in)
>   ret i64 %val
> }
>
> define i64 @test2(i64 %val) {
>   tail call i32 @ret32(i32 returned undef)
>   ret i32 42
> }
>
> should be tail calls, and the function sameNoopInput is responsible. The main problem is that it is completely symmetric in the "tail call" and "ret" value, but in reality different things are allowed on each side.
>
> For these cases:
> 1. Any truncation should lead to a larger value being generated by "tail call" than needed by "ret".
> 2. Undef should only be allowed as a source for ret, not as a result of the call.
>
> Along the way I noticed that a mismatch between what this function treats as a valid truncation and what the backends see can lead to invalid calls as well (see x86-32 test case).
>
> This patch refactors the code so that instead of being based primarily on values which it recurses into when necessary, it starts by inspecting the type and considers each fundamental slot that the backend will see in turn. For example, given a pathological function that returned {{}, {{}, i32, {}}, i32} we would consider each "real" i32 in turn, and ask if it passes through unchanged. This is much closer to what the backend sees as a result of ComputeValueVTs.
>
> Aside from the bug fixes, this eliminates the recursion that's going on and, I believe, makes the bulk of the code significantly easier to understand. The trade-off is the nasty iterators needed to find the real types inside a returned value.
>
>
> http://llvm-reviews.chandlerc.com/D1216
>
> Files:
>   lib/CodeGen/Analysis.cpp
>   test/CodeGen/X86/returned-trunc-tail-calls.ll
>   test/CodeGen/X86/tail-call-legality.ll
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>