[PATCH] D88823: [CodeGen][TailDuplicator] Don't duplicate blocks with INLINEASM_BR

James Y Knight via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 5 23:04:47 PDT 2020


jyknight added a comment.

> I disagree. You'll be inserting an instruction that won't be executed on the default path before the asm goto block. While this may be okay in a majority of cases, I don't suspect that it'll be good to rely on it being always okay.

Being able to handle a PHI in the indirect target block is required. We can, and _must_ be able to do so. E.g., consider two asm goto statements both having the same block as their indirect destination.

> I've explained way too many times what this patch is doing. You and Nick seem to think that this patch is bad for some reason or that it's confusing. If so, then please present one of your own.

The problem is that what the patch is doing is not consistent with the description of the problem. If "PHIs on indirect targets are bad", this patch cannot be the right fix, because in that case, we'd need to actually prevent PHIs in indirect targets, which this doesn't do. It avoids only one way to cause them to be created.

However, looking at your last example, I do believe it's the _description_ which is wrong. That is -- the bug is actually in tail duplication, not with PHIs in indirect targets in general. (phew!) In the "After Tail Duplication" dump, you can see tail duplication has introduced a COPY after the INLINEASM_BR, which gets used in the indirect block. This is obviously not OK -- the COPY should need to be before the INLINEASM_BR in order to be used in IR block bb17.i.i.i.

  INLINEASM_BR &"1:.byte 0x0f,0x1f,0x44,0x00,0\0A\09.pushsection __jump_table,  \22aw\22 \0A\09 .balign 8 \0A\09.long 1b - ., ${2:l} - . \0A\09 .quad ${0:c} + ${1:c} - .\0A\09.popsection \0A\09" [sideeffect] [mayload] [attdialect], $0:[imm], @slub_debug_enabled, $1:[imm], 0, $2:[imm], blockaddress(@get_partial_node, %ir-block.bb17.i.i.i), $3:[clobber], implicit-def early-clobber $df, $4:[clobber], implicit-def early-clobber $fpsw, $5:[clobber], implicit-def early-clobber $eflags, !11

^^ This instruction can jump to bb.19.bb17.i.i.i

  %61:gr64 = COPY %4:gr64

Therefore, this copy ^^ cannot be placed here validly.

    JMP_1 %bb.20
  
  bb.19.bb17.i.i.i (address-taken):
  ; predecessors: %bb.16, %bb.17
    successors: %bb.20(0x7c000000), %bb.21(0x04000000); %bb.20(96.88%), %bb.21(3.12%)
  
    %62:gr64 = PHI %60:gr64, %bb.16, %61:gr64, %bb.17

^^^ Here's the use.

Here's a slightly modified version of your minimized test-case which exhibits these same invalid copy insertions. (The difference is using the phi variable somewhere else besides just that one phi.) I also got rid of the undefs, just because they seem icky --

  declare void @foo()
  
  define i8* @test1(i8** %arg1, i8* %arg2) {
  bb:
    %i28.i = load i8*, i8** %arg1, align 8
    %if = icmp ne i8* %i28.i, %arg2
    br i1 %if, label %bb100, label %bb106
  
  bb100:                                            ; preds = %bb
    store i8* null, i8** %arg1, align 8
    br label %bb110
  
  bb106:                                            ; preds = %bb
    call void @foo()
    br label %bb110
  
  bb110:                                            ; preds = %bb106, %bb100
    %i10.1 = phi i8* [ %arg2, %bb106 ], [ %i28.i, %bb100 ]
    callbr void asm sideeffect "#$0 $1 $2", "i,i,X,~{dirflag},~{fpsr},~{flags}"(i32 42, i1 false, i8* blockaddress(@test1, %bb17.i.i.i))
            to label %kmem_cache_has_cpu_partial.exit [label %bb17.i.i.i]
  
  bb17.i.i.i:                                       ; preds = %bb110
    br label %kmem_cache_has_cpu_partial.exit
  
  kmem_cache_has_cpu_partial.exit:                  ; preds = %bb110
    ret i8* %i10.1
  }

In all these sorts of bugs, it's useful to ask why it's not broken with invoke (in case it is!). In this case, it looks like what prevents this bug from occurring for invoke is that EH_LABEL is marked `isNotDuplicable = 1`, checked a few lines up from your change. So, good.

I think the fix you have here is OK for now, if you fix the test and commentary.

As a follow-on, we could re-allow tail duplication, but actually place the copies in the correct location, by calling llvm::findPHICopyInsertPoint within TailDuplicator::appendCopies.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88823/new/

https://reviews.llvm.org/D88823



More information about the llvm-commits mailing list