[PATCH][X86] Fix for cannot select broadcast

Quentin Colombet qcolombet at apple.com
Fri Mar 21 15:55:57 PDT 2014


Hi Nadav,

Here is a new patch that provides the fallback patterns.

That said, the new patterns creates less efficient code that my initial proposal.

I.e., fallback patterns:
	movb	(%rdi), %al
	vmovd	%eax, %xmm1
	vpbroadcastb	%xmm1, %xmm1
Orignial patch:
	vpbroadcastb	(%rdi), %xmm1

You told me that a later pass is supposed to catch the new opportunities of load folding. Looks like it does not know how to do that across copies.
We would have to fix that at some point.

Anyhow, I think both patches are orthogonal, having fallback patterns looks like a good idea but selection directly the instructions that match what X86ISelLowering thought it was lowering seems a good approach too :).

What do you think?

As a side question, do you happen to have a test case for the existing fallback patterns?
The original commit  r155437 do not have any test case and I was wondering how a load could have new users added to it (some constant load?). Thus, I wanted to check we were not already working around the problem I mentioned.

Thanks,
-Quentin

On Mar 21, 2014, at 1:25 PM, Quentin Colombet <qcolombet at apple.com> wrote:

> Talked with Nadav off-line.
> 
> In fact X86 backend has fallback patterns for such cases. We may just miss a few of them. Looking into it.
> 
> Thanks,
> -Quentin
> 
> On Mar 21, 2014, at 12:49 PM, Quentin Colombet <qcolombet at apple.com> wrote:
> 
>> Hi Nadav and Owen,
>> 
>> Here is a patch that fixes a cannot select failure for broadcast instruction in the X86 backend <rdar://problem/16074331>.
>> Although the fix is in the x86 backend (Nadav’s domain), the root cause of the problem is in SelectionDAGISel (Owen’s domain).
>> 
>> Thanks for your review and feedbacks.
>> 
>> 
>> ** Symptom **
>> 
>> Compiling the test case in the patch with x86_64 and avx2 would produce:
>> --
>> LLVM ERROR: Cannot select: 0x7f9d02059128: v8i16 = X86ISD::VBROADCAST 0x7f9d02037178 [ORD=9] [ID=13]
>>   0x7f9d02037178: i16,ch = load 0x7f9d0205ab18, 0x7f9d02059230, 0x7f9d02036b48<LD2[%cV_R.addr](align=4)> [ORD=7] [ID=11]
>>     0x7f9d02059230: i64,ch = CopyFromReg 0x7f9d01512a30, 0x7f9d02036a40 [ORD=1] [ID=8]
>>       0x7f9d02036a40: i64 = Register %vreg0 [ID=1]
>>     0x7f9d02036b48: i64 = undef [ID=3]
>> In function: isel_crash_broadcast
>> FileCheck error: '-' is empty.
>> 
>> --
>> 
>> 
>> ** Problem **
>> 
>> The instruction supposed to match is a broadcast of a value coming from the memory, i.e., a load folded into a broadcast.
>> The load is used only once and the broadcast does not have any other input dependencies. Thus, the folding is possible modulo a proper scheduling of the broadcast node.
>> 
>> The problem here is that during the select phase of the  isel process, SelectionDAGISel performs a check to see if the load is foldable into the broadcast (HandleMergeInputChains). This test is overly conservative and fails to detect that the folding is okay.
>> 
>> 
>> ** Cause **
>> 
>> HandleMergeInputChains is not actually checking that a cycle will be created if something is folded into something else.
>> In particular, it is not checking for reachability but instead relies on heuristics to give a quick answer. This answer is conservatively correct:
>> - No: no cycle will be created.
>> - Yes: a cycle *may* be created. 
>> 
>> In this example we have something like this:
>> a = ld @a
>> |   \
>> |    st b, @b
>> C = vbroadcast a
>> 
>> Here, we are trying to fold a into C and there is a chain between the load of a and the store of b.
>> Since this chain is not part of the pattern it assumes it will create a cycle:
>>  st b, @b
>>    ^
>>     |
>>     v  
>> C = vbroadcast ld @b
>> 
>> This is wrong, because the chain is in one direction C -> st b, i.e., we can schedule C before st b.
>> I beleive this limitation is intended for three reasons:
>> 1. to avoid costly reachability checks.
>> 2. to handle only the rewriting of token factor from the not-yet-matched nodes to the matched node.
>> 3. everything that has been matched is ready to schedule.
>> 
>> Interestingly, inverting the selection order of the st and vbroadcast nodes in this dag, solves the issue, because now the store is considered as scheduled and therefore cannot create a cycle (condition #3).
>> 
>> 
>> ** Proposed Solution **
>> 
>> Fixing the root cause of the problem requires to change a lot of assumption in SelectionDAGISel, in particular #3. Moreover, #1 may be very harmful for the compile time.
>> Therefore, the proposed fix works around the problem by forcing a valid schedule for the broadcast instruction to please the select phase. In parallel, I will file a PR for the general issue.
>> 
>> Other ideas are welcome!
>> 
>> For the details, the patch is transforming this:
>> load -- chain --> someNode  -- more deps ->
>>       \
>>         +-- use --> broadcast -- more deps ->
>> 
>> Into this:
>> load                          +-- chain  --> someNode  -- more deps ->
>>       \                       /
>>         +-- use --> broadcast -- more deps ->
>> 
>> Thus, the broadcast will be schedule in place of the load. By construction, this is valid, because load has only one use and broadcast only one input dependency (the load).
>> 
>> Thanks,
>> -Quentin
>> <broadcast.patch>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140321/87f7b07d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: broadcast_fallback.patch
Type: application/octet-stream
Size: 8218 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140321/87f7b07d/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140321/87f7b07d/attachment-0001.html>


More information about the llvm-commits mailing list