[LLVMdev] Making Sense of ISel DAG Output
Dan Gohman
gohman at apple.com
Thu Oct 2 17:32:34 PDT 2008
On Oct 2, 2008, at 2:19 PM, David Greene wrote:
> On Thursday 02 October 2008 12:42, David Greene wrote:
>
>> But let's say you _could_ write such a pattern (because I can).
>> The input
>> DAG looks like this:
>>
>> 0x391a220: <multiple use>
>> 0x391c970: v2f64 = scalar_to_vector 0x391a220 srcLineNum= 10
>> 0x391ac10: <multiple use>
>> 0x391c8b0: v2f64 = scalar_to_vector 0x391ac10 srcLineNum= 10
>> 0x3927b10: <multiple use>
>> 0x3923100: v2f64 = vector_shuffle 0x391c970, 0x391c8b0,
>> 0x3927b10<0,2> srcLineNum= 10
>>
>> The code that gets produced looks like this:
>>
>> %reg1071<def> = MOVSD2PDrm %reg1026, 8, %reg1065, 4294967288,
>> Mem:LD(8,8)
>> [r66428 + 0]LD(8,8) [r78427 + 0] ; srcLine 10
>> %reg1072<def> = MOVSD2PDrm %reg1026, 8, %reg1065, 4294967288,
>> Mem:LD(8,8)
>> [r66428 + 0]LD(8,8) [r78427 + 0] ; srcLine 10
>> %reg1073<def> = SHUFPDrri %reg1071, %reg1072, 0 ; srcLine 10
>
> Actrually, it's worse than this. I wanted to check to make sure
> something
> else wasn't causing the problem but it appears to come from isel.
> The full
> output for the DAG looks like this:
>
> %reg1059<def> = MOVSX64rm32 %reg1033, 1, %reg0, 4, Mem:LD(4,4)
> [tmp163 +
> 0] ; srcLine 10
> %reg1060<def> = MOVSDrm %reg1026, 8, %reg1059, 4294967288,
> Mem:LD(8,8)
> [r45154 + 0] ; srcLine 10
> %reg1061<def> = MOVSX64rm32 %reg1033, 1, %reg0, 0, Mem:LD(4,4) [iv.
> 161162 +
> 0] ; srcLine 10
> %reg1062<def> = MOVSDrm %reg1026, 8, %reg1061, 4294967288,
> Mem:LD(8,8)
> [r30158 + 0] ; srcLine 10
> %reg1063<def> = MOVSD2PDrm %reg1026, 8, %reg1059, 4294967288,
> Mem:LD(8,8)
> [r30158 + 0]LD(8,8) [r45154 + 0] ; srcLine 10
> %reg1064<def> = MOVSD2PDrm %reg1026, 8, %reg1059, 4294967288,
> Mem:LD(8,8)
> [r30158 + 0]LD(8,8) [r45154 + 0] ; srcLine 10
> %reg1065<def> = SHUFPDrri %reg1063, %reg1064, 0 ; srcLine 10
>
> Where the <bleep> are these extra dead MOVSDrms coming from? Note
> that the
> extra MOVSDrms at least seem to use the correct addresses.
Looking at your dump() output above, it looks like the pre-selection
loads have multiple uses, so even though you've managed to match a
larger pattern that incorporates them, they still need to exist to
satisfy some other users.
Dan
More information about the llvm-dev
mailing list