[cfe-dev] Difference in generated code between variadic parameter pack and manual version

Johannes Doerfert via cfe-dev cfe-dev at lists.llvm.org
Mon Sep 21 22:00:27 PDT 2020


We should probably be discuss this on llvm-dev.
Arthur pointed out the "clang caused" differences already, from the 
generated IR this looks not "substantially different" at all 
(https://godbolt.org/z/6onnfh)

As was reported, this is a vectorizer "issue", probably some pattern 
matching gone wrong but maybe more.
Vectorizer remarks might already be useful here.

~ Johannes


On 9/21/20 7:07 AM, Bart Samwel via cfe-dev wrote:
> On Mon, Sep 21, 2020 at 1:44 PM Arthur O'Dwyer <arthur.j.odwyer at gmail.com>
> wrote:
>
>> On Mon, Sep 21, 2020 at 7:02 AM Bart Samwel via cfe-dev <
>> cfe-dev at lists.llvm.org> wrote:
>>
>>> Hi there folks,
>>>
>>> I wonder if anybody can shed some light on this. I'm looking at a
>>> function with a parameter pack argument and one without, that should do the
>>> exact same thing.
>>>
>>> https://godbolt.org/z/Keqzcj
>>>
>>> However, the version with the parameter pack expands (at -O3
>>> -march=broadwell, on clang 10.0.1, on godbolt) into a loop per 128 bytes,
>>> plus a loop per 64 bytes, plus nonvectorized instructions to process the
>>> remaining <=63 bytes. The manual version expands to just a loop per 128
>>> bytes (256-bit vectors, unrolled 4x), and nonvectorized instructions to
>>> process the remaining <=127 bytes.
>>>
>> It's about the fold expression.
>> https://godbolt.org/z/EPETj9
>>
>> With C++17 fold-expressions, (args | ...) doesn't mean (arg1 | arg2 |
>> arg3); it means (arg1 | (arg2 | arg3)).  So with the right-fold you wrote,
>> you're telling the compiler to OR the values together "right-to-left",
>> whereas the non-template version does it "left-to-right": ((arg1 | arg2) |
>> arg3). And apparently this makes some huge difference to the codegen (which
>> is still mysterious to me, but out of my depth).
>>
> That is just plain weird, and probably interesting for the codegen folks to
> look at. :) Thanks a lot for figuring this out!
>
>
>> Switch the right-fold to a left-fold and the codegen becomes identical, at
>> least to my eyes. (In the above Godbolt, put -DVARIADIC in one compiler
>> frame and nothing in the other.)
>>
>> –Arthur
>>
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev


More information about the cfe-dev mailing list