[llvm-dev] RFC: Inline expansion of memcmp vs call to standard library
Philip Reames via llvm-dev
llvm-dev at lists.llvm.org
Thu Dec 29 13:14:49 PST 2016
Improving lowering for memcmp is definitely something we should do for
all targets. Doing it in a target specific way is decidedly non-ideal.
It looks like we already have some code in SelectionDAGBuilder which
tries to optimize the lowering for the memcpy library call. I am a bit
confused by the problem you are trying to solve. Are you specifically
interested in lowering for constant lengths greater than a legal size?
(i.e. do you need the loop?)
If so, there are two approaches you might consider:
- Expand the memcmp call into the loop form in CodeGenPrep (or a similar
timed pass) where working with multiple basic blocks is much easier.
Long term, the "right place" for this type of thing is clearly
GlobalISEL, but we have a number of other such hacks in lowering today
and continuing to build off of that seems reasonable.
- Emit the non-early exit form for small constant values (a[0] == b[0]
&& a[1] == b[1] ...). Assuming your backend has handling for
efficiently lowering and chains using branches, you may very well get
the code you want.
Using the psuedo instruction here feels messy. In particular, I don't
like the fact it basically opts out of all of the combines which might
further improve the lowering.
Philip
On 12/29/2016 11:35 AM, Zaara Syeda via llvm-dev wrote:
>
> Currently on PowerPC, calls to memcmp are not expanded and are left as
> library calls. In certain conditions, expansion can improve
> performance rather than calling the library function as done for
> functions like memcpy, memmove, etc. This patch
> *(**https://reviews.llvm.org/D28163**)*is an initial implementation
> for PowerPC to expand memcmp when the size is an 8 byte multiple.
>
> The approach currently added for this expansion tries to use the
> existing infrastructure by overriding the virtual function
> EmitTargetCodeForMemcmp. This function works on the SelectionDAG, but
> the expansion requires control flow for early exit. So, instead of
> implementing the expansion within EmitTargetCodeForMemcmp, a new
> pseudo instruction is added for memcmp and a SelectionDAG node for
> this new pseudo is created in EmitTargetCodeForMemcmp. This pseudo
> instruction is then expanded during lowering in
> EmitInstrWithCustomInserter.
>
> The advantage of this approach is that it uses the existing
> infrastructure and does not impact other targets. If other targets
> would like to expand memcmp, they can also override
> EmitTargetCodeForMemcmp and create their own expansion.
>
> Another option to consider is adding a new optimization pass for this
> expansion that isn’t target specific if other targets would benefit
> from a more general infrastructure.
>
> Please provide feedback if this approach should be continued to
> implement the PowerPC specific memcmp expansions or whether the
> community is interested in devising a more general approach.
>
> Thanks,
>
> Zaara Syeda
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161229/ec08affd/attachment.html>
More information about the llvm-dev
mailing list