[llvm-dev] RFC: Inline expansion of memcmp vs call to standard library

Philip Reames via llvm-dev llvm-dev at lists.llvm.org
Thu Dec 29 13:14:49 PST 2016


Improving lowering for memcmp is definitely something we should do for 
all targets.  Doing it in a target specific way is decidedly non-ideal.

It looks like we already have some code in SelectionDAGBuilder which 
tries to optimize the lowering for the memcpy library call. I am a bit 
confused by the problem you are trying to solve.  Are you specifically 
interested in lowering for constant lengths greater than a legal size?  
(i.e. do you need the loop?)

If so, there are two approaches you might consider:
- Expand the memcmp call into the loop form in CodeGenPrep (or a similar 
timed pass) where working with multiple basic blocks is much easier.  
Long term, the "right place" for this type of thing is clearly 
GlobalISEL, but we have a number of other such hacks in lowering today 
and continuing to build off of that seems reasonable.
- Emit the non-early exit form for small constant values (a[0] == b[0] 
&& a[1] == b[1] ...).  Assuming your backend has handling for 
efficiently lowering and chains using branches, you may very well get 
the code you want.

Using the psuedo instruction here feels messy.  In particular, I don't 
like the fact it basically opts out of all of the combines which might 
further improve the lowering.

Philip


On 12/29/2016 11:35 AM, Zaara Syeda via llvm-dev wrote:
>
> Currently on PowerPC, calls to memcmp are not expanded and are left as 
> library calls. In certain conditions, expansion can improve 
> performance rather than calling the library function as done for 
> functions like memcpy, memmove, etc. This patch 
> *(**https://reviews.llvm.org/D28163**)*is an initial implementation 
> for PowerPC to expand memcmp when the size is an 8 byte multiple.
>
> The approach currently added for this expansion tries to use the 
> existing infrastructure by overriding the virtual function 
> EmitTargetCodeForMemcmp. This function works on the SelectionDAG, but 
> the expansion requires control flow for early exit. So, instead of 
> implementing the expansion within EmitTargetCodeForMemcmp, a new 
> pseudo instruction is added for memcmp and a SelectionDAG node for 
> this new pseudo is created in EmitTargetCodeForMemcmp. This pseudo 
> instruction is then expanded during lowering in 
> EmitInstrWithCustomInserter.
>
> The advantage of this approach is that it uses the existing 
> infrastructure and does not impact other targets. If other targets 
> would like to expand memcmp, they can also override 
> EmitTargetCodeForMemcmp and create their own expansion.
>
> Another option to consider is adding a new optimization pass for this 
> expansion that isn’t target specific if other targets would benefit 
> from a more general infrastructure.
>
> Please provide feedback if this approach should be continued to 
> implement the PowerPC specific memcmp expansions or whether the 
> community is interested in devising a more general approach.
>
> Thanks,
>
> Zaara Syeda
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161229/ec08affd/attachment.html>


More information about the llvm-dev mailing list