[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

David Chisnall david.chisnall at cl.cam.ac.uk
Thu Dec 4 23:46:54 PST 2014


On 3 Dec 2014, at 23:36, Robert Lougher <rob.lougher at gmail.com> wrote:

> On 2 December 2014 at 22:18, Alex Rosenberg <alexr at leftfield.org> wrote:
>> 
>> Our C library amplifies this problem by being in a dynamic library, so the
>> call has additional overhead, which for small trip counts swamps the
>> copy/set.
>> 
> 
> I can't imagine we're the only platform (now or in the future) that
> has comparatively slow library calls.  We had discussed some sort of
> platform flag (has slow library calls) but this would be too late to
> affect the loop-idiom.  However, it could affect lowering.  Following
> on from Reid's earlier idea to lower short memcpys to an inlined,
> slightly widened loop, we could expand into a guarded loop for small
> values and a call?

I think the bug is not that we are recognising that the loop is memcpy, it's that we're then generating an inefficient memcpy.  We do this for a variety of reasons, some of which apply elsewhere.  One issue I hit a few months ago was that the vectoriser doesn't notice whether unaligned loads and stores are supported, so will happily replace two adjacent i32 align 4 loads followed by two adjacent i64 align 4 stores with an i64 align 4 load followed by an i64 align 4 store, which more than doubles the number of instructions that the back end emits.

We expand memcpy and friends in several different places (in the IR in at least one place, then in SelectionDAG, and then again in the back end, as I recall - I remember playing whack-a-bug with this for a while as the lowering was differently broken for our target in each place).  In SelectionDAG, we're dealing with a single basic block, so we can't construct the loop.  In the back end we've already lost a lot of high-level type information that would make this easier.

I'd be in favour of consolidating the memcpy / memset / memmove expansion into an IR pass that would take a cost model from the target.

David





More information about the llvm-dev mailing list