[llvm-dev] [PATCH] strlen -> strnlen optimization

Joerg Sonnenberger via llvm-dev llvm-dev at lists.llvm.org
Sun Feb 7 12:19:49 PST 2016


On Sun, Feb 07, 2016 at 01:21:29PM -0500, Michael McConville via llvm-dev wrote:
> Joerg Sonnenberger wrote:
> > On Sat, Feb 06, 2016 at 11:05:14PM -0500, Michael McConville via llvm-dev wrote:
> > > This addition converts strlen() calls to strnlen() when the result is
> > > compared to a constant. For example, the following:
> > > 
> > > strlen(s) < 5
> > > 
> > > Becomes:
> > > 
> > > strnlen(s, 5) < 5
> > > 
> > > That way, we don't have to walk through the entire string. There is the
> > > added overhead of maintaining a counter when using strnlen(), but I
> > > thought I'd start with the general case. It may make sense to only use
> > > this optimization with small constants. On the other hand, the impact of
> > > the counter may be negligible in many or most cases due to
> > > hardware-level optimizations.
> > 
> > This is an optimisation I am very vary of two two reasons:
> > (1) strnlen is only POSIX2008, so missing on many slightly older
> > systems.
> 
> That's why we check whether it exists.
> 
> glibc has had strnlen since 1996, OpenBSD since 2010, FreeBSD since
> 2009, OS X since ~2010. I expect that the vast majority of Unix-like
> systems have it. Regardless, I don't think this optimization does any
> significant harm to systems that lack strnlen.

Breaking correct code is the poster child of significant harm.

> > (2) I do believe that the counting is quite harmful to the performance.
> 
> I'll benchmark. If we limit the optimization to cases where the constant
> is (for example) < 20, a small performance hit from the counter may not
> matter.
> 
> > Additionally, I wouldn't be surprised if many systems don't consider
> > strnlen to be the fast path, so it would be even worse than using e.g.
> > memchr for this.
> 
> The strnlen implementations I've looked at don't seem much different
> from their strlen counterparts in the common libc's I looked at. For
> example, glibc has asm implementations of both for common platforms.
> FreeBSD only has asm implementations of strlen, but those are only for
> ARM and MIPS. OpenBSD doesn't use asm for either. I wouldn't be
> surprised if the extremely simple nature of the functions means that asm
> implementations aren't noticeably faster.

I'm a bit surprised by FreeBSD, since even on amd64 the trivial lowering
is almost always quite a bit slower than optimized versions...

Joerg


More information about the llvm-dev mailing list