[LLVMdev] MemoryBuffer

Thu Sep 24 18:07:14 PDT 2009

On Thu, Sep 24, 2009 at 5:38 PM, Gordon Henriksen
<gordonhenriksen at me.com> wrote:
> On 2009-09-24, at 18:56, OvermindDL1 wrote:
>
> Out of curiosity, what code in Clang is optimized by doing a
> pointer derefence then compare to 0, rather then just comparing two
> points directly?  Does not seem that efficient when laid out like that,
> which is why I am curious what code actually is helped by that pattern?
>
> Consider parsing an integer:
>
> // With NUL termination.
>
> while ('0' <= *p && *p <= '9')
> n = n * 10 + (*p - '0');
>
> // Without.
>
> while (p != e && '0' <= *p && *p <= '9')
> n = n * 10 + (*p - '0');

That does not help with binary files though (as in my case).  Why not
use a templated range interface (I could make one if you so wish),
thus you could specify binary or not as a policy to the template and
you could allow it to test for end internally, comparing pointers for
end if not null terminated, and comparing to null if null terminated
for an optimization, based on the policy design you could even let it
terminate on any given passed in terminator type, fully optimized for
all situations.  Policy based trumps hand-rolled trumps generic in
terms of speed for all situations.  I write code like that all the
time, would be quite simple, would you accept such code?  Just from a
quick thought the interface would change from using
getBufferStart()/getBufferEnd()/getBufferSize() to something like
getRange() which returns a random_access_range_type<no_policy>, which
would probably have an equivalent use like (using a monotype font to
line things up):

typedef const char cchar;
MemoryBuffer mb;
MemoryBuffer::range_type<policy> rt(mb.getRange());

Current MemoryBuffer                       Range-based MemoryBuffer
___________________________________________________________________
cchar *c=mb.getBufferStart();              cchar *c=rt.begin();
cchar *ce=mb.getBufferEnd();               cchar *ce=rt.end();
size_t s=mb.getBufferSize();               size_t s=rt.size();
cchar  ch=*c;                              cchar  ch=*rt;
c++;                                       rt++;
++c;                                       ++rt;
c--;                                       rt--;
--c;                                       --rt;
c[n];                                      rt[n];      // for unchecked
(c+n<ce&&c+n>=mb.getBufferStart())?c[n]:0; rt.at(n);   // for checked
c!=ce;                                     rt.empty(); // for
binary-style end test
*c==0;                                     rt.empty(); // for
text-style nul end test

And so forth.  Could also use bcp to bring in just the parts of the
Boost.Range library as well and build it using that (although it uses
free-standing functions for *full* generality while using policies),
although a hand-grown one as above (that is similar to one I have made
and use) has a more simple interface.  Could change the "rt.empty()"
calls to just "!rt" using the safe bool idiom, so I could do that as
well.

There appears to be ~34 areas in the LLVM sources that would require
some changes, I could do that.  It should compile to the same assembly
as well so no issues there.  It is also very easy to create new policy
types as well should it ever need to be extended in the future, and
such range types also allow you to transparently handle non-contiguous
memory too.