[cfe-commits] r154204 - /cfe/trunk/lib/Basic/SourceManager.cpp
benny.kra at googlemail.com
Sat Apr 7 00:42:14 PDT 2012
On 07.04.2012, at 08:57, John McCall <rjmccall at apple.com> wrote:
> On Apr 6, 2012, at 11:26 PM, Jonathan Sauer wrote:
>>> The speedup from vectorization isn't very large, as we fall back to bytewise
>>> scanning when we hit a newline. There might be a way to avoid leaving the sse
>>> loop but everything I tried didn't work out because a call to push_back
>>> clobbers xmm registers.
>> Wouldn't that indicate a codegen bug? Judging from <http://www.agner.org/optimize/calling_conventions.pdf>,
>> most calling conventions specify that at least some xmm registers are scratch registers, i.e. must be
>> saved on the stack before calling a function and restored afterwards, if they should keep their value.
>> If that doesn't happen, it seems to me to be a bug in the compiler you used to compile clang.
> When we talk about clobbering a register, we usually assume that the compiler understands that the register has been invalidated and is taking precautions. Usually those precautions are expensive, e.g. spilling the register to the stack a lot. In this case, Benjamin is saying that the performance impact of vectorization isn't as large as it could be because the compiler has to spill and rematerialize his vectors across a call that occurs at every newline.
Exactly. In this case llvm reloads the vectors with the '\r' and '\n'
patterns from memory on every iteration if we have a call in the loop,
killing performance. push_back may call crazy functions like malloc,
so the compiler cannot know if the %xmm registers still have their
value after the call.
It might be possible to use a preallocated buffer of some size and
leave the loop only when it's full, I didn't implement that so far for
> The only ABI I know of that makes any of the XMM registers non-scratch is Win64.
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
More information about the cfe-commits