[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
baldrick at free.fr
Sat Nov 10 01:37:27 PST 2012
On 09/11/12 22:49, Justin Holewinski wrote:
> Test cases exist under test/CodeGen/NVPTX (name changed in May).
I've deleted the empty PTX directory.
Now that I'm
> back at NVIDIA, I'm going to be running through the bugzilla issues (thanks
> Dmitry for the reports!). I have practically the exact same patch here in my
> queue. :)
> In this case, I would prefer ABI alignment for compatibility with the vendor
I don't really understand this argument. If the vendor compiler is aligning to
4 (say) then some globals will have address a multiple of 4, some will have
address a multiple of 8, some will have address a multiple of 16 etc, depending
on the accidents of just where in memory they happen to be placed. For
example, if you have two 4 byte globals that follow each other in memory, and
that are 4 byte aligned, then if the first one has address a multiple of 4 then
the second will have address a multiple of 8. In short lots of variables will
be 8 byte aligned by accident. If LLVM gives them all an alignment of 8, what
does that change? OK, I will now admit that there is an effect if assumptions
are being made about globals being placed next to each other: if you declare
two globals A and B immediately after each other in the IR then the LLVM
semantics doesn't guarantee that they will be laid out one immediately after
the other in memory. But that's how it happens in practice so maybe people
are (wrongly) relying on that. Bumping up the alignment to a multiple of 8
may add extra padding between A and B, causing B to not be at the position that
such naughty people are expecting.
It should work either way, but I do need to audit the codebase and
> tie up any issues here.
The IR optimizers already bump the alignment of some globals up to the
preferred alignment, check out enforceKnownAlignment in Local.cpp (it ends
up being called from instcombine).
More information about the llvm-dev