[LLVMdev] [NVPTX] llc -march=nvptx64 -mcpu=sm_20 generates invalid zero align for device function params
justin.holewinski at gmail.com
Fri Nov 9 13:49:35 PST 2012
Test cases exist under test/CodeGen/NVPTX (name changed in May). Now that
I'm back at NVIDIA, I'm going to be running through the bugzilla issues
(thanks Dmitry for the reports!). I have practically the exact same patch
here in my queue. :)
In this case, I would prefer ABI alignment for compatibility with the
vendor compiler. It should work either way, but I do need to audit the
codebase and tie up any issues here.
On Fri, Nov 9, 2012 at 5:23 AM, Duncan Sands <baldrick at free.fr> wrote:
> Hi Dmitry,
> You're right, global variables use preferred alignment. And - yes,
>> preferred alignment in this case is bigger: 8 instead of 4. NVIDIA's
>> prop. compiler gives 4. However, since CUDA 5.0 ptx modules are
>> linkable with each other, I think alignments for externally visible
>> functions and data should all follow ABI rules.
> giving it an alignment of 8 does follow ABI rules: everything that is
> 8 byte aligned has an address that is a multiple of 4, i.e. it is also
> 4 byte aligned.
> It would be wrong to assume that external globals have the preferred
> alignment: they can only be assumed to have the ABI alignment. But
> as we are aligning this variable we can give it the alignment we like
> as long as it is at least 4.
> In short, I think using the preferred alignment is correct in this
> Is there a guide on making tests? I have ~5 pending patches never
>> accepted simply because I'm not familiar with LLVM test system :-/
> Your test would go in test/CodeGen/PTX/. It should use FileCheck,
> take a look at test/CodeGen/X86/zext-trunc.ll for an example. As
> there are no PTX tests at all for the moment, you need to create a
> lit.local.cfg file in PTX/. Imitate the X86 one.
> Ciao, Duncan.
>> - D.
>> 2012/11/9 Duncan Sands <baldrick at free.fr>:
>>> Hi Dmitry,
>>> I'm attaching a patch that should fix the issue mentioned above. It
>>>> simply makes the same check seen in the same file for global
>>>> emitPTXAddressSpace(PTy->**getAddressSpace(), O);
>>>> if (GVar->getAlignment() == 0)
>>>> O << " .align " << (int) TD->getPrefTypeAlignment(ETy);
>>>> O << " .align " << GVar->getAlignment();
>>> it's not quite the same because your patch uses the ABI alignment, while
>>> in this snippet it is the preferred alignment (which is usually the same
>>> as the ABI alignment, but may be bigger).
>>> Could you please review and commit? Do you think it needs a test case?
>>> Yes, it needs a testcase.
>>> Ciao, Duncan.
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev