[llvm-dev] [AMDGPU] Strange results with different address spaces

Matt Arsenault via llvm-dev llvm-dev at lists.llvm.org
Fri Dec 8 11:54:28 PST 2017



> On Dec 8, 2017, at 01:51, Haidl, Michael <michael.haidl at uni-muenster.de> wrote:
> 
> Hi Matt, 
>  
> thanks for your response. I agree that the IR should be generated with the correct AS in the first place. However, for my project this is somehow impossible.


> I need the same IR with everything in AS 0 for CPU execution and again with GPU specific address spaces to avoid performance impacts of the generic address space.
We have an optimization pass to eliminate generic accesses, so for the most part you shouldn’t have to worry about this too much. You can insert casts to flat and generally expect them to be eliminated. This is what HCC is doing now.


> Doing this in the front-end means a way more intrusive change to clang and a way I did not want to go in the first place.
>  
> I have the IR that goes into the pass manager attached to the mail.
> The PM is set up as follows:
>  
>   llvm::TargetOptions options;
>   options.UnsafeFPMath = false;
>   options.NoInfsFPMath = false;
>   options.NoNaNsFPMath = false;
>   options.HonorSignDependentRoundingFPMathOption = false;
>   options.AllowFPOpFusion = FPOpFusion::Fast;
>  
>   Triple TheTriple = Triple(M.getTargetTriple());
>   std::string Error;
>   SmallString<128> hsaString;
>   llvm::raw_svector_ostream hsaOS(hsaString);
>   if (!_target)
>     _target = TargetRegistry::lookupTarget("amdgcn", TheTriple, Error);
>   if (!_target) {
>     throw common::generic_exception(Error);
>   }
>  
>   llvm::legacy::PassManager PM;
>   PassManagerBuilder builder;
>   builder.OptLevel = 3;
>   builder.populateModulePassManager(PM);
>  
>   _machine.reset(_target->createTargetMachine(
>       TheTriple.getTriple(), _cpu, _features, options, Reloc::Model::Static,
>       CodeModel::Model::Medium, CodeGenOpt::Aggressive));
>  
>   if (_machine->addPassesToEmitFile(PM, hsaOS,
>                                     TargetMachine::CGFT_ObjectFile, false)) {
>     throw std::logic_error(
>         "target does not support generation of this file type!\n");
>   }
>  
>   PM.run(M);
>   <>
I can’t see what’s going on with this. I would look into what happens when AMDGPUTTIImpl::isSourceOfDivergence is called in your broken example. I don’t see exactly how it would happen or how it would cause this, but I”m guessing something went wrong where the wrong address space mapping is being used at some point.

-Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171208/6687c062/attachment-0001.html>


More information about the llvm-dev mailing list