[PATCH] D29542: [TargetInfo] Adjust x86-32 atomic support to the CPU used
Michał Górny via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Feb 6 09:32:41 PST 2017
mgorny added inline comments.
================
Comment at: lib/Basic/Targets.cpp:1808
+ if (HostTriple.getArch() == llvm::Triple::x86)
+ HostTarget->setCPU("i586");
+
----------------
jlebar wrote:
> mgorny wrote:
> > jlebar wrote:
> > > Okay, is this still needed now?
> > Yes. I've specifically tested with it commented out, and the CPU gets initiated to generic (=no inline atomics) then.
> Yes, but is that a bug? Does that break the test?
>
> I thought the problem we were trying to solve here was that CUDA host and device builds did not define the same macros. And I thought that setCPU modified the values for MaxAtomicInlineWidth and MaxAtomicPromoteWidth. Moreover I thought that we called HostTarget->setCPU before calling this function.
>
> If all of those things are true, I don't see what problem we're solving by calling HostTarget->setCPU("i586") here.
Well, the thing is, we don't call `HostTarget->setCPU()` before this function. We just call `AllocateTarget()`, and it does not set the CPU.
Normally the CPU is set in Driver, based on `-march` etc. if provided, with fallback to platform-specific defaults. In the case of host-side CUDA build, the Driver sets x86-specific CPU. While the defaults differ per platform, for all platforms supporting CUDA it's i586+.
Now, for the target-side, the Driver creates NVPTX target, and sets NVPTX-specific CPU. The `HostTarget` instance is only created within `NVPTXTargetInfo`, and so we need to `setCPU()` explicitly. Since we can reliably assume that the host-side will be i586+, we use `i586` here.
So far this didn't matter since all atomic properties were defined statically. However, this patch changes them to adjust to the CPU used, and so if the `X8632TargetInfo` instance is allocated without an explicit `setCPU()` call, it defaults to generic x86 (= no inline atomics available) which is different from the host platform default. As a result, different macros are defined and the test fails.
https://reviews.llvm.org/D29542
More information about the cfe-commits
mailing list