[PATCH] Add support for part-word atomics for PPC
nemanja.i.ibm at gmail.com
Fri Mar 6 19:54:20 PST 2015
Comment at: lib/Target/PowerPC/PPCInstrInfo.td:1469
@@ +1468,3 @@
+ "lharx $rD, $src", IIC_LdStLWARX,
+ [(set i32:$rD, (PPClarx xoaddr:$src))]>,
> This pattern is identical to the pattern for LBARX above; can they both really be selected?
> (and the same as the LWARX pattern below)
> Maybe you need PPClarx to carry a type node operand, like the extload nodes?
Please correct any of the below if my understanding of how this works is incorrect (I do not yet fully understand how we go from IR to SDAGs to actual instructions).
This is a good point, there would be no way to disambiguate which instruction to select if a selection DAG matched this pattern. However, I don't see anything in the PPC target implementation that would produce such a DAG node, so I am not sure how useful having this pattern is at all. As far as I can see, these instructions are selected only through the Pseudo instructions and are all emitted through
And with my limited understanding of how things work, it would make sense that this can only be emitted through a custom code sequence since a load with reservation makes little sense without a subsequent store conditional (and likely the associated conditional branch). On the other hand, something in the PPC back end may decide to produce the PPClarx SD node so it would be good if we can match a likely pattern.
I understand what you mean about the SD Node carrying the type (much like all the versions of the [sign|zero] extend loads that you mention).
What I propose is the following (**please comment on validity of the proposal**):
- Provide PPClarx8, PPClarx16, PPClarx32 and PPClarx64 SD nodes (PPClarx128 to follow with implementation of quadword version)
- Provide PPCstcx8, PPCstcx16, PPCstcx32 and PPCstcx64 SD nodes (PPCstcx128 to follow)
- Modify the patterns to use the respective version
- Remove the existing PPClarx and PPCstcx SD nodes
In a subsequent patch, provide test coverage for all of the atomic instructions in the IR (currently we seem to only have testing for some of them and only for i32 and i64 types). Maybe even add another test case that contains the IR from existing tests of all of the gcc __sync_* builtins to ensure we produce expected code for all of those.
More information about the llvm-commits