[LLVMdev] RFC: Missing canonicalization in LLVM
Pete Cooper
peter_cooper at apple.com
Wed Jan 21 15:06:19 PST 2015
> On Jan 21, 2015, at 3:02 PM, Chandler Carruth <chandlerc at gmail.com> wrote:
>
>
> On Wed, Jan 21, 2015 at 2:43 PM, Pete Cooper <peter_cooper at apple.com <mailto:peter_cooper at apple.com>> wrote:
> The first thing that springs to mind is that I don’t trust the backend to get this right. I don’t think it will understand when an i32 load/store would have been preferable to a float one or vice versa. I have no evidence of this, but given how strongly typed tablegen is, I don’t think it can make a good choice here.
>
> I don't think tablegen is relevant to making a good choice here. This only comes up when we have a load which is only ever stored. See below, I'll come back to this after clarifying...
>
>
> So I think we probably need to teach the backend how to undo whatever canonical form we choose if it has a reason to. And the best long term solution is for tablegen to have sized load/stores, not typed ones.
>
> One (potentially expensive) way to choose the canonical form here is to look at the users of the load and see what type works best. If we load an i32, but bit cast and do an fp operation on it, then a float load was best. If we just load it then store, then in theory either type works.
>
> We actually already do this. =] I tought instcombine to do this recently. The way this works is as follows:
>
> If we find a load with a single bitcast of its value to some other type, we try to load that type instead. We rely on the fact that if there is in fact a single type that the load is used as, some other part of LLVM will fold the redundant bitcasts. I can easily handle redundant bitcasts if you like.
>
> If we find a store of a bitcasted value, we try to store the original value instead.
>
> This works really well *except* for the case when the only (transitive) users of the loaded value are themselves stores. In that case, there is no "truth" we can rely on from operational semantics. We need to just pick a consistent answer IMO. Integers seem like the right consistent answer, and this isn't the only place LLVM does this -- we also lower a small memcpy as an integer load and store.
Yeah, thinking about this more, integers do seem like the right answer. If a backend wanted to do something special then its up to them to handle it. For example, x86 might load balance issue ports by turning an i32 load/store in to SSE insert/extract, although i cannot imagine any time this would actually be a good idea.
>
> As for the backend, I agree I don't trust them to do anything clever here, but I think you may be missing how clever they would need to be. =D The only way this matters is that, for example, you have a store-to-load forwarding unit in your CPU that only works within a register class, and thus a stored integer will fail to get forwarded in the CPU to a load of a floating point value at the same address. If LLVM is ever able to forward this in the IR, it should re-write all the types to match whatever operation we have.
>
> I don't think any backend today (or in the foreseeable future) would have such smarts. But if CPUs are actually impacted by this kind of thing (and I have no evidence that they are), we might be getting lucky some of the time.
>
> Personally, I don't think this is compelling enough to delay making a change because i have test cases where we are getting *unlucky* today, and picking a canonical form will either directly fix them or make it much easier to fix them.
Sounds good to me. Integers it is then.
Pete
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150121/0caf5044/attachment.html>
More information about the llvm-dev
mailing list