[PATCH] D81373: [ARM] Basic bfloat support
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 18 08:40:45 PDT 2020
dmgreen added inline comments.
================
Comment at: llvm/lib/Target/ARM/ARMISelLowering.cpp:727
+ if (Subtarget->hasBF16())
+ addRegisterClass(MVT::bf16, &ARM::HPRRegClass);
+
----------------
stuij wrote:
> dmgreen wrote:
> > labrinea wrote:
> > > dmgreen wrote:
> > > > We only add the FP16 regclass if we have HasFullFP16. Not if we just have the vcvt instructions (HasFP16). I agree it is probably good to make bf16 a legal type if we can, but a lot of operations will not be supported. As far as I understand they are not expected to work, and there will be nothing that can promote them at the moment?
> > > >
> > > > I'm not sure how the AArch64 backend handled it, but do we need something like setAllExpand(..)? Honestly I would expect it to still break if you tried to add two bfloats in IR, but it might be more future-proof.
> > > Maybe for now it'd be better to add the bfloat type if both fullfp16 and bf16 are present. Makes sense?
> > We will need to fix bfloat + nofullfp16 anyway, and so I'm not sure it will make a lot of difference in the short term to make this fullfp16.
> >
> > My point is that there will be a lot of operations - like adding two bfloats, or extracting from a bfloat vector or concating two bfloat vectors. In the long run we are going to make sure all those things work and don't crash during legalization. At the moment I expect a lot of them to break. For this patch it is probably OK to ignore that for the moment. But it should be something we look at eventually.
> Yes, we are not supporting the full range yet, either on AArch32 or AArch64. Adding more support is a todo. We do have follow-up patches on phab that add support for extraction/insertion, concatenation, and a number of others.
>
> One nice thing is that since all the bfloat operations go through intrinsics (when using C/C++), the surface area of possible operations is greatly reduced and known. Currently on AArch64, the IR generated from those intrinsics is the input for testing the SelDag lowering (like we do for half as well).
Yeah OK. I would worry about the optimizer producing weird and wonderful variants on the input, so you may find a few more are needed. But hopefully most of that will be covered with what we have.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D81373/new/
https://reviews.llvm.org/D81373
More information about the llvm-commits
mailing list