[LLVMdev] Handling Masked Vector Operations
nrotem at apple.com
Thu May 2 09:48:55 PDT 2013
> It seems the only solution is to create an intrinsic:
> llvm_int_load_masked mask, [addr]
> But this unnecessarily shuts down optimization.
I think that using intrinsics is the right solution. I imagine that most interesting load/store optimizations happen before vectorization, so I am not sure how much we can gain by optimizing masked load/stores.
> Similar problems exist with any trapping instruction (div, mod, etc.).
> It gets even worse when you consider than any floating point operation
> can trap on a signalling NaN input.
For DIV/MOD you can blend the inputs BEFORE the operation. You can place ones or zeros depending on the operation.
> So are there any ideas out there for how to efficiently handle this?
> We've talked about llvm and masks before and it's clear that there is
> strong resistance to adding masks to the IR.
Yes. I think that the consensus is that we don't need to predicate the IR itself to support MIC-like processors.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev