[LLVMdev] Adding masked vector load and store intrinsics

Fri Oct 24 10:22:13 PDT 2014

"Das, Dibyendu" <Dibyendu.Das at amd.com> writes:

> This looks to be a reasonable proposal. However native instructions
> that support such masked ld/st may have a high latency ? Also, it
> would be good to state some workloads where this will have a positive
> impact.

Any significant vector workload will see a giant gain from this.

The masked operations really shouldn't have any more latency.  The time
of the memory operation itself dominates.

                            -David