gareevroman added a comment. > I think throughput and latency of vector fma instructions are pretty constant across micro-architectures too. Can we also add them? Sorry, probably, it’d require to specify it for each architecture. https://reviews.llvm.org/D37051