[PATCH] D37579: [InstCombine] Fix PR21780 Expansion of 256 bit vector loads fails to fold into shuffles

Mon Nov 27 11:28:15 PST 2017

reames requested changes to this revision.
reames added a comment.
This revision now requires changes to proceed.

Initial round of comments focused on the metadata semantics only.  I completely skipped the instcombine changes.

You also need verifier tests.

================
Comment at: docs/LangRef.rst:4919

+'``speculation.marker``' Metadata
+^^^^^^^^^^^^^^^^^^^^^^
----------------
This needs a name change.  

We already have dereferenceable attribute which marks arguments and return values as being *globally* deferenceable.  This is marking a memory location as being dereferenceable at a particular contextual location.

Maybe: dereferenceable_offsets?

================
Comment at: docs/LangRef.rst:4935
+    ...
+    !0 = !{i64 -1, i64 2}
+
----------------
More information on the format is needed.  Specific missing pieces:
- what do both fields mean?
- i64 or any constant?
- signed or unsigned?
- is it a list?

I'd also suggest requiring sorting for easy of access.

================
Comment at: test/Transforms/InstCombine/speculation_marker.ll:3
+
+define <4 x double> @foo(double* %ptr) {
+; CHECK-LABEL: @foo(
----------------
Just to comment: in your test case, we can prove that the loads post dominate the entry to the function and could update the argument with the existing dereferenceability attribute.  This might be an alternate approach and separately worth implementation.  

https://reviews.llvm.org/D37579