[clang] [HLSL][Doc] Document multi-argument resolution (PR #104474)

Thu Aug 15 13:56:50 PDT 2024

https://github.com/llvm-beanz updated https://github.com/llvm/llvm-project/pull/104474

>From a68223e1d0ee5e1b41ea7ec2385c9d581c901e70 Mon Sep 17 00:00:00 2001
From: Chris Bieneman <chris.bieneman at me.com>
Date: Thu, 15 Aug 2024 12:03:55 -0500
Subject: [PATCH] [HLSL][Doc] Document multi-argument resolution

This updates the expected diffferences document to capture the
difference in multi-argument overload resolution between Clang and DXC.

Fixes #99530
---
 clang/docs/HLSL/ExpectedDifferences.rst | 106 +++++++++++++++++++++---
 1 file changed, 94 insertions(+), 12 deletions(-)

diff --git a/clang/docs/HLSL/ExpectedDifferences.rst b/clang/docs/HLSL/ExpectedDifferences.rst
index 4782eb3cda754a..1bc29ae47aa270 100644
--- a/clang/docs/HLSL/ExpectedDifferences.rst
+++ b/clang/docs/HLSL/ExpectedDifferences.rst
@@ -71,18 +71,23 @@ behavior between Clang and DXC. Some examples include:
     uint U;
     int I;
     float X, Y, Z;
-    double3 A, B;
+    double3 R, G;
   }
 
-  void twoParams(int, int);
-  void twoParams(float, float);
+  void takesSingleDouble(double);
+  void takesSingleDouble(vector<double, 1>);
+
+  void scalarOrVector(double);
+  void scalarOrVector(vector<double, 2>);
 
   export void call() {
-    halfOrInt16(U); // DXC: Fails with call ambiguous between int16_t and uint16_t overloads
-                    // Clang: Resolves to halfOrInt16(uint16_t).
-    halfOrInt16(I); // All: Resolves to halfOrInt16(int16_t).
     half H;
+    halfOrInt16(I); // All: Resolves to halfOrInt16(int16_t).
+
   #ifndef IGNORE_ERRORS
+    halfOrInt16(U); // All: Fails with call ambiguous between int16_t and uint16_t
+                    // overloads
+
     // asfloat16 is a builtin with overloads for half, int16_t, and uint16_t.
     H = asfloat16(I); // DXC: Fails to resolve overload for int.
                       // Clang: Resolves to asfloat16(int16_t).
@@ -94,21 +99,28 @@ behavior between Clang and DXC. Some examples include:
 
     takesDoubles(X, Y, Z); // Works on all compilers
   #ifndef IGNORE_ERRORS
-    fma(X, Y, Z); // DXC: Fails to resolve no known conversion from float to double.
+    fma(X, Y, Z); // DXC: Fails to resolve no known conversion from float to
+                  //   double.
                   // Clang: Resolves to fma(double,double,double).
-  #endif
 
-    double D = dot(A, B); // DXC: Resolves to dot(double3, double3), fails DXIL Validation.
+    double D = dot(R, G); // DXC: Resolves to dot(double3, double3), fails DXIL Validation.
                           // FXC: Expands to compute double dot product with fmul/fadd
-                          // Clang: Resolves to dot(float3, float3), emits conversion warnings.
+                          // Clang: Fails to resolve as ambiguous against
+                          //   dot(half, half) or dot(float, float)
+  #endif
 
   #ifndef IGNORE_ERRORS
     tan(B); // DXC: resolves to tan(float).
             // Clang: Fails to resolve, ambiguous between integer types.
 
-    twoParams(I, X); // DXC: resolves twoParams(int, int).
-                     // Clang: Fails to resolve ambiguous conversions.
   #endif
+
+    double D;
+    takesSingleDouble(D); // All: Fails to resolve ambiguous conversions.
+    takesSingleDouble(R); // All: Fails to resolve ambiguous conversions.
+
+    scalarOrVector(D); // All: Resolves to scalarOrVector(double).
+    scalarOrVector(R); // All: Fails to resolve ambiguous conversions.
   }
 
 .. note::
@@ -119,3 +131,73 @@ behavior between Clang and DXC. Some examples include:
   diagnostic notifying the user of the conversion rather than silently altering
   precision relative to the other overloads (as FXC does) or generating code
   that will fail validation (as DXC does).
+
+Multi-Argument Overloads
+------------------------
+
+In addition to the differences in single-element conversions, Clang and DXC
+differ dramatically in multi-argument overload resolution. C++ multi-argument
+overload resolution behavior (or something very similar) is required to
+implement
+`non-member operator overloading <https://github.com/microsoft/hlsl-specs/blob/main/proposals/0008-non-member-operator-overloading.md>`_.
+
+Clang adopts the C++-inspired language from the
+`draft HLSL specification <https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf>`_,
+where an overload ``f`` is a better candidate than ``f\` `` if for all arguments the
+conversion sequences is not worse than the corresponding conversion sequence and
+for at least one argument it is better.
+
+.. code-block:: c++
+
+  cbuffer CB {
+    int I;
+    float X;
+    float4 V;
+  }
+
+  void twoParams(int, int);
+  void twoParams(float, float);
+  void threeParams(float, float, float);
+  void threeParams(float4, float4, float4);
+
+  export void call() {
+    twoParams(I, X); // DXC: resolves twoParams(int, int).
+                     // Clang: Fails to resolve ambiguous conversions.
+
+    threeParams(X, V, V); // DXC: resolves threeParams(float4, float4, float4).
+                          // Clang: Fails to resolve ambiguous conversions.
+  }
+
+For the examples above since ``twoParams`` called with mixed parameters produces
+implicit conversion sequences that are { ExactMatch, FloatingIntegral }  and {
+FloatingIntegral, ExactMatch }. In both cases an argument has a worse conversion
+in the other sequence, so the overload is ambiguous.
+
+In the ``threeParams`` example the sequences are { ExactMatch, VectorTruncation,
+VectorTruncation } or { VectorSplat, ExactMatch, ExactMatch }, again in both
+cases at least one parameter has a worse conversion in the other sequence, so
+the overload is ambiguous.
+
+.. note::
+
+  The behavior of DXC documented below is undocumented so this is gleaned from
+  observation and a bit of reading the source.
+
+DXC's approach for determining the best overload produces an integer score value
+for each implicit conversion sequence for each argument expression. Scores for
+casts are based on a bitmask construction that is complicated to reverse
+engineer. It seems that:
+
+* Exact match is 0
+* Dimension increase is 1
+* Promotion is 2
+* Integral -> Float conversion is 4
+* Float -> Integral conversion is 8
+* Cast is 16
+
+The masks are or'd against each other to produce a score for the cast.
+
+The scores of each conversion sequence are then summed to generate a score for
+the overload candidate. The overload candidate with the lowest score is the best
+candidate. If more than one overload are matched for the lowest score the call
+is ambiguous.