[clang] [HIP] fix host min/max in header (PR #82956)

Mon Feb 26 14:28:48 PST 2024

================
@@ -1306,15 +1306,73 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); }
 __DEVICE__
 double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
 
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __arg1, int __arg2) {
-  return __arg1 < __arg2 ? __arg1 : __arg2;
+// Define host min/max functions.
+#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) &&                  \
+    !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
+
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#define DEFINE_MIN_MAX_FUNCTIONS(ret_type, type1, type2)                       \
+  inline ret_type min(const type1 __a, const type2 __b) {                      \
+    return (__a < __b) ? __a : __b;                                            \
+  }                                                                            \
+  inline ret_type max(const type1 __a, const type2 __b) {                      \
+    return (__a > __b) ? __a : __b;                                            \
+  }
+
+// Define min and max functions for same type comparisons
+DEFINE_MIN_MAX_FUNCTIONS(int, int, int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(long, long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(long long, long long, long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long,
+                         unsigned long long)
+
+// CUDA defines host min/max functions with mixed signed/unsgined integer
+// parameters where signed integers are casted to unsigned integers. However,
+// this may not be users' intention. Therefore do not define them by default
+// unless users specify -D__HIP_DEFINE_MIXED_HOST_MIN_MAX__.
----------------
Artem-B wrote:

Nit: signed integers are implicitly promoted to unsigned ones due to the integer promotion rules. Cast would imply intentional cast and we're not doing that.

I'd rephrase it a bit along the lines of:

The routines below will perform unsigned comparison, which may produce invalid results if a signed integer was passed unintentionally. We do not want it happen silently, and do not provide these overloads by default. However for compatibility with CUDA, we allow them, if explicitly requested by the user by defining `__HIP_DEFINE_MIXED_HOST_MIN_MAX__`. 


https://github.com/llvm/llvm-project/pull/82956