[clang] [HLSL] Implement output parameter (PR #101083)

Thu Aug 15 17:28:04 PDT 2024

================
@@ -4689,6 +4720,31 @@ void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E,
   assert(type->isReferenceType() == E->isGLValue() &&
          "reference binding to unmaterialized r-value!");
 
+  // Add writeback for HLSLOutParamExpr.
+  if (const HLSLOutArgExpr *OE = dyn_cast<HLSLOutArgExpr>(E)) {
+    LValue LV = EmitLValue(E);
+    llvm::Type *ElTy = ConvertTypeForMem(LV.getType());
+    llvm::Value *Addr, *BaseAddr;
+    if (LV.isExtVectorElt()) {
+      llvm::Constant *VecElts = LV.getExtVectorElts();
+      BaseAddr = LV.getExtVectorAddress().getBasePointer();
+      Addr = Builder.CreateGEP(
+          ElTy, BaseAddr,
+          {Builder.getInt32(0), VecElts->getAggregateElement((unsigned)0)});
+    } else // LV.getAddress() will assert if this is not a simple LValue.
+      Addr = BaseAddr = LV.getAddress().getBasePointer();
+
+    llvm::TypeSize Sz =
+        CGM.getDataLayout().getTypeAllocSize(ConvertTypeForMem(LV.getType()));
+
+    llvm::Value *LifetimeSize = EmitLifetimeStart(Sz, BaseAddr);
+
+    Address TmpAddr(Addr, ElTy, LV.getAlignment());
+    args.addWriteback(EmitLValue(OE->getBase()->IgnoreImpCasts()), TmpAddr,
----------------
rjmccall wrote:

This is double-emitting the base, right?  L-value expressions can have side-effects, so this is not okay.

What you want to do in this function is:
1. emit the original l-value,
2. create the temporary,
3. do your potentially-converted load from the l-value into the temporary if it's an `inout`, and then
4. set up the potentially-converted writeback.

The temporary should always be a simple temporary of the parameter type; I don't understand the logic above about an ext-vector l-value, but there's no reason you should be doing any of that anyway.  See the comment in Sema.

You don't want to have a generic `EmitHLSLOutArgExpr` method, because it's really not correct to encounter one of these things "in the wild".  Only this call-argument-specific code that sets up the writeback is going to actually generate correct code, so if we screw up IR generation somehow and encounter one of these expressions without coming through here, we want the compiler to assert (or emit a hard error), not to generate half the expression.  And that simplifies the organization here — you don't have to double-emit expressions or try to push information through the abstract `EmitLValue` function because this one place in the code is responsible for setting everything up.

https://github.com/llvm/llvm-project/pull/101083