[PATCH] D69878: Consoldiate internal denormal flushing controls

Thu Jan 9 13:51:04 PST 2020

arsenm marked an inline comment as done.
arsenm added inline comments.

================
Comment at: llvm/docs/LangRef.rst:1837
+   are present, this overrides ``"denormal-fp-math"``. Not all targets
+   support separately setting the denormal mode per type.
+
----------------
andrew.w.kaylor wrote:
> arsenm wrote:
> > andrew.w.kaylor wrote:
> > > Can you document which targets do support the option? What happens if I try to use the option on a target where it is not supported?
> > I'm not sure where to document this, or if/how/where to diagnose it. I don't think the high level LangRef description is the right place to discuss specific target handling.
> > 
> > Currently it won't error or anything. Code checking the denorm mode will see the f32 specific mode, even if the target in the end isn't really going to respect this.
> > 
> > One problem is this potentially does require coordination with other toolchain components. For AMDGPU, the compiler can directly tell the driver what FP mode to set on each entry point, but for x86 it requires linking in crtfastmath to set the default mode bits. If another target had a similar runtime environment requirement, I don't think we can be sure the attribute is correct or not.
> There is precedent for describing target-specific behavior in LangRef. It just doesn't seem useful to say that not all targets support the attribute without saying which ones do. We should also say what is expected if a target doesn't support the attribute. It seems reasonable for the function attribute to be silently ignored.
> 
> > One problem is this potentially does require coordination with other toolchain components. For AMDGPU, the compiler can directly tell the driver what FP mode to set on each entry point, but for x86 it requires linking in crtfastmath to set the default mode bits.
> 
> This is a point I'm interested in. I don't like the current crtfastmath.o handling. It feels almost accidental when FTZ works as expected. My understanding is we link crtfastmath.o if we find it but if not everything just goes about its business. The Intel compiler injects code into main() to explicitly set the FTZ/DAZ control modes. That obviously has problems too, but it's at least consistent and predictable. As I understand it, crtfastmath.o sets these modes from a static initializer, but I'm not sure anything is done to determine the order of that initializer relative to others.
> 
> How does the compiler identify entry points for AMDGPU? And does it emit code to set FTZ based on the function attribute here?
The entry points are a specific calling convention. There's no real concept of main. Each kernel has an associated blob of metadata the driver uses to set up various config registers on dispatch.

I don't think specially recognizing main in the compiler is fundamentally different than having it done in a static constructor. It's still a construct not associated with any particular function or anything.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69878/new/

https://reviews.llvm.org/D69878