[llvm-dev] Unnecessary spill/fill issue

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Mon May 9 15:14:38 PDT 2016


----- Original Message -----
> From: "Quentin Colombet via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "Jason" <thesurprises at gmail.com>
> Cc: llvm-dev at lists.llvm.org
> Sent: Monday, May 9, 2016 5:09:35 PM
> Subject: Re: [llvm-dev] Unnecessary spill/fill issue
> 
> 
> Hi Jason,
> 
> 
> I am guessing that the problem is that we do not recognize the
> sequence as rematerializable because, we do not directly load
> LCPI0_212 into a ymm register.
> One way to fix that is by using a pseudo instruction that does the
> load from the constant to ymm (while defining a dead GPR register to
> be able to expand the pseudo), then teach the folding code how to
> deal with that.
> 
> 
> Another option is to make the rematerialization smarter, but that is
> more complicated :).
> 

Making rematerialization smarter, however, is certainly work that would be broadly appreciated.

 -Hal

> 
> Cheers,
> -Quentin
> 
> 
> 
> 
> On May 9, 2016, at 2:41 PM, Jason via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
> 
> 
> 
> Does anyone have any insight into this problem? Is there a way to
> minimize excessive spill/fill for this kind of scenario?
> Thanks,
> Jason
> 
> 
> 
> 
> On Fri, May 6, 2016 at 10:44 AM, Jason < thesurprises at gmail.com >
> wrote:
> 
> 
> 
> Hi, I am using mcjit in llvm 3.6 to jit kernels to x86 avx2. I've
> noticed some inefficient use of the stack around constant vectors.
> In one example, I have code that computes a series of constant
> vectors at compile time. Each vector has a single use. In the final
> asm, I see a series of spills at the top of the function of all the
> constant vectors immediately to stack, then each use references the
> stack pointer directly:
> 
> 
> Lots of these at top of function:
> 
> 
> 
> movabsq $.LCPI0_212, %rbx
> vmovaps (%rbx), %ymm0
> vmovaps %ymm0, 2816(%rsp) # 32-byte Spill
> 
> 
> 
> Later on, each use references the stack pointer:
> 
> 
> vpaddd 2816(%rsp), %ymm4, %ymm1 # 32-byte Folded Reload
> 
> 
> It seems the spill to stack is unnecessary. In one particularly bad
> kernel, I have 128 8-wide constant vectors, and so there is 4KB of
> stack use just for these constants. I think a better approach could
> be to load the constant vector pointers as needed:
> 
> 
> movabsq $.LCPI0_212, %rbx
> vpaddd ( %rbx), %ymm4, %ymm1
> 
> 
> 
> 
> Thanks,
> Jason
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-dev mailing list