[llvm-dev] multiply-accumulate instruction

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Mon Sep 21 05:48:55 PDT 2015


----- Original Message -----
> From: "Chris.Dewhurst via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "James Y Knight" <jyknight at google.com>
> Cc: llvm-dev at lists.llvm.org
> Sent: Monday, September 21, 2015 2:43:30 AM
> Subject: Re: [llvm-dev] multiply-accumulate instruction
> 
> I've been looking to see if there's a way to get the instruction
> below (SMAC) emitted from a higher-level construct, but I'm starting
> to think this is unrealistic.
> 
> To do so, I'd have to tie-in two other instructions: Firstly,
> clearing the ASR18 and Y register somewhere near the start of the
> method, then copying out the value of these registers somewhere near
> the end of the method, or wherever the value needs to be used.
> 
> In addition, it would only make sense to use the construct inside a
> loop of some form, otherwise, some variation on MUL would be better.
> That would either require detecting the loop, or optimising further
> down the line to convert the above construct *into* a simple MUL.
> 
> This now feels to me to be unrealistic and likely to be prone to
> bugs.
> 
> On that basis, I'm going to go with the simple "assembler-only
> support" recommended below, unless anyone can recommend a simple way
> of achieving the above (and direct me to a suitable reference). I
> can't find anything sufficiently similar in any of the other
> processors supported by LLVM.

Can you provide an example or two (written in C is fine) showing the kinds of loops or sequences of operations you're trying to pattern match to use this instruction. I don't know of anything that works exactly like this, but some targets do have IR-level preprocessing passes to use certain kinds of intrinsics (lib/Target/PowerPC/PPCCTRLoops.cpp for an example involving loops). There may be other ways to do this as well. I'd not give up so easily, but I need to see some concrete examples in order to provide advise.

 -Hal

> 
> 
> Thanks for the feedback
> Chris Dewhurst
> University of Limerick.
> 
> 
> 
> From: James Y Knight [jyknight at google.com]
> Sent: 18 September 2015 16:39
> To: Chris.Dewhurst
> Cc: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] multiply-accumulate instruction
> 
> 
> 
> 
> Do you only want to define assembler syntax for this, or do you need
> to be able to be able to automatically emit it from some higher
> level construct? I'd expect the former would be entirely sufficient,
> in which case this should be sufficient:
> 
> 
> let Predicates = [HasLeon3, HasLeon4], Defs = [Y, ASR18], Uses = [Y,
> ASR18] in
> 
> def SMACrr : F3_1<3, 0b111110,
> 
> (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2),
> 
> "smac $rs1, $rs2, $rd",
> 
> [ ]>;
> 
> 
> 
> 
> If you want the latter, I'm not sure how you'd go about being able to
> pattern-match it, because of the unusual 40 bit accumulate input and
> output, and the unusual for sparc 16-bit inputs. Hopefully you don't
> really need that. :)
> 
> 
> On Fri, Sep 18, 2015 at 10:19 AM, Chris.Dewhurst via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
> 
> 
> 
> 
> 
> 
> I’m trying to define a multiply-accumulate instruction for the LEON
> processor, a Subtarget of the Sparc target.
> 
> 
> 
> The documentation for the processor is as follows:
> 
> 
> 
> ===
> 
> To accelerate DSP algorithms, two multiply&accumulate instructions
> are implemented: UMAC and SMAC. The UMAC performs an unsigned 16-bit
> multiply, producing a 32-bit result, and adds the result to a 40-bit
> accumulator made up by the 8 lsb bits from the %y register and the
> %asr18 register. The least significant 32 bits are also written to
> the destination register. SMAC works similarly but performs signed
> multiply and accumulate. The MAC instructions execute in one clock
> but have two clocks latency, meaning that one pipeline stall cycle
> will be inserted if the following instruction uses the destination
> register of the MAC as a source operand.
> 
> 
> 
> Assembler syntax:
> 
> smac rs1, reg_imm, rd
> 
> 
> 
> Operation:
> 
> prod[31:0] = rs1[15:0] * reg_imm[15:0]
> 
> result[39:0] = (Y[7:0] & %asr18[31:0]) + prod[31:0]
> 
> (Y[7:0] & %asr18[31:0]) = result[39:0]
> 
> rd = result[31:0]
> 
> 
> 
> %asr18 can be read and written using the rdasr and wrasr
> instructions.
> 
> ===
> 
> 
> 
> I have the following in SparcInstrInfo to define the lowering rules
> for this instruction, but I feel that this isn’t likely to work as I
> need to somehow tie together the fact that %Y, %ASR18 and %rd are
> all related to each other in the output.
> 
> 
> 
> let Predicates = [HasLeon3, HasLeon4], Defs = [Y, ASR18], Uses = [Y,
> ASR18] in
> 
> def SMACrr : F3_1<3, 0b111110,
> 
> (outs IntRegs:$rd), (ins IntRegs:$rs1, IntRegs:$rs2, ASRRegs:$asr18),
> 
> "smac $rs1, $rs2, $rd",
> 
> [(set i32:$rd,
> 
> (add i32:$asr18, (mul i32:$rs1, i32:$rs2)))] >;
> 
> 
> 
> Perhaps a well-chosen “let Constraints=” might be used here? If so,
> I’m not sure I know what to put in there. If not, can anyone help me
> how I might define the lowering rules for this instruction please?
> 
> 
> 
> Chris Dewhurst, University of Limerick.
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-dev mailing list