[llvm-dev] IPRA, interprocedural register allocation, question

vivek pandya via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 14 04:19:09 PDT 2016


On Thu, Jul 14, 2016 at 4:01 PM, zan jyu Wong <zyfwong at gmail.com> wrote:

> Vivek,
>
> First of all, I'd like to thank you for you hard work. Your work really
> helps me a lot.
>

Thanks Zan Jyu Wong I am glad that this helped, but I would like to share
credits with my mentors and llvm community who have helped me for this.

I am adding llvm dev list here because if my reasons are wrong then some
can help both of us understanding it correctly.

But I have a question about regmask collector.
> In lib/CodeGen/RegUsageInfoCollector.cpp, there's a for-loop to iterator
> over all registers to check
> if they are modified:
>   for (unsigned PReg = 1, PRegE = TRI->getNumRegs(); PReg < PRegE; ++PReg)
>     if (MRI->isPhysRegModified(PReg, true))
>       markRegClobbered(TRI, &RegMask[0], PReg);
>
> void RegUsageInfoCollector::markRegClobbered(const TargetRegisterInfo *TRI,
>                                              uint32_t *RegMask, unsigned
> PReg) {
>   // If PReg is clobbered then all of its alias are also clobbered.
>   for (MCRegAliasIterator AI(PReg, TRI, true); AI.isValid(); ++AI) {
>     DEBUG(dbgs() << "mark: " << TRI->getName(*AI) << "\n");
>     RegMask[*AI / 32] &= ~(1u << (*AI % 32));
>   }
> }
>
> Suppose that r0, r1 is sub-regs of d0. And function use only r0. Then both
> r0, d0 will return true
> when call with MRI->isPhysRegModified. When call `markRegClobbered' using
> d0, r1 will mark as clobbered, too.
> But I don't think that r1 should marked as clobbered.
>
> I'm wondering that if this is expcted behavior? Thanks again.
>
No I don't think that r1 will be clobbered here. My reasons are as follow
with slightly different example :
Consider AL | AH | AX | EAX | RAX   and the way LLVM models this register
so that AL is aliased to AX, EAX and RAX similar for AH. This can be
verified from lib/Target/X86/X86RegisterInfo.td file consider following
comments from the file :
// In the register alias definitions below, we define which registers alias
// which others.  We only specify which registers the small registers alias,
// because the register file generator is smart enough to figure out that
// AL aliases AX if we tell it that AX aliased AL (for example).

see definitions of registers like
def AL : X86Reg<"al", 0>;
def AH : X86Reg<"ah", 4>;
def AX : X86Reg<"ax", 0, [AL,AH]>;
...
So if we mark AX as used/modified than obviously we can't use AL or AH but
yes if only AL is clobbered than AH can still be used.
I hope this helps, and llvm devs please correct me if necessary.

-Vivek

>
> On Thu, Jul 14, 2016 at 12:50 PM, vivek pandya via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>>
>>
>> On Thu, Jul 14, 2016 at 1:10 AM, Mehdi Amini <mehdi.amini at apple.com>
>> wrote:
>>
>>>
>>> On Jul 13, 2016, at 12:26 PM, Lawrence, Peter <c_plawre at qca.qualcomm.com>
>>> wrote:
>>>
>>> Vivek,
>>>              I apologize if you took my original email as a request for
>>> implementation,
>>> I meant to be asking what is already available, I think the answer to
>>> that
>>> is the ‘preserves_most’ and ‘preserves_all’ attributes, but I will also
>>> Use ‘regmask’ if those prove to be too sub-optimal.
>>>
>>> Peter there is no need to apologize as we want to get most benefits out
>> of this work ( this is our aim for  GSoC project ).
>> Yes 'regmask' can be useful when you can't exactly describe register
>> usage with preserve_most/ preserve_all.  I just ask before sending because
>> to have this feature in truck will take some time (review process).
>>
>> As far as LLC is concerned what Mehdi has suggested should be enough.
>> Also I have mentioned already even you want to compile multiple source file
>> and get benefits with LLC I believe you can use llvm-link to combine all
>> .bc files to create one module and use resulting .bc file with LLC to get
>> most benefits of IPRA.
>>
>> -Vivek
>>
>>>
>>> I am still interested in figuring out the necessary and sufficient
>>> conditions
>>> For LLC to do optimal IPRA when given a “whole program”
>>> (as per my previous definition of “whole program”),
>>> As opposed to how to accomplish this with LTO,
>>>
>>>
>>> Easy: mark *all* of your function “static” (or “internal” in LLVM
>>> denomination).
>>>
>>>
>>> If you are open to having such discussions, even though your focus
>>> IIUC is supposed to be LTO, then great.   I think Mehdi is stuck trying
>>> To convince me to use LTO, but given all the changes I’ve had to make
>>> To CodeGen (IE outside of my Target sub-dir) for having separate Data
>>> and Address
>>> register sets, I think using LTO is a long term solution that I can’t
>>> take
>>> On just now (IE the svn branch merge problem)
>>>
>>>
>>> As one of my old math professors used to say “don’t use a sledge hammer
>>> To crush a pea”,  to wit  I am only compiling a single source file as an
>>> entire whole
>>> Program and I don’t do any linking, why should I have to use a linker.
>>>
>>>
>>> Just semantic issue: you need to tell the optimizer what it can and
>>> can’t do. In general we can’t assume that the code being optimized or
>>> generated won’t be dlopen/dlsym for instance.
>>> Unfortunately I’d prefer everything to be hidden/private by default and
>>> the user having to explicitly export symbols, but that’s not the current
>>> model.
>>>
>>> The LTO API is here to circumvent this issue: by delaying the
>>> optimizations/codegen to the link time, we have more information about what
>>> function can / can’t be called from another module.
>>> One of the key point of LTO is the linker telling us “I don’t need to
>>> export this symbol” and we turn it into an “internal” one.
>>>
>>>>>> Mehdi
>>>
>>>
>>>
>>>
>>> --Peter Lawrence
>>>
>>>
>>>
>>> Vivek,
>>>           I have an application where many of the leaf functions are
>>> Hand-coded assembly language,  because they use special IO instructions
>>> That only the assembler knows about.  These functions typically don't
>>> Use any registers besides the incoming argument registers, IE they don't
>>> Need to use any additional callee-save nor caller-save registers.
>>> Perhaps using some form of __attribute__ ?
>>> Maybe __attribute__ ((registermask = ....))  ?
>>>
>>>
>>> --Peter Lawrence.
>>>
>>>
>>>
>>>
>>>
>>> *From:* vivek pandya [mailto:vivekvpandya at gmail.com
>>> <vivekvpandya at gmail.com>]
>>> *Sent:* Wednesday, July 13, 2016 11:47 AM
>>> *To:* Lawrence, Peter <c_plawre at qca.qualcomm.com>
>>> *Cc:* mehdi.amini at apple.com; llvm-dev <llvm-dev at lists.llvm.org>;
>>> llvm-dev-request at lists.llvm.org; Hal Finkel <hfinkel at anl.gov>
>>> *Subject:* Re: [llvm-dev] IPRA, interprocedural register allocation,
>>> question
>>>
>>> Hello Peter,
>>>
>>> Are you still interested in __attribute__(regmask) ?
>>> I have done some hack ( both clang+IPRA)  to get it working if you want
>>> to play around it I can send a patch by tomorrow.
>>>
>>> Sincerely,
>>> Vivek
>>>
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160714/195f076a/attachment-0001.html>


More information about the llvm-dev mailing list