[cfe-commits] PATCH: Large re-working of bitfield IR-gen, and a fix for PR13691

Daniel Dunbar daniel at zuster.org
Fri Sep 14 14:42:06 PDT 2012


On Tue, Sep 11, 2012 at 11:02 AM, Chandler Carruth <chandlerc at google.com> wrote:
> On Fri, Aug 31, 2012 at 11:46 AM, Daniel Dunbar <daniel at zuster.org> wrote:
>>
>> Hi Chandler,
>>
>> I haven't done an extensive review of the patch yet (by which I would
>> mostly mean looking at the codegen, I trust you to verify the
>> correctness), so treat this as initial comments.
>
>
> Thanks for these. I'm attaching an updated patch with some fixes and
> improvements, see below.
>
>>
>> My one main initial comment is that I would like to treat the PR and
>> the rewrite as distinct issues. Obviously fixing the PR is a
>> no-brainer, but it would be nice to put in the minimal patch for that
>> in the current code. I wouldn't expect this to be hard or to result in
>> byte-by-byte accesses, but I haven't looked closely at what the change
>> should be. It sounds however like you attempted this and decided it
>> was hard enough to prefer a rewrite...
>
>
> Yes. Both Richard Smith and I looked at it, and neither of us had any clear
> ideas about how to adapt the existing code to avoid the problem without
> creating  worse stores. As I explained in my previous email, the existing
> strategy has two properties that make this hard: 1) it assumes it can read
> before the bitfield in order to start off with an aligned load, and 2) it
> wants to pre-split into strides which requires the frontend to implement
> intelligent striding logic.
>
>>
>> The one thing I am worried about is that the simplicity of your
>> approach is contingent on us using irregular integer sized types. If
>> we don't do that, I feel like the current version of the code is
>> pretty good (feel free to disagree though), so really the question
>> comes down to do we want to move to irregular sized integer types. At
>> the time the code was written, it was a conscious decision not to do
>> that. I agree with the general sentiment that SROA is already doing
>> this, I am just worried that if we ever needed to break that we would
>> be back to doing something like the current code is doing.
>
>
> Yes, we may, some day need to go back and change things. But we'll have the
> old code kicking around in history. This at least gets us to correct
> behavior today and simplifies the code today.
>
>>
>> I did a reasonable amount of looking at the codegen on bitfield test
>> cases when I wrote the old code, but I probably did a poor job of
>> turning these into test cases. I'm assuming you probably already did
>> the same kind of testing on the new code? Do you have any test cases
>> that demonstrate better or worse codegen from the new model?
>
>
> Test cases I have looked at in depth are included. ;] Mostly, I carefully
> checked the output of the existing bitfield tests. There are actually quite
> a few. I added a test case to cover the correctness issue.
>
> I'll work on a few stress test cases based on the examples you and eli gave,
> as well as some that I've come up with.
>
>
>>
>> I don't understand why this is true (the byte-by-byte part). If your
>> rewrite is generating a wider load, then there is nothing in the
>> structure of the old code that would prevent generating larger
>> accesses as well. This seems independent of the structural change. It
>> seems like implementing this in the current framework is a small
>> amount of code.
>
>
> Well, both Richard and I looked at it, and after I had tried a few times,
> neither of us had really good ideas about how to make it a small amount of
> code. To be fair, I didn't really *work* to find an elegant way to implement
> it because after a few initial stabs and readings, I came to this
> conclusion:
>
>>
>> I agree that after we did that it might be very questionable why we
>> were doing it in the frontend.
>
>
> And once I was there, I wanted to do the right thing and solve this using
> the facilities LLVM provides.
>
>>
>> >> (1) How well-specified and -tested are the paths for non-power-of-2
>> >> integer types in LLVM?  Do we actually promise that, say, storing to an
>> >> i17
>> >> will not touch the fourth byte?  Do we provide reasonably portable
>> >> behavior
>> >> across targets, or is this an instance of the x86 backend having
>> >> received a
>> >> lot of attention and everything else being a festering morass of latent
>> >> issues?  I know that SROA's behavior has exposed bugs before (IIRC with
>> >> the
>> >> occasional response of making SROA less aggressive), not least of which
>> >> because SROA's behavior doesn't really kick in a lot.
>> >
>> >
>> > Sure, this is a great point. I'm happy to do any testing I can here. One
>> > thing that I think I've already done is ensure that we only produce
>> > i8-multiples, not i17s and such. I'll double check, but I would not
>> > expect
>> > to see any of those. My hope was that this would in the worst cased be
>> > trivially legalized to a sequence of i8 operations, and in the best
>> > case,
>> > the optimal pattern.
>>
>> This is a concern of mine too. Of course, its also something we should
>> just probably make sure is nicely defined in the backend given that
>> SROA already uses it.
>
>
> So I talked to Duncan specifically about this and this is actually the use
> case he feels was intended originally and should be best supported. I've
> done some LLVM-side testing and it seems to bear that out. Very reasonable
> code was generated for integers in my testing.

Ok, I have more or less been convinced this is a better approach.

I think you should get explicit sign off from Chris though, because
when I discussed it in person with him he had a very strong feeling
that we did not want the FE to start generating off-size integer
memory accesses (as opposed to off-sized integer values). I don't
think I can clearly convey his argument, though, so I'd like to see
him chime in.

 - Daniel

>
>>
>> I agree with the approach. I'm worried there might be cases where the
>> backend does a worse job in codegen *today* on the off size integer
>> types, but I could be wrong. And of course the right thing to do here
>> is fix up the backend.
>>
>> Regardless, thanks for working in this area!
>
>
> Well, I've fixed a tiny bug, and done significantly more correctness
> testing. The patch attached passes self-host and the full testsuite. The
> test suite shows no significant performance regressions.
>
> I've also added something to try to mitigate the performance concerns. I've
> added logic to detect a case when there is guranteed to be padding bytes
> after the bitfield and to widen the bitfield into the padding when doing so
> produces more natural integer sizes.



More information about the cfe-commits mailing list