[llvm] [TableGen] fix tlbgen for EncodingByHwMode (PR #84906)

Mon Mar 18 04:08:59 PDT 2024

superZWT123 wrote:

> > > > @nvjle @topperc Hello, could you please help me review my code.
> > > 
> > > 
> > > @superZWT123 Please give me a day or two to review this code. From a very quick scan, it actually undoes some of the work I did recently (or at least does it differently)-- which may be fine-- but I need to examine carefully. I'm using this feature in a downstream production backend with a complicated ISA (generously speaking) and even more complicated encoding schemes (100+ decoder tables). It tends to exercise the encode/decode heavily.
> > > Also, my previous patch to suppress duplicates had a command line option so that existing downstream backends could opt-in to that feature-- at least temporarily. That is, a downstream backend could rebase and their disassembler would not need any changes due to decoder table naming. Something like that would probably be good to keep initially.
> > > All of that said, I am certainly glad to see others helping to production-harden this feature.
> > 
> > 
> > Hi [nvjle](https://github.com/nvjle). I hope all is well with you. Since you mentioned needing a day or two to carefully review my code, three days have now passed, and I’m curious if you’ve had the opportunity to complete your review. Are there any suggestions or adjustments you’d like me to make? If not, may I proceed with merging my code into the main branch? I look forward to your valuable feedback. Thanks!
> 
> Hi @superZWT123, I've done some reviewing and done a trial incorporation into a production downstream backend. To summarize briefly, there are some significant problems with this patch in its current form. The first problem is that, in the mentioned downstream backend with these decoder changes applied, all decoder duplicate suppression is reverted compared to the existing duplicate suppression. This definitely can not be merged as is. The second problem is that with the encoding changes, I'm seeing miscompiles-- but I've not had time to investigate precisely what is happening there (enormous 200,000+ line TD files, complicated encoding schemes, binary diffs-- TBD).
> 
> Patches to the decode and encode can be subtle-- especially in larger backends. Here are some specific requests and observations:
> 
> 1. Typically in LLVM we adhere to this policy for submitting patches: https://llvm.org/docs/Contributing.html#how-to-submit-a-patch . Specifically, independent/orthogonal changes should be separate patches. This patch incorporates two completely independent issues (even though both rely on `EncodingByHwMode`) and each is subtle and needs to be considered separately. In this patch, I would respectfully request that it be split into two orthogonal parts-- one for the `CodeEmitterGen`+`SubtargetEmitter`+`RegisterInfoEmitter` set of changes and the other for the `DecoderEmitter` changes. This will allow detailed attention to each (which they need) and independent timelines for potential merging of each (and likewise for downstream backends-- they can independently incorporate/test/revert orthogonal changes). In any case, I'll give some initial thoughts below which can go into the follow-up patches preemptively for quicker turnaround.
> 2. The decoder emitter changes here essentially undo/cancel much of the existing support for duplicate suppression (in our backend, 100% of duplicates are no longer detected[1]). It also undoes the command line option to opt out of suppression. This means any existing downstream backend that already relies on `EncodingByHwMode` may need to have its decoder tables altered. It is better in the short term to allow targets to decide whether or not to apply suppression-- particularly in production backends who are necessarily conservative or just want to change on their own timeline (consider downstream backends who pull-in upstream changes daily/weekly on an automated basis). This can be further discussed in a separate decoder patch, but at the moment, this is significantly worse than that current state of affairs. The good news is, I think by merely using "" (empty name) instead of `AllModes` as the shared/common decoder table name in the existing code (trivial change, and use `-gen-disassembler --suppress-per-hwmode-duplicates`), you would have the effect you are trying to achieve. I originally used that name to make it explicit in the generated code that the table applies to all instructions that do not have a per-HwMode override, but a blank name is perfectly acceptable too-- specifically for the case when a backend did not previously have `EncodingByHwMode`, but will add it. [1] For example, in unit test `HwModeEncodeDecode2.td` (which was carefully constructed to show/test duplicate suppression), instruction `bar` should only appear in one table after suppression-- and indeed that is true with the current suppression code in TOT. Your patch regresses that case and leaves `bar` duplicated. The current TOT code seems to entirely obviate the decoder part of the patch (save the `AllModes` name change if it is desired).
> 3. I see some obvious inefficiencies and coding standard nits (https://llvm.org/docs/CodingStandards.html) that you could attend to while splitting the patch. For example, in the generated `getRegInfosHwModeId`, there is an entire loop nest to simply detect that a single bit is set. That can be replaced entirely with a single bitwise expression (or use `llvm::has_single_bit` + zero check). A number of comments need to be updated (all comments should be standard prose/caps/full-stop/etc).
> 4. A simple TableGen unit test is needed for the reginfo/subtarget-emitters changes.  Also, I think it might make sense to have two different APIs for this since `getHwMode` can now return a set. Maybe the original for callers who expect exactly one live mode at a given moment (the original intent, as far as I know), and a second `getHwModeSet` for callers expecting a bit vector of live modes. If not that, at the very least, some comments need to be added explaining what this API now returns for future readers of this and related code.
> 
> I would say 3 days of review time is not terrible in LLVM time, especially for this sort of patch. I will certainly try to turn around as quickly as possible-- with the emitter changes being likely quicker to turnaround.
> 
> FWIW, in recent changes to TableGen, I've been very careful myself about making narrow, isolated changes. This helps reviewers more easily understand and review the code (lessen cognitive load for people who largely work full time and need to context switch). It also makes it easier to pinpoint any problems later and revert/change relatively small amounts of code. This does require a bit more effort to upstream patches from downstream/private backends, but this is exactly what all of us "living downstream" are doing.

Hi [nvjle](https://github.com/nvjle), I have carefully read your comments and discussed your suggestions with our team members. Thank you very much for taking the time to review this patch and providing valuable feedback.
I also work for a production downstream backend. Our backend has twenty to thirty subtargets, so the instructions' definitions in our backend are quite complex, and there are many outdated instruction definitions that are difficult to maintain. So we are very much looking forward to enabling the ‘EncodingByHwMode’ feature to release the pressure on our backend's instructions' definitions. It is foreseeable that with the development of LLVM, more and more dowmstream backend may face the same problem (instruction definition inflation).

- For your first point,  I will split my patch into two parts, one is about the changes of DecoderEmitter(eliminate table duplication) and the other is about  storing HwModes in bits form and sinking HwModes selection logic down to per instructions level. 

- For your second point, I will keep the '--suppress-per-hwmode-duplicates' option, allowing downstream backends who already used this option to decide whether or not to apply suppression.

- For your third point, I will consider using your solution to handle bitwise related operations, if it can reduce time complexity.

- For your fourth point, after our discussion, we have decided to keep the current approach because it is compatible with the previous approach, and adding an new interface may seem a bit redundant in our opinion. In fact, the old logic cannot detect that a subtarget has two hwmode attributes, which is actually a bug. Of course, I will add more comments to describe the changes to this interface and specific testcases for this feature will be added.

What worries me the most is what you mentioned about seeing **miscompiles** in your backend. If you can provide some error information, I believe I could better pinpoint where the problem lies. Alternatively, after I split the patch into two parts, it might be easier to reproduce and locate the problem on your backend, but this would probably require your continued assistance in reviewing the split patches.  In fact, I enabled the current patch in my backend and assigned the 'EncodingByHwMode' attribute to seven or eight instructions to resolve encoding conflicts. So far, I have not encountered any problems.

If there are any points that I haven't replied to, please point them out. Thank you very much!

https://github.com/llvm/llvm-project/pull/84906