[PATCH] D107378: Make enum iteration with seq safe by default

Tue Oct 19 05:48:49 PDT 2021

aaron.ballman added a comment.

In D107378#3030890 <https://reviews.llvm.org/D107378#3030890>, @kuhar wrote:

> Let me start with some brief history:
>
> - I started something similar for an out-of-tree llvm-based compiler, LLPC: https://github.com/GPUOpen-Drivers/llpc/blob/dev/lgc/interface/lgc/EnumIterator.h. At the time, the implementation could not use `llvm::seq` because it did not support enums.
> - A couple of weeks later, I learned that `seq` was enhanced to support enums. But when I looked at the implementation, I learned that it blindly accepts all enum types, even though it does not make sense to iterate over all of them.
>
> Before this work, the status quo was that people would create hand written for loops like `for (MyEnum val = MyEnum::A; val != MyEnum::X, val = static_cast<MyEnum>(static_cast<unsigned>(val) + 1))`. This has 3 main issues:
>
> 1. The loop is verbose and ugly.

FWIW, I see this as a benefit. The loop is deeply strange and needs extra attention because of the use of enumerations, so making it stick out like a sore thumb is a good thing IMO.

> 2. The underlying type may not be `unsigned`, and in our codebase nobody used `std::underlying_type` instead. But that would make it even more verbose. Some places implemented custom operators or helper increment functions instead.
> 3. The enum may not be continuous. I don't have links to public bugs in llvm or llpc to point to, but this showed up a few times internally, and I personally messed it up a few times. The issue is that even if the original code had continuous values, modifying the enum definition may invalidate all loops that use this type. This was difficult for me to grep for, given that for loops like this can be written slightly differently, with values hoisted out or custom increment operators, etc. In my opinion, this pattern is very bugprone.

Strongly agreed! That's actually why I'm not super keen on adding facilities to make it easier to iterate over an enumeration. I feel like this need should be somewhat rare in the code base, so it's already not strongly motivated, but the fact that these facilities only make sense on enumerations that are defined in a very particular way makes it feel like blessing a code pattern we don't want people to use in the first place.

> I believe that this patch fixes all 3 issues. The most tricky one to handle is 3.: I found that the most reliable way to ensure that enumeration is safe is to disallow it by default and place this logic/declaration just after enum definition. This way, a person modifying the enum does not have to hunt for all of the uses across the codebase, and is likely to notice potential issues because of the close proximity of the declaration in the source code.

I think this approach alleviates my concerns about the danger of using the API (I think the original API was way too dangerous, so fixes here are really appreciated), but it doesn't really alleviate my concerns of "do we want to encourage people to do this by making it easier?".

> From my perspective, I'm happy with `enumRange` in LLPC, which is even more restrictive than what is proposed here (it also forces users to specify the first and end enum value). However, I'm concerned that the unsafe default is behavior is allowed with `llvm::seq`.
>
> If you see this patch as overcomplicated, maybe it would make sense to keep the underlying iteration mechanism of `iota_range` in place, but disallow enum iteration with `seq` and implement it separately, with extra safety checks outside of `iota_range`?

Personally, I sort of wonder if it makes sense to to disallow enum iteration entirely. Are we convinced we want to encourage people to do this even when the enumeration is held just right?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107378/new/

https://reviews.llvm.org/D107378