[PATCH] D64790: [LLVM][NFC] Adding an Alignment type to LLVM

Thu Jul 18 13:32:44 PDT 2019

jakehehrlich added a comment.

In D64790#1588982 <https://reviews.llvm.org/D64790#1588982>, @gchatelet wrote:

> In D64790#1588231 <https://reviews.llvm.org/D64790#1588231>, @jakehehrlich wrote:
>
> >
>
>
> First things first, thx a lot for the review. I really appreciate it.
>
> > Ok so I thought about endianness more and I think that's not an issue because computations are generally done in native endianness.
>
> Ok
>
> > I do however think you should definitely have either the 64 bit version be the default or have both a 32 and 64 bit version. This type is most useful in cases where tools like lld, llvm-objcopy, yaml2obj, etc... would be using an internal structure or temporary variable and generally you want to handle both 32-bit and 64-bit operations in these cases. The solution is always to default to the highest bit size which is 64-bits because it will handle the 32-bit case just fine. Unless there's a compelling reason to not use 64-bits I think you should just make it your default. Certainly before lld or llvm-objcopy can use this we need to have a 64-bit version.
>
> Throughout LLVM we mostly use `unsigned` to represent an alignment. I don't know for sure what was the rationale for using it instead of uint64_t, it can be for historic reasons (just use the machine's `word` type) or for space reasons.
>  A quick estimation via `git grep -E "^\s+unsigned [Aa]lign.*" | wc -l` gives the following number of occurrences (for the whole `llvm-project` repo)
>
> - `unsigned` 200
> - `uint16_t`  4
> - `uint32_t`  132
> - `uint64_t`  44
>
>   One of the occurrences for the `uint16_t` is in `llvm/include/llvm/CodeGen/TargetLowering.h` so for this one at least is seems that storage is important.
>
>   This tends to prove that the Alignment type should be parameterized as you suggest.

In those 44 cases and 132 cases includes all ELF code. Also consider lld for more uses. You alienate no one if you support up to 64-bits but you alienate all ELF programmers if you don't support 64-bits. I am one of those people. I cannot use this if 64-bits isn't supported. Other than trying to help llvm code quality improve and being a good citizen I have no reason to review this if I don't get 64-bit support.

================
Comment at: llvm/include/llvm/Support/Alignment.h:101-103
+  bool needsPadding(uint64_t Size) const {
+    return (Size & (uint64_t(Value) - 1)) != 0;
+  }
----------------
gchatelet wrote:
> jakehehrlich wrote:
> > Seems like you'd just want to provide helper functions like "alignTo" that do this for you. There's a whole family of them but personally I've primarily used "alignTo" and one other really odd highly specific one for ELF to deal with segment alignment.
> I've encountered this pattern a number of times and considered it was worth factoring.
>  - https://github.com/llvm/llvm-project/blob/9eb95903da4575b3b95d1a743d48e51bb5026ccd/llvm/lib/IR/DataLayout.cpp#L58
>  - https://github.com/llvm/llvm-project/blob/9eb95903da4575b3b95d1a743d48e51bb5026ccd/llvm/lib/IR/DataLayout.cpp#L75
>  - https://github.com/llvm/llvm-project/blob/9eb95903da4575b3b95d1a743d48e51bb5026ccd/llvm/lib/Target/PowerPC/PPCRegisterInfo.cpp#L506
>  - https://github.com/llvm/llvm-project/blob/9eb95903da4575b3b95d1a743d48e51bb5026ccd/llvm/lib/Transforms/Coroutines/CoroFrame.cpp#L329
> 
> I agree it doesn't have to be a struct member and can be changed to a free function. Any suggestion for a better name?
Those are all easily refactored. I'm not sure how "IsPadded" is being propogated though but if that's needed you can check to see if the value changes. In general needing to check for padding like most of those examples are is anti pattern. The error checking case can just be `X != alignTo(X, Align)`

================
Comment at: llvm/include/llvm/Support/Alignment.h:107
+  // Needed to interact with C for instance.
+  uint64_t value() const { return Value; }
+};
----------------
gchatelet wrote:
> jakehehrlich wrote:
> > If size is a concern yet we only want to represent powers of two I think encoding the power two only would let us represent 64-bit alignments using only a 1 byte value at the cost of a slight bit of increased computation on each operation. These operations tend to not be very performance critical however.
> Yes that's a possibility I wanted to explore as well, but I'm not sure about the tradeoffs at this point.
> IMHO the code would be much more straightforward compared to a templated version.
> 
> **I'm interested in comments from other developers here**.
> 
> Storing alignment as a byte would allow alignment up to 2^254
> ```
> 0:0
> 1:1
> 2:2
> 3:4
> 4:8
> ...
> 255: 2^254
> ```
0 would represent an alignment of '1' not '0' but you might take that route in the "maybe" to encode a missing value.

================
Comment at: llvm/include/llvm/Support/Alignment.h:115
+// It is suitable for use as static global constants.
+struct MaybeAlign final {
+private:
----------------
gchatelet wrote:
> jakehehrlich wrote:
> > What's the use case for this type? I don't think any ELF tools have a use case for this at least. ELF uses 0 to mean the same thing as 1 so we'd just want something that converted to 1 asap.
> > 
> > I think its interface should copy Optional if this has a use case. e.g. using Optional<Align> should function close to the same, perhaps with some operations from Align that act in sensible ways.
> Yes semantically this is exactly Optional<Align>, the use of this type is pervasive in LLVM where the alignment requirements are first unknown and set down the road when needed.
> Problem is `llvm::Optional` needs an additional `bool` to know if the value is set or not so `MaybeAlign` is more compact than `Optional<Align>` (https://godbolt.org/z/8_uA_D)  it's also more customizable.
Right I understand the optimization I'm saying you should match the interface modulo a few operations specific to alignments.

================
Comment at: llvm/include/llvm/Support/Alignment.h:179-181
+  ALIGN_DEPRECATED(
+      bool needsPadding(uint64_t Size) const,
+      "needsPadding involving MaybeAlign mixes semantics, use Align instead");
----------------
gchatelet wrote:
> jakehehrlich wrote:
> > Why add a deprecated method to new code? Nothing is using this so there's no need for backward compatibility. Also the two methods above look equally dangerous but boil down to Optional's 'get' or dereference.
> ALIGN_DEPRECATED is defined only when ALIGN_HARDENED_ALIGNMENT is set.
> As of this patch there will be no deprecation message, it's useful as a second step to find fishy code.
> Fixing the code while introducing the type is not feasible considering the patch is already humongous.
You can just remove all the deprecated methods and then gradually refactor other code. Nothing calls this yet so there's no reason to add deprecated functionality.

================
Comment at: llvm/include/llvm/Support/Alignment.h:192-194
+inline bool MaybeAlign::needsPadding(uint64_t Size) const {
+  return Value ? Align(Value).needsPadding(Size) : false;
+}
----------------
gchatelet wrote:
> jakehehrlich wrote:
> > Why does choosing false make sense here? It isn't clear to me what should be returned here one way or the other. If padding is needed then one would assume you'd also need a method to perform the padding required. If this was an empty alignment how could you do that?
> I concur this does not make sense but this is what the code is doing right now. Since this patch is about introducing a type and not fix the code, I've to add this function for now.
> This is why these functions will be deprecated afterwards.
You should introduce a new proper type, and fix code as its migrated over, not introduce more broken code and gradually unbreak it.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64790/new/

https://reviews.llvm.org/D64790