[PATCH] D121556: [randstruct] Add randomize structure layout support

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 17 05:23:58 PDT 2022


aaron.ballman added a comment.

Thank you for working on this new security feature! I like it and I think it's heading in the right direction.

Can you please add a release note for the new functionality?

This is missing tests for the Driver diagnostics, the Sema diagnostics (including existing diagnostics like writing the attribute on the wrong subject or giving it args when it doesn't expect them, wrong language mode, etc), and the CodeGen behavior, so those should be added. Also, the Windows pre-commit CI is failing:

  Failed Tests (7):
    Clang-Unit :: AST/./ASTTests.exe/StructureLayoutRandomization.AnonymousStructsAndUnionsRetainFieldOrder
    Clang-Unit :: AST/./ASTTests.exe/StructureLayoutRandomization.CheckAdjacentBitfieldsRemainAdjacentAfterRandomization
    Clang-Unit :: AST/./ASTTests.exe/StructureLayoutRandomization.CheckVariableLengthArrayMemberRemainsAtEndOfStructure
    Clang-Unit :: AST/./ASTTests.exe/StructureLayoutRandomization.MarkedRandomize
    Clang-Unit :: AST/./ASTTests.exe/StructureLayoutRandomization.MultipleAttrsRandomize
    Clang-Unit :: AST/./ASTTests.exe/StructureLayoutRandomization.RandstructDoesNotOverrideThePackedAttr
    Clang-Unit :: AST/./ASTTests.exe/StructureLayoutRandomization.ZeroWidthBitfieldsSeparateAllocationUnits
   
  Testing Time: 676.49s
    Skipped          :     4
    Unsupported      :   202
    Passed           : 29832
    Expectedly Failed:    31
    Failed           :     7



================
Comment at: clang/include/clang/AST/Decl.h:2842
   mutable unsigned CachedFieldIndex : 30;
+  mutable unsigned OriginalFieldIndex : 30;
 
----------------
It's unfortunate that every field node in the AST is now going to be 4 bytes larger for this feature; I worry about the extra memory pressure from the additional overhead, so if there's a way for us to save some space here, I think it might be worth it. (I'm worried that our max template instantiation depth will shrink because of this.)


================
Comment at: clang/include/clang/AST/Decl.h:4067
+
+  void setIsRandomized(bool V) const { RecordDeclBits.IsRandomized = V; }
+
----------------
A setter that is marked `const` does not spark joy for me... Can we use friendship rather than mutability to solve this?


================
Comment at: clang/include/clang/AST/Randstruct.h:34
+
+extern std::string SEED;
+
----------------
Doing this with a global variable is unfortunate; it could make things harder when we multithread the frontend. Can we solve this without a global? (And if not, does this global need to be a `ManagedStatic`?)


================
Comment at: clang/include/clang/Basic/Attr.td:3938-3939
+def RandomizeLayout : InheritableAttr {
+  let Spellings = [GCC<"randomize_layout">, Declspec<"randomize_layout">,
+    Keyword<"randomize_layout">];
+  let Subjects = SubjectList<[Record]>;
----------------
Microsoft does not implement this `__declspec`, correct? If they don't, we shouldn't either (even if GCC does) unless there's a *very* good reason to do so. That's Microsoft's design space we're encroaching on.

I'd also like to understand the justification for adding a new keyword for this.


================
Comment at: clang/include/clang/Basic/Attr.td:3948-3949
+def NoRandomizeLayout : InheritableAttr {
+  let Spellings = [GCC<"no_randomize_layout">, Declspec<"no_randomize_layout">,
+    Keyword<"no_randomize_layout">];
+  let Subjects = SubjectList<[Record]>;
----------------
Same here.


================
Comment at: clang/include/clang/Basic/AttrDocs.td:6367
+program should be compiled with the same seed, but keep the seed safe
+otherwise.
+
----------------
I think it would help to explain the effects of this attribute in conjunction with structure packing techniques (the `packed` attribute and bit-fields) and whether any attempt is made to keep the structure packed or those fields together or not. I'd expect no such attempt is made, but people may want to know "attempts to pack this structure" and "randomize this structure layout" are not compatible. (We may want to diagnose in the case of seeing the `packed` attribute, in fact.)

We should also be more clear that the seed specified does not have to be numeric in nature (from the file or when directly on the command line).


================
Comment at: clang/lib/AST/DeclBase.cpp:28
 #include "clang/AST/ExternalASTSource.h"
+#include "clang/AST/Randstruct.h"
 #include "clang/AST/Stmt.h"
----------------
Is this include necessary?


================
Comment at: clang/lib/AST/Randstruct.cpp:72
+  // All of the Buckets produced by best-effort cache-line algorithm.
+  // TODO: Figure out a better initial size.
+  SmallVector<std::unique_ptr<randstruct::Bucket>, 16> Buckets;
----------------
This size seems as defensible as most others; do you plan to do more here, or should this comment be removed?


================
Comment at: clang/lib/AST/Randstruct.cpp:99
+
+    if (FD->isBitField() && !FD->isZeroLengthBitField(Context)) {
+      // Start a bitfield run if this is the first bitfield we have found.
----------------
Oh cool, we do bit-field runs!

How should we handle anonymous bit-fields (if anything special should be done for them at all)? People usually use them for padding between bit-fields, so it's not clear to me how to treat them when randomizing.


================
Comment at: clang/lib/AST/Randstruct.cpp:179
+
+  SmallVector<FieldDecl *, 64> RandomizedFields;
+
----------------
This one seems a bit large to me; any reason not to use `16` again?


================
Comment at: clang/lib/AST/Randstruct.cpp:185
+    ++TotalNumFields;
+    if (auto FD = dyn_cast<FieldDecl>(D)) {
+      FD->setOriginalFieldIndex(Index++);
----------------



================
Comment at: clang/lib/Parse/ParseDeclCXX.cpp:2063-2069
+      if (RD->isRandomized()) {
+        for (Decl *D : FinalOrdering)
+          RD->removeDecl(D);
+
+        for (Decl *D : FinalOrdering)
+          RD->addDecl(D);
+      }
----------------
I think this works okay for C, but I think if we ever attempted to provide this functionality in C++, the calls to `addDecl()` would be rather expensive because it eventually calls `addedMember()` which calculates information about the structure (whether it's copy constructible, has a trivial destructor, etc) and we don't need to redo any of that work.

However, for now, I think this is fine.


================
Comment at: clang/lib/Sema/AnalysisBasedWarnings.cpp:2166-2167
+      if (isa<PointerType>(ExprTy) && isa<PointerType>(SubExprTy)) {
+        const QualType CastTy = cast<PointerType>(ExprTy)->getPointeeType();
+        const QualType BaseTy = cast<PointerType>(SubExprTy)->getPointeeType();
+
----------------



================
Comment at: clang/lib/Sema/AnalysisBasedWarnings.cpp:2503
+
+  // FIXME: Any way to get a handle to a RecordDecl struct here?
+  clang::randstruct::checkForBadCasts(S, AC);
----------------
No, this function is only called when popping a function scope, and so the only declaration it has access to is the `FunctionDecl`. Or do I misunderstand the question?


================
Comment at: clang/lib/Sema/SemaDecl.cpp:27
 #include "clang/AST/NonTrivialTypeVisitor.h"
+#include "clang/AST/Randstruct.h"
 #include "clang/AST/StmtCXX.h"
----------------
Is this include necessary?


================
Comment at: clang/lib/Sema/SemaDeclAttr.cpp:8550-8556
+  case ParsedAttr::AT_RandomizeLayout:
+    handleSimpleAttribute<RandomizeLayoutAttr>(S, D, AL);
+    break;
+  case ParsedAttr::AT_NoRandomizeLayout:
+    // Drop the "randomize_layout" attribute if it's on the decl.
+    D->dropAttr<RandomizeLayoutAttr>();
+    break;
----------------
I don't think this is sufficient. Consider redeclaration merging cases:
```
struct [[gnu::randomize_layout]] S;
struct [[gnu::no_randomize_layout]] S {};

struct [[gnu::no_randomize_layout]] T;
struct [[gnu::randomize_layout]] T {};
```
I think if the user accidentally does something like this, it should be diagnosed. I would have assumed this would warrant an error diagnostic because the user is obviously confused as to what they want, but it seems GCC ignores the subsequent diagnostic with a warning: https://godbolt.org/z/1q8dazYPW.

There's also the "confused attribute list" case which GCC just... does... things... with: https://godbolt.org/z/rnfsn7hG1. I think we want to behave more consistently with how we treat these cases.

Either way, we shouldn't be silent.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121556/new/

https://reviews.llvm.org/D121556



More information about the cfe-commits mailing list