[PATCH] D153376: Introducing llvm-cm: A Cost Model Tool

James Henderson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 29 00:21:28 PDT 2023


jhenderson requested changes to this revision.
jhenderson added a comment.
This revision now requires changes to proceed.

I haven't attempted to review any of the logic of this - I wanted to take a quick whizz through the code to see what was going on and got sucked in by a rather hefty heap of style issues.

There seems to be very limited testing for what is a fairly chunky block of code here. You should have testing for all your error paths, as well as other other code paths you have created here, at the very least.

Is this intended to be a user-facing tool? If so, please create a user guide in llvm/docs/CommandGuide (and make sure to update the index there). That doesn't need to be in this patch, as long as it gets added at some point.



================
Comment at: llvm/test/tools/llvm-cm/inst_count.s:113
+# CHECK-NEXT: total # of instructions: 6
+
+
----------------
Get rid of all the blank lines at EOF. The file should end with precisely one \n, at the end of the last line containing text.


================
Comment at: llvm/tools/llvm-cm/CMakeLists.txt:1
+#include_directories(include)
+
----------------
Get rid of commented out code here and below.


================
Comment at: llvm/tools/llvm-cm/CMakeLists.txt:20
+#set(LLVM_CM_SOURCE_DIR ${CURRENT_SOURCE_DIR})
\ No newline at end of file

----------------
mtrofin wrote:
> needs a newline here (makes diff happy)
Now you have too many new lines (as noted above, files should end with precisely one \n).


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:47
+#include <cstdint>
+#include <iostream>
+#include <map>
----------------
https://llvm.org/docs/CodingStandards.html#include-iostream-is-forbidden


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:57
+
+using namespace llvm;
+
----------------
Would it make sense to `using namespace llvm::object;`?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:71
+
+// Class for error handling.  
+
----------------
This comment is probably unnecessary, but even if it were, it should be immediately next to the class in question, without any blank lines separating it.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:73
+
+// Rewrite ExitIf to override the << operator to print the message. 
+class ExitIf {
----------------
What is this comment about?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:74
+// Rewrite ExitIf to override the << operator to print the message. 
+class ExitIf {
+   public:
----------------
Why is this a class when a simple function would do?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:75
+class ExitIf {
+   public:
+    ExitIf(bool Cond) : Condition(Cond) {}
----------------
Have you run clang-format on your new code? Because this doesn't look formatted according to the standard format.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:79
+        if (Condition) {
+            errs() << "Error: " << Message.str() << "\n";
+            exit(1);
----------------
This isn't a normal way to print errors in LLVM tools. Please see existing examples in tools like llvm-objdump and llvm-readobj. In particular, you should be using `WithColor` for printing error messages. I see you've already implemented an `error` function below, so should you be using that?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:80
+            errs() << "Error: " << Message.str() << "\n";
+            exit(1);
+        }
----------------
Should this be `std::exit`?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:89
+    return *this;
+  } 
+   private:
----------------
Blank line after this function.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:110
+checkSectionFilter(object::SectionRef S, StringSet<> FoundSectionSet,
+                   std::vector<std::string> FilterSections) {
+  if (FilterSections.empty())
----------------
It's extremely unusual to pass in a vector by value. Should this be `const &` or `&&`?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:116
+  if (!SecNameOrErr) {
+    consumeError(SecNameOrErr.takeError());
+    return {/*Keep=*/false, /*IncrementIndex=*/false};
----------------
Using `consumeError` in new code is usually a code smell. Is there a reason you don't report the Error (at least as a warning)?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:121
+
+  // StringSet does not allow empty key so avoid adding sections with
+  // no name (such as the section with index 0) here.
----------------



================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:132
+
+llvm::object::SectionFilter
+gettoolSectionFilter(object::ObjectFile const &O, uint64_t *Idx,
----------------
This can just be `object::SectionFilter`, right? Same below at the return site.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:134
+gettoolSectionFilter(object::ObjectFile const &O, uint64_t *Idx,
+                  std::vector<std::string> FilterSections) {
+  StringSet<> FoundSectionSet;
----------------
Same as above re. passing by value.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:137
+  if (Idx)
+    *Idx = UINT64_MAX;
+  return llvm::object::SectionFilter(
----------------
I don't think it's a hard rule, but I've tended to see `std::numeric_limits<uint64_t>::max()` used rather than the C-style macros.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:149
+
+// Implement the "error" function.
+[[noreturn]] static void error(Error Err) {
----------------
This seems like an unnecessary comment.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:164
+
+// TODO: Share this with llvm-objdump.cpp. 
+static uint8_t getElfSymbolType(const llvm::object::ObjectFile &Obj, const llvm::object::SymbolRef &Sym) {
----------------
Any reason you can't do this upfront, like you did with the section filter stuff?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:182
+
+// TODO: Share this with llvm-objdump.cpp. 
+SymbolInfoTy createSymbolInfo(const object::ObjectFile &Obj, const object::SymbolRef Symbol) {
----------------
Ditto.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:187
+  return SymbolInfoTy(Addr, SymName, Obj.isELF() ? getElfSymbolType(Obj, Symbol)
+                                    : (uint8_t)ELF::STT_NOTYPE); 
+  
----------------
It's still unclear there's a 100% consensus, but at least it seems like this should be a `static_cast` rather than a C-style one (there's a debate about functional-style casts that's ongoing). See D151187.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:188
+                                    : (uint8_t)ELF::STT_NOTYPE); 
+  
+}
----------------
No blank line at end of function, please.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:191
+
+// Rewrite the printFunction function to only take in aliases.
+void printFunctionNames(ArrayRef<SymbolInfoTy> &Aliases) {
----------------
This seems like a weird comment? Should this be a TODO too?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:194
+  for (size_t I = 0; I < Aliases.size(); ++I) {
+    const StringRef SymbolName = Aliases[I].Name;
+    outs() << SymbolName << ":\n";
----------------
This seems like an unnecessary temporary variable. Can this and the next line be folded together?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:199
+
+// TODO: Share this with llvm-objdump.cpp. 
+static void collectBBtoAddressLabels(const std::unordered_map<uint64_t, llvm::object::BBAddrMap> &AddrToBBAddrMap,
----------------
Same as above.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:211
+    return;
+  for (unsigned I = 0, Size = Iter->second.BBEntries.size(); I < Size; ++I) {
+    uint64_t BBAddress = Iter->second.BBEntries[I].Offset + Iter->second.Addr;
----------------
`size()` returns a `size_t`, so we should match that type.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:219-220
+
+void processInsts(std::unique_ptr<MCDisassembler> &DisAsm,
+               const uint64_t &SectionAddr, ArrayRef<uint8_t> &Bytes,
+               raw_svector_ostream &CommentStream, uint64_t &Start,
----------------
Is there a reason you're passing in the `unique_ptr` here by reference, rather than simply the underlying pointer (i.e. `MCDisassembler *`)?
Also, simple types like `uint64_t` are usually passed in by value, not by `const &`.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:251
+                                                    << Twine::utohexstr(CurrAddr).str()
+                                                    << "\n";
+    ++NumInstructions;
----------------
Do you need this new line? `ExitIf` already adds its own.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:253
+    ++NumInstructions;
+    ++NumInstsInBb;
+    if (Size == 0) {
----------------
The usual style is that acronyms (i.e. in this case "BB" = "BasicBlock"), are all caps.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:259
+    Index += Size;
+  }
+  // If enteredbb is true and there is more than one label in the basic block, output the number of instructions in the basic block.
----------------
I'd add a blank line after this one, after the if block.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:260
+  }
+  // If enteredbb is true and there is more than one label in the basic block, output the number of instructions in the basic block.
+  if (EnteredBb && Labels.size() > 1) {
----------------
This just seems to be a comment that describes what the (relatively simple) code is doing. Is it really useful?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:272
+
+  // Parse the command line options.
+  cl::ParseCommandLineOptions(argc, argv, "llvm cost model tool\n");
----------------
Comments like this are not useful. It simply is saying what the name of the function on the next line literally says it is doing. Use variable and function names to describe the //what// of what your code is doing, and comments only when that is insufficient (which should be rare) or where the "why" is important.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:290
+  std::string Error;
+
+  const Target *TheTarget = TargetRegistry::lookupTarget(TripleName, Error);
----------------
There are a lot of unnecessary blank lines within this function, which disrupt the flow of someone reading it. Keep your code grouped into logical bits, e.g. variable declarations and the function that then uses them all together.

Also don't initialise local variables until you need them. E.g. `TheTarget` isn't used until quite a way down from here, so move its initialization until then.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:302
+  
+  for (unsigned I = 0; I != MAttrs.size(); ++I) {
+      TrueFeatures.AddFeature(MAttrs[I]);
----------------
`size_t`


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:338
+
+
+
----------------
Please clean up all these double/triple blank lines.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:340
+
+  std::map<object::SectionRef, SectionSymbolsTy> AllSymbols;
+  SectionSymbolsTy UndefinedSymbols;
----------------
Should this be a `StringMap`?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:345
+
+  bool Check64Bits = Obj->getBytesInAddress() > 4; 
+
----------------
This variable is named as a verb, but it is a variable. Please use a noun or adjective phrase.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:377
+ 
+  std::unordered_map<uint64_t, object::BBAddrMap> BBAddrMap; 
+  auto BBAddrMapping = [&]() 
----------------
I suspect `std::unordered_map` is not the type you actually want. Take a look here: https://llvm.org/docs/ProgrammersManual.html#map-like-containers-std-map-densemap-etc


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:378
+  std::unordered_map<uint64_t, object::BBAddrMap> BBAddrMap; 
+  auto BBAddrMapping = [&]() 
+  { 
----------------
This should probably be called `GetBBAddrMapping`.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:419
+    for (size_t SI = 0, SE = SortedSymbols.size(); SI != SE;) {
+
+      // Find all symbols in the same "location" by incrementing over
----------------
Don't start a block with a blank line.


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:463-464
+      // Make sure to keep track of the number of instructions.
+      int NumInstructions = 0;
+      int NumInstsInBB = 0; 
+
----------------
Does `int` make sense here? Can they be negative? Should they actually be `uint64_t`?


================
Comment at: llvm/tools/llvm-cm/llvm-cm.cpp:472
+  }
+  return 0;
+}
----------------
Explicit `return 0;` at end of `main` is unnecessary.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153376/new/

https://reviews.llvm.org/D153376



More information about the llvm-commits mailing list