[PATCH] D43522: [clang-format] New API guessLanguage()

Wed Feb 21 15:20:56 PST 2018

djasper added inline comments.

================
Comment at: cfe/trunk/lib/Format/Format.cpp:2308
+      Guesser.process();
+      if (Guesser.isObjC()) {
+        result = FormatStyle::LK_ObjC;
----------------
benhamilton wrote:
> djasper wrote:
> > In LLVM, we generally don't add braces for single statement ifs.
> Mmmm.. is this a hard requirement? I've personally been bitten so many times by adding statements after missing braces, I'd rather add them unless they're required to not be there by the style guide.
Yes. This is done as consistently as possible throughout LLVM and Clang and I'd like to keep clang-format's codebase consistent.

clang-format itself makes it really hard to get bitten by missing braces :).

================
Comment at: cfe/trunk/lib/Format/Format.cpp:2309
+      if (Guesser.isObjC()) {
+        result = FormatStyle::LK_ObjC;
+      }
----------------
benhamilton wrote:
> djasper wrote:
> > Why not just return here?
> I don't like early returns in case an else sneaks in later:
> 
> https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return
But I would argue exactly the opposite. At this point, you have pretty uniquely determined that this is ObjC (from this originally being viewed as LK_Cpp). Now suppose you add additional logic before the return statement in line 2313: That would make the state space that this function can have quite complex.

I would even go one step further and completely eliminate the variable "result". That would be slightly less efficient, so maybe I'd be ok with:

  const auto GuessFromFilename = getLanguageByFileName(FileName);

And then you can return this at the end or have early exits / other code paths if you come up with different languages. Having both, a complex control flow and state in local variables seems unnecessarily complex here.

================
Comment at: cfe/trunk/unittests/Format/FormatTest.cpp:11955

+struct GuessLanguageTestCase {
+  const char *const FileName;
----------------
benhamilton wrote:
> djasper wrote:
> > IMO, this is a bit over-engineered. IIUC, this only calls a single free standing function with two different parameters and expects a certain result. You don't need this struct, a test fixture or parameterized tests for that. Just write:
> > 
> >   TEST(GuessLanguageTest, FileAndCode) {
> >     EXPECT_EQ(guessLanguage("foo.cc", ""), FormatStyle::LK_Cpp);
> >     EXPECT_EQ(guessLanguage("foo.m", ""), FormatStyle::LK_ObjC);
> >     ...
> >   }
> > 
> > Yes, you'd be duplicating the "EXPECT_EQ" and "guessLanguage", but I think that's actually good here. It makes the tests intuitively readable.
> I hear you. I much prefer it when a single test tests a single issue, so failures are isolated to the actual change:
> 
> https://elgaard.blog/2011/02/06/multiple-asserts-in-a-single-unit-test-method/
> 
> If this isn't a hard requirement, I'd like to keep this the way it is.
It certainly makes sense for asserts, as a tests stops upon finding the first assert. But these are expectations. Each failing expectation will be reported individually, with a direct reference to the line in question and an easily understandable error message.

I understand what you are saying but I think my proposal will actually make test failures easier to diagnose and understand. Please do change it.

Repository:
  rL LLVM

https://reviews.llvm.org/D43522