[PATCH] Initial code coverage mapping data structures, and reader and writers + C interface for ProfileData library

Fri Jun 27 16:07:17 PDT 2014

> On Jun 25, 2014, at 4:32 PM, Alex L <arphaman at gmail.com> wrote:
> 
> Hi everyone,
> This is a first patch that implements the data structures, readers and writers used by the new code coverage system. I added the new code to the ProfileData library. I also added a very minimal C api for the ProfileData library.

Review, part 2…. I’m skipping over the reader and writer code for now.

> diff --git a/lib/ProfileData/CoverageMapping.cpp b/lib/ProfileData/CoverageMapping.cpp
> new file mode 100644
> index 0000000..33ea866
> --- /dev/null
> +++ b/lib/ProfileData/CoverageMapping.cpp
> @@ -0,0 +1,175 @@
> +//=-- CoverageMapping.cpp - Code coverage mapping support ---------*- C++ -*-=//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains support for clang's and llvm's instrumentation based
> +// code coverage.
> +//
> +//===----------------------------------------------------------------------===//
> +
> +#include "llvm/ProfileData/CoverageMapping.h"
> +
> +using namespace llvm;
> +
> +// Encode the counter in the following way:
> +// Low 2 bits - tag/counter kind.
> +// Remaining bits - Counter/Expression ID.
> +unsigned Counter::Encode() const {
> +  assert(ID <= (std::numeric_limits<unsigned>::max() >> EncodingTagBits));
> +
> +  return unsigned(Kind) | (ID << EncodingTagBits);
> +}
> +
> +bool Counter::Decode(unsigned Value) {
> +  Kind = CounterKind(Value & EncodingTagMask);
> +  ID = Value >> EncodingTagBits;
> +  return false;
> +}
> +
> +CounterExpressionBuilder::CounterExpressionBuilder(unsigned NumCounters) {
> +  Terms.resize(NumCounters);
> +}

It looks like NumCounters here refers to the number of counts from the profile data and does not include all of the “abstract Counters”. In the CounterKind enum, you refer to the former as “CounterValueReference”. Perhaps the argument here should be named “NumCounterValues” to help make that distinction more clear?

> +
> +Counter CounterExpressionBuilder::get(Counter::CounterKind Kind,
> +                                      const CounterExpression &E) {
> +  for (unsigned I = 0, S = Expressions.size(); I < S; ++I) {
> +    if (Expressions[I] == E)
> +      return Counter::getExpression(Kind, I);
> +  }
> +  Expressions.push_back(E);
> +  return Counter::getExpression(Kind, Expressions.size() - 1);
> +}
> +
> +void CounterExpressionBuilder::ExtractTerms(Counter C, int Sign) {
> +  switch (C.getKind()) {
> +  case Counter::Zero:
> +    break;
> +  case Counter::CounterValueReference:
> +    Terms[C.getCounterID()] += Sign;
> +    break;
> +  case Counter::AdditionExpression:
> +    ExtractTerms(Expressions[C.getExpressionID()].LHS, Sign);
> +    ExtractTerms(Expressions[C.getExpressionID()].RHS, Sign);
> +    break;
> +  case Counter::SubtractionExpression:
> +    ExtractTerms(Expressions[C.getExpressionID()].LHS, Sign);
> +    ExtractTerms(Expressions[C.getExpressionID()].RHS, Sign == 1 ? -1 : 1);

How about just “-Sign”?

> +    break;
> +  }
> +}
> +
> +Counter CounterExpressionBuilder::Simplify(Counter ExpressionTree) {
> +  // Gather constant terms
> +  for (auto &I : Terms)
> +    I = 0;
> +  ExtractTerms(ExpressionTree);
> +
> +  Counter C;
> +  // Create additions
> +  // Note: the additions are created first
> +  // to avoid creation of a tree like ((0 - X) + Y)
> +  // instead of (Y - X).
> +  for (unsigned I = 0, S = Terms.size(); I < S; ++I) {
> +    if (Terms[I] <= 0)
> +      continue;
> +    for (int J = 0; J < Terms[I]; ++J) {
> +      if (C.isZero())
> +        C = Counter::getCounter(I);
> +      else
> +        C = get(Counter::AdditionExpression,
> +                CounterExpression(CounterExpression::Add, C,
> +                                  Counter::getCounter(I)));
> +    }
> +  }
> +
> +  // Create subtractions
> +  for (unsigned I = 0, S = Terms.size(); I < S; ++I) {
> +    if (Terms[I] >= 0)
> +      continue;
> +    for (int J = 0; J < (-Terms[I]); ++J)
> +      C = get(Counter::SubtractionExpression,
> +              CounterExpression(CounterExpression::Subtract, C,
> +                                Counter::getCounter(I)));
> +  }
> +  return C;
> +}
> +
> +Counter CounterExpressionBuilder::Add(Counter LHS, Counter RHS) {
> +  return Simplify(get(Counter::AdditionExpression,
> +                      CounterExpression(CounterExpression::Add, LHS, RHS)));
> +}
> +
> +Counter CounterExpressionBuilder::Subtract(Counter LHS, Counter RHS) {
> +  return Simplify(
> +      get(Counter::SubtractionExpression,
> +          CounterExpression(CounterExpression::Subtract, LHS, RHS)));
> +}

You’re using the same set of Expressions in the CounterExpressionBuilder both before and after the simplification. That will leave some unused entries. I see that you have code in the writer to only emit the ones that are used in the end result, but I’m curious if you’ve measured how often you get unused entries.

Have you done anything to evaluate the effectiveness of your simplification code? Did you try anything different before settling on the current approach? How much does the Simplify function reduce the number of counter expressions?

> +
> +void CounterMappingContext::dump(const Counter &C,
> +                                 llvm::raw_ostream &OS) const {
> +  switch (C.getKind()) {
> +  default:

Shouldn’t this be “case Counter::Zero”?

> +    OS << '0';
> +    return;
> +  case Counter::CounterValueReference:
> +    OS << '#' << C.getCounterID();
> +    break;
> +  case Counter::SubtractionExpression:
> +  case Counter::AdditionExpression: {
> +    if (C.getExpressionID() >= Expressions.size())
> +      return;
> +    const auto &E = Expressions[C.getExpressionID()];
> +    OS << '(';
> +    dump(E.LHS);
> +    OS << (C.getKind() == Counter::SubtractionExpression ? " - " : " + ");
> +    dump(E.RHS);
> +    OS << ')';
> +    break;
> +  }
> +  }
> +  if (CounterValues.empty())
> +    return;
> +  std::error_code Error;
> +  auto Value = evaluate(C, Error);
> +  if (Error)
> +    return;
> +  OS << '[' << Value << ']';
> +}
> +
> +int64_t CounterMappingContext::evaluate(const Counter &C,
> +                                        std::error_code *EC) const {
> +  switch (C.getKind()) {
> +  case Counter::Zero:
> +    return 0;
> +  case Counter::CounterValueReference:
> +    if (C.getCounterID() >= CounterValues.size()) {
> +      if (EC)
> +        *EC = std::make_error_code(std::errc::argument_out_of_domain);
> +      break;
> +    }
> +    return CounterValues[C.getCounterID()];
> +  case Counter::SubtractionExpression:
> +  case Counter::AdditionExpression: {
> +    if (C.getExpressionID() >= Expressions.size()) {
> +      if (EC)
> +        *EC = std::make_error_code(std::errc::argument_out_of_domain);
> +      break;
> +    }
> +    const auto &E = Expressions[C.getExpressionID()];
> +    auto LHS = evaluate(E.LHS, EC);
> +    if (EC && *EC)
> +      return 0;
> +    auto RHS = evaluate(E.RHS, EC);
> +    if (EC && *EC)
> +      return 0;
> +    return C.getKind() == Counter::SubtractionExpression ? LHS - RHS
> +                                                         : LHS + RHS;
> +  }
> +  }
> +  return 0;
> +}

Responding to your email about the header file review:
> Ditto for the CounterMappingContext. Let’s come back to that later.
> 
> Okay. However, it's used in the implementation of the C api in this library.

Good point. I’m not strongly opposed to keeping this here for now.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140627/e8448f47/attachment.html>