[PATCH] Initial code coverage mapping data structures, and reader and writers + C interface for ProfileData library

Alex L arphaman at gmail.com
Fri Jun 27 16:45:00 PDT 2014


2014-06-27 16:07 GMT-07:00 Bob Wilson <bob.wilson at apple.com>:

>
> On Jun 25, 2014, at 4:32 PM, Alex L <arphaman at gmail.com> wrote:
>
> Hi everyone,
> This is a first patch that implements the data structures, readers and
> writers used by the new code coverage system. I added the new code to the
> ProfileData library. I also added a very minimal C api for the ProfileData
> library.
>
>
> Review, part 2…. I’m skipping over the reader and writer code for now.
>
> diff --git a/lib/ProfileData/CoverageMapping.cpp
> b/lib/ProfileData/CoverageMapping.cpp
> new file mode 100644
> index 0000000..33ea866
> --- /dev/null
> +++ b/lib/ProfileData/CoverageMapping.cpp
> @@ -0,0 +1,175 @@
> +//=-- CoverageMapping.cpp - Code coverage mapping support ---------*- C++
> -*-=//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains support for clang's and llvm's instrumentation based
> +// code coverage.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "llvm/ProfileData/CoverageMapping.h"
> +
> +using namespace llvm;
> +
> +// Encode the counter in the following way:
> +// Low 2 bits - tag/counter kind.
> +// Remaining bits - Counter/Expression ID.
> +unsigned Counter::Encode() const {
> +  assert(ID <= (std::numeric_limits<unsigned>::max() >> EncodingTagBits));
> +
> +  return unsigned(Kind) | (ID << EncodingTagBits);
> +}
> +
> +bool Counter::Decode(unsigned Value) {
> +  Kind = CounterKind(Value & EncodingTagMask);
> +  ID = Value >> EncodingTagBits;
> +  return false;
> +}
> +
> +CounterExpressionBuilder::CounterExpressionBuilder(unsigned NumCounters) {
> +  Terms.resize(NumCounters);
> +}
>
>
> It looks like NumCounters here refers to the number of counts from the
> profile data and does not include all of the “abstract Counters”. In the
> CounterKind enum, you refer to the former as “CounterValueReference”.
> Perhaps the argument here should be named “NumCounterValues” to help make
> that distinction more clear?
>

Yes, it does refer to the number of counts from profile data. This rename
here is a good idea.


>
> +
> +Counter CounterExpressionBuilder::get(Counter::CounterKind Kind,
> +                                      const CounterExpression &E) {
> +  for (unsigned I = 0, S = Expressions.size(); I < S; ++I) {
> +    if (Expressions[I] == E)
> +      return Counter::getExpression(Kind, I);
> +  }
> +  Expressions.push_back(E);
> +  return Counter::getExpression(Kind, Expressions.size() - 1);
> +}
> +
> +void CounterExpressionBuilder::ExtractTerms(Counter C, int Sign) {
> +  switch (C.getKind()) {
> +  case Counter::Zero:
> +    break;
> +  case Counter::CounterValueReference:
> +    Terms[C.getCounterID()] += Sign;
> +    break;
> +  case Counter::AdditionExpression:
> +    ExtractTerms(Expressions[C.getExpressionID()].LHS, Sign);
> +    ExtractTerms(Expressions[C.getExpressionID()].RHS, Sign);
> +    break;
> +  case Counter::SubtractionExpression:
> +    ExtractTerms(Expressions[C.getExpressionID()].LHS, Sign);
> +    ExtractTerms(Expressions[C.getExpressionID()].RHS, Sign == 1 ? -1 :
> 1);
>
>
> How about just “-Sign”?
>
> +    break;
> +  }
> +}
> +
> +Counter CounterExpressionBuilder::Simplify(Counter ExpressionTree) {
> +  // Gather constant terms
> +  for (auto &I : Terms)
> +    I = 0;
> +  ExtractTerms(ExpressionTree);
> +
> +  Counter C;
> +  // Create additions
> +  // Note: the additions are created first
> +  // to avoid creation of a tree like ((0 - X) + Y)
> +  // instead of (Y - X).
> +  for (unsigned I = 0, S = Terms.size(); I < S; ++I) {
> +    if (Terms[I] <= 0)
> +      continue;
> +    for (int J = 0; J < Terms[I]; ++J) {
> +      if (C.isZero())
> +        C = Counter::getCounter(I);
> +      else
> +        C = get(Counter::AdditionExpression,
> +                CounterExpression(CounterExpression::Add, C,
> +                                  Counter::getCounter(I)));
> +    }
> +  }
> +
> +  // Create subtractions
> +  for (unsigned I = 0, S = Terms.size(); I < S; ++I) {
> +    if (Terms[I] >= 0)
> +      continue;
> +    for (int J = 0; J < (-Terms[I]); ++J)
> +      C = get(Counter::SubtractionExpression,
> +              CounterExpression(CounterExpression::Subtract, C,
> +                                Counter::getCounter(I)));
> +  }
> +  return C;
> +}
> +
> +Counter CounterExpressionBuilder::Add(Counter LHS, Counter RHS) {
> +  return Simplify(get(Counter::AdditionExpression,
> +                      CounterExpression(CounterExpression::Add, LHS,
> RHS)));
> +}
> +
> +Counter CounterExpressionBuilder::Subtract(Counter LHS, Counter RHS) {
> +  return Simplify(
> +      get(Counter::SubtractionExpression,
> +          CounterExpression(CounterExpression::Subtract, LHS, RHS)));
> +}
>
>
> You’re using the same set of Expressions in the CounterExpressionBuilder
> both before and after the simplification. That will leave some unused
> entries. I see that you have code in the writer to only emit the ones that
> are used in the end result, but I’m curious if you’ve measured how often
> you get unused entries.
>
> Have you done anything to evaluate the effectiveness of your
> simplification code? Did you try anything different before settling on the
> current approach? How much does the Simplify function reduce the number of
> counter expressions?
>
>
Yes, there are unused expressions stored in the builder, but the writer
writes out only the ones that are used.
This simplification code brought a lot of improvement when compared to the
old method. It managed to get rid of expression trees like ((X - Y) + Y),
etc. It also reduced the number of emitted mapping regions in some specific
test cases, because now those regions had the same counters (they where
found to be the same because they contained the same simplified expression).
The total reduction of expressions for a function with this simplification
method and the writer only writing the used expressions is very significant
- one test case went from 60 something to around 3 or 4 serialized
expressions.
The unused entries make up the majority of expressions in the builder, so
the additional filtering code in the writer is necessary.


> +
> +void CounterMappingContext::dump(const Counter &C,
> +                                 llvm::raw_ostream &OS) const {
> +  switch (C.getKind()) {
> +  default:
>
>
> Shouldn’t this be “case Counter::Zero”?
>

Thanks, It should, the default case was from a very early version when I
had multiple regions types with zero counters.


>
>

> +    OS << '0';
> +    return;
> +  case Counter::CounterValueReference:
> +    OS << '#' << C.getCounterID();
> +    break;
> +  case Counter::SubtractionExpression:
> +  case Counter::AdditionExpression: {
> +    if (C.getExpressionID() >= Expressions.size())
> +      return;
> +    const auto &E = Expressions[C.getExpressionID()];
> +    OS << '(';
> +    dump(E.LHS);
> +    OS << (C.getKind() == Counter::SubtractionExpression ? " - " : " + ");
> +    dump(E.RHS);
> +    OS << ')';
> +    break;
> +  }
> +  }
> +  if (CounterValues.empty())
> +    return;
> +  std::error_code Error;
> +  auto Value = evaluate(C, Error);
> +  if (Error)
> +    return;
> +  OS << '[' << Value << ']';
> +}
> +
> +int64_t CounterMappingContext::evaluate(const Counter &C,
> +                                        std::error_code *EC) const {
> +  switch (C.getKind()) {
> +  case Counter::Zero:
> +    return 0;
> +  case Counter::CounterValueReference:
> +    if (C.getCounterID() >= CounterValues.size()) {
> +      if (EC)
> +        *EC = std::make_error_code(std::errc::argument_out_of_domain);
> +      break;
> +    }
> +    return CounterValues[C.getCounterID()];
> +  case Counter::SubtractionExpression:
> +  case Counter::AdditionExpression: {
> +    if (C.getExpressionID() >= Expressions.size()) {
> +      if (EC)
> +        *EC = std::make_error_code(std::errc::argument_out_of_domain);
> +      break;
> +    }
> +    const auto &E = Expressions[C.getExpressionID()];
> +    auto LHS = evaluate(E.LHS, EC);
> +    if (EC && *EC)
> +      return 0;
> +    auto RHS = evaluate(E.RHS, EC);
> +    if (EC && *EC)
> +      return 0;
> +    return C.getKind() == Counter::SubtractionExpression ? LHS - RHS
> +                                                         : LHS + RHS;
> +  }
> +  }
> +  return 0;
> +}
>
>
> Responding to your email about the header file review:
>
> Ditto for the CounterMappingContext. Let’s come back to that later.
>>
>
> Okay. However, it's used in the implementation of the C api in this
> library.
>
>
> Good point. I’m not strongly opposed to keeping this here for now.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140627/82fe4ad1/attachment.html>


More information about the llvm-commits mailing list