[llvm-dev] Bugpoint Redesign

Fri Jun 7 15:01:43 PDT 2019

On 6/7/19 4:19 PM, Diego Treviño via llvm-dev wrote:
Hey all,

I wanted to share a proposal<https://docs.google.com/document/d/171ecPTeXw68fbCghdGw_NPBouWvmvUX8vePlbhhHEdA/edit?usp=sharing> to revamp the current go-to IR debugging tool: Bugpoint. i'd love to hear any feedback or general thoughts.

Here's the markdown version of the doc:
---
# Bugpoint Redesign
Author: Diego Treviño (diegotf at google.com<mailto:diegotf at google.com>)

Date: 2016-06-05

Status: Draft

## Introduction
As use of bugpoint has grown several areas of improvement have been identified through years of use: confusing to use, slow, it doesn’t always produce high quality test cases, etc. This document proposes a new approach with a narrower focus: minimization of IR test cases.

## Proposed New Design

### Narrow focus: test-case reduction
The main focus will be a code reduction strategy to obtain much smaller test cases that still have the same property as the original one. This will be done via classic delta debugging and by adding some IR-specific reductions (e.g. replacing globals, removing unused instructions, etc), similar to what already exists, but with more in-depth minimization.

Granted, if the community differs on this proposal, the legacy code could still be present in the tool, but with the caveat of still being documented and designed towards delta reduction.

### Command-Line Options
We are proposing to reduce the plethora of bugpoint’s options to just two: an interesting-ness test and the arguments for said test, similar to other delta reduction tools such as CReduce, Delta, and Lithium; the tool should feel less cluttered, and there should also be no uncertainty about how to operate it.

The interesting-ness test that’s going to be run to reduce the code is given by name:
        `--test=<test_name>`
If a `--test`  option is not given, the program exits; this option is similar to bugpoint’s current `-compile-custom` option, which lets the user run a custom script.

The interesting-ness test would be defined as a script that returns 0 when the IR achieves a user-defined behaviour (e.g. failure to compile on clang) and a nonzero value when otherwise. Leaving the user the freedom to determine what is and isn’t interesting to the tool, and thus, streamlining the process of reducing a test-case.

If the test accepts any arguments (excluding the input ll/bc file), they are given via the following flag:
        `--test_args=<test_arguments>`
If unspecified, the test is run as given. It’s worth noting that the input file would be passed as a parameter to the test, similar how `-compile-custom` currently operates.

### Implementation
The tool would behave similar to CReduce’s functionality in that it would have a list of passes that try to minimize the given test-case. We should be able to modularize the tool’s behavior, as well as making it easier to maintain and expand.

The first version of this redesign would try to:

* Split the code into chunks and discard those that fail the given test
* Discard functions, instructions and metadata that don’t influence the interesting-ness test
* Remove unused parameters from functions
* Eliminate unvisited conditional paths

Do you plan on doing this by running with instrumentation?

* Rename variables to more regular ones (such as “a”, “b”, “c”, etc.)

Great.

Once these passes are implemented, more meaningful reductions (such as type reduction) would be added to the tool, to even further reduce IR.

Thanks for taking this on. We will all benefit from improvements to bugpoint.

One concern that I have is that, from personal experience, the ability for bugpoint to reduce the set of optimization passes applied in order to reproduce a bug is extremely helpful. I understand your desire to decouple the logic somewhat, and maybe there's some way to generalize that functionality by enabling simultaneous delta reduction on some secondary inputs (some of which may happen to be a pass list), but I'd like to see us somehow retain that capability to isolate the problematic set of transformations.

bugpoint currently has the ability to debug miscompiles by splitting the code into a "known good" set of functions (which is puts into a separate library) and the remainder of the code (which, in theory, is the smallest part of the code necessary to reproduce the bug). This has also proven useful in the past. Is this something you intend to keep?

My largest set of problems with bugpoint has been that bugpoint's logic for doing things like loop extraction will themselves often crash, and also, there's no easy way to start with multiple input files (e.g., what you start with from a program with multiple source files).

 -Hal

## Background on historical bugpoint issues

### Root Cause Analysis
Presently, bugpoint takes a long time to find the source problem in a given IR file, mainly due to the fact that it tries to debug the input by running various strategies to classify the bug, which in turn run multiple optimizer and compilation passes over the input, taking up a lot of time. Furthermore, when the IR crashes, it tries to reduce it by performing some sub-optimal passes (e.g. a lot of unreachable blocks), and sometimes even fails to minimize at all.

### "Quirky" Interface
Bugpoint’s current interface overwhelms and confuses the user, the help screen alone ends up confusing rather providing guidance, as seen below:

![Bugpoint's help option showcase](https://lh6.googleusercontent.com/sbpaSVHzpVVZKKAgHL9gvfzTWdgh3ju0KiDYql6WmWZfDYrdauOJMcuo9PP_V1dq8JQfMHOSKTv3lJcSpVytUyU8r5tJ2KTlGB0b2ve7jsZ3nVX8K8ItAbsA0JWkFKw67VJnq99m)

And, not only are there numerous features and options, but some of them also work in unexpected ways and most of the time the user ends up using a custom script. Pruning and simplifying the interface will be worth considering in order to make the tool more useful in the general case and easier to maintain.

--
Cheers,
Diego

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190607/668db684/attachment.html>