<table border="1" cellspacing="0" cellpadding="8">

    <tr>

        <th>Issue</th>

        <td>

            <a href=https://github.com/llvm/llvm-project/issues/115333>115333</a>

        </td>

    </tr>

    <tr>

        <th>Summary</th>

        <td>

            [AArch64] BOLT does not support SPE branch data

        </td>

    </tr>

    <tr>

      <th>Labels</th>

      <td>

            BOLT

      </td>

    </tr>

    <tr>

      <th>Assignees</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Reporter</th>

      <td>

          paschalis-mpeis

      </td>

    </tr>

</table>

<pre>

    ## Purpose

In some upcoming Linux kernel release, perf will gain the ability to process additional branch entries of

Arm's Statistical Profiling Extension ([SPE](https://developer.arm.com/documentation/101136/22-1-3/MAP/Arm-Statistical-Profiling-Extension--SPE-)). ATM, those changes are part of [perf-tools-next](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-tools-next).

The purpose of this issue is to raise awareness to the community that BOLT will need some changes to be able to take advantage of this additional information. Below, I briefly explain what this extra information is, how to check one is on the right version, and some pointers of how it may be used in the future.

## What extra branch information will be provided

With the upcoming update, perf will gain the ability to process [all sampled branches](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=c1b67c85108f99af0a80aa9e59a2b94ad95428d7) and not just the branch misses. Each branch sample will be pointing to the [correct target](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=19966d792b9e6b055aeb2f0e573b4d4573d5e15b) (ip → to_ip).

In short, with SPE one can get a statistical sample of the branches, where each event will have a Source and a Destination (fields ip → addr).

## Ensuring the updated perf version is used

Currently, one can use the updated perf by compiling [perf-tools-next](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-tools-next).

The relevant patches are merged, which are:   [#patch1](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=19966d792b9e6b055aeb2f0e573b4d4573d5e15b),  [#patch2](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=c1b67c85108f99af0a80aa9e59a2b94ad95428d7), [#patch3](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=edff8dad3f9a483259140fb814586b39da430a38), [#patch4](https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/commit/?h=perf-tools-next&id=35f5aa9ccc83f4a4171cdb6ba023e514e2b2ecff).

A couple of checks need to be done:

1. Verify if the machine has SPE available. If not, enable SPE (see [guide](https://developer.arm.com/documentation/ka005362/1-0)).

2. Confirm perf is on the correct version and can process all branches. It should report 'branch' instead of 'branch-miss' events.

Both checks can be performed using:

```bash

perf record -e arm_spe/branch_filter=1/u 2>/dev/null -- ls . \

        && (perf script -F event,ip,addr 2> /dev/null | grep -v branch-miss -q \

        && echo "SUCCESS: perf can process all SPE branch samples" \

        || echo "FAILURE: perf cannot process all SPE branch samples") \

        || echo "FAILURE: SPE is not available"

```

## How the additional information may be used

There can be two approaches of supporting SPE in BOLT. Some relevant points of how this information may be aligned to existing BOLT infrastructure are discussed briefly below.

### Using the BasicAggregation format:

For this format, one could parse the source and destination branches and create two simple events. Those can then be processed by BOLT as normal, under the `-nl` flag. They will use an accurate sample of branch locations with hotness counts for the source and the destination blocks.

### Using the LBR format:

For this format, one could convert the source and destination branches into the LBR format. The trace will have a pretty shallow depth of 2, being: { 'destination', 'source' }, matching the expected branch history.

The `flags` field in `perf script` could be toggled on to distinguish between branch misses and hits. There seems to be no point in using the branch fall-through information, due to how SPE operates: in contrast with LBR, SPE is a non-contiguous statistical sampling. In other words, ignoring fall-throughs would be dropping information that BOLT cannot infer.

</pre>

<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUWM2O2zgSfhr2pSBDoizbOvTB7m5jG8hgG9OZneOAEksSJxSpJan-eftFkbLbTrKYmT0MskAQJxJZv19VfSXhveoN4i2rDqy6vxFzGKy7nYRvB6GVz8YJlb9prHy_ZbxkvISn2U3WI8vvWb5_NODtiDBPrR2V6eGTMvMbfEFnUINDjcIj43cwoevgVWkNvVAGwoAgGqVVeIdgYXK2Re9BSKmCskZoaJww7QBoglPowXZJ4d6NjG89PAcRlA-qFRqenO2UJu0PbwGNV9YA4ztWHZ6fHlh1z_huCGHyrNwzfmT8KPEFtZ3QrYQbV60d6Zlt5xENibWG8WORF0W5YfzIeVZkJePHn_ZPjB_3bswulGdn5dlZeZY9Pz1kjNeM1yvYf_6JAhAG6xHaQZgePQiHMAkXwHbAqgNFJwvWap8ZfAvfs7lXYZXCurKuZ_w4zQ3jR9-S8ZqizvgxHUin6Qi6bvm5kL5KL7UlKaw8Dqy8_9oAXq9SvNPfnweEKeWdLA6D8qC8nxGUp_w5oTyCeBUODeUx2Jjh1o7jbGKOBxHg8M9PnxMGDKJMwDkFJFhoCBIa42XxBUHIF2GC6D9UXsBDmc66MSZrBQfU9pWC_AiNU9jpd8C3SRPQXklxvIxvwYnLe6A83RnsK6lsB2y_gDXRJZsQ6lQ_BHhB5yMm7kCYxezJKhPQES6jABVgFO_kwuxRwoLwbg6zw6tILkX0K5mVLFqAfmlYjFGDVBYvSqJMV39VYYhiz8U2T1KEv1BerDoIrcGLcdIoF83o_xa4ERbiP_4b4jZKsvK-LZrNtt1VRb7r6lp0udjlQtRY1YI39VrIulrzndwyXsdsGBvg99mH6PISylF5j34FD6IdTs-Szx-RpfxRCBeksurQWuewDRCE6_HvKcE_GZOirjcbua15U-OmyatKYMO7HKtt2azlutqWssKiaigmjO_UBOyBs92G1RyC_U1NX5czde3BukDIeSVUPT89ROi3wkCPAQT4i_66xC5WIX6ghi4P6BCQ4owvaEKK7yBeEAQ829m1GLMk4B59UCbBm_Fdp1BLD1emCindh6VwVS8Pxs8u5ivin2AvE-iX6qSqpdJLt-5m59AE_U5GnhybPX57vXmnLjWl8fFDd2LqwTRQqSnCJALlIM6REV2PMqVDtQM9Y-UegNxhvIxHi_9TPJNXl37wH8mPv9CryI8LN8ofyQ2UXbeTQpZdLda7kld1sc67Zlesq92mKWsp1mUuyt23bqx_JDfKqquEqNu23ZXdWqyLbdHKZtOInJdYFWvkDce2675uhnto7bz0t0gCfKIniZFIa6ia0tFiBf9Cp7p3UKkXjqIdlEEYhI89VLwIpYnErOCxo9lEAUMTaQ29Z3znMU6bflYS_1du-kXkeVVuONHULF-IZjKRr-DOmk65MfW3DzJzmm-nhkl9mdrimXtrfe7tK3gMNCJmLcHhZF0AxrfpLeNbUMYHFDKy19PzjMYuvYyTwF_F-GDDcAou6aQJjI4ID0qYvTL9OcZsk6c_jfBDehQdcdhaJyFDEG78zU_I-DEp_q1TOqCjxsL4cQbOyocUSMaPZtYasgy0hxWw6u5kVc34hvENZSSK961TU4DsmMxn_I7G5h3NpCgQriWy7R30DifIXuDCfcj-_T0l2A4WGOfPv9zdPTw_U3eOSr8OP0Hkiq94xvm1wO0dqT4JPO4fP_3y88OlQCJEfygzMoU_JZauKx9p1hncJOA6V99huP8gWk009Luc_ZItn8ebwxM4wqsFMU3OijjlbAd-ngiGNKWjSSZuEyt4Jjb-MRaJ1p1JeVpTvlUqNO28scLxjUiO6dNuokznhA9ubom3x-EqlW9n7yNbTptFQ8vGdzg9Of2LP3GUg_Cq3fe9wz7pTlacYX60Ltm3PD_xlFhyk3ALV_EfHEpeMKhTnaYidihCiplXkaktFQif084pYgMwy0ZByCB33pPPgpLrRqHJhNlIdIkQb_LMaLbJodOiJ1H4ntgd0ShhQLTt7EjvBztcYKZtG630iVsONsSdsLUz5aaz7mvP6L9X3mnbfvF_GOJPh5__UlRba17QhT8VV2WWxeBDSYwBBCdavKK5k8MQ3sEPQmv7ChKnMFA0OGlvMPU2YNsDtcoLZYxv4zjl22QMtU62vadnIw3Xk5v4NmEbzusaDMoH694vWCHb5JQkH7NFxJrKg23yi8ZGr1IUqLps39P-R4PBEsKpAmblB2gwvCKa60UqRmlQCVBUpB5xPG3sxqaiI5XzOTXL_U5onYXB2bm_WnDJRznHTZ_qNG4fExKaaA6SqNaaQKWYIPTp8DNdWXqRAGNNRidUP9vZf7urKNOv4NGADQM6eLVOxm1F9cbGJeLSLg-vp7hIZ6eJ3l_2jI8vF0tzVaZDt0T_Rt6Wsi5rcYO3xbbM63q921Y3wy3yphZcVk0hO7nOt7kQO2x4haIUu2rDb9Qtz_m6KPJtUVXrslrtmrLDjaw6RLHhRcnWOY5C6ZXWLyMxqZv4weW2KKqyLG-0aFD7-NmOczKPmnJ1f-Nu6XzWzL1n61wrH_yHhKCCjp_69nvXDhuib8k1aTH1-KXNXk4NKYK4mZ2-_YbmDXOzEBTSsPxkk7O_Y0t8LRrsiaQkm19u-X8CAAD__2_ssyc">