[PATCH] D49243: [X86] Improved sched models for X86 BT*rr instructions
Clement Courbet via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 18 08:22:58 PDT 2018
courbet added inline comments.
================
Comment at: lib/Target/X86/X86SchedHaswell.td:632
}
def : InstRW<[HWWriteBTRSCmr], (instregex "BT(R|S|C)(16|32|64)mr")>;
----------------
craig.topper wrote:
> RKSimon wrote:
> > @craig.topper @courbet @gchatelet These look completely wrong (and BTmr above) - and Broadwell appears to be missing them as well - any suggestions for the bit tests memory cases?
> Skylake doesn't even have an InstRW for them.
>
> They're also missing from the copy of the database used by IACA that I have. I believe that's where Gadi got most of the info from. I wonder what IACA does if you feed it those instructions.
I can't tell for latencies because we do not support memory operands yet.
For uops, I have working support in this patch:
https://reviews.llvm.org/D48935
On haswell, this gives:
```
---
mode: uops
key:
instructions:
- 'BTC64mr RDI i_0x1x i_0x0x R9'
- 'BTC64mr RDI i_0x1x i_0x64x RBX'
- 'BTC64mr RDI i_0x1x i_0x128x RSI'
- 'BTC64mr RDI i_0x1x i_0x192x RCX'
- 'BTC64mr RDI i_0x1x i_0x256x R8'
- 'BTC64mr RDI i_0x1x i_0x320x RDX'
config: ''
cpu_name: haswell
llvm_triple: x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
- { key: '3', value: 1.3771, debug_string: HWPort0 }
- { key: '4', value: 1.8848, debug_string: HWPort1 }
- { key: '5', value: 1.3687, debug_string: HWPort2 }
- { key: '6', value: 0.728, debug_string: HWPort3 }
- { key: '7', value: 1.0025, debug_string: HWPort4 }
- { key: '8', value: 1.6272, debug_string: HWPort5 }
- { key: '9', value: 2.1307, debug_string: HWPort6 }
- { key: '10', value: 0.0002, debug_string: HWPort7 }
error: ''
info: instruction is parallel, repeating a random one.
assembled_snippet: 5349C7C10100000048C7C30100000048C7C60100000048C7C10100000049C7C00100000048C7C2010000004C0FBB0F480FBB5F40480FBBB780000000480FBB8FC00000004C0FBB8700010000480FBB97400100004C0FBB0F480FBB5F40480FBBB780000000480FBB8FC00000004C0FBB8700010000480FBB97400100004C0FBB0F480FBB5F40480FBBB780000000480FBB8FC00000004C0FBB8700010000480FBB97400100004C0FBB0F480FBB5F40480FBBB780000000480FBB8FC00000005BC3
...
```
Other instructions are similar.
This is a bit noisy unfortunately. This looks like 2*P23 (or maybe P23 + P237, P7 being unused for some reason ?) + 7*P0156 + P4.
https://reviews.llvm.org/D49243
More information about the llvm-commits
mailing list