[llvm] [X86][MC] Reject out-of-range segment and debug registers encoded with APX (PR #82584)
Timothy Herchen via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 21 21:53:47 PST 2024
https://github.com/anematode created https://github.com/llvm/llvm-project/pull/82584
Fixes #82557. APX specification states that the high bits found in REX2 used to encode GPRs can also be used to encode segment and debug registers, although all of them will #UD. Therefore, when disassembling we reject attempts to create segment or debug registers with a value of 16 or more.
See page 22 of the [specification](https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html):
> Note that the R, X and B register identifiers can also address non-GPR register types, such as vector registers, control registers and debug registers. When any of them does, the highest-order bits REX2.R4, REX2.X4 or REX2.B4 are generally ignored, except when the register being addressed is a control or debug register. [...] The exception is that REX2.R4 and REX2.R3 [*sic*] are not ignored when the R register identifier addresses a control or debug register. Furthermore, if any attempt is made to access a non-existent control register (CR*) or debug register (DR*) using the REX2 prefix and one of the following instructions:
“MOV CR*, r64”, “MOV r64, CR*”, “MOV DR*, r64”, “MOV r64, DR*”. #UD is raised.
The invalid encodings are 64-bit only because `0xd5` is a valid instruction in 32-bit mode.
>From 17603fa1c931388808768ee1a458ceda5d8590f4 Mon Sep 17 00:00:00 2001
From: Timothy Herchen <timothy.herchen at gmail.com>
Date: Wed, 21 Feb 2024 21:37:43 -0800
Subject: [PATCH] [X86][MC] Reject out-of-range segment and debug registers
encoded with APX
APX specification states that the high bits found in REX2 used to encode GPRs can also be used to encode segment and debug registers, although all of them will #UD. Therefore, when disassembling we reject attempts to create segment or debug registers with a value of 16 or more.
---
llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp | 4 ++++
llvm/test/MC/Disassembler/X86/x86-64-err.txt | 4 ++++
2 files changed, 8 insertions(+)
diff --git a/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp b/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
index 5f852613610664..dbc2cef39d8682 100644
--- a/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
+++ b/llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp
@@ -819,8 +819,12 @@ static int readModRM(struct InternalInstruction *insn) {
*valid = 0; \
return prefix##_ES + (index & 7); \
case TYPE_DEBUGREG: \
+ if (index > 15) \
+ *valid = 0; \
return prefix##_DR0 + index; \
case TYPE_CONTROLREG: \
+ if (index > 15) \
+ *valid = 0; \
return prefix##_CR0 + index; \
case TYPE_MVSIBX: \
return prefix##_XMM0 + index; \
diff --git a/llvm/test/MC/Disassembler/X86/x86-64-err.txt b/llvm/test/MC/Disassembler/X86/x86-64-err.txt
index 3eca239e60f5c7..bd744790fe33d5 100644
--- a/llvm/test/MC/Disassembler/X86/x86-64-err.txt
+++ b/llvm/test/MC/Disassembler/X86/x86-64-err.txt
@@ -13,3 +13,7 @@
0xc4,0xe2,0xfd,0x1a,0x08
# 64: invalid instruction encoding
0xc4,0xe3,0xfd,0x39,0xc5,0x01
+# 64: invalid instruction encoding
+0xd5,0xc5,0x20,0xef
+# 64: invalid instruction encoding
+0xd5,0xc5,0x21,0xef
\ No newline at end of file
More information about the llvm-commits
mailing list