[LLVMdev] Lower CFI IDs Using Target Intrinsic
Joseph Battaglia
jbattagl at andrew.cmu.edu
Mon Mar 3 11:06:45 PST 2014
Hello,
I’m a newbie here, working on a project to enforce Control Flow Integrity (CFI) on programs compiled with LLVM. We’re using LLVM 3.3 so we can leverage poolalloc's dsa analysis. Ideally this will be as target-independent as possible, but our primary target is ARM. One of our passes requires inserting different i32 IDs at various points into the code we’re compiling. As far as I can tell, it’s impossible to with just LLVM IR, so we’re looking into ways of getting these IDs through the CodeGen.
One thing that looked promising is the function “prefix” value in LLVM 3.4, which is able to emit a global value into the asm. This is the right idea except we need it at arbitrary points in code. We then looked at defining a custom intrinsic function (@llvm.cfiid) that we can insert into the IR and then lower to assembly. It didn’t seem like this was exactly what we wanted either, because the asm that is generated has to be target dependent. We’ve checked out the poolalloc/safecoode projects and there’s some helpful analysis tools, but didn’t find anything relevant to ID lowering.
Our current thrust is to define a custom target intrinsic function (@llvm.arm.cfiid) that we can insert into the IR and lower using a definition in the ARMInstrInfo.td file. Right now, I’m trying to define the pattern and instruction in that file. At first, I just inserted a pattern to lower our intrinsic into a “trap” instruction, which worked fine:
/* Code in IR/IntrinsicsARM.td */
/* Note, I’m not positive that IntrNoReturn is correct here, but IntrNoMem type wouldn’t lower to an SDNode because of lack of “results” */
def int_arm_cfiid : Intrinsic<[], [llvm_i32_ty], [IntrNoReturn]>;
…
/* Code in Target/ARM/ARMInstrInfo.td */
def : Pat<(int_arm_cfiid (i32 imm)),
(TRAP)>;
...
Next, I’m trying to create my own “AXI” definition based on the TRAP definition, and then put that into the pattern. I admit that I don’t fully grok the tablegen syntax, so a lot of what I’ve been doing is trial and error, and based on examples in other *.td files.
Here’s what I think I’m shooting for...
/* Code in Target/ARM/ARMInstrInfo.td */
def ARMCFIID : AXI<(outs), (ins i32imm:$opt), MiscFrm, NoItinerary,
"cfiid", "\t$opt", [(int_arm_cfiid i32imm:$opt)]>,
Requires<[IsARM]> {
bits<32> opt;
let Inst{31-0} = opt;
}
...
I realize this is very wrong, but just to give you an idea of what I’m trying to do… basically take the i32 param of the intrinsic and encode it as a raw bytes. Obviously, this is broke…
TL;DR:
What’s the best way to lower an IR i32 into code as raw bytes?
If an Intrinsic is the answer, can it be done entirely in the TableGen files or do I need to do some SDNode stuff as well?
If a TargetIntrinsic is the answer, what’s the proper syntax to define an ARM Instruction and matching it with my intrinsic pattern?
Sorry if this is pretty basic stuff… I’ve been looking at the archives and couldn’t find any other threads that worked for me.
Also, I noticed that there is an llvm-devs google group as well. Is it faux-pas to cross-post to that list as well, or are these lists disjoint enough that it wouldn’t be spammy?
Thanks,
Joe
--
Joseph Battaglia
M.S. Information Security '14
Information Networking Institute
Carnegie Mellon University
jabat at cmu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140303/3ffac9e6/attachment.html>
More information about the llvm-dev
mailing list