[llvm-dev] What semantics does lli provide?
Wilfred Hughes via llvm-dev
llvm-dev at lists.llvm.org
Tue Sep 29 15:39:33 PDT 2015
The behaviour of lli surprised me recently when investigating
out-of-bounds memory access in some LLVM IR.
Consider the code:
declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
define i32 @main() {
entry:
%cells = alloca i8, i32 9
%offset_cell_ptr = getelementptr i8* %cells, i32 0
call void @llvm.memset.p0i8.i32(i8* %offset_cell_ptr, i8 0, i32
9, i32 1, i1 true)
%cell_index_ptr = alloca i32
store i32 0, i32* %cell_index_ptr
; This is an out-of-bounds memory access.
%target_cell_ptr = getelementptr i8* %cells, i32 -1
%target_cell_val = load i8* %target_cell_ptr
%new_target_val = add i8 %target_cell_val, 11
store i8 %new_target_val, i8* %target_cell_ptr
; This memory access is in bounds.
%cell_index5 = load i32* %cell_index_ptr
%current_cell_ptr6 = getelementptr i8* %cells, i32 %cell_index5
; Removing this line stops lli segfaulting:
store i8 0, i8* %current_cell_ptr6
ret i32 0
}
Running the file as follows:
$ lli example.ll
produces a segfault. However, I was surprised to see that removing the
last store is sufficient to prevent the segfault.
I was further surprised to see that *replacing a variable with its
value* also prevents the segfault:
; Changing this line prevents the segfault, even though it should
be equivalent.
%current_cell_ptr6 = getelementptr i8* %cells, i32 0
It looks to me like lli is doing some amount of DCE rather than simply
executing my code as written.
Looking at the man page for lli, I found the -force-interpreter=true
option. This enables me to reduce my test case whilst preserving the
behaviour:
declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
define i32 @main() {
entry:
%cells = alloca i8, i32 9
%offset_cell_ptr = getelementptr i8* %cells, i32 0
call void @llvm.memset.p0i8.i32(i8* %offset_cell_ptr, i8 0, i32
9, i32 1, i1 true)
%cell_index_ptr = alloca i32
store i32 0, i32* %cell_index_ptr
; This is an out-of-bounds memory access.
%target_cell_ptr = getelementptr i8* %cells, i32 -1
%target_cell_val = load i8* %target_cell_ptr
%new_target_val = add i8 %target_cell_val, 11
store i8 %new_target_val, i8* %target_cell_ptr
ret i32 0
}
This still requires the store instruction to produce a segfault. In an
ideal world, I'd like `%target_cell_val = load i8* %target_cell_ptr`
to generate an error and not the later store instruction.
The problem with the existing behaviour is that it makes it hard to
bisect a piece of faulty IR to find the source of a problem.
So, my questions are:
(1) Is this expected behaviour of lli? Have I missed the documentation
that describes this?
(2) Is this a sane way to debug LLVM IR?
Wilfred
More information about the llvm-dev
mailing list