[llvm-dev] Representations of IR in the output of opt

Fri May 24 12:52:55 PDT 2019

Hi LLVM,

I'm currently setting up some tools to investigate the influence of the 
order of optimization passes on the performance of compiled programs 
-nothing exceptional here.

I noticed something inconvenient with opt, namely that splitting a call 
does not always give the same output:

% llvm-stress > stress.ll
% opt -dse -verify -dce stress.ll -o stress-1.bc
% opt -dse stress.ll | opt -dce -o stress-2.bc
% diff stress-{1,2}.bc
Binary files stress-1.bc and stress-2.bc differ

The difference seems meaningful; it's ~180 bytes out of ~1400 bytes of 
output in my random case. I can't decode it however, because 
disassembling the bytecode produces identical text files, even with 
annotations. (!)

I made sure that the sequence for [-dse -verify -dce] is the 
concatenation of the individual sequences; this falls in place naturally 
because -dce has no dependencies. The verifier pass helps make two 
function pass managers, just in case.

Now if I do the same thing but staying in text format, I get the same IR 
(up to module name):

% opt -S -dse -verify -dce stress.ll -o stress-1.ll
% opt -S -dse stress.ll | opt -S -dce -o stress-2.ll
% diff -y --suppress-common-lines stress-{1,2}.ll
; ModuleID = 'stress.ll'	|	; ModuleID = '<stdin>'

Is there a specific behavior of opt that could explain this situation? 
What kind of difference could there be in the bytecode files that is 
lost in translation to text format ?

Cheers,
Sébastien Michelland