[PATCH] [InstructionMerge - GVN] hoisting and sinking of equivalent memory instructions in diamonds

Gerolf Hoflehner ghoflehner at apple.com
Tue Apr 29 12:31:23 PDT 2014


Hi,

the attached patch iteratively hoists two loads to the same address out of a diamond (hammock) and merges them
into a single load in the header. Similar it sinks and merges two stores to the tail block. The algorithm
iterates over the instructions of one side of the diamond and attempts to find a matching load/store on
the other side. It hoists / sinks when it thinks it safe to do so. It runs as part of GVN in the GVN preparation
loop.  I tailored the code as conservative as possible to catch the initial cases we are interested in, which 
keeps code size and complexity in check. The optimization helps hiding load latencies and triggering if-conversion.

The optimization gives gains on some internal benchmarks and about ~2% on SPEC mcf.

I have not measured a sizable compiler-time impact.

Cheers
Gerolf



Example for the optimization from 181.mcf, refresh_potential():

IR before opt 
while.body6:                                      ; preds = %while.body6.lr.ph, %if.end
  %checksum.152 = phi i64 [ %checksum.057, %while.body6.lr.ph ], [ %checksum.2, %if.end ]
  %node.251 = phi %struct.node* [ %node.155, %while.body6.lr.ph ], [ %12, %if.end ]
  %orientation = getelementptr inbounds %struct.node* %node.251, i64 0, i32 7
  %3 = load i64* %orientation, align 8, !tbaa !13
  %cmp7 = icmp eq i64 %3, 1
  br i1 %cmp7, label %if.then, label %if.else

if.then:                                          ; preds = %while.body6
  %basic_arc = getelementptr inbounds %struct.node* %node.251, i64 0, i32 8
  %4 = load %struct.arc** %basic_arc, align 8, !tbaa !14
  %cost = getelementptr inbounds %struct.arc* %4, i64 0, i32 4
  %5 = load i64* %cost, align 8, !tbaa !15
  %pred = getelementptr inbounds %struct.node* %node.251, i64 0, i32 2
  %6 = load %struct.node** %pred, align 8, !tbaa !17
  %potential8 = getelementptr inbounds %struct.node* %6, i64 0, i32 11
  %7 = load i64* %potential8, align 8, !tbaa !11
  %add = add nsw i64 %7, %5
  %potential9 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 11
  store i64 %add, i64* %potential9, align 8, !tbaa !11
  br label %if.end

if.else:                                          ; preds = %while.body6
  %pred10 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 2
  %8 = load %struct.node** %pred10, align 8, !tbaa !17
  %potential11 = getelementptr inbounds %struct.node* %8, i64 0, i32 11
  %9 = load i64* %potential11, align 8, !tbaa !11
  %basic_arc12 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 8
  %10 = load %struct.arc** %basic_arc12, align 8, !tbaa !14
  %cost13 = getelementptr inbounds %struct.arc* %10, i64 0, i32 4
  %11 = load i64* %cost13, align 8, !tbaa !15
  %sub = sub nsw i64 %9, %11
  %potential14 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 11
  store i64 %sub, i64* %potential14, align 8, !tbaa !11
  %inc = add nsw i64 %checksum.152, 1
  br label %if.end

if.end:                                           ; preds = %if.else, %if.then
  %checksum.2 = phi i64 [ %checksum.152, %if.then ], [ %inc, %if.else ]
  %child15 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 3
  %12 = load %struct.node** %child15, align 8, !tbaa !12
  %tobool = icmp eq %struct.node* %12, null
  br i1 %tobool, label %while.cond5.while.cond16.preheader_crit_edge, label %while.body6

IR after:
while.body6:                                      ; preds = %while.body6.lr.ph, %if.end
  %checksum.152 = phi i64 [ %checksum.057, %while.body6.lr.ph ], [ %checksum.2, %if.end ]
  %node.251 = phi %struct.node* [ %node.155, %while.body6.lr.ph ], [ %13, %if.end ]
  %orientation = getelementptr inbounds %struct.node* %node.251, i64 0, i32 7
  %3 = load i64* %orientation, align 8, !tbaa !13
  %cmp7 = icmp eq i64 %3, 1
  %4 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 8
  %5 = load %struct.arc** %4, align 8, !tbaa !14
  %6 = getelementptr inbounds %struct.arc* %5, i64 0, i32 4
  %7 = load i64* %6, align 8, !tbaa !15
  %8 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 2
  %9 = load %struct.node** %8, align 8, !tbaa !17
  %10 = getelementptr inbounds %struct.node* %9, i64 0, i32 11
  %11 = load i64* %10, align 8, !tbaa !11
  br i1 %cmp7, label %if.then, label %if.else

if.then:                                          ; preds = %while.body6
  %add = add nsw i64 %11, %7
  br label %if.end

if.else:                                          ; preds = %while.body6
  %sub = sub nsw i64 %11, %7
  %inc = add nsw i64 %checksum.152, 1
  br label %if.end

if.end:                                           ; preds = %if.else, %if.then
  %add.sink = phi i64 [ %sub, %if.else ], [ %add, %if.then ]
  %checksum.2 = phi i64 [ %checksum.152, %if.then ], [ %inc, %if.else ]
  %12 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 11
  store i64 %add.sink, i64* %12, align 8, !tbaa !11
  %child15 = getelementptr inbounds %struct.node* %node.251, i64 0, i32 3
  %13 = load %struct.node** %child15, align 8, !tbaa !12
  %tobool = icmp eq %struct.node* %13, null
  br i1 %tobool, label %while.cond5.while.cond16.preheader_crit_edge, label %while.body6
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140429/dabfd8a9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: instruction_merge.patch
Type: application/octet-stream
Size: 15245 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140429/dabfd8a9/attachment.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140429/dabfd8a9/attachment-0001.html>


More information about the llvm-commits mailing list