RKSimon added a comment. I think this might have been to avoid the high cost of calling combineX86ShufflesRecursively again - can you see any compile time diffs? Possibly from one of the vector-shuffle-*.ll test files? https://reviews.llvm.org/D49569