[PM] Flesh out almost all of the late loop passes.

With this the per-module pass pipeline is *extremely* close to the legacy PM. The missing pieces are: - PruneEH (or some equivalent) - ArgumentPromotion - LoopLoadElimination - LoopUnswitch I'm going to work through those in essentially that order but this seems like a worthwhile incremental step toward the end state. One difference in what I have here from the legacy PM is that I've consolidated some of the per-function passes at the very end of the pipeline into the main optimization function pipeline. The intervening passes are *really* uninteresting and so this seems very likely to have any effect other than minor improvement to locality. Note that there are still some failures in the test suite, but the compiler doesn't crash or assert. Differential Revision: https://reviews.llvm.org/D29114 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293241 91177308-0d34-0410-b5e6-96231b3b80d8
author: Chandler Carruth <chandlerc@gmail.com> 2017-01-27 00:50:21 +0000
committer: Chandler Carruth <chandlerc@gmail.com> 2017-01-27 00:50:21 +0000
commit: f8e4cd6c85e17f68316b93b556f6885e759f3539 (patch)
tree: 576a1a02c93f77ad56ccbbe09652c5cf94d3b7c8 /lib/Passes
parent: 49c910dae19133b77acefc094e9b203dcaf609a2 (diff)
1 files changed, 37 insertions, 9 deletions
diff --git a/lib/Passes/PassBuilder.cpp b/lib/Passes/PassBuilder.cpp
index fcfe3ce9b35e..d9ce15dc377a 100644
--- a/lib/Passes/PassBuilder.cpp
+++ b/lib/Passes/PassBuilder.cpp
@@ -491,32 +491,60 @@ PassBuilder::buildPerModuleDefaultPipeline(OptimizationLevel Level,
   // Optimize the loop execution. These passes operate on entire loop nests
   // rather than on each loop in an inside-out manner, and so they are actually
   // function passes.
+
+  // First rotate loops that may have been un-rotated by prior passes.
+  OptimizePM.addPass(createFunctionToLoopPassAdaptor(LoopRotatePass()));
+
+  // Distribute loops to allow partial vectorization.  I.e. isolate dependences
+  // into separate loop that would otherwise inhibit vectorization.  This is
+  // currently only performed for loops marked with the metadata
+  // llvm.loop.distribute=true or when -enable-loop-distribute is specified.
   OptimizePM.addPass(LoopDistributePass());
+
+  // Now run the core loop vectorizer.
   OptimizePM.addPass(LoopVectorizePass());
+
   // FIXME: Need to port Loop Load Elimination and add it here.
+
+  // Cleanup after the loop optimization passes.
   OptimizePM.addPass(InstCombinePass());
 
+
+  // Now that we've formed fast to execute loop structures, we do further
+  // optimizations. These are run afterward as they might block doing complex
+  // analyses and transforms such as what are needed for loop vectorization.
+
   // Optimize parallel scalar instruction chains into SIMD instructions.
   OptimizePM.addPass(SLPVectorizerPass());
 
-  // Cleanup after vectorizers.
+  // Cleanup after all of the vectorizers.
   OptimizePM.addPass(SimplifyCFGPass());
   OptimizePM.addPass(InstCombinePass());
 
   // Unroll small loops to hide loop backedge latency and saturate any parallel
-  // execution resources of an out-of-order processor.
-  // FIXME: Need to add once loop pass pipeline is available.
-
-  // FIXME: Add the loop sink pass when ported.
-
-  // FIXME: Add cleanup from the loop pass manager when we're forming LCSSA
-  // here.
+  // execution resources of an out-of-order processor. We also then need to
+  // clean up redundancies and loop invariant code.
+  // FIXME: It would be really good to use a loop-integrated instruction
+  // combiner for cleanup here so that the unrolling and LICM can be pipelined
+  // across the loop nests.
+  OptimizePM.addPass(createFunctionToLoopPassAdaptor(LoopUnrollPass::create()));
+  OptimizePM.addPass(InstCombinePass());
+  OptimizePM.addPass(createFunctionToLoopPassAdaptor(LICMPass()));
 
   // Now that we've vectorized and unrolled loops, we may have more refined
   // alignment information, try to re-derive it here.
   OptimizePM.addPass(AlignmentFromAssumptionsPass());
 
-  // ADd the core optimizing pipeline.
+  // LoopSink pass sinks instructions hoisted by LICM, which serves as a
+  // canonicalization pass that enables other optimizations. As a result,
+  // LoopSink pass needs to be a very late IR pass to avoid undoing LICM
+  // result too early.
+  OptimizePM.addPass(LoopSinkPass());
+
+  // And finally clean up LCSSA form before generating code.
+  OptimizePM.addPass(InstSimplifierPass());
+
+  // Add the core optimizing pipeline.
   MPM.addPass(createModuleToFunctionPassAdaptor(std::move(OptimizePM)));
 
   // Now we need to do some global optimization transforms.
author	Chandler Carruth <chandlerc@gmail.com>	2017-01-27 00:50:21 +0000
committer	Chandler Carruth <chandlerc@gmail.com>	2017-01-27 00:50:21 +0000
commit	f8e4cd6c85e17f68316b93b556f6885e759f3539 (patch)
tree	576a1a02c93f77ad56ccbbe09652c5cf94d3b7c8 /lib/Passes
parent	49c910dae19133b77acefc094e9b203dcaf609a2 (diff)