MLIR Reduce Design
Last Updated: 06/25/2020
This document aims to serve as a guide for the development of a test case reducer tool for the Multi-Level Intermediate Representation compiler framework. It provides a general list of objectives, detailed design aspects and a guide for collaboration.
Test case reducers are important tools in the debugging process of any compiler framework. They serve to reduce the size of a piece of input code that triggers a defined interesting behavior (i.e. a bug) in a specific compiler pass. This speeds up the debugging process by producing more concise test cases for bug reporting.
The LLVM Project currently contains two test case reducing tools at the LLVM level: Bugpoint and LLVM Reduce. The latter was developed with the purpose of being later integrated into Bugpoint. The current MLIR Reduce design adopts important ideas from LLVM Reduce and the mlir-opt tool. It utilizes a similar framework and provides a similar user experience to that of LLVM Reduce and it implements sequential independent passes like the ones in mlir-opt.
The MLIR Reduce tool will attempt to reduce large MLIR test cases into smaller ones while preserving their interesting behavior. The tool will accept a MLIR file containing the initial test case along with a shell script containing the bug reproduction commands and respective checks for the interesting behavior as inputs. It will execute multiple reduction passes aimed at narrowing down different components of the test case with the final goal of outputting a minimized version.
To build the MLIR Reduce Tool we will need:
- Parse the command line arguments
- Parse the MLIR test case into a module
- Set up the testing environment
- Call the different reduction passes
- Output the final reduced test case
- Provide an interface for running the testing script on a MLIR File
- Keep track of the most reduced test case version
- Reduction Passes
- Multiple passes that perform transformations on a given test case
- Include a method to run the Tester’s testing script on an MLIR module
- Include a method to order two test case variants in shortlex order
- Include other utilities for the transformations required by the Reducer Tool that are not currently available in the MLIR framework
- Bug Producer Compiler Pass
- An MLIR compiler pass that can be called using mlir-opt to produce different types of bugs for the purpose of testing the reducer tool
The implementation of MLIR Reduce will follow a set of general guidelines to standardize the behavior of the different reduction passes and the behavior of the tool as a whole.
Total Test Case Ordering
It is important to define a total ordering between all the different reduced test case variants created to determine if the reducer is making progress. For the purpose of standardizing the reduction process, the test cases will be ordered using shortlex ordering where smaller test cases as measured by number of bytes are preferred. In the case that two test cases are of the same size, the number of lines will be used as a tiebreaker. If the number of lines are also the same, the test case that comes first in lexicographical order is preferred.
Each reduction pass will be independent from each other and will have a single well defined reduction objective. The reduction objective for the pass should describe the transformations being performed on the test case and the hierarchical level at which they are performed. The proper definition of these objectives will prevent two different passes from performing transformations with a similar goal. Each pass will accept a Tester object containing the most reduced test case version and an interface to run the interestingness test on the generated reduced variants of the test case as input. This modular approach will allow for different algorithms to be implemented for different types of transformations. The passes will iteratively perform reduction transformations to the test case until a stopping condition is met.
No specific pass ordering should be assumed when implementing a reduction pass. At first, passes will be performed sequentially in an order defined by hierarchy. Passes with transformations at a higher level will be performed first. The idea behind this is that transformation performed at a higher level might be more effective in reducing significant portions of the IR. In the future, there might be an incentive to consider the reordering and repetition of certain passes since some passes may influence the efficiency of others.
The most significant cost factor to be considered when designing and implementing a pass is the number of times the testing script is invoked. Since the tool accepts any type of testing script, the time complexity of the testing script will significantly impact the running time of the tool.
Command Line Arguments
The MLIR Reduce tool shall accept the following command line arguments:
- The file name for the MLIR test case
- The file name for the interestingness testing script
- The interestingness script arguments
- An optional file name to which the reduced test case should be outputted to
- An optional flag to apply a canonicalization compiler pass to the initial test case
- Optional flags to apply only specific reduction passes
- An optional flag to enable the concurrency in the reduction passes
The MLIR Reduce tool will initiate a LLVM environment and parse the MLIR input test case into a module to which transformations will be applied. A canonicalization pass shall be applied to the module if the respective flag is specified. A Tester object will be created with the testing script, the testing script arguments and the original test case module.
The Tester class will be responsible for checking the interesting behavior in the reduced variants. It will contain:
- A pointer to a MLIR ModuleOp object which will reference the most reduced test case variant up to that point
- A setter method to update the most reduced test case variant
- A getter method to fetch the most reduced test case variant
- A method that will run the testing script on a specified MLIR file containing a test case variant
The testing scripts shall contain the commands necessary to reproduce the bug with the input test case as the first argument and the respective checks to detect the presence of the interestingness behavior in the output. The testing script shall return a 0 if the test case produces the interesting behavior or a 1 otherwise.
The following example runs the “bug-producer” compiler pass using the mlir-opt utility and inspects the output for the presence of the keyword “BUG”:
#!/bin/bash ./bin/mlir-opt $1 -bug-producer | grep 'BUG' &> /dev/null if [ $? == 0 ]; then #Interesting behavior exit 0 else exit 1 fi
Each pass shall have a well defined reduction objective for the purpose of preventing duplicate work. The pass shall attempt to reduce a specific component of the IR at a specific level.
The reduction algorithm will vary for each specific pass depending on the different objectives and transformations performed. However, each pass should follow the following structure:
- Transformation: A specific type of transformation shall be implemented to achieve the reduction goal for the given reduction pass. This transformation shall be applicable to different sections of the test case and should ultimately be able to reduce the size as defined by the number of bytes of the given sections.
- Variant Creation: The transformations shall be applied to different sections of the test case to create new variants to which the same transformation can be performed subsequently at a higher granularity. All these test case variants shall be organizable in a “reduction tree” structure in which the root is the initial test case, the leafs of each node are the different variants created by performing the defined transformation to different sections of that node and the most reduced variant is a leaf for which no new variants can be created.
- Variant Iteration: At any given level, the different variants are tested to determine if they exhibit the interesting behavior. Those that don’t are discarded and the remaining ones are ordered using the total test case ordering defined above. The algorithm can then create new variants using the smallest remaining variant as a starting point or inspect multiple paths in the reduction tree. Backtracking can also be implemented if the reducer is not making sufficient progress on the current branch being evaluated.
- Stopping Function: Each pass shall have a well defined stopping function that ensures the polynomial time completion of the algorithm.
The following example describes the ReduceOp pass which has a main objective reducing the number of operations in a module.
- Transformation: Elimination of all operations in a given section of the module.
- Variant Creation: The current test case module will be bisected and two variants will be created by applying the transformation to either the first section or the second section.
- Variant iteration: The smallest successful variant will now serve as a starting point for future variant creation.
- Stopping condition: The algorithm stops when no new successful variants can be created from the current test case module.
Given the specified pass framework. Pessimistic concurrency can be implemented to enable the concurrent traversal of the reduction tree. A general pessimistic concurrency algorithm can be implemented and later integrated to each reduction pass without modifying their behavior. This concurrency will be optional in the running of the tool since specific bugs might be triggered because of the concurrency in the compiler passes and the concurrent execution of MLIR Reduce might introduce race conditions to the testing environment.