ASMGraph is a simple framework which tends to help compiler developers who work on RISC-V architecture.
This framework intends to:
- Visualize functions' assembly code through dot graphs
- Highlight executed and the hottest BBs in each function
- Extract singleton BBs from the whole function
- Find several instruction fusion cases like:
- Extending (SI->DI, HI->DI)
- Integer indexed loads
- Load with preincrement
- Load from constant addresses
- Address and constant formation
- Double address and constant formation
- Compute executed instructions count like
llvm-mca
The instrument can be easily integrated to CI/CD systems to evaluate changes that you made.
Note: To harness the full power of the instrument, you need the Qemu bbexec plugin.
To use instrument you need to have
- python3
- pip3
- dot
- libgirepository1.0-dev (only for GUI)
- libcairo2-dev (only for GUI)
- gir1.2-webkit-4.0 (only for GUI)
Suggested command line to install them
sudo apt-get install python3 python3-pip xdot libgirepository1.0-dev libcairo2-dev gir1.2-webkit-4.0
The tool also requires some python libraries that can be found in requirements.txt. They can be easily installed by the following command:
pip3 install -r requirements.txt
Note: Some of the modules may require additional tools such as cmake and pkg-config
asm_graph.py is the main script of the framework. It handles the core functionality of processing binary and assembly files, generating dot graphs, and performing various checks.
asm_graph.py (-a ASM | -b BIN -d OBJDUMP | --add_plugin PLUGIN_NAME PLUGIN_PATH) [OPTIONS]
-h, --helpShow this help message and exit-a ASM, --asm ASMPath to assembly file.-b BIN, --bin BINPath to the binary file.-d OBJDUMP, --objdump OBJDUMPPath to the disassembler (riscv-objdump).-f FUNC, --func FUNCThe name of the function that should be extracted. By default, will produce all functions from the text segment.-c BBEXEC, --bbexec BBEXECPath to the bbexec file.--dotCreate dot graphs for functions.--min_exec_count MIN_EXEC_COUNTMinimum number of times BB must be executed to process it with plugins.-s, --singletonsCollect singleton basic blocks into the singletons.xlsx.-o OUTPUT, --output OUTPUTThe name of the out directory. (by default:cwd/output)--run_pluginsRun the enabled plugins from plugins/plugins.json--add_plugin PLUGIN_NAME PLUGIN_PATHAdd a custom plugin. Provide plugin name and file path. File must contain a 'run' function with a 'Node' object as input (see plugins/example.py).
ASMGraph can be used in two modes, with and without bbexec file.
Providing bbexec file opens several opportunities like highlighting the hottest BBs
or computing instructions group like llvm-mca.
To run ASMGraph, you must specify either the assembly file or the binary and the disassembler path.
NOTE: If you disassemble yourself you must pass --no-show-raw-insn option to disassembler
- Here is an example to run ASMGraph through binary.
./asm_graph.py -b ./path/to/test.exe --objdump /full/path/to/riscv-**objdump -o output
In this case, the output directory will contain all function bodies extracted in separate files.
- To run ASMGraph through your disassembled file, run the following command.
./asm_graph.py -a ./path/to/test.asm -o output
- In order to visualize
my_functionrun the following command.
./asm_graph.py -a ./path/to/test.asm -f my_function --dot -o output
After this, the output directory will contain my_function.asm and my_function.dot files for function body and visualization files, respectively.
Unfortunately, this option may slow down execution performance. We set a time limit for each function in 10 minutes.
NOTE: If working on a function takes longer than the time limit, the function is added to the blacklist that is stored in functions_blacklist.json
- To apply execution info in visualization, you must also add
-coption.
./asm_graph.py -a ./path/to/test.asm -c ./path/to/test.bbexec -f my_function --dot -o output
BBs that did not execute at all will be colored in blue, and the most executed ones in dark red
- To run plugins you need to run the following command. Don't forget enable the necessary plugins in the
plugins/plugins.json
./asm_graph.py -a ./path/to/test.asm --run_plugins -o output
After the execution, you can find the test.xlsx file in the output directory which contains the results of each plugin separated into responding sheets.
SUGGESTION:
Run plugins with bbexec info to get only the actual and interesting cases.
./asm_graph.py -a ./path/to/test.asm -c ./path/to/test.bbexec --run_plugins -o output
ASMGraph gets rid of each BB which has less than 1M dynamic instruction.
To adjust that value use the --min_exec_count option. For example, to check only BBs that have at least 5M execution use this command line.
./asm_graph.py -a ./path/to/test.asm -c ./path/to/test.bbexec --run_plugins --min_exec_count 5000000 -o output
- In order to extract singleton BBs run the following command.
./asm_graph.py -a ./path/to/test.asm -s -o output
Suppose we have the following code
int main() {
int n, sum = 0;
scanf ("%d", &n);
for (int i = 0; i < n; i++) {
sum += i * 2;
}
printf("Sum is %d\n", sum);
return 0;
}Our goal is to understand / highlight which part of the code is the most executed fragment. For that purpose we compile, run and pass some inputs to our target program. After execution, we will get the Qemu bbexec file. Next we need to pass the target binary and the bbexec file to ASMGraph with the following command line.
./asm_graph.py -b ./path/to/binary -d /path/to/objdump -c ./path/to/target.bbexec -f main --dot -o output
After executing the instrument we can see the visualization like this one.
As we can see B3 is the most executed (hottest) basic block and B1 was not executed at all.
The cost or dynamic instruction count of B3 is 1497.
The .bbexec file contain information about the execution of BBs. To obtain the .bbexec file, you must apply the qemu.patch patch to the QEMU's sources which add the required plugin, and then build it.
sudo apt-get install build-essential libcairo2-dev libpango1.0-dev \
libjpeg-dev libgif-dev librsvg2-dev ninja-build
sudo apt install qemu-system-misc qemu-user-static binfmt-support
Next, apply the QEMU basic block execution plugin as a patch for the QEMU sources:
cd riscv-gnu-toolchain/qemu
git apply /path/to/qemu.patch
The project is located in the build directory:
./configure --prefix="$PWD/build" --enable-plugins
make
make plugins
After that, execute your binary file by QEMU with the follow steps:
- Navigate to the qemu directory.
- Run the command:
find . -iname “libbbexec.so” - Set the QEMU_PLUGIN environment variable to the path of
libbbexec.so:
export QEMU_PLUGIN="file=lib_bbexec_path" - Finally, execute your binary file using the appropriate QEMU architecture.
We provide a script that will help you with that issue.
evaluate_versions.py script intends to compare the performance of two compilers.
For example, you have *bbexec files gathered in dir_1 and dir_2 directories,
respectively for C1 and C2 compilers. To compare them just run the following command line.
./evaluate_versions.py --fd ./dir_1 --sd ./dir_2 --all
To compare individual .bbexec files use this command line:
./evaluate_versions.py --ff ./bbexec_file_1 --sf ./bbexec_file_2
In the current working directory will be created the evaluation_result.xlsx sheet that contains the comparison of each *bbexec file separated sheet by sheet.
The names of *bbexec files are important, the script tries to compare only files with the same name.
Additionally, it provides a general comparison sheet (general_diff) to show the overall differences.
If you wish to see only the general comparison sheet, then just skip the --all option.
The project supports a flexible plugin system that allows users to run custom and built-in plugins on the basic blocks of the assembly code. Each plugin provides a specific analysis or transformation, and you can easily add, enable, or disable plugins.
Plugins are defined in a JSON structure, where each plugin has:
- name: A human-readable name for the plugin.
- enabled: A boolean indicating whether the plugin is active.
- function: The path to the function that the plugin runs.
- args: The arguments passed to the function.
- Create Plugin File: Add a new Python file.
- Define
run()Function: Each new plugin must define a run() function, which will be called by the system. The function should follow this signature:
def run(basic_block: Node) -> List[dict]:
# Your plugin logic here
return []- Add your new plugin by using either the
--add_pluginsflag of theasm_graph.pyor the ASMGraph GUI
Now, when the application runs, it will import your new plugin and execute it.
Enable necessary plugins in plugins/plugin.json and add --run_plugins flag when running asm_graph.py:
./asm_graph.py --run_plugins ...
Use the asm_gui.py script to open ASMGraph GUI.
The GUI provides an easy-to-use interface for visualizing and interacting with the different features of ASMGraph.
A key feature of the GUI is the capability to visualize dot files.
The ASMGraph GUI incorporates a plugin management system that enables users to execute plugins on the visualized data,
with the results being stored in the xlsx file. Users can easily add new plugins through the GUI by providing a file with the plugin.
The file must include a run() function from which the plugin execution begins. An example of a custom plugin can be found in the plugins directory.
