A deep learning-based approach to measuring code maintainability
This repository contains materials on our paper "Measuring Code Maintainability with Deep Neural Networks".
- two datasets:
- TrainingData: an automatically constructed large dataset, which contains 1,394,514 Java classes.
- TestingData: an manually constructed small dataset, which contains 240 Java classes.
- source code (written in Java) that is used to extract featurs from Java classes.
- Based on the code, you can extract features for a single Java class or a directory that contains a number of Java classes quickly because of multithreading.
-
source code (
/Model/Model/*.py):/Model/Model/GenerateDataset.pygenerates TFRecords file in TensorFlow based on the training data./Model/Model/TrainOrTest.pytrains and tests our deep learning-based model./Model/Model/DeepM.pycomputes the maintainability index of a Java class using the trained model./Model/Model/Model.pycontains the model of DeepM, and models for ablation study.
-
baselines
-
a trained model (in
/Model/Model/all/) -
/Model/requirements.txt, which describes the requirements of running our code in/Model/Model/. -
/Model/runDeepM.sh, which computes the maintainability indexes of Java classes in a directory using the traned model.- usage:
sh runDeepM.sh [The directory containing /Model/Model/DeepM.py] [The directory containing Java classes] [The file saving results]
- the maintainability indexes (computed by the trained model) of Java classes in the testing data
If you have questions, please contact me directly: ymhu@bit.edu.cn. Thank you!