SATD refers to self-admitted technical debt, which is introduced intentionally (e.g., through temporary fix) and admitted by developers themselves and always recorded in source code comments. SATD Detector[1] is a tool that is able to automatically detect SATD comments text mining. This is the back-end of SATD Detector, which provides command-line interface and Java API of SATD Detector. More details can be found in [1].
Download binaries: satd_detector.jar
Use build-in models to test whether a comment is SATD comment or not:
$ java -jar satd_detector.jar test
>This is an ugly implementation.
SATD
>This function read lines from file.
Not SATD
>/exit
bye!To train your own models, you need provide three data files:
- Comment file: contains all the comments in your dataset. One line for each comment.
- Label file: contains the labels of comments.
- Project file: contains the project names of comments.
Comment file example: comments.txt
// This is comment 1
// TODO: fix this function later
/* This is comment 3 */Label file example: labels.txt
No
Yes
NoProject file example: projects.txt
project1
project2
project3Train your own model
$ java -jar satd_detector.jar train -comments comments.txt -labels labels.txt -projects projects.txt -out_dir ./models/
Finish Saving classifier of project1
Finish Saving classifier of project2
Finish Saving classifier of project3Now you can see 9 files in ./models/:
- models
- project1.as
- project1.model
- project1.stw
- project2.as
- project2.model
- project2.stw
- project3.as
- project3.model
- project3.stwAll these files will be loaded while classifying.
$ java -jar satd_detector.jar test -model_dir ./models/
>This is an ugly implementation.
SATD
>This function read lines from file.
Not SATD
>/exit
bye!Your can also use SATD Detector in your Java projects.
- Download satd_detector.jar
- Add it to the classpath
import satd_detector.core.train.Train;
String commentFile = "./comments.txt";
String labelsFile = "./labels.txt";
String projectsFile = "./projects.txt";
String outDir = "./models/";
// train your own models
try{
Train.buildModels(commentFile, labelFile, projectFile, outDir);
} catch (Exception e) {
e.printStackTrace();
}import satd_detector.core.utils.SATDDetector;
// create an instance using build-in models
SATDDetector detector1 = new SATDDetector();
String comment = "This is an ugly implementation."
boolean result = detector1.isSATD(comment)
// create an instance using your own models
String modelDir = "./models/"
SATDDetector detector2 = new SATDDetector(modelDir);
boolean result = detector2.isSATD(comment)$ java -jar satd_detector.jar test -h
usage: test
-h Show help message
-model_dir <arg> Dir which stores all the models. Using build-in models
if not specified.
$ java -jar satd_detector.jar train -h
usage: train
-comments <arg> The file which stores all comments.
-h Show help messages.
-labels <arg> The file which stores labels of all comments.
-out_dir <arg> Dir to store trained models.
-projects <arg> The file which stores projects of all comments.- weka 3.8.1 or above
- snowball-stemmers
- which can be installed using weka package manager
- Apache commons cli 1.4 or above
- Git clone or download zip
- Add
resourcesfolder to your classpath - Add
weka.jar,snowball-stemmers.jarandcommons-cli.jarto your classpath
For Eclipse, you can modify classpath quickly by:
- Righ-click on the project
- Choose
Build Path->Configure Build Path- for
resourcesfolder, chooseSourcetab -> Add Folder - for jar files, choose
Librariestab -> Add External JARs
- for
[1] SATD Detector: A Text-Mining-Based Self-Admitted Technical Debt Detection Tool.