SMOKE Scalable Path-Sensitive Memory Leak Detection for Millions of Lines of Code.
SMOKE is built on LLVM 3.6. It analyzes the Bitcode files (.bc files) of software projects to check vulnerabilities. A Bitcode file is a kind of intermediate representation of the source code. We have prepared all Bitcode files of 29 projects for evaluation.
We prepared a ubuntu server for the artifact evaluation. Reviewers can access the server via ssh (the password is given in the INSTALL file):
ssh icse2019ae@chcpu12h.cse.ust.hk
NOTE: when running large benchmark programs (>0.5MLoC), the computation resource in the server may be used up if multiple users use the server at the same time. Thus, we recommend running small benchmark programs to verify our idea.
In the server, we have installed all necessary binaries for evaluation:
In the home directory, there are several folders, including three groups of benchmarks and other folders:
Since we have a large set of benchmarks (29 projects) and five different tools to evaluate (SMOKE/Pinpoint/Saber/CSA/Infer). It takes a significantly large amount of time to evaluate and to collect the results. To better replicate the results within limited evaluation time, we have converted all reports to Pinpoint bug report format and made the reports and our inspection results available on the bug reporting system (Pinpoint Report System). Here is how to access them:
The address is SMOKE/Saber/CSA/Pinpoint/Infer Reports
Username/pass : testtest/testtest
Mapping:
Confirmed = True Positive
False Positive = False Positive.
SSU = SMOKE
PSA = PINPOINT (Pinpoint Static Analyzer)
CSA = CSA (Clang Static Analyzer)
Saber = Saber
Infer = Infer
Note: When collecting the statistics information, we use the default “Cluster” feature provided by Pinpoint. Reports are merged if they share the same root cause.
For readers who are interested in how to generate Bitcode files and compilation databases, please refer to PP-Capture
To evaluate SMOKE and PINPOINT, we need to know the location of the Bitcode files of version LLVM3.6. The Bitcode file for project [Proj Name] is located at:
/home/icse2019ae/bench36/[Proj Name]/
To evaluate Saber, we need to know the location of the Bitcode files of version LLVM4.0. The Bitcode file for project [Proj Name] is located at:
/home/icse2019ae/bench40/[Proj Name]/
For Infer and CSA, we need to know the [Source Dir] for a project [Proj Name], which is specified in the following table:
! /home/fangang and /bigdata/fangang are accessible for user icse2019ae.
[Proj Name] | [Source Dir] (a path that contains .piggy folder) |
---|---|
164.gzip | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/164.gzip/src |
175.vpr | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/175.vpr/src |
176.gcc | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/176.gcc/src |
181.mcf | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/181.mcf/src |
186.crafty | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/186.crafty/src |
197.parser | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/197.parser/src |
252.eon | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/252.eon/src |
253.perlbmk | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/253.perlbmk/src |
254.gap | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/254.gap/src |
255.vortex | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/255.vortex/src |
256.bzip2 | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/256.bzip2/src |
300.twolf | /home/fangang/cases/researchbench/spec2000/benchspec/CINT2000/300.twolf/src |
bftpd | /home/fangang/cases/researchbench/bftpd/bftpd |
htop | /home/fangang/cases/researchbench/htop/ |
caffe | /home/fangang/cases/researchbench/caffe/build |
memcached | /home/fangang/cases/researchbench/memcached/memcached-master |
lame | /bigdata/fangang/cases/lame/lame-3.100 |
zlib | /home/fangang/cases/researchbench/zlib/zlib |
tmux | /bigdata/fangang/cases/tmux/tmux-2.5 |
httpd | /home/fangang/cases/researchbench/apache2/httpd-2.4.29 |
openssl | /home/fangang/cases/researchbench/openssl |
ffmpeg | /home/fangang/benchmarks/FFmpeg/buildssu |
godot | /bigdata/fangang/cases/superlarge/godot/godot |
mysql | /home/fangang/cases/researchbench/mysql/mysql-server-mysql-5.5.51/build |
v8 | /home/fangang/cases/researchbench/v8/v8 |
skia | /home/fangang/cases/superlarge/skia/skia-master |
blender | /bigdata/fangang/cases/superlarge/blender/blender |
wine | /bigdata/fangang/cases/superlarge/wine/wine/build |
firefox | /bigdata/fangang/cases/superlarge/firefox/src/mozilla-unified/obj-x86_64-pc-linux-gnu |
We use Tmux as an example to illustrate the evaluation process. To evaluate a project, we need to run five tools to collect their running time and bug reports. We manually inspect the bug reports to classify them into False Positives and True Positives.
Note that the analysis time can be slightly different from the time reported in the paper. It is affected by the configurations of the running machine and the workload of that machine when running the analysis.
Commands:
pp-smoke -report=tmux_smoke.json bench36/tmux/tmux.bc
Program output:
icse2019ae@ubuntu:~$ pp-smoke -report=tmux_smoke.json bench36/tmux/tmux.bc
=== Glancing Mode : A quick glance of the project, not deep but fast ===
Code Metrics
------------------------------------------------------------------------------------------
Total Number of Functions: 1483
Number of Implemented Functions: 1293
Number of External Functions: 190
External Functions Ratio: 12.81%
Instruction Count: 138155 (Avg: 106.85 per function)
Function With Most Instructions: 1480 (Function: window_copy_command)
Average Cyclomatic Complexity: 7.49
Max Cyclomatic Complexity: 151 (Function: window_copy_command)
------------------------------------------------------------------------------------------
Analyzer Execution
------------------------------------------------------------------------------------------
Kept function size 542
VFG_Slicing Time: 22296us
Code preprocessing ........Done!
Constructing call graph ........Done!
Before CDG Time: 1303793us
CDG-Construction Time: 23382us
CDG-Construction Memory: 0KB
DomTreePass Construction Time: 157502us
DomTreePass Construction Memory: 0KB
[Falcon] [################################################################################] 100%
[SPEG] [################################################################################] 100%
SEG-Building Time: 2082713us
SEG-Building Memory: 245M 360KB
SVFG-Global-Building Time: 2341851us
SVFG-Global-Building Memory: 347M 144KB
SVFG-Building Time: 2360265us
SVFG-Building Memory: 347M 144KB
[PSA Checking] [################################################################################] 100%
PPMaster Time Time: 5847us
SSUMemoryLeakChecker Find Candidate Traces Time Time: 25822us
Peak Memory: 1G 565M 784KB
SSUMemoryLeakChecker Post Verification Time Time: 1267382us
Peak Memory: 1G 573M 236KB
[SSU Checking] [################################################################################] 100%
SSUMemoryLeakChecker Time Time: 1293500us
Peak Memory: 1G 573M 236KB
Parallel Scheduler Time Time: 1299422us
------------------------------------------------------------------------------------------
Bug reports summary per checker:
------------------------------------------------------------------------------------------
Checker Name Total bugs valid/qualified/found
------------------------------------------------------------------------------------------
SSU Memory Leak Checker 13/13/19
Bug reports summary per bug type
------------------------------------------------------------------------------------------
Bug type Number of reports
------------------------------------------------------------------------------------------
Memory Leak : CWE-401 13
Total Time: 7516350us
Peak Memory: 1G 573M 236KB
The report file, (You can refer to ReportFileFormat for the format of this report file):
Also, you can visit the following url for a visualized bug report (Username/pass : testtest/testtest ): tmux-smoke-pinpoint-report
It takes several 7516350us (7.52 seconds) for analyzing tmux.bc. After it finished, you will see the time and memory usage on the screen. SMOKE reports 19 bugs in total and marks 13 of them as valid. Those six invalid reports are identified in the “path-sensitive verification” phase we described in the paper.
In SMOKE, we treat different reports of the same root cause as one. So that we report nine memory leak reports in the paper and two of them are false positives.
Commands:
time saber -leak bench40/tmux/tmux.bc
Output
AvgIndOutDeg 3
MaxIndInDeg 896
MaxIndOutDeg 225
#######################################################
PartialLeak : memory allocation at : (ln: 70 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/imsg.c)
conditional free path:
--> (ln: 70 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/imsg.c)
--> (ln: 74 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/imsg.c)
--> (ln: 82 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/imsg.c)
PartialLeak : memory allocation at : (ln: 145 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/imsg.c)
conditional free path:
--> (ln: 27 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/freezero.c)
--> (ln: 68 fl: /bigdata/fangang/cases/tmux/tmux-2.5/proc.c)
--> (ln: 72 fl: /bigdata/fangang/cases/tmux/tmux-2.5/proc.c)
PartialLeak : memory allocation at : (ln: 215 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/vis.c)
conditional free path:
--> (ln: 101 fl: /bigdata/fangang/cases/tmux/tmux-2.5/log.c)
--> (ln: 105 fl: /bigdata/fangang/cases/tmux/tmux-2.5/log.c)
--> (ln: 216 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/vis.c)
--> (ln: 221 fl: /bigdata/fangang/cases/tmux/tmux-2.5/compat/vis.c)
PartialLeak : memory allocation at : (ln: 342 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
conditional free path:
--> (ln: 275 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 402 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 404 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 410 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 412 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 414 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 425 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
PartialLeak : memory allocation at : (ln: 236 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
conditional free path:
--> (ln: 1510 fl: /bigdata/fangang/cases/tmux/tmux-2.5/server-client.c)
--> (ln: 239 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 240 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
--> (ln: 275 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
PartialLeak : memory allocation at : (ln: 1517 fl: /bigdata/fangang/cases/tmux/tmux-2.5/server-client.c)
conditional free path:
--> (ln: 275 fl: /bigdata/fangang/cases/tmux/tmux-2.5/cmd.c)
real 1m6.410s
user 1m3.516s
sys 0m2.892s
From the output screen, you can find that Saber takes around 63.5 seconds for analyzing Tmux. It reports six memory leaks and all of them are false positives after a closely inspection.
For a visualized bug report: tmux-saber-pinpoint-report
Commands. We run the CSA in single thread mode:
cd ~/srcs/tmux/tmux-2.5
time ~/pinpoint/bin/pp-capture --run-csa --thread-pool=1 -- make -j4
Output:
icse2019ae@ubuntu:~/srcs/tmux/tmux-2.5$ time ~/pinpoint/bin/pp-capture --run-csa --thread-pool=1 -- make -j4
Current pp-capture version: 01.04.374
Caching compile commands ...
Running symbolic executor ...
Analyzed [############################################################] 100%
Total files: 136, analyzed 131 files, found 40 bugs(redundantly).
real 2m49.957s
user 2m44.072s
sys 0m4.868s
The report file is located in .piggy/reports/csa_report0.json, for the format of this report file please refer to ReportFileFormat.
The report file: csa_report0.json
CSA reports no memory leak for Tmux. (Note that there are two “Logic Error” bugs reported by the “cplusplus.NewDeleteLeaks” checker. They are not memory leaks).
For a visualized report (this report contains other types of bugs, no memory leak has been reported): tmux-csa-pinpoint-report
Commands. Infer can take a compilation database file as input.
cd ~/srcs/tmux/tmux-2.5
time infer -a infer -j 1 --biabduction-only --keep-going --compilation-database .piggy/compile_commands.json
Output:
icse2019ae@ubuntu:~$ cd ~/srcs/tmux/tmux-2.5
icse2019ae@ubuntu:~/srcs/tmux/tmux-2.5$ time infer -a infer -j 1 --biabduction-only --keep-going --compilation-database .piggy/compile_commands.json
Capturing using compilation database...
Starting translating 136 files
....................cmd-if-shell.c:150:7: ERROR translating statement 'InitListExpr'
cmd-if-shell.c:150:7: ERROR translating statement 'CompoundLiteralExpr'
cmd-if-shell.c:150:7: ERROR translating statement 'ParenExpr'
cmd-if-shell.c:150:7: ERROR translating statement 'MemberExpr'
cmd-if-shell.c:150:7: ERROR translating statement 'ParenExpr'
.................
Found 136 source files to analyze in /home/icse2019ae/srcs/tmux/tmux-2.5/infer-out
Starting analysis...
legend:
"F" analyzing a file
"." analyzing a procedure
F...........F........F..................................................................................................................................................
..FF....F.....
Analysis finished in 9min5ss
Found 76 issues
key-bindings.c:27: error: NULL_DEREFERENCE
pointer `tmp` last assigned on line 27 could be null and is dereferenced at line 27, column 1.
25. #include "tmux.h"
26.
27. > RB_GENERATE(key_bindings, key_binding, entry, key_bindings_cmp);
28. RB_GENERATE(key_tables, key_table, entry, key_table_cmp);
29. struct key_tables key_tables = RB_INITIALIZER(&key_tables);
...too many issues to display (limit=10 exceeded), please see /home/icse2019ae/srcs/tmux/tmux-2.5/infer-out/bugs.txt or run `infer-explore` for the remaining issues.
Summary of the reports
NULL_DEREFERENCE: 38
MEMORY_LEAK: 38
real 9m6.837s
user 8m20.956s
sys 0m9.948s
Infer takes around 8m20.956s (501.0 seconds) for checking Tmux. Since it detects many bugs, the final report is located in “/home/icse2019ae/srcs/tmux/tmux-2.5/infer-out/bugs.txt”. Here is the link to it:
The report file: bugs.txt
Visualized report: tmux-infer-pinpoint-report
At the end of this file, Infer reports 38 memory leaks, which becomes 18 after clustering. After close inspection, only 1 of 18 is True Positive.
Summary of the reports
NULL_DEREFERENCE: 38
MEMORY_LEAK: 38
Commands:
pp-pinpoint -report=tmux_pinpoint.json bench36/tmux/tmux.bc
Output:
Settings
------------------------------------------------------------------------------------------
Execution Mode: Normal. Typical balanced settings for finding software vulnerabilities.
------------------------------------------------------------------------------------------
Code Metrics
------------------------------------------------------------------------------------------
Total Number of Functions: 1483
Number of Implemented Functions: 1293
Number of External Functions: 190
External Functions Ratio: 12.81%
Instruction Count: 138155 (Avg: 106.85 per function)
Function With Most Instructions: 1480 (Function: window_copy_command)
Average Cyclomatic Complexity: 7.49
Max Cyclomatic Complexity: 151 (Function: window_copy_command)
------------------------------------------------------------------------------------------
Analyzer Execution
------------------------------------------------------------------------------------------
Code preprocessing ........Done!
Constructing call graph ........Done!
Before CDG Time: 1691310us
CDG-Construction Time: 67364us
CDG-Construction Memory: 0KB
[Falcon] [################################################################################] 100%
Resolving indirect calls ........Done
[SPEG] [################################################################################] 100%
SEG-Building Time: 14317544us
SEG-Building Memory: 987M 952KB
[Canary] [################################################################################] 100%
[PSA Checking] [################################################################################] 100%
PPMaster Time Time: 28484658us
[Post-processing] [################################################################################] 100%
PSA Memory Leak Checker Time Time: 95588us
Peak Memory: 2G 952M 816KB
------------------------------------------------------------------------------------------
Bug reports summary per bug type
------------------------------------------------------------------------------------------
Bug type Number of reports
------------------------------------------------------------------------------------------
Memory Leak : CWE-401 10
Total Time: 47946278us
Peak Memory: 2G 952M 816KB
Report file: tmux_pinpoint.json
Visualized report: tmux-pinpoint-pinpoint-report
Pinpoint takes 47946278us (47.9 seconds) for analyzing Tmux. Pinpoint initially reports ten bugs being found. Same as SMOKE, we clustered them according to their root causes, which are two (xmalloc.c:33 and xmalloc.c:47). After inspection, one bug report is true positive, and another one is a false positive.
Here is the summary for this evaluation:
Performance (in seconds):
Project | SMOKE | Pinpoint | Saber | CSA | Infer |
---|---|---|---|---|---|
tmux | 7.52 | 47.9 | 63.5 | 164.0 | 509.56 |
For precision ([# False positive]/[# Total Reports]):
Project | SMOKE | Pinpoint | Saber | CSA | Infer |
---|---|---|---|---|---|
tmux | 2/9 | 2/2 | 6/6 | 0/0 | 17/18 |