[Tech] Symbolic Execution

KLEE

不同版本的KLEE需要不同版本的LLVM,首先安装LLVM,对于新版KLEE,可以使用LLVM高版本,对于KLEE 1.3版本,需要搭配LLVM 3.4 https://github.com/tum-i4/klee-install/blob/master/README.md

高版本KLEE编译方式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
sudo apt-get install build-essential curl libcap-dev git cmake libncurses5-dev python-minimal python-pip unzip libtcmalloc-minimal4 libgoogle-perftools-dev libsqlite3-dev doxygen

sudo apt-get install clang-9 llvm-9 llvm-9-dev llvm-9-tools

curl -OL https://github.com/google/googletest/archive/release-1.7.0.zip
unzip release-1.7.0.zip

git clone https://github.com/klee/klee-uclibc.git
cd klee-uclibc
./configure --make-llvm-lib
make -j2
cd ..

mkdir build
cd build
cmake -DENABLE_SOLVER_Z3=ON \
-DENABLE_POSIX_RUNTIME=ON \
-DENABLE_KLEE_UCLIBC=ON \
-DKLEE_UCLIBC_PATH=../klee-uclibc \
-DGTEST_SRC_DIR=../googletest-release-1.7.0 \
-DENABLE_SYSTEM_TESTS=ON \
-DENABLE_UNIT_TESTS=ON \
../

make
sudo make install

KLEE使用LLVM IR进行符号执行,对于如下给定程序example.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <memory.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

char* get_message(const char* message) {
if (strlen(message) <= 4)
return NULL;

if (!('F' == message[0] && '\x1' == message[1]))
return NULL;

unsigned int message_length = *(unsigned int*)&message[2];
char* output_message = (char*)malloc(message_length);

memcpy(output_message,&message[2],message_length);
return output_message;
}

int main(int argc,char** argv) {
char input_buffer[0x10] = {0};
scanf("%s",input_buffer);
char* message_data = get_message(input_buffer);

if (NULL == message_data)
printf("Error for resolve message\n");
else
printf("message data : %s\n",message_data);
return 0;
}

KLEE有concrete执行模式和symbolize模式,如果没有指定任何symbolize的变量,KLEE就模拟执行

KLEE模拟执行

我们首先将其编译为LLVM bitcode example.bc

1
2
3
# -emit-llvm -c for bitcode generation
# generate example.bc
clang -emit-llvm -c example.c -o example.bc

该命令生成example.bc文件,我们将其传递给KLEE,没有任何符号化输入的情况下,KLEE模拟执行程序,仅探索一条path。

1
2
# run klee
klee --libc=uclibc --posix-runtime example.bc
image-20210403151736478

执行后,KLEE将相关信息输出到klee-out-N目录下,其中N表示运行次数。

image-20210403152646024

KLEE符号执行

我们首先将其编译为LLVM bitcode example.bc

1
2
3
4
5
# -emit-llvm -c for bitcode generation
# -g for debug information
# -O0 -Xclang -disable-O0-optnone for optimal level
# generate example_1.bc
clang -I ../klee/include/ -emit-llvm -c -g -O0 -Xclang -disable-O0-optnone example.c

该命令生成example.bc文件,我们将其传递给KLEE,

1
2
# run klee
klee --libc=uclibc --posix-runtime --debug-z3-dump-queries=./z3_queries --solver-backend=z3 --write-paths --write-smt2s --external-calls=all --check-overshift=0 --check-div-zero=0 example_1.bc

As expected, KLEE informs us that it explored three paths in the program and generated one test case for each path explored. The output of a KLEE execution is a directory (in our case klee-out-0) containing the test cases generated by KLEE. KLEE names the output directory klee-out-N where N is the lowest available number (so if we run KLEE again it will create a directory called klee-out-1), and also generates a symbolic link called klee-last to this directory for convenience:

CSmith

1
2
3
4
5
6
7
8
cd csmith
./configure
make

csmith --no-arrays --no-structs --no-unions --no-safe-math --no-pointers --no-longlong --no-float --max-expr-complexity 5 --max-funcs 3 > example.c

csmith > example.c
clang -I ../csmith-2.3.0/runtime example.c

Z3 Solver

1
2
3
4
python scripts/mk_make.py --java
cd build
make
sudo make install
Z3 for Java

https://stackoverflow.com/questions/60403775/how-to-setup-a-java-development-environment-for-z3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
wget https://github.com/Z3Prover/z3/releases/download/z3-4.8.10/z3-4.8.10-x64-ubuntu-18.04.zip
unzip z3-4.8.10-x64-ubuntu-18.04.zip
export Z3_DIR=<some_path>/z3-4.8.10-x64-ubuntu-18.04

# install maven by apt
sudo apt install maven
# install maven manually
wget https://apache.website-solution.net/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz
tar -xvzf apache-maven-3.6.3-bin.tar.gz
mv apache-maven-3.6.3 /opt
export PATH=/opt/apache-maven-3.6.3/bin:$PATH

mvn install:install-file \
-Dfile=$Z3_DIR/bin/com.microsoft.z3.jar \
-DgroupId=com.microsoft \
-DartifactId=z3 \
-Dversion=4.8.7 \
-Dpackaging=jar \
-DgeneratePom=true

Add the following source to maven pom.xml

1
2
3
4
5
<dependency>
<groupId>com.microsoft</groupId>
<artifactId>z3</artifactId>
<version>4.8.7</version>
</dependency>

https://github.com/Z3Prover/z3/tree/z3-4.8.7/src/api/java

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import Z3Helper.z3Utility;
import com.microsoft.z3.*;

public class z3Test1 {
public static void main(String[] args) {
Context z3 = new Context();
IntExpr zero = z3.mkInt(0);
IntExpr one = z3.mkInt(1);
IntExpr variable = z3.mkIntConst("a");
IntExpr variable2 = z3.mkIntConst("c");
IntExpr variable3 = z3.mkIntConst("b");
System.out.println(variable.toString());
BoolExpr greater = z3.mkGe(variable,zero);
BoolExpr lessthan = z3.mkLt(variable2,one);
BoolExpr formula = z3.mkAnd(greater,lessthan);
System.out.println(greater);
Expr[] copy =new Expr[2];
copy[0] = variable;
copy[1] = variable2;
Expr[] beCopied =new Expr[2];
beCopied[0] = variable3;
beCopied[1] = variable3;
System.out.println(formula);
System.out.println(z3Utility.constructFreshExpr(formula,z3));
Solver s = z3.mkSolver();
s.add(greater);
Status result = s.check();
System.out.println(result);

}
}

没有Main时打包为Jar

image-20210329131754749

image-20210329131811471

image-20210329131847136

image-20210329131900567

Z3 for Python
1
pip3 install z3-solver
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
s = z3.Solver()
s.from_string(constraint1)
res = s.check()
print('constraint 1:', res)
print(s.model())
print()

s = z3.Solver()
s.from_string(constraint2)
res = s.check()
print('constraint 2:', res)
print(s.model())
print()

s = z3.Solver()
s.from_string(constraint3)
res = s.check()
print('constraint 3:', res)

Combined Constraints

1
2
3
4
5
6
7
8
9
csmith --no-arrays 
--no-structs
--no-unions
--no-safe-math
--no-pointers
--no-longlong
--no-float
--max-expr-complexity 5
--max-funcs 3 > example.c

Add the following statements before main to symbolize the global variables

1
2
3
4
klee_make_symbolic(&g_21, sizeof g_21, "g_21");
klee_make_symbolic(&g_125, sizeof g_125, "g_125");
klee_make_symbolic(&g_22, sizeof g_22, "g_22");
klee_make_symbolic(&g_184, sizeof g_184, "g_184");

Compile the code to LLVM bitcode for symbolic execution

1
2
3
4
5
6
7
8
9
10
# https://clang.llvm.org/docs/ClangCommandLineReference.html
# -emit-llvm -c for bitcode generation
# -g for debug information
# -O0 -Xclang -disable-O0-optnone for optimal level
# generate example_1.bc
clang -I ../csmith-2.3.0/runtime/ -I ../klee/include/ -emit-llvm -c -g -O0 -Xclang -disable-O0-optnone example_1.c
# generate a.out
export LD_LIBRARY_PATH=/home/wchenbt/Projects/klee/build/lib/:$LD_LIBRARY_PATH
clang -I ../csmith-2.3.0/runtime/ -I ../klee/include/ -L ../klee/build/lib -g -O0 -Xclang -disable-O0-optnone example_1.c -lkleeRuntest
KTEST_FILE=klee-last/test000001.ktest ./a.out

Run klee on generate bitcode

1
2
3
4
5
6
7
8
9
10
11
# -write-smt2s: Write .smt2 (SMT-LIBv2) files for each test case (default=false)

klee --libc=uclibc \
--posix-runtime \
--debug-z3-dump-queries=./z3_queries \
--solver-backend=z3 \
--write-paths \
--write-smt2s \
--external-calls=all \
--check-overshift=0 \
--check-div-zero=0 example_1.bc

solve the generated Klee constraint

http://smtlib.cs.uiowa.edu/logics-all.shtml

https://ericpony.github.io/z3py-tutorial/guide-examples.htm

https://ericpony.github.io/z3py-tutorial/advanced-examples.htm

https://rise4fun.com/z3/tutorialcontent/guide

https://www.cs.tau.ac.il/~msagiv/courses/asv/z3py/guide-examples.htm

https://theory.stanford.edu/~nikolaj/programmingz3.html

image-20210407231220676

image-20210407231300975

https://z3prover.github.io/api/html/classz3py_1_1_solver.html#ad1255f8f9ba8926bb04e1e2ab38c8c15