TY - JOUR
T1 - Machine learning steered symbolic execution framework for complex software code
AU - Bu, Lei
AU - Liang, Yongjuan
AU - Xie, Zhunyi
AU - Qian, Hong
AU - Hu, Yi Qi
AU - Yu, Yang
AU - Chen, Xin
AU - Li, Xuandong
N1 - Publisher Copyright:
© 2021, British Computer Society.
PY - 2021/6
Y1 - 2021/6
N2 - During program traversing, symbolic execution collects pathconditions and feeds them to a constraint solver to obtain feasiblesolutions. However, complex path conditions, like nonlinearconstraints, which widely appear in programs, are hard to be handledefficiently by the existing solvers. In this paper, we adapt theclassical symbolic execution framework with a machine learningapproach for constraint satisfaction. The approach samples andlearns from different solutions to identify potentially feasiblearea. This sampling-learning style solving can be applied indifferent class of complex problems easily. Therefore, incorporatingthis approach, our framework, MLBSE, supports the symbolicexecution of not only simple linear path conditions, but alsononlinear arithmetic operations, and even black-box function callsof library methods. Meanwhile, thanks to the theoretical foundationof the machine learning based approach, when the solver fails tosolve a path condition, we can have an estimation of the confidencein the satisfiability (ECS) of the problem to give users insightsabout how the problem is analyzed and whether they could ultimatelyfind a solution. We implement MLBSE on the basis of SymbolicPath Finder (SPF) into a fully automatic Java symbolic executionengine. Users can feed their code to MLBSE directly, which isvery convenient to use. To evaluate its performance, 22 real caseprograms are used as the benchmarks for MLBSE to generate testcases, which involve a total number of 1042 methods that are full ofnonlinear operations, floating-point arithmetic as well as nativemethod calls. Experiment results show that the coverage achieved byMLBSE is much higher than the state-of-the-art tools.
AB - During program traversing, symbolic execution collects pathconditions and feeds them to a constraint solver to obtain feasiblesolutions. However, complex path conditions, like nonlinearconstraints, which widely appear in programs, are hard to be handledefficiently by the existing solvers. In this paper, we adapt theclassical symbolic execution framework with a machine learningapproach for constraint satisfaction. The approach samples andlearns from different solutions to identify potentially feasiblearea. This sampling-learning style solving can be applied indifferent class of complex problems easily. Therefore, incorporatingthis approach, our framework, MLBSE, supports the symbolicexecution of not only simple linear path conditions, but alsononlinear arithmetic operations, and even black-box function callsof library methods. Meanwhile, thanks to the theoretical foundationof the machine learning based approach, when the solver fails tosolve a path condition, we can have an estimation of the confidencein the satisfiability (ECS) of the problem to give users insightsabout how the problem is analyzed and whether they could ultimatelyfind a solution. We implement MLBSE on the basis of SymbolicPath Finder (SPF) into a fully automatic Java symbolic executionengine. Users can feed their code to MLBSE directly, which isvery convenient to use. To evaluate its performance, 22 real caseprograms are used as the benchmarks for MLBSE to generate testcases, which involve a total number of 1042 methods that are full ofnonlinear operations, floating-point arithmetic as well as nativemethod calls. Experiment results show that the coverage achieved byMLBSE is much higher than the state-of-the-art tools.
KW - Constraint solving
KW - Machine learning
KW - Nonlinear path condition
KW - Symbolic execution
UR - https://www.scopus.com/pages/publications/85106508716
U2 - 10.1007/s00165-021-00538-3
DO - 10.1007/s00165-021-00538-3
M3 - 文章
AN - SCOPUS:85106508716
SN - 0934-5043
VL - 33
SP - 301
EP - 323
JO - Formal Aspects of Computing
JF - Formal Aspects of Computing
IS - 3
ER -