GenTree: Using Decision Trees to Learn Interactions for Configurable Software
Source code: https://github.com/unsat/gentree
Modern software systems are increasingly designed to be highly configurable, which increases flexibility but can make programs harder to develop, test, and analyze, e.g., how configuration options are set to reach certain locations, what characterizes the configuration space of an interesting or buggy program behavior? We introduce GenTree, a new dynamic analysis that automatically learns a program's interactions - logical formulae that describe how configuration option settings map to code coverage. GenTree uses an iterative refinement approach that runs the program under a small sample of configurations to obtain coverage data; uses a custom classifying algorithm on these data to build decision trees representing interaction candidates; and then analyzes the trees to generate new configurations to further refine the trees and interactions in the next iteration. Our experiments on 17 configurable systems spanning 4 languages show that GenTree efficiently finds precise interactions using a tiny fraction of the configuration space.
icon search Searchable Transcript
Toggle between list and paragraph view.
[00:00:00.100]Hi, my name is KimHao Nguyen.
[00:00:02.670]I am a sophomore majoring in Computer Science.
[00:00:06.000]My advisor is Professor ThanhVu Nguyen
[00:00:09.700]Our research project
[00:00:10.990]is GenTree: Using Decision Trees to Learn Interactions
[00:00:16.140]for Configurable Software.
[00:00:19.070]The full paper will be presented
[00:00:21.400]at the ICSE'21 Conference,
[00:00:24.650]and the pre-print version is available on this slide.
[00:00:28.850]So let's go over some background.
[00:00:31.600]Many software systems
[00:00:33.290]are designed to be highly configurable,
[00:00:36.370]which increases flexibility,
[00:00:39.140]but make programs harder to develop, test and analyze.
[00:00:44.770]So what's a configurable software?
[00:00:49.350]Look at the UNIX cat command.
[00:00:52.810]So user can tweak the program behavior
[00:00:56.040]by command line arguments
[00:00:58.633]like -A, -b, et cetera.
[00:01:02.570]It has 12 boolean options
[00:01:05.550]and 4,096 possible configurations.
[00:01:10.800]The problem is that faults are often visible
[00:01:14.490]under only some specific configurations
[00:01:18.920]and the configuration space grows exponentially,
[00:01:22.820]which make it infeasible to use an exhaustive search.
[00:01:27.610]Our technique, GenTree, is a dynamic analysis
[00:01:32.560]that can infer interactions of a program.
[00:01:36.850]An interaction for a location or behavior
[00:01:40.970]is defined as a logically weakest formula
[00:01:45.320]over configuration options
[00:01:48.270]such that any configuration satisfying that formula
[00:01:52.670]would cover that location or trigger that behavior.
[00:01:59.170]The interaction can help us
[00:02:01.720]quickly characterize the condition of a fault
[00:02:06.360]or iterate over all configurations causing the fault.
[00:02:13.090]The figure on the right shows an example C program
[00:02:17.870]with the location L0 to L8
[00:02:21.560]annotated with their interactions.
[00:02:27.390]L0's, interaction is true,
[00:02:29.890]so it's always executed,
[00:02:32.470]but only configuration having both u and v be true
[00:02:37.860]will cover L3.
[00:02:40.780]So how does GenTree work?
[00:02:43.750]GenTree is based on iGen and iterative refinement ideas.
[00:02:50.450]First, we create an initial set of configurations
[00:02:55.830]and run the program to obtain location coverage.
[00:03:00.590]Then for each location l,
[00:03:03.230]we build a decision tree
[00:03:05.160]representing a candidate interaction
[00:03:08.590]from the set of configurations that do and do not cover l.
[00:03:15.820]Because we are only using a subset of all configurations,
[00:03:21.430]the trees may be inaccurate.
[00:03:24.240]So we analyze the trees
[00:03:26.920]and generate the candidate counterexamples.
[00:03:31.670]If we found a counterexample,
[00:03:35.140]then we rebuild the trees and repeat the process.
[00:03:40.520]Then we stop when we obtain no new coverage or trees
[00:03:45.320]for several consecutive iterations.
[00:03:49.190]Finally, we convert the decision trees to logical formulae.
[00:03:55.810]Let's go through an example of how GenTree works
[00:03:59.410]to find the interaction for L8 in this C program.
[00:04:07.060]GenTree will generate three initial configurations,
[00:04:11.140]c1 to c3,
[00:04:13.610]and then build a decision tree
[00:04:16.500]and obtain an interaction "not-s",
[00:04:19.940]which is not quite accurate.
[00:04:22.820]So we analyze the tree
[00:04:25.530]and then generate four more configurations,
[00:04:29.450]c4 to c7.
[00:04:31.670]And we found that the configuration c4, 5, and 6
[00:04:37.270]contradict with the first decision tree here.
[00:04:41.380]So we have found the counterexamples.
[00:04:44.950]So we proceed to generate a new decision tree
[00:04:49.330]which correspond to this interaction
[00:04:52.410]which is more accurate than the first one.
[00:04:56.700]Then we repeat the process
[00:04:59.760]and we generate nine more configurations
[00:05:04.330]and generate this decision tree
[00:05:07.280]and then convert it to this interaction
[00:05:10.250]which is exactly our target interaction
[00:05:15.329]To evaluate GenTree,
[00:05:17.580]we have tested it on 17 configurable systems
[00:05:22.460]spanning four languages:
[00:05:24.400]C, Python, Ocaml and Perl.
[00:05:28.110]The programs have configuration space ranges
[00:05:31.830]from 1000 to 100 trillion configurations.
[00:05:39.160]From the experiment,
[00:05:40.850]we found that GenTree is performant and scalable
[00:05:45.680]because it only uses a small sample
[00:05:49.150]out of all possible configurations.
[00:05:54.160]the configuration space of ls is 100 trillion,
[00:05:59.220]but GenTree only need around 100,000 of them
[00:06:04.750]to infer the interactions.
[00:06:07.500]To check the accuracy of the inferred interaction,
[00:06:11.790]we obtain the ground truth
[00:06:13.660]by an exhaustive search, when feasible.
[00:06:17.560]Then we plot the ratio of precise interactions
[00:06:23.160]inferred by GenTree over time.
[00:06:26.370]We found that GenTree can find many precise interactions
[00:06:31.300]at around 20% progress
[00:06:36.380]and spend the rest of time refining hard interactions.
[00:06:41.300]At the end, GenTree found precise interactions
[00:06:45.620]for almost all locations reachable by the testsuite.
[00:06:51.430]GenTree also consistently outperforms
[00:06:54.760]the randomized algorithm,
[00:06:57.180]which means that our counterexample generation heuristic
[00:07:04.660]In conclusion, we have presented GenTree,
[00:07:08.750]a new dynamic analysis technique
[00:07:11.260]to learn program interactions,
[00:07:13.710]which are formulae describing the configurations
[00:07:17.460]covering a location.
[00:07:19.920]The implementation and evaluation data
[00:07:22.860]are available on GitHub.
[00:07:25.080]We will extend our work
[00:07:27.290]to improve the counterexamples generation algorithm
[00:07:31.960]and integrate fuzzing and genetic techniques
[00:07:36.710]to obtain better initial traces.
[00:07:39.790]This project was advised by Professor ThanhVu Nguyen,
[00:07:44.430]and supported by the National Science Foundation,
[00:07:48.250]the Army Research Office and the UCARE Award.
[00:07:52.980]Thank you for listening.
Log in to post comments