Index: /reasoner/evaluation.tex
===================================================================
--- /reasoner/evaluation.tex	(revision 170)
+++ /reasoner/evaluation.tex	(revision 171)
@@ -11,97 +11,98 @@
 \emph{Subjects:} The test cases of EASy-Producer involving reasoning, i.e., the test suites for the SSE reasoner (reused from the reasoner core), the runtime extension for VIL, the scenario test cases (including the models of QualiMaster) as well as the scenario test cases for the BMWi ScaleLog project. It is important to note that the base version of the reasoner\footnote{\label{reasonerBaseVersion}\TBD{Version 1.2.0-SNAPSHOT, git hash}} did not contain the generic data collector, so we had to backpack the original base version. Moreover, several test cases that have been created for testing the advanced features of the recent version of the SSE reasoner are also not included (and cannot be executed on the base version). However, while the subject sets differ in detail, the most imporant small and large models are the same in both subject sets.
 
-\emph{Procedure:} We run the four test suites mentioned above each 5 times to collect response time and model. Typically, each test suite runs individually in a JVM. To compensate delayed JIT optimization, we include a ramp-up run that warms up the JVM. For most test cases, a simple in-memory model with a compound type, a collection over that type and a quantor constraint over the container variable is sufficient. However, for the QualiMaster models, we added as ramp-up run a full run of one of the largest models without accounting for reasoning time or without performing instantiation. \TBD{We execute all tests in a script outside Eclipse to avoid disturbances caused by functionality of the IDE. We execute this procedure on the most recent version of EASy-Producer\footnote{Version 1.3.0-SNAPSHOT, TBD{git-hash}} on an actual development machine, a Dell laptop... with Windows 10 and JDK9. For comparison, we run the same version of EASy-Producer on a Dell laptop ... with Windows 7 and JDK8. For both windows machines, we disable first the virus scanner and terminate all programs that are not required for the execution of the tests. On the Windows 7 machine we also run the base version of the reasoner\footref{reasonerBaseVersion}. Finally, for curiosity, we run the the test execution script also on our continuous integration server, a Linux... VM at a point in time when no Jenkins tasks are running.}
+\emph{Procedure:} We run the four test suites mentioned above each 5 times to collect response time and model. Typically, each test suite runs individually in a JVM. To compensate delayed JIT optimization, we include a ramp-up run that warms up the JVM. For most test cases, a simple in-memory model with a compound type, a collection over that type and a quantor constraint over the container variable is sufficient. However, for the QualiMaster models, we added as ramp-up run a full run of one of the largest models without accounting for reasoning time or without performing instantiation. \TBD{We execute all tests in a script outside Eclipse to avoid disturbances caused by functionality of the IDE. We execute this procedure on the most recent version of EASy-Producer\footnote{Version 1.3.0-SNAPSHOT, TBD{git-hash}} on an actual development machine, a Dell laptop \TBD{XXX} with Windows 10 and JDK9. We select Windows for a better comparison with the base version and also to measure the reasoner in a typical environment. For comparison, we run the same version of EASy-Producer on a Dell laptop \TBD{XXX} with Windows 7 and JDK8. For both windows machines, we disable first the virus scanner and terminate all programs that are not required for the execution of the tests. On the Windows 7 machine we also run the base version of the reasoner\footref{reasonerBaseVersion}. Finally, for curiosity, we run the the test execution script also on our continuous integration server, a Linux... VM at a point in time when no Jenkins tasks are running.}
 
-\emph{Analysis:} \TBD{XXX}
+\emph{Analysis:} After collecting the runtime results, we execute an R script which combines the results of all test executions for one complete run. The script produces various plots, all relating runtime and model complexity (cf. Section \ref{sectModelComplexity}). Some plots are inteded to provide an overview on the reasoning time for different modes, some how the reasoning time is composed (in terms of model translation time and constraint evaluation time) and some indicating the deviations across all series of test suite execution.
 
 \subsection{Model Complexity}\label{sectModelComplexity}
 
+When evaluating the reasoning capabilities of traditional variabilit models, such as feature models, typically a measure combining the number of features and the number of constraints is applied, e.g., the average number of constraints per feature or the constraint ratio \TBD{refs}. However, just relating the number of variables to the number of constraints in an IVML model can lead to a misleading view on the complexity of the configuration / reasoning problem. As long as only top-level variables and Boolean constraints as in some generated models are used, a measure based on the number of variables and constraints seems to correctly classify such models according as illustrated in \TBD{Figure}. However, models with compounds, containers and quantor constraints appear less complex as the nested variables are not considered and iterating container expressions are just counted as a basic Boolean constraint.
+
+For this evaluation, we combine the basic idea of counting variables / constraints with the approach used in the McCabe complexity metric, i.e., also counting the nested elements and weighting them according to their perceived complexity. Although we 'calibrate' the weights according to our test models to obtain an objective criterion to display complexity vs. reasoning time, we do not claim that the weights are universal and hold for all kinds of IVML models. 
+
+The complexity metric applied here consists of four parts, the
+\begin{enumerate}
+\item measure of the variable structure $cpx_v(e)$ of a given model element $e$.
+\item measure of the constraints $cpx_c(e)$  for a certain model element $e$, which is based on the 
+\item measure of a (constraint, default value) expression $cpx_e(e)$ of a given expression $e$.
+\item weighting $w_{cpx}(e)$ for a model element, constraint or expression $e$.
+\end{enumerate}
+
+Due to the nested structure of IVML models, most of the formulae for calculating the complexity are recursive. Within these formulae, the weighting function $w_cpx(e)$ is mostly applied additive. In two cases, we use $w_cpx(e)$ also in a multiplicative fashion, in particular to disable parts of the calculation, e.g., for constraints or nested variables, so that we also can express the traditional counting approach for features and constraints through $cpx_v(e)$ and $cpx_c(e)$ in an integrated way. We explain now the formulae for the four parts. Thereby, we implicitly introduce some new functions for model elements that have not been used so far and that will only be used within this section.
+
+The measure of the variable structure $cpx_v(e)$ calculates a weighted sum over the number of nested variables for an IVML model element starting with a given configuration. We do not rely here on the meta-model, i.e., the project, as a configuration contains all actually available variables, i.e., also those created by assignment constraints in terms of compound or container instances. When applying $cpx_v(e)$ to a configuration, the sum of the measures for all variables is calculated. In turn, for a variable, we add a weight for the type of the variable (e.g., if we want to weight complex types like containers or compounds higher) with the (recursive) sum of $cpx_v(e)$ over all nested variables (weighted by the IVML type \IVML{Var} for decision variables to disable measuring nested variables).
+%
 $$
    cpx_v(e) = \begin{cases} 
        \sum_{v\in vars(e)}cpx_v(e) & \text{if } isConfiguration(e)\\
-       w_{cpx}(type(e)) + cpx(default(e)) & \text{if } isVariable(e)\\
+       w_{cpx}(type(e)) +  & \\
+       w_{cpx}(\IVML{Var}) \cdot \sum_{n\in vars(e)}cpx_v(n) & \text{if } isVariable(e)\\
        0 & \text{else}
        \end{cases}
 $$
-
+%
+The measure of the constraints $cpx_c(e)$ considers all types of model elements that may contain constraints and sums the measures for these constraints recursively. In more details, for a configuration we consider the contained (instantiated) variables as well as all top-level elements in the underlying project. For a project, we calculate the sum of $cpx_c(e)$ for all model elements. For a decision variable (not the underlying declaration), we build the recursive sum over all nested variables and add the complexity of the default value expression. Here we use the weighting $w_{cpx}(\IVML{Expr})$ for the IVML type \IVML{Expr} (expression) to allow disabling the measure for the default value expression. For an eval-block, we sum up the measures for all nested constraints and (potentially nested) recursive eval blocks. Similarly, for annotation assignments, we sum the measures for all nested constraints and (potentially nested) recursive annotation assignments. %implicit default values missing
+For a user-defined operation (identified via $isOpDef(e)$), we just take the measure for the function defining the operation into account. For a constraint, we combine a basic additive weight (to enable just counting constraints) with the weighted measure for the expression of the constraints (to disable the calculation of the actual expression).
+%
 $$
    cpx_c(e) = \begin{cases} 
        \sum_{v\in vars(e)}cpx_c(v) + cpx_c(project(e)) & \text{if } isConfiguration(e)\\
        \sum_{f\in elements(e)}cpx_c(f)  & \text{if } isProject(e)\\
-       \sum_{v\in vars(v)}cpx_c(e) + w_{cpx}(type(e)) & \\
+       \sum_{v\in vars(v)}cpx_c(e) \\
        \text{ } + w_{cpx}(\IVML{Expr}) \cdot cpx_e(default(e)) & \text{if } isVariable(e)\\
-       \sum_{c\in constraints(e)} cpx_c(c) & \text{if } isEval(e)\\ 
+       \sum_{x\in constraints(e)~\cup~evals(e)} cpx_c(x) & \text{if } isEval(e)\\ 
        \sum_{x\in constraints(e)~\cup~assignments(e)} cpx_c(x) & \text{if } isAssignment(e)\\
-       cpx(function(e)) & \text{if } isOpDef(e)\\
+       cpx_e(function(e)) & \text{if } isOpDef(e)\\
        w_{cpx}(e) + w_{cpx}(\IVML{Expr}) \cdot cpx_e(expr(e)) & \text{if } isConstraint(e)\\
        0 & \text{else}
        \end{cases}
 $$
-
-
-%$$
- %  cpx(e) = \closedCases{ 
-  %     cpx(project(cfg)) & \text{if } isConfiguration(e)\\
-  %     %& \text{if } isProject(e)\\ only r_{cpx}
-   %    w_{cpx}(type(e)) + cpx(default(e)) & \text{if } isVariable(e)\\
-    %   cpx(default(e)) & \text{if } isDeclaration(e)\\ % may count compound slots twice
-%       %\sum_{c\in constraints(e)} & \text{if } isEval(e)\\ only r_{cpx}
- %      %\sum_{x\in r_a(e)} cpx(x) & \text{if } isAssignment(e)\\ only r_{cpx}
- %      cpx(function(e)) & \text{if } isOpDef(e)\\
- %      c_{cpx}(e) & \text{if } isConstraint(e)\\
- %      c_{cpx}(e) & \text{if } isExpression(e)\\
- %      0 & \text{else}
- %      } + \sum_{x\in r_{cpx}(x)}cpx(e)
-%$$
-
-%$$
-%   r_{cpx}(e) = \begin{cases}
-%           vars(e) & \text{if } isConfiguration(e)\\
-%           constraints(p)~\cup~evals(p)~\cup~assignments(p) & \text{if } isProject(e)\\
-%           vars(e)~\cup~allRefines^+(type(e)) & \text{if } isVariable(e)\\
-%           constraints(e)~\cup~assignments(e) & \text{if } isCompound(e)\\
-%           constraints(e)~\cup~evals(e) & \text{if } isEval(e)\\
-%           vars(e)~\cup~constraints(e)~\cup~assignments(e) & \text{if } isAssignment(e)\\
-%           \emptySet & \text{else}
-%      \end{cases}
-%$$
-
+%
+The measure for an expression is calculated along the expression tree, i.e., we weight each tree node and typically sum up the connected sub-trees. A tree node can have various types that we enumerate as cases here. Most of the operations and functions that can be used in an IVML expression are internally represented as a call. For a call, we just summarize the measures of all arguments, e.g., for the plus operation the measures of the left and right hand side expression. Parentheses, container iterators, let expressions and accessors mainly consist of a single expression that makes up the measure. For an if-then-else expression, we sum up the expressions constituting the condition, the then part and the else part. For (compound or container) initializers as well as for expression blocks, we just sum up the measures for the contained expressions.
+%
 $$
     cpx_{e}(e) = w_{cpx}(e) + \begin{cases}
        \sum_{a\in args(e)}c_{cpx}(a) & \text{if } isCall(e)\\
-       c_{cpx}(expr(e)) & \text{if } isParenthesis(e)~\vee \\
-                                  &  \text{ } isIter(e) \vee isLet(e)\\
+       c_{cpx}(expr(e)) & \text{if } isParenthesis(e)\\
+       c_{cpx}(expr(e)) & \text{if } isLet(e)\\
+       c_{cpx}(expr(e)) & \text{if } isIter(e)\\
+       c_{cpx}(expr(e)) & \text{if } isAccessor(e)\\
        c_{cpx}(expr(e)) + c_{cpx}(then(e)) + c_{cpx}(else(e)) & \text{if } isIfThen(e)\\
-       c_{cpx}(expr(e)) & \text{if } isAccessor(e)\\
-       \sum_{e\in expressions(e)}c_{cpx}(e) & \text{if } isInitializer(e)~\vee \\ % compound + container
-                                  &  \text{ } isBlock(e)\\
+       \sum_{e\in expressions(e)}c_{cpx}(e) & \text{if } isInitializer(e)\\ % compound + container
+       \sum_{e\in expressions(e)}c_{cpx}(e) & \text{if } isBlock(e)\\ 
+       0 & \text{else}
+    \end{cases}
+$$
+%
+The traditional measure applied to an IVML model would determine the number of top-level variables and the number of constraints. This can be achieved as follows, i.e., we disable measuring nested variables and expressions, an enable counting constraints and remaining variables by setting the type weight for $cpx_v(e)$ and the constraint weight for $cpx_c(e)$ both to 1. All remaining weights, in particular for constraint expressions are set to 0.
+%
+$$
+    w_{cpx}(e) = \begin{cases}
+       0 & \text{if } \IVML{Var}~\vee~\IVML{Expr}\\
+       1 & \text{if } isConstraint(e)\\ % in particular 0 for Expr
+       1 & \text{if } isType(e)\\ %count variables equally
+       0 & \text{else} % scope out \IVML{Variable}
+    \end{cases}
+$$
+%
+As mentioned above, the traditional measure leads to a wrong impression of the actual complexity, as nested variables are not counted and and the different types of constraints are just considered with the same constant value. According to our experience and the algorithms presented in this document, also compound and container types as well as quantor or iterator expressions imply a higher complexity, while accessing the actual value of a variable and of a constraint (often used in default value expressions) is typically not so complex. For this experiment, we defined the weighting function as follows (not claiming to present a universal complexity measure for IVML models here), that we calibrated by sample runs over the subjects. 
+%
+$$
+    w_{cpx}(e) = \begin{cases}
+       1 & \text{if } \IVML{Var}~\vee~\IVML{Expr}\\
+       2 & \text{if } isCompound(e)~\vee~isContainer(e)\\
+       1 & \text{if } isType(e)\\
+       1 & \text{if } isCall(e)~\vee~isLet(e)\\
+       1 & \text{if } isParenthesis(e)~\vee~isAccessor(e)\\
+       1 & \text{if } isIfThen(e) \\
+       1 & \text{if } isInitializer(e)~\vee~isIsBlock(e)\\
+       1 & \text{if } e = \IVMLself{}\\
+       5 & \text{if } isIter(e)\\
+       0.25 & \text{if } isVariableUse(e)~\vee~isConstant(e)\\
        0 & \text{else}
     \end{cases}
 $$
 
-Calibrated for the projects in this set
-
-$$
-    w_{cpx}(e) = \begin{cases}
-       2 & \text{if } isCompound(e)\\
-       2 & \text{if } isContainer(e)\\
-%       1 & \text{if } isExpression(e)\\
-       1 & \text{if } \IVML{Expr}\\
-       0.25 & \text{if } isVariable(e)\\
-       0.25 & \text{if } isConstant(e)\\ % 0 for isCall
-       5 & \text{if } isIter(e)\\
-       1 & \text{if } isAccessor(e)\\
-       1 & \text{if } e = \IVMLself{}\\
-       0 & \text{else}
-    \end{cases}
-$$
-
-Tranditional
-$$
-    w_{cpx}(e) = \begin{cases}
-       1 & \text{if } isConstraint(e)\\ % in particular 0 for Expr
-       0 & \text{else}
-    \end{cases}
-$$
+For short, we enable considering nested variables and the contents of expressions. Compounds and containers are considered with a double weight compared with all other types. Most of the constraint tree nodes are weighted by 1 including access to the actual instance of a compound (\IVMLself{}), except for container iterator operations that we weight by a higher and access to the value of decision variables and constants that we weight by a lower value.
 
 \subsection{Results}\label{sectResults}