Index: /reasoner/approach.tex
===================================================================
--- /reasoner/approach.tex	(revision 183)
+++ /reasoner/approach.tex	(revision 184)
@@ -1,21 +1,23 @@
 \section{Approach}\label{sectApproach}
 
-IVML is a textual variability and configuration modeling language. The syntax of IVML, which is mostly out of scope here (please refer to~\cite{IVML-LS} for the complete specification), is based on well known (programming) language concepts and can be used to express the (meta-)model of a configuration as well as individual configurations. The main concept of IVML is the \emph{typed variable}, which may have a default value expression. For short, IVML offers basic types (Boolean, Integer, Real, String), user-defined compound types (groups of variables, can be subject to multi-inheritance), container types (set, sequence), derived types (constraining or aliasing another type) and constraint types, i.e. variables holding a constraint so that it can be overridden by assigning a new constraint. IVML variables can be further detailed by and interrelated with other variables through constraints. The constraint (expression) language is largely inspired by concepts (and notation) of the Object Constraint Language (OCL)~\cite{OCL24}, but, in contrast to OCL, allows side-effect to enable the definition of configurations as well as the propagation of configuration values. IVML models (given in terms of so called \IVML{projects}), can be imported, staged~\cite{CzarneckiHelsenEisenecker05a} and conditionally frozen, in particular to enable partial instantiation of code, including removal of variation points.
+IVML is a textual variability and configuration modeling language. The syntax of IVML, which is mostly out of scope here (please refer to~\cite{IVML-LS} for the complete specification), is based on well known (programming) language concepts and can be used to express the (meta-)model of a configuration as well as individual configurations. The main concept of IVML is the \emph{typed variable}, which may detailed with a default value (expression). For short, IVML offers basic types (Boolean, Integer, Real, String), user-defined compound types (groups of variables, can be subject to multi-inheritance), container types (set, sequence), and derived types (constraining or aliasing another type). 
 
-Over the last years, we approached the analysis and reasoning for IVML from different perspectives, including traditional reasoning mechanisms or rule engines~\cite{QMD42}. Although these mechanisms are powerful in their own respect and typically available as implementation, using them for IVML requires translating an IVML model and its constraints into the model of the respective reasoning mechanism and executing that mechanism on the translated model. In the past, we identified in particular this transformation as performance bottleneck as well as an obstacle to achieve IVML-completeness, i.e., we were often not able to express all IVML concepts in the respective reasoner model. While traditional reasoners allow for completing a model through feasible ground instances, they typically also perform a kind of constrained state space exploration. Considering the size of usual configuration models (currently the largest IVML model we are aware of has more than 16.000 variables~\cite{EichelbergerQinSizonenko+16, EichelbergerQinSchmid17a}), we experienced that just applying a traditional reasoner to usual IVML models is an inefficient approach. Other related mechanisms such as rule engines can be applied to some degree, however, according to our experience~\cite{EichelbergerQinSizonenko+16, QMD42, QMD43} lead to a significant and infeasible runtime overhead. Moreover, traditional reasoners as well as rule engines are limited as they typically do not support the range of OCL operations required for IVML, in particular container iterators, quantifiers and user-defined constraint operations. Further, also specific reasoning approaches, e.g., for the Object Web Language (OWL)~\cite{KahnPorres15, KolliaGlimmHorrocks11} might be an option as they partially support quantification, but they are similarly limited regarding the range of OCL operations. Moreover, OCL-based approaches such as~\cite{DemuthLoecherZschaler04, ClavelEgea06} typically also focus on forward-chaining, but would require deep extensions, as OCL does not allow for side-effects or value propagation as required by IVML. 
+IVML variables can be further detailed by and interrelated through constraints. The constraint (expression) language is mostly inspired by concepts (and notation) of the Object Constraint Language (OCL)~\cite{OCL24}. In contrast to OCL, IVML allows side-effects to specify the value of variables in a configuration and to enable the propagation of configuration values through constraints. An IVML model (given in terms of a so called \IVML{project}), can be imported, staged~\cite{CzarneckiHelsenEisenecker05a} and conditionally frozen, in particular to enable partial instantiation of code, including removal of variation points. A specific type in IVML not mentioned above is the constraint type, which allows declaring variables that hold a constraint as value. The constraint is used as any other constraint to validate the model, but as also for any other variable, the value can be defined, changed and override by the configuration. 
 
-Due to these experiences, we decided to pursue a \emph{mixed reasoning approach}, which attempts to reduce / simplify the reasoning problem towards an instance that (if needed) can finally be handled by a traditional reasoning approach. In our case, the basis for a mixed approach is \emph{forward-chaining}, i.e., evaluation of a given constraint base as long as no constraints are left over, while re-scheduling constraints that depend on a variable changed during reasoning. The SSE IVML reasoner as described in this document realizes this basis, while the actual mixed approach is part of future work. Other work also combine multiple reasoning approaches to achieve advanced reasoning capabilities, e.g., evolutionary algorithms with SAT-solving to realize many-objective optimization for feature modeling~\cite{XiangZhouZheng+2018}.
+Over the last years, we approached the analysis and reasoning for IVML from different perspectives, including traditional reasoning mechanisms or rule engines~\cite{QMD42}. Although these mechanisms are powerful in their own respect, using them for IVML requires translating an IVML model and its constraints into the model of the respective reasoning mechanism and executing that mechanism on the translated model. In the past, we identified in particular this transformation as a significant performance bottleneck as well as an obstacle to achieve IVML-completeness, i.e., we were often not able to express all IVML concepts in the respective reasoner model. While traditional reasoners allow for completing a model through feasible ground instances, they typically also perform a kind of (constrained) state space exploration. Considering the size of usual configuration models (currently the largest IVML model we are aware of has more than 16.000 variables~\cite{EichelbergerQinSchmid17a, EichelbergerQinSizonenko+16}), we experienced that just applying a traditional reasoner to usual IVML models is an inefficient approach. Other related mechanisms such as rule engines can be applied to some degree, however, according to our experience~\cite{EichelbergerQinSizonenko+16, QMD42, QMD43} lead to a significant and infeasible runtime overhead. Moreover, traditional reasoners as well as rule engines are limited as they typically do not support the range of OCL concepts required for IVML, in particular container iterators, quantifiers and user-defined functions. Further, also specific reasoning approaches, e.g., for the Object Web Language (OWL)~\cite{KahnPorres15, KolliaGlimmHorrocks11} might be an option as they partially support quantification. However, they are often similarly limited regarding the range of OCL operations. Moreover, OCL-based approaches such as~\cite{DemuthLoecherZschaler04, ClavelEgea06} typically also focus on forward-chaining, but would require deep extensions, as OCL does not allow for side-effects or value propagation as required by IVML. 
 
-It is important to notice that IVML~\cite{IVML-LS} supports declarative specification of constraints, i.e., modeling concepts and constraints can be specified regardless of their actual evaluation sequence. To ensure that values for variables are uniquely determined, in particular in presence of default values, IVML allows changing the value of a variable only once within a project scope. Otherwise the reasoning mechanism must issue an error. Thus, reasoning must happen project-wise, in a depth-first traversal along the project import dependencies. Moreover, to speed up reasoning in specific situations, some form of sequence of the constraints can be indicated through the concept of partial evaluations, i.e., eval-blocks stated within a project or a compound. An eval-block consists of constraints or nested eval-blocks, while nested eval-blocks implicitly define an evaluation sequence. In other words, a project is the outermost (implicit) eval-block and nested eval-blocks shall be evaluated in their nesting sequence, while eval-blocks on the same nesting level do not imply any evaluation sequence. 
+Due to these experiences, we decided to pursue a \emph{mixed reasoning approach}, which attempts to reduce / simplify the reasoning problem towards an instance that (if needed) can be handled by a traditional reasoner. Other work also combine multiple reasoning approaches to achieve advanced reasoning capabilities, e.g., evolutionary algorithms with SAT-solving to realize many-objective optimization for feature modeling~\cite{XiangZhouZheng+2018}. In our case, the basis for a mixed approach is \emph{forward-chaining}, i.e., evaluation of a given constraint base as long as no constraints are left over, while re-scheduling constraints that depend on a variable changed during reasoning. The SSE IVML reasoner as described in this document realizes this basis, while the actual mixed approach is part of future work. 
 
-However, not all IVML constraints can be directly used for reasoning. This is in particular true for constraints defined in compound types, i.e., these constraints must hold for all instances, or, indirectly, through types used within collections, i.e., these constraints must hold for the respective collection elements. Such constraints typically utilize local variables defined within the compound including the special pre-defined variable \IVMLself{} pointing to the actual compound instance. Allowing such implicit constraints rather than requiring that all constraints are defined directly on the configuration variables increases consistency and helps reducing model sizes and complexity. 
+It is important to notice that IVML~\cite{IVML-LS} supports declarative specification of constraints, i.e., modeling concepts and constraints can be specified regardless of their actual evaluation sequence. To ensure that values for variables are uniquely determined, in particular in presence of default values, IVML allows changing the value of a variable only once within a project scope. Otherwise the reasoning mechanism must issue an error. Thus, reasoning must happen project-wise, along a depth-first traversal along the project import dependencies. Moreover, to speed up reasoning in specific situations, some form of sequence of the constraints can be indicated through the concept of partial evaluations, i.e., in terms of so called eval-blocks stated within a project or a compound. An eval-block consists of constraints or nested eval-blocks, while nested eval-blocks implicitly define an evaluation sequence. In other words, a project is the outermost (implicit) eval-block and nested eval-blocks shall be evaluated in their nesting sequence, while eval-blocks on the same nesting level do not imply any evaluation sequence. 
+
+However, not all IVML constraints can be directly used for reasoning. This is in particular true for constraints defined within compound types. These constraints must hold for all instances of the compound. Such constraints are typically based on variables defined within the compound type (slot) as well as te special pre-defined variable \IVMLself{} that points to the actual compound instance. Allowing such implicit constraints rather than requiring that all constraints are defined directly on variables increases consistency and helps reducing model sizes and complexity. 
 %Cross-references to other types are required to be specified through an explicit accessor, i.e., an expression mentioning the variable and the respective nested variable(s). 
-As a consequence, the reasoning mechanism must relate type-based constraints to the correct underlying configuration variables. In our approach, we perform first a \emph{constraint translation} to instantiate the constraints for configuration variables based on the actual type of the variable or its value. An alternative could be reasoning over type constraints and modifying the mapping to the variables on demand before evaluation. This may save memory and runtime (required for constraint translation / creation), but, however, also increases the complexity of managing the actual constraints to be evaluated.
+As a consequence, the reasoning mechanism must relate type-based constraints to the correct underlying variables. In our approach, we perform first a \emph{constraint translation} to instantiate the constraints for variables based on the actual type of the variable or its value. An alternative could be to rely on a dynamic mapping for the variables used in type constraints, i.e., to substitute the actual value on-the-fly before evaluation. While this may save memory and the time required for constraint translation, however, this also increases the complexity of constraint management while reasoning. For now, we opted for an approach based on constraint translation, which separates translation and evaluation into two clear phases. Based on this version, we may evaluate such an on-demand mapping in an alternative implementation of the reasoner in future work.
 
-Based on our experience~\cite{EichelbergerQinSizonenko+16, QMD42, QMD43} and taking into account the specific requirements of the default values, project scopes, eval-block scopes, we are convinced that (at least the first step in a mixed reasoning approach) shall directly operate on an IVML model, also to achieve a high runtime performance. In particular, model transformations such as the constraint translation shall happen only if required to reduce overhead on the reasoning time. After the required constraint translation, constraint evaluation and forward-chaining is executed on the collected constraint base. Besides the resulting configuration, \emph{detained reasoning output} is required, so that subsequent reasoners in a reasoning chain can continue on the results. For this purpose, reasoning error messages shall not only contain the failed constraint, but also additional information such as failing sub-expressions. Based on such output, we can imagine that more advanced reasoning engines such as SMT-reasoners could complement the described approach. Translating the information about failed constraints into a model that is suitable for sophisticated reasoning approaches could enable advanced capabilities, such as value determination, instance creation or configuration completion / repair. However, such chained reasoning is out of the scope of this document.
+Based on our experience~\cite{EichelbergerQinSizonenko+16, QMD42, QMD43} and taking into account the specific requirements of IVML default values, project scopes, and eval-block scopes, we are convinced that basic reasoning for IVML shall directly operate on an IVML model, in particular to enable for high performance. Constraint transformations such as the constraint translation shall happen only if required to reduce overhead during reasoning time. After the required constraint translation, constraint evaluation and forward-chaining is executed on the collected constraint base. However, this may not be the end of reasoning operations, as subsequent reasoning phases may continue working on the results of the forward-reasoning. Such a mixed / chained approach can then apply advanced reasoning functionality to a partial configuration determined by forward reasoning, i.e., an already reduced reasoning problem. Therefore, besides the resulting configuration, \emph{detailed reasoning output} is required, subsequent reasoners can rely on. For this purpose, reasoning error messages shall not only contain the failed constraint, but also additional information such as failing sub-expressions. This is also helpful for other forms of analysis, e.g., determining constraint violations as a basis for adaptive decision making~\cite{Eichelberger18}.
 
-Basically, reasoning over all variables of a project including all constraints is needed. This may differ for certain use cases where considering all constraints neither required nor efficient in terms of reasoning time. The first use case is \emph{incremental reasoning}, i.e., starting with a given configuration and just analyzing the changes made by the user. The idea is to provide immediate reasoning results, so that reasoning even on complex and large models can happen upon every change made by the user. The second use case is \emph{runtime reasoning}, e.g., reasoning for validity of a configuration at runtime of a system to identify validations and to trigger re-configuration or adaptation~\cite{Eichelberger16}. In this case, we may assume that the underlying IVML model is not changed, so that initially translated constraints can be re-used. This assumption may does not hold in the incremental reasoning case. For supporting these two cases, we introduce two main \emph{reasoning modes}, namely complete (full) reasoning and incremental reasoning, while incremental reasoning can be achieved through the following two techniques.
+Often, reasoning over all variables of a project including all defined constraints, i.e., \emph{full reasoning}, is needed. This may differ for use cases in which considering all constraints is neither required nor efficient in terms of reasoning time. The first use case is \emph{incremental reasoning}, i.e., starting with a given configuration and performing reasoning just for those variables that have been changed between two subsequent reasoner runs, e.g., by the user. The idea is to provide immediate reasoning results, so that reasoning can happen upon every change made by the user even on complex and large models. It is important to note that in incremental reasoning the underlying model structures may change, so that from time to time also a full reasoning run may be required. The second use case is \emph{runtime reasoning}, e.g., reasoning for validity of a configuration at runtime of a system to identify validations and to trigger re-configuration or adaptation~\cite{Eichelberger16, Eichelberger18}. In contrast to incremental reasoning, in this case we assume that the underlying IVML model is not changed, so that initially translated constraints can always be re-used. For supporting these two cases, we will relay on the following two techniques:
 \begin{itemize}
-  \item \emph{Creation of a partial constraint base:} Constraints assigning already processed default values or constraints over variables that cannot change anymore (frozen variables) are not added to the constraint base. The remaining configuration variables are assumed to be properly initialized or shall be instantiated by the reasoner if possible. Thus, reasoning focuses on non-frozen variables\footnote{In EASy-Producer, this technique can be enabled through the reasoner configuration.}, in particular changed variables. In our experiments, this mechanism can save up to 65\% reasoning time.
+  \item \emph{Creation of a partial constraint base} due to changed variables, in particular non-frozen variables\footnote{In EASy-Producer, this technique can be enabled through the reasoner configuration.}. In more detail, constraints assigning already processed default values or constraints over variables that cannot change anymore (frozen variables) are not added to the constraint base. The remaining configuration variables are assumed to be properly initialized (initial full reasoning) or shall be determined during the actual run by the reasoner if possible.  
   \item \emph{Re-using a previously created constraint base:} The initial constraint base created by the constraint translation is stored in memory and re-stored when applying the reasoner again to the same (structurally unchanged) model, i.e., we require that no new types or variables are added between two reasoning runs. This technique\footnote{This technique can be applied by requesting a re-usable reasoner instance from the EASy-Producer reasoner core and by using that instance instead of the default interface provided by the static reasoning fronted.} trades off reasoning time for memory consumption and can even be combined with creating a partial constraint base, saving up to further 75\% reasoning time.
 \end{itemize}
Index: /reasoner/reasoner.bib
===================================================================
--- /reasoner/reasoner.bib	(revision 183)
+++ /reasoner/reasoner.bib	(revision 184)
@@ -446,3 +446,11 @@
 } 
 
+@inproceedings{Eichelberger18,
+  author = "Holger Eichelberger",
+  title = "Flexible System-level Monitoring of Heterogeneous Big Data Streaming Systems",
+  year = 2018,
+  booktitle = "Proceedings of the 44th Euromicro Conference on Software Engineering and Advanced Applications",
+  pages = "289--292",
+}
+
 @Comment{jabref-meta: databaseType:bibtex;}