Index: /reasoner/mainAlgorithms.tex
===================================================================
--- /reasoner/mainAlgorithms.tex	(revision 168)
+++ /reasoner/mainAlgorithms.tex	(revision 169)
@@ -222,5 +222,5 @@
         $hasTimeout \assng checkForTimeout()$\; \label{algEvalLoopTimeout}
     } \label{algEvalLoopLoopEnd}
- \caption{Constraint evaluation loop (\IVML{evaluateConstraints}).}\label{algEvalLoop}
+ \caption{Constraint evaluation loop (\IVML{evaluateConstraints}).}\label{algEvalLoop}\label{algEvaluateConstraints}
 \end{algorithm}
 
Index: /reasoner/performance.tex
===================================================================
--- /reasoner/performance.tex	(revision 168)
+++ /reasoner/performance.tex	(revision 169)
@@ -1,21 +1,32 @@
 \section{Performance Considerations}\label{sectPerformance}
 
-In this section, we briefly collect some observations seen/refactored/solved/left over during the improvement of the reasoner. Although the considerations may appear rather reasoner specific or specific to the IVML object model, they typically stem from more general Java performance optimization knowledge typically hidden behind constraints, reasoner or model data structures. \TBD{The individual texts may be pretty short, and are currently just intended for remembering/recording the obstacles.}
+In this section, we discuss some observations made during the revision of the reasoner. We document these observations here as a general explanation of the implementation, but also to avoid that future revisions accidentally violate undocumented performance improvements. Baseline for this discussion is again the version from January 2018. Although the considerations may at a glance appear rather specific, they typically stem from more general Java performance optimization knowledge typically hidden behind constraints, reasoner or model data structures.
 
 \begin{itemize}
-    \item For lists of unknown size and for the expected capacity of IVML models, use \textbf{linked lists} instead of array lists. Main operations are adding, iterating over all, lookup of and removing elements to clean up memory incrementally, i.e., linked lists can avoid performance-impacting resizing effects. To speed up removal and lookup, we combined a linked list and a hashtable in a special constraint base class. Index-based access is not needed, constraints are just added and one iteration is needed to build up the constraint base or to perform reasoning. Pre-calculation of the expected size of the data structures per model seems to be unfeasible as this requires a second run over all model elements.
-    \item Process the \textbf{constraint base} iteratively and remove processed constraints directly after processing. This again requires a linked list as otherwise incremental resizing effects can affect the performance. In summary, building the constraint base and reasoning causes peak memory effects that are better from a performance point of view than (re-)allocating big array memory blocks.
-    \item Avoid having too many unneeded \textbf{partitions of the constraint base} being stored during the translation in different lists, that are finally added to the constraint base. Try to directly add constraints where possible to the constraint base, e.g., for default constraints to avoid iterating over the complete constraint base before reasoning. However, due to the prioritization of the constraint types and their required sequence, this is not always possible. One alternative, a numeric prioritization of the individual constraints would require sorting the constraint base, at the new parts not the left-over constraints to be re-evaluated from a previous reasoning steps. However, implicit sorting on insertion according to priorities could help here. To speed up operations for such constraint lists, we use a specific linked constraint list, which transfers the nodes the linked nodes upon adding the constraints to the constraint base (based on this specific linked list).
-    \item Create/translate a \textbf{constraint in one step}, if required in compounds split into filling the variability mapping and then creating the constraints based on the actual mapping. 
-    \item Try to figure out whether a certain \textbf{constraint (type) is actually needed} for reasoning and avoid its translation/creation wherever feasible. \TBD{Can we move the decision about keeping a constraint from adding it to the constraint base before constraint creation. However, this requires changes/additional alternatives for different code parts.}. Thus, don't re-process or filter certain constraint types at the end (done in the initial version for default and derived type constraints), as this implies iterating multiple times over the (partial) constraint base and re-creating/throwing away constraint instances in memory.
-    \item \TBD{Avoid storing constraints that can be \textbf{processed immediately}, e.g., default constraints without dependencies. However, if a default constraint fails, it must be added to the constraint basis for later possible re-evaluation if dependent values become available/right.}
-    \item For identifying whether constraints are already in the constraint base, use a \textbf{fast look-up structure} instead of an iteration over the constraint base. Although the initial implementation utilized here reference equality (faster than the equals method), the current lookup structure is significantly faster (approx. 600 ms on large models, at that time around 25\% of runtime) than the original approach trading off execution speed for (peek) memory. Here a lookup structure based on reference equality could further speed up reasoning, which, however, may in the end be slower than the built in collections due to JIT compilation and JVM optimization effects. This is even true for the stack-based variable mapping $\variableMapping$, which performs as fast as a flat map, which may roughly serve for the main purpose of $\variableMapping$, but not for composing complex constraints from nested compunds and containers.
-    \item Incremental reasoning using a \textbf{stored constraint base} may speed up reasoning but only achieves this through large memory allocations. Ensure initializing the model through default constraints so that the actual stored constraint base only contains really needed constraint types, i.e., no default constraints and, if possible, no constraints for which all depending variables are already frozen.
-    \item \textbf{Reuse visitor instances} if applied more than once, in particular reasoners with more complex internal data structures such as maps or lists. This required some refactoring of the IVML model implementation.
-    \item \textbf{Avoid temporary instances}, e.g., iterators, temporary constraint creation, temporary data structures wherever possible. One option replacing a collection structure is to define a functor for processing the elements and to create a reusable instance of that functor. If permitted by the Java environment settings (EASy often compiles against JDK 1.6), also a lambda-function instead of a heavy-weight functor class may be appropriated.
-    \item \textbf{Avoid multipe variable substitutions} per constraint expressions, i.e., join variable substitutions wherever possible. Variable substitution copies the expression. Initially, performing a substitution multiple time on the same expression caused creating multiple copies of all nodes in memry, i.e., a follow-up copy usually was used instead of the original copy making the previous copies effectively superfluous. As sometimes multiple substitutions can only be avoided with additional (not straightforward) code, e.g., for a sequential call of Algorithm \ref{algTranslateCompoundDeclaration} and \ref{algTranslateDeclaration}, we revised the IVML expression copy mechanism to create instances of expression tree nodes only if a substitution happend, i.e., new expression trees are absolutely required.
-    \item \textbf{Use instance pooling} if datastructures are temporarily used, e.g., variable accessors in expression evaluation or sets of compounds in reasoning (20-30 ms on larger models, at that time around 10\% of runtime).
-\item visitor is faster than (IVML-)instanceof
+    \item Typically, array-based lists shall be initialized with (an estimate of) the required size~\cite{Shirazi02}. For lists of unknown size, we use \emph{linked lists} instead of array-based lists. Although some initial sizes could be obtained, this appears infeasible as it would require a further traversal of the input model. Main operations required for reasoning are adding elements, iterating over all elements, lookup of elements and removing elements to clean up memory incrementally. Here, linked lists can avoid performance-impacting resizing operations, although linked lists in Java typically require explicit use of iterators that shall be avoided~\cite{Shirazi02}. 
+
+   \item For the central list-based data structures, namely constraint lists, index-based access is not needed. Constraints are added to (temporary) lists and a single iteration is needed to transfer the constraints into the (initial) constraint base. To \emph{avoid iterators} here, we created an own container class (\class{ConstraintList}), which performs typical traversal operations such as element transfers internally, using basic loops~\cite{Shirazi02}. Moreover, we equipped these operations with the ability to directly transfer the internal list elements instead of only the payload to avoid unnecessary (peak) memory allocation during the transfer operations. 
+
+   \item The \emph{constraint base} is used for storing the constraints to be processed. Initially, during reasoning a single project, a pointer-based iterator traversed the array-based constraint list and the algorithm kept all processed constraints in memory. Constraint re-scheduling just added more constraints. Basically, using a linked list avoids re-allocation of big memory blocks. Moreover, using the list as a queue enables dynamic memory usage according to the needs, i.e., the actual constraint can be obtained by removing the head element and the main reasoning loop in Algorithm \ref{algEvaluateConstraints} can just operate until the constraint base is empty (if no further re-scheduled constraints are added). To also speed up lookup and removal for re-scheduling, we extended the \class{ConstraintList} and integrated it with a hashtable (\class{ConstraintBase}). This avoids the initially quadratic complexity in constraint re-scheduling for forward reasoning. On large models, the difference in ad ho performance comparisons was roughly 600 ms, i.e., around 25\% of the reasoning time, even if we do not use JIT supported data structures.
+
+    \item In general, maintaining unneeded \emph{partitions of the constraint base} shall be avoided. Although we were able to speed up the creation of the constraint base by special transfer operations, it is better to directly add constraints to the constraint base where possible. Thus, we removed several unneeded temporary lists. However, adding constraints directly is not always possible due to the incremental construction while traversing the model and the prioritization of certain constraints, such as default values or constraints in eval-blocks. As an alterative, we could use a numeric prioritization stored along with the individual constraints (like the attached type information) and use a sorted data structure. So far, we tried to avoid additional information stored in the constraints or to investigate specific management records for each individual constraint that would be required when avoiding constraint translation by changing variable contexts as discussed in Section \ref{sectApproach}.
+
+    \item Determine whether a certain \emph{constraint (type) is actually needed} for reasoning and avoid its translation/creation wherever feasible. While this is easy for some constraint types, e.g., annotations in incremental reasoning mode, checking for frozen constraints would require tests in various places of the constraint translation. So far, we opted for keeping these tests in a central place for better maintainance ($add$ function), but probably testing constraint properties more early would allow for better performance.
+
+    \item Create/translate a \emph{constraint in one step}, in particular, avoid multiple variable substitutions for the same constraint. Initially, some constraint types such as  those for default values and derived types were substituted partially, then collected and finally substituted again. This implies iterating multiple times over the same constraint base as well as re-creating and throwing away constraint instances. We managed to avoid such multiple substitutions and, moreover, improved the variable substitution so that new constraint instances are only created if variable substitutions are actually applied.
+
+    \item \textbf{Reuse visitor instances} if the same visitor applied more than once (as also for other types of instances~\cite{Shirazi02}) and the visitor maintains complex internal data structures such as maps or lists. If possible, even avoid internal storage structures that are consulted only once, e.g., for transferring the results of a traversal into another data structure. To avoid this, we created an abstract visitor which leaves the handing of the results open. A concrete visitor replaces the original visitor still storing the results as before in an internal datastructure. In contrast, the visitor using by the reasoner stores the results directly in the target datastructure. Informal performance observations showed that replacing a single visitor (out of 5 used by the implementation) improved the performance on large models by roughly 80 - 100 ms, i.e., 3-4\% of the reasoning time.
+
+    \item \textbf{Use instance pooling} if temporary use of datastructures is required, pooling of instances can be rather helpful~\cite{EichelbergerSassSchmid16, Shirazi02}, but requires explicit memory managment, i.e., obtaining and releasing. For example, we pool frequently used sets of compounds in the translation of compounds, which helps saving 20-30 ms on larger models. Moreover, we pool frequently used internal structures of the expression evaluation (but don't have measurements available as this was part of the original implementation).
+
+    \item In general, \textbf{avoid temporary instances}, e.g., iterators, temporary constraint creation, or temporary data structures wherever possible. Possible strategies are maintaining a singleton instance in single-thread processing, using instance pools or just using different types of loops. Sometimes, also applying a (reusable) functor to a collection is more efficient than an explicit loop. If permitted by the compilation settings (EASy-Producer usually is compiled against JDK 1.6), also a lambda-function instead of a heavy-weight functor class may be appropriate.
+
+    \item Avoid \textbf{tests with instanceof}. We experienced that a (reusable) visitor can be significantly faster than a series of alternatives with instanceof comparisons (or similar operations of the IVML type system). However, implementing a visitor class can interrupt the code reading flow.
+
+    \item Incremental reasoning using a \textbf{stored constraint base} may speed up reasoning, but only achieves this through significant memory use. Thus, where possible, irrelevant constraints such as default values and frozen constraints shall be avoided. Moreover, we experience a ramp-up time of around 40 ms when using the stored constraint base, which is probalby due to the required constraint transfer operations, i.e., for keeping the stored constraint base and copying the stored constraints to the active constraint base.
+
+    \item During the revision, we replaced an originally \emph{flat hash table} realizing $\variableMapping$ by a stack of hash tables (cf. Section \ref{sectVarMapping}) due to correctness issues and to enable the incremental creation of nested container constraints. Although the stacked structure may require more memory, the overall reasoning speed was roughly the same after the exchange of the data structures.
+
 \end{itemize}
 
-There are still opportunities for further speeding up the reasoning operations. On an Intel core i7, the reasoner reaches around 15-43 constraint evaluations in average per millisecond - runtime vs. full reasoning on the latest QualiMasster test model, currently reasoning time includes translation and evaluation time \TBD{Sepearte timing}. This is already an improvement over the initial version, which operated at around 10 evaluations per millisecond in average (also with a higher number of constraint evaluations due to unneeded and uneffective constraints). The constraint evaluation time probably suffers from the immutable structured values in EASy-Producer, which are created for each assignment. Knowing that most of the assignments in this model are done for basic IVML datatypes, a fast track for setting integer, double, boolean and String values could increase the number of evaluations, but requires careful extension, as changing values shall only be allowed via decision variables, which take care of the current assignment state, e.g., \IVML{FROZEN}.
+Although we improved the performance of the reasoner significantly, there are still opportunities for further speeding up the reasoning operations. \TBD{On an Intel core i7, the reasoner reaches around 15-43 constraint evaluations in average per millisecond - runtime vs. full reasoning on the latest QualiMasster test model, currently reasoning time includes translation and evaluation time. This is already an improvement over the initial version, which operated at around 10 evaluations per millisecond in average (also with a higher number of constraint evaluations due to unneeded and uneffective constraints).} Potential improvements are filtering out irrelevant constraints as early as possible, evaluating certain constraints immediately and storing them only in the constraint base in case of failures, or even inefficient implementation of several functions and structures in the IVML object model.
Index: /reasoner/reasoner.bib
===================================================================
--- /reasoner/reasoner.bib	(revision 168)
+++ /reasoner/reasoner.bib	(revision 169)
@@ -154,4 +154,25 @@
 keywords = "Consistency, Ontology, UML, Reasoning"
 }
+
+@inproceedings {EichelbergerSassSchmid16,
+author={Holger Eichelberger and Aike Sass and Klaus Schmid},
+title={From Reproducibility Problems to Improvements: A journey},
+booktitle={Proceedings of the 7th Symposium on Software Performance},
+number={4},
+journal={Softwaretechnik-Trends},
+year={2016},
+pages={43-45},
+abstract={Reproducibility and repeatability are key properties of benchmarks. However, achieving reproducibility can be difficult. We faced this while applying the micro-benchmark MooBench to the resource monitoring framework SPASS-meter. In this paper, we discuss some interesting problems that occurred while trying to reproduce previous benchmarking results. In the process of reproduction, we extended MooBench and made improvements to the performance of SPASS-meter. We conclude with lessons learned for reproducing (micro-)benchmarks.}
+} 
+
+@book{Shirazi02,
+ author = {Shirazi, Jack},
+ title = {Java Performance Tuning},
+ year = {2002},
+ isbn = {0596003773},
+ edition = {2nd},
+ publisher = {O'Reilly \& Associates, Inc.},
+ address = {Sebastopol, CA, USA},
+} 
 
 @article{XiangZhouZheng+2018,
