Data Processing Library compilation patterns

One key feature in the Data Processing Library are its compilation patterns.
These patterns guide you to implement incremental distributed compilers. Each
task executes one compiler, which can either consist of your code only or a
combination of your code with one of the patterns provided.

There are two types of patterns:

Functional patterns: provide specific interfaces
that guide you to develop the compiler in a more precise way. In these
patterns, Spark is hidden in the pattern implementation and the compiler
focuses on the business logic. The processing library takes care of the
distributed processing and
incremental compilation details.
Spark RDD-based patterns: expose Spark RDDs,
allowing the compiler implementation to perform parallel operations on data
and metadata using Spark, such as join, cogroup, filter, or map. In these
patterns, the interfaces are less rigid and you may need to actively support
incremental compilation.

Table 1: Compilation patterns overview

Compiler Class	Incremental Processing	References to Other Tiles	Global Algorithms	Functional or RDD-Based	Complexity
DirectCompiler	Yes	No	No	Functional	Simple
MapGroupCompiler	Yes	No	No	Functional	Simple
RefTreeCompiler	Yes	Yes	No	Functional	Medium
NonIncrementalCompiler	No	Yes	Yes	RDD	Simple
DepCompiler	Partially	No	Yes	RDD	Medium
IncrementalDepCompiler	Yes	No	Yes	RDD	Complex

Note: Where possible, it is recommended to use functional patterns instead
of Spark RDD-based patterns.