Data Processing Library modules
Data Processing Library modules
The Data Processing Library contains the following modules:
batch-core— provides the core functionality, including theDriver,DriverContext,DriverTaskabstractions and all compilation patterns.batch-catalog— provides theCatalogabstraction to access catalogs via Spark. This module contains the abstract interfaces only.batch-catalog-dataservice— provides the implementation of thebatch-catalogabstractions for the HERE Data API.pipeline-runner— provides thePipelineRunnerclass that constitutes the entry point of a batch pipeline. This module depends onbatch-core.batch-core-java— Java bindings for thebatch-coremodule.batch-catalog-java— Java bindings for thebatch-catalogmodule.pipeline-runner-java— Java bindings for thepipeline-runnermodule.batch-validation— provides a set of classes and DeltaSet transformations to implement data validation pipelines.batch-validation-scalatest— provides scalatest bindings to implement data validation suites using scalatest Domain Specific Language.
To use the Data Processing Library in your Scala applications, it is
sufficient to include pipeline-runner and batch-catalog-dataservice
as dependencies.
For Java applications, you also need to include pipeline-runner-java
as a dependency.
For more information on how to manage dependencies, see Dependency Management.
Updated 19 days ago