Data Processing Library logging
Data Processing Library logging
The Data Processing Library uses context-logging to deliver logs enriched with log contexts, a set of key/value pairs that provide additional information about the computational context at the time when a log message is generated.
context-logging is based on SLF4J.
Note
context-logginghas no dependencies on the Data Processing Library and can be used as a standalone component in your projects.
Log context and context aware loggers
The log context
is a dynamically bound, thread local set of key/value pairs that are automatically appended to each
log message delivered through a ContextAwareLogger.
The following snippet is an example of log message augmented with the log context:
2020/10/13 22:33:54.393 INFO Main$: Final assessment: Assessment(true,0.0) // Task=main deltaset=mapValues13 catalog=output layer=assessment partition=assessment spark.stageId=59 spark.partitionId=16 spark.attemptNumber=0 spark.taskAttemptId=400The log context can be arbitrarily augmented via LogContext.withChild,
specifying an execution block where the new binding is in place until the control flow leaves its
scope, as shown in the following code snippet:
import com.here.platform.pipeline.logging.{ContextLogging, LogContext}
object LogContextExample extends ContextLogging {
// extending `ContextLogging` is roughly equivalent to:
// protected val logger = new ContextAwareLogger(getClass)
def someLoggingMethod(): Unit = logger.info("some info message")
def main(arg: Array[String]): Unit = {
// => some info message
someLoggingMethod()
LogContext.withChild("key", "value") {
// => some info message // key=value
someLoggingMethod()
LogContext.withChild("another-key", "another-value") {
// => some info message // key=value another-key=another-value
someLoggingMethod()
}
// => some info message // key=value
someLoggingMethod()
}
}
}import com.here.platform.pipeline.logging.LogContext;
import com.here.platform.pipeline.logging.java.ContextAwareLogger;
public class LogContextExample {
static ContextAwareLogger logger = new ContextAwareLogger(LogContextExample.class);
public static void someLoggingMethod() {
logger.info("some info message");
}
public static void main(String[] args) {
// => some info message
someLoggingMethod();
LogContext.withChild("key", "value", () -> {
// => some info message // key=value
someLoggingMethod();
LogContext.withChild("another-key", "another-value", () -> {
// => some info message // key=value another-key=another-value
someLoggingMethod();
});
// => some info message // key=value
someLoggingMethod();
});
}
}Log context in the Data Processing Library
The Data Processing Library populates the log context around most user-defined functions passed to the
framework, with information about the current driver task, the compilation phase, and the partition
that is being processed. Therefore, you should always prefer a ContextAwareLogger over other
logging solutions to not lose this useful source of debugging information, even if you do not plan
to augment the log context yourself.
Log context in functional compilers
In each functional compiler method (mappingFn, resolveFn, compileInFn, compileOutFn) the
log context includes at least the name of the method, the partition Key that is being processed
and information about the current Spark stage as well as the task within the stage.
Log context in Deltaset transformations
In every DeltaSet transformation, the log context includes the id of the DeltaSet and
information about the current Spark stage as well as the task within the stage. Mapping
transformations also include the key that is being processed.
Sampling
Logging is a valuable source of information, but sometimes you are not interested in every message produced by a single logger invocation, and too many instances of the same log message - albeit with different data - can clutter the logs, impact costs while providing little additional information.
In these cases you can use a sampling logger, a special context aware logger which outputs a subset of all logger invocations, based on a sampling strategy:
import com.here.platform.pipeline.logging.{ContextLogging, SamplingLogger}
object SamplingLoggingExample extends ContextLogging {
// extending `ContextLogging` is roughly equivalent to:
// protected val logger = new ContextAwareLogger(getClass)
val samplingLogger: SamplingLogger = logger.oneEvery(3)
def main(arg: Array[String]): Unit = {
// => Without sampling: 1
// => Every 3: 1
// => Once: 1
// => Without sampling: 2
// => Without sampling: 3
// => Without sampling: 4
// => [3 times] Every 3: 4
// => Without sampling: 5
// => Exiting...
// => [flush] Every 3: 5
// => [flush] [4 times] Once: 2
(1 to 5).foreach { count =>
logger.info(s"Without sampling: $count")
samplingLogger.info(s"Every 3: $count")
// a sampling strategy can be specified for single logger calls
logger.limit(1).warn(s"Once: $count")
}
logger.info("Exiting...")
}
}import com.here.platform.pipeline.logging.java.ContextAwareLogger;
import com.here.platform.pipeline.logging.java.SamplingLogger;
public class SamplingLoggingExample {
static ContextAwareLogger logger = new ContextAwareLogger(SamplingLoggingExample.class);
static SamplingLogger samplingLogger = logger.oneEvery(3);
public static void main(String[] args) {
// => Without sampling: 1
// => Every 3: 1
// => Once: 1
// => Without sampling: 2
// => Without sampling: 3
// => Without sampling: 4
// => [3 times] Every 3: 4
// => Without sampling: 5
// => Exiting...
// => [flush] Every 3: 5
// => [flush] [4 times] Once: 2
for (int i = 1; i <= 5; ++i) {
logger.info("Without sampling: {}", i);
samplingLogger.info("Every 3: {}", i);
// a sampling strategy can be specified for single logger calls
logger.limit(1).warn("Once: {}", i);
}
logger.info("Exiting...");
}
}Updated 21 days ago