AWS S3
The AWS S3 connector is designed for use with the Amazon Simple Storage Service (S3).
It supports 2 operational modes:
- for processing of batch data stored in S3
- for runtime configuration using anonymization management queue
Batch data processing
This is a minimal example of an S3 connector URI for processing batch data:
s3+batch+files://bucket/path?region=eu-west-1
When the batch data processing is configured to use the example URI as a source, the HERE Anonymizer Self-Hosted reads the available files according to the defined filter from the given bucket.
When the batch data processing is configured to use the example URI as a sink, the HERE Anonymizer Self-Hosted writes the anonymized data as files to that location.
Please see explicit authentication and optional parameters in the chapters below.
Runtime configuration
This is a minimal example of an S3 connector URI for runtime configuration:
s3+management://bucket/path?region=eu-west-1
When the anonymization management queue is configured to use the example URI, HERE Anonymizer Self-Hosted reads the following predefined S3 objects in one-minute intervals:
s3://bucket/path/anonymization.confs3://bucket/path/HERE_ANONYMIZER_LICENSE
Please see explicit authentication and optional parameters in the chapters below.
Logging
HERE Anonymizer Self-Hosted logs one of the following messages, depending on the state of the objects:
-
Both objects recognized, detected changes to objects
INFO c.h.a.f.c.s3.S3FilesSourceConnector - Configured s3 files source of s3://bucket/path/ reading every PT1M all the files: [HERE_ANONYMIZER_LICENSE, anonymization.conf] ... INFO c.h.a.f.c.s3.S3FilesSourceFunction - File bucket/path/HERE_ANONYMIZER_LICENSE has been changed, downloading new version INFO c.h.a.f.c.s3.S3FilesSourceFunction - File bucket/path/anonymization.conf has been changed, downloading new version -
Objects detected, no changes to objects (checked by S3 Object's eTag) - the system doesn't log any messages.
-
One or both objects unavailable
WARN c.h.a.f.c.s3.S3FilesSourceFunction - Unable to read s3://bucket/path/anonymization.conf: null (Service: S3, Status Code: 404, Request ID: *******, Extended Request ID: ******) (Service: S3, Status Code: 404, Request ID: *******)
Authentication
This connector supports two authentication methods:
-
Default authentication
This method is used when credentials aren't provided in the connector URI, for example:
s3+management://bucket/path?region=...In that case, the default AWS SDK credentials provider (reading environment variables, system properties,~/.aws/credentials) is used. -
Explicit (static) credentials provider
This method is used when the credentials are provided explicitly in the connector URI, for example:
s3+batch+files://aws-access-id:aws-secret-key@bucket/path?region=...
Configuration
Use the following parameters to configure the connector.
The region parameter
region parameterMandatory parameter. Defines the AWS region.
The poll_interval parameter
poll_interval parameterOptional parameter. Defines the time interval for checking for changes in the configured files. The value must be in the Java Duration format.
This parameter is applicable for runtime configuration mode only.
Default value: PT1M (1 minute)
The endpoint parameter
endpoint parameterOptional parameter. Enables the use of alternatively hosted S3 API.
The files parameter
files parameterOptional parameter. Comma-separated list of files to read. Enables the use of alternative filenames for the managed objects (configuration and license).
This parameter is applicable for runtime configuration mode only.
Default value: anonymization.conf,HERE_ANONYMIZER_LICENSE
With the parameter configured to s3+management://bucket/path/to/folder?files=new-config-name.conf,CUSTOM_LICENSE_NAME
the connector observes these S3 objects:
s3://bucket/path/to/folder/new-config-name.confs3://bucket/path/to/folder/CUSTOM_LICENSE_NAME
The path_resolver_recursive parameter
path_resolver_recursive parameterOptional parameter. Defines if the S3 keys should be recursively traversed in a directory-like way with / character as a delimiter.
This parameter is applicable for batch data processing mode only.
Default value: true
The path_resolver_filter parameter
path_resolver_filter parameterOptional parameter. Defines the PathMatcher pattern which should be applied as a filter to the overall S3 object list relative to the given bucket and path of URI.
This parameter is applicable for batch data processing mode only.
Default value: glob:*
Updated 27 days ago