Guides
Guides

AWS EKS

Prerequisites

  1. Install tools:
  2. Create an AWS user with the following policies attached:

Deploy HERE Anonymizer Preprocessor playbook

You can complete the deployment scenario by running the supplied scripts from the path where you unpacked the Preprocessor.

Run the commands and export environment variables indicated in the code block.

# The full hostname of AWS ECR registry, for example "{ACCOUNT}.dkr.ecr.eu-west-1.amazonaws.com"
export AWS_ECR_HOST="{ACCOUNT}.dkr.ecr.eu-west-1.amazonaws.com"

# AWS region, for example "eu-west-1"
export AWS_REGION="{AWS_REGION}"

# Configure aws-cli credentials in ~/.aws/credentials or export these variables
export AWS_ACCESS_KEY_ID={USER_ACCESS_KEY_ID}
export AWS_SECRET_ACCESS_KEY={USER_SECRET_ACCESS_KEY}

# Path where the HERE Anonymizer Preprocessor is unzipped
export PREPROC_DIST_DIR={PATH_TO_UNPACKED_ANONYMIZER}

# Build and push container images, create EKS cluster, deploy
# HERE Anonymizer Preprocessor images and Apache Ignite server for demo purposes
./deployments/kubernetes/aws/aws-deploy-eks.sh

# There the HERE Anonymizer Preprocessor is deployed
# and processing data present in the source bucket.

# Shutdown the cluster and all its resources,
# delete HERE Anonymizer Preprocessor container image
/deployments/kubernetes/aws/aws-shutdown-eks.sh

Additional variables

List of optional variables for configuring aws-deploy-eks.sh and aws-shutdown.sh with their default values:

APP_NAME=here-anon-preproc-eks
APP_VERSION=latest
AWS_CLUSTER_NAME=here-anon-preproc-eks-latest
AWS_REGION=eu-west-1
BILLING_TAG=test

Deploy manually

Push container images to ECR

  1. Log in to ECR.

    aws ecr get-login-password --region "$AWS_REGION" | docker login --username AWS --password-stdin $AWS_ECR_HOST
  2. Build and push the HERE Anonymizer Preprocessor container. Running the commands checks if the container registry exists and creates one if a registry doesn't exist.

    APP_IMAGE=${AWS_ECR_HOST}/${APP_NAME}-flink:${APP_VERSION}
    docker build --tag "$APP_IMAGE" $PREPROC_DIST_DIR
    aws ecr describe-repositories --repository-names ${APP_NAME}-flink || \
      aws ecr create-repository --repository-name ${APP_NAME}-flink
    docker push "$APP_IMAGE"
  3. Push the Apache Ignite image.

    IGNITE_IMAGE=$AWS_ECR_HOST/ignite
    docker pull apacheignite/ignite:3.0.0
    docker tag apacheignite/ignite:3.0.0 "$IGNITE_IMAGE"
    aws ecr describe-repositories --repository-names ignite || \
      aws ecr create-repository --repository-name ignite
    docker push "$IGNITE_IMAGE"

Create Kubernetes cluster

  1. Use eksctl to create a cluster and all required resources. The creation process takes about 10-15 minutes.

    eksctl create cluster \
        --version 1.27 \
        -n $AWS_CLUSTER_NAME \
        --region $AWS_REGION \
        --tags "app_name=${APP_NAME},app_version=${APP_VERSION}"
  2. Check Kubernetes connection and cluster readiness.

    kubectl wait --for=condition=Ready pods --all --all-namespaces --timeout=300s
    kubectl get nodes,pods,deployments -A

Deploy HERE Anonymizer Preprocessor

  1. Prepare input data in the S3:

    export SOURCE_PATH=$AWS_S3_BUCKET/$APP_NAME/$APP_VERSION/input
    export SINK_PATH=$AWS_S3_BUCKET/$APP_NAME/$APP_VERSION/output
    aws s3 cp $PREPROC_DIST_DIR/deployments/common/here-probe-example-data/ s3://"$SOURCE_PATH" --recursive
  2. Edit configuration files and environment variables. See Configuration for details.

    export USERNAME=$AWS_ACCESS_KEY_ID
    export PASSWORD=$AWS_SECRET_ACCESS_KEY
    envsubst '${USERNAME} ${PASSWORD} ${SOURCE_PATH} ${SINK_PATH}' <"${PREPROC_DIST_DIR}/deployments/kubernetes/env-configmap.yml" | kubectl apply -f -
    kubectl create -f "${PREPROC_DIST_DIR}/deployments/kubernetes/conf-files-configmap.yml"
  3. Configure container registry credentials as a Kubernetes secret. Note that the command exports all local Docker credentials.

    kubectl create secret generic regcred \
      --from-file=.dockerconfigjson="$HOME/.docker/config.json" \
      --type=kubernetes.io/dockerconfigjson
  4. As the private container registry is used, container image substitution is required before each deployment.

    # Deploy ignite service
    kubectl create -f "${PREPROC_DIST_DIR}/deployments/kubernetes/ignite/ignite-service-account-role.yml"
    kubectl create -f "${PREPROC_DIST_DIR}/deployments/kubernetes/ignite/ignite-service.yml"
    IMAGE=$IGNITE_IMAGE envsubst <"${PREPROC_DIST_DIR}/deployments/kubernetes/ignite/ignite-deployment-template.yml" | kubectl apply -f -
    
    # Wait for ignite to be running
    kubectl wait --for=condition=available --timeout=300s deployment/ignite-service
    
    # Deploy flink cluster for Indexer
    IMAGE=$APP_IMAGE envsubst <"${PREPROC_DIST_DIR}/deployments/kubernetes/indexer/jobmanager-batch-job.template.yml" | kubectl apply -f -
    kubectl create -f "${PREPROC_DIST_DIR}/deployments/kubernetes/indexer/jobmanager-service.yml"
    IMAGE=$APP_IMAGE envsubst <"${PREPROC_DIST_DIR}/deployments/kubernetes/indexer/taskmanager-deployment.template.yml" | kubectl apply -f -
    
    # Wait for the indexer to finish
    kubectl wait --for=condition=complete job/indexer-jobmanager --timeout=300s
    
    # Deploy flink cluster for Preprocessor
    IMAGE=$APP_IMAGE envsubst <"${PREPROC_DIST_DIR}/deployments/kubernetes/preprocessor/jobmanager-batch-job.template.yml" | kubectl apply -f -
    kubectl create -f "${PREPROC_DIST_DIR}/deployments/kubernetes/preprocessor/jobmanager-service.yml"
    IMAGE=$APP_IMAGE envsubst <"${PREPROC_DIST_DIR}/deployments/kubernetes/preprocessor/taskmanager-deployment.template.yml" | kubectl apply -f -
  5. Review all deployed resources:

    kubectl wait --for=condition=Ready pods --all --all-namespaces --timeout=300s
    kubectl get nodes,pods,deployments,jobs,configmaps,secrets,services

Run smoke test

Run a smoke test on the deployed cluster to check the preprocessed data.

  1. Run this command to check the indexer report:

    aws s3 cp "s3://${SOURCE_PATH}/INDEXER_REPORT.json" - | jq "."
  2. In a new terminal, run this command to access the Flink UI:

    kubectl port-forward jobs/preprocessor-jobmanager 8081:8081
    📘

    Note

    The Flink UI is only accessible for the period when preprocessor job is running.

  3. Next, go to http://localhost:8081 and open Running Jobs -> HERE Anonymizer preprocessor, then Source: Preprocess ... task's Accumulators tab.

  4. Check if the essential metrics like HERE_decoding_point_info_all and HERE_output_point_info_all have a value greater than zero. See all the anonymization metrics explained here.

  5. After the preprocessor job completes, run this command to access the preprocessor report:

     aws s3 cp "s3://${SINK_PATH}/PREPROCESSOR_REPORT.json" - | jq "."
  6. Additionally, the preprocessed data can be browsed at the location referred by ${SINK_PATH}.

Cleanup

  1. Use eksctl to clean up all AWS resources created for the Kubernetes cluster, including all EC2 resources. Run the following command:

    eksctl delete cluster --force -n $AWS_CLUSTER_NAME --region $AWS_REGION

    It's recommended to check if the Cloud Formation Stack eksctl-$APP_NAME-$APP_VERISION-cluster (by default eksctl-here-anon-preproc-eks-latest-cluster) is deleted successfully. You can delete the stack and inherited resources manually if it's not deleted by eksctl.

  2. Remove the HERE Anonymizer Preprocessor container image:

    aws ecr batch-delete-image \
      --repository-name ${APP_NAME}-flink \
      --image-ids imageTag=${APP_VERSION}

Example of required IAM policies

ECR_CreateRepository

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Statement1",
      "Effect": "Allow",
      "Action": [
        "ecr:DescribeRegistry",
        "ecr:DescribeRepositories",
        "ecr:CreateRepository"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

ECR_Push

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecr:CompleteLayerUpload",
        "ecr:GetAuthorizationToken",
        "ecr:UploadLayerPart",
        "ecr:InitiateLayerUpload",
        "ecr:BatchCheckLayerAvailability",
        "ecr:PutImage"
      ],
      "Resource": "*"
    }
  ]
}

ECR_DeleteImageOrRepository

Optional policy. Required for cleaning recently deployed version of HERE Anonymizer container image in example script ./deployments/kubernetes/aws/aws-shutdown.sh.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "ecr:BatchDeleteImage",
        "ecr:DeleteRepository"
      ],
      "Resource": "*"
    }
  ]
}