Rescaling

Apply the following configuration to enable adaptive rescaling with checkpointing.

Adaptive mode

For Amazon Elastic Kubernetes Service edit $HERE_ANONYMIZER_DIST/deployments/kubernetes/aws/conf-files-configmap.yml to create/update flink-conf.yaml.

    export AWS_CLUSTER_NAME=hereanoneks-latest
    export S3_CHECKPOINTS_BUCKET_NAME="${AWS_CLUSTER_NAME}-checkpoints"
    export AWS_SECRET_ACCESS_KEY={AWS_SECRET_ACCESS_KEY}
    export AWS_ACCESS_KEY_ID={AWS_ACCESS_KEY_ID}
    export CHECKPOINTS_DIR="s3://${S3_CHECKPOINTS_BUCKET_NAME}/checkpoints/"
    export SAVEPOINTS_DIR="s3://${S3_CHECKPOINTS_BUCKET_NAME}/savepoints/"

apiVersion: v1
kind: ConfigMap
metadata:
  name: here-anonymizer-configs
  labels:
    app: here-anonymizer
    component: config-files
data:
  flink-conf.yaml: |
    jobmanager.scheduler: adaptive
    execution.checkpointing.mode: EXACTLY_ONCE
    state.checkpoint-storage: filesystem
    state.checkpoints.dir: ${CHECKPOINTS_DIR}
    state.savepoints.dir: ${SAVEPOINTS_DIR}
    state.checkpoints.num-retained: 5
    execution.checkpointing.interval: 60s
    execution.checkpointing.timeout: 30s
    execution.checkpointing.max-concurrent: 1
    taskmanager.numberOfTaskSlots: 1
    parallelism.default: 4
    s3.access-key: ${AWS_ACCESS_KEY_ID}
    s3.secret-key: ${AWS_SECRET_ACCESS_KEY}
  # ...

For Azure Kubernetes Service edit $HERE_ANONYMIZER_DIST/deployments/kubernetes/azure/conf-files-configmap.yml to create/update flink-conf.yaml.

    export AZURE_STORAGE_ACCOUNT={AZURE_STORAGE_ACCOUNT}
    export AZURE_CLIENT_ID={AZURE_CLIENT_ID}
    export AZURE_CLIENT_PASSWORD={AZURE_CLIENT_PASSWORD}
    export AZURE_STORAGE_KEY={AZURE_STORAGE_KEY}
    export CHECKPOINTS_DIR="wasbs://${CHECKPOINTS_CONTAINER}@${AZURE_STORAGE_ACCOUNT}.blob.core.windows.net/checkpoints/"
    export SAVEPOINTS_DIR="wasbs://${CHECKPOINTS_CONTAINER}@${AZURE_STORAGE_ACCOUNT}.blob.core.windows.net/savepoints/"

apiVersion: v1
kind: ConfigMap
metadata:
  name: here-anonymizer-configs
  labels:
    app: here-anonymizer
    component: config-files
data:
  flink-conf.yaml: |
    jobmanager.scheduler: adaptive
    state.checkpoint-storage: filesystem
    state.checkpoints.dir: ${CHECKPOINTS_DIR}
    state.savepoints.dir: ${SAVEPOINTS_DIR}
    state.checkpoints.num-retained: 5
    execution.checkpointing.mode: EXACTLY_ONCE
    execution.checkpointing.interval: 60s
    execution.checkpointing.timeout: 30s
    execution.checkpointing.max-concurrent: 1
    taskmanager.numberOfTaskSlots: 1
    parallelism.default: 4
    fs.azure.account.key.${AZURE_STORAGE_ACCOUNT}.blob.core.windows.net: ${AZURE_STORAGE_KEY}
  # ...

Restart the Flink cluster. Once the cluster is running, apply upscaling and downscaling as needed.

Upscaling

Upscaling the TaskManager deployment to three replicas (with one task slot per TaskManager) results in the parallelism increasing to three. Use the following command:

kubectl scale deployment taskmanager --replicas=3

Downscaling

Downscaling the TaskManager deployment to two replicas (with one task slot per TaskManager) results in the parallelism decreasing to two. Use the following command:

kubectl scale deployment taskmanager --replicas=2