Azure VM
Prerequisites
- Install tools:
- Create an Azure Storage Account
Deploy HERE Anonymizer Self-Hosted playbook
You can complete the deployment scenario by running the supplied scripts from the path where you unpacked HERE Anonymizer Self-Hosted.
Run the commands and export environment variables indicated in the code block.
# Azure Storage Account name (just the name, not the FQDN)
export AZURE_STORAGE_ACCOUNT={YOUR_STORAGE_ACCOUNT_NAME}
export HERE_ANONYMIZER_DIST={PATH_TO_UNPACKED_ANONYMIZER}
export HERE_ANONYMIZER_LICENSE={YOUR_LICENSE_FILE_CONTENT}
# You can use login from a service principal, see an example in
# ./deployments/kubernetes/azure/azure-login.sh
azure login
# Main deployment scenario including uploading here-anonymizer.jar to Azure storage,
# creating RabbitMQ VM, JobManager VM and the ScaleSet of two instances for
# Task Managers. The script produces two files:
# - ./azure-deploy-vm.env: contains env variables for connecting to the JobManager and RabbitMQ VMs
# - ./azure-deploy-vm.jobmanager.log: contains logs of JobManager
./deployments/vm/azure/azure-deploy-vm.sh
# For deploying flink in session mode for batch processing, run:
# ./deployments/vm/azure/azure-deploy-vm.sh batch
# Export SSH_RABBITMQ_CMD and SSH_JOBMANAGER_CMD from ./azure-deploy-vm.env
set -a ; . ./azure-deploy-vm.env ; set +a
# There the anonymizer is deployed and ready to accept data into RabbitMQ queue
# Pushing test data:
$SSH_RABBITMQ_CMD 'rabbitmqadmin publish exchange=amq.default routing_key="input-queue"' \
< ./deployments/common/here-probe-example.json
# Last command output must be: Message published
# Getting JobID and print metrics statistic
JOB_ID=$($SSH_JOBMANAGER_CMD curl -s "http://localhost:8081/jobs" | jq -r '.jobs[0].id')
$SSH_JOBMANAGER_CMD curl -s "http://localhost:8081/jobs/${JOB_ID}/accumulators" | jq
# For running a batch sample, set the environment variables
# APP_CLIENT_AZURE_STORAGE_ACCOUNT and APP_CLIENT_AZURE_STORAGE_KEY and run:
# /deployments/vm/azure/batch-anonymize-sample-data.sh
# Shutdown the cluster and all its resources
./deployments/vm/azure/azure-shutdown-vm.shAdditional variables
List of optional variables for configuring azure-deploy-vm.sh and
azure-shutdown-vm.sh with their default values:
export APP_NAME="hereanonvm"
export APP_VERSION="latest"
export AZURE_RESOURCE_GROUP="hereanonvm-latest"
export AZURE_LOCATION="germanywestcentral"
export AZURE_VM_IMAGE="Ubuntu2204"
export AZURE_VM_SIZE="Standard_B2s"
export TAGS="no_user_tags=true"Deploy manually
Login and create resource group
-
Log in to Azure using the CLI.
# User must have the Contributor role az login -
Create a resource group.
az group create --name "$AZURE_RESOURCE_GROUP" --location "$AZURE_LOCATION"
Upload HERE Anonymizer Self-Hosted files
-
Create an Azure Storage container.
CONTAINER="${APP_NAME}-${APP_VERSION}" az storage container create -n "${CONTAINER}" -
Upload distribution files and anonymization config.
az storage blob upload -f "./here-anonymizer.jar" -c "${CONTAINER}" az storage blob upload -f "./azure-blob-connector.jar" -c "${CONTAINER}" az storage blob upload -f "./rabbit-connector.jar" -c "${CONTAINER}" az storage blob upload -f "./simple-anonymization.conf" -c "${CONTAINER}" -n "anonymization.conf" TMP_DIR=$(mktemp -d -t "${APP_NAME}-${APP_VERSION}-XXXXXX") curl -L "${FLINK_DIST_DOWNLOAD_URL}" -o "${TMP_DIR}/flink.tgz" az storage blob upload -f "${TMP_DIR}/flink.tgz" -c "${CONTAINER}" az storage blob upload -f "./deployments/vm/azure/flink-install.sh" -c "${CONTAINER}" az storage blob upload -f "./deployments/vm/azure/rabbit-install-and-run.sh" -c "${CONTAINER}"
Deploy RabbitMQ virtual machine
Note
The RabbitMQ deployment is used for demo purposes only. For production deployments, configure
SOURCE_URIandSINK_URIin
all of the cloud-config-***.template.yml to point to production data streams.
-
Create a virtual machine.
VM_RABBIT="${APP_NAME}-${APP_VERSION}-rabbit" az vm create \ --name "${VM_RABBIT}" \ --image "${AZURE_VM_IMAGE}" \ --size "${AZURE_VM_SIZE}" \ --vnet-name "${AZURE_RESOURCE_GROUP}-vnet" \ --subnet "${AZURE_RESOURCE_GROUP}-subnet" \ --admin-username "azureuser" \ --generate-ssh-keys \ --tags ${TAGS} \ --public-ip-sku Standard IP_RABBIT=$(az vm show --name "${VM_RABBIT}" --show-details -o tsv --query publicIps) -
Apply the CustomScript Azure VM extension.
envsubst < "./deployments/vm/azure/rabbit-custom-script.template.json" > "${TMP_DIR}/rabbit-custom-script.json" az vm extension set --vm-name "${VM_RABBIT}" -n CustomScript --publisher Microsoft.Azure.Extensions \ --protected-settings "${TMP_DIR}/rabbit-custom-script.json"
Note
Optionally, you can enable the
15672RabbitMQ UI port and access management console athttp://${IP_RABBIT}:15672/. Note that this link uses the default and unsecureguest:guestcredentials.az vm open-port --port 15672 --resource-group $AZURE_RESOURCE_GROUP --name ${VM_RABBIT}
- Prepare
SOURCE_URIandSINK_URIvariables for configuring HERE Anonymizer Self-Hosted.
export SOURCE_URI=rabbit://guest:guest@${TMP_VM_NAME}:5672/input-queue
export SINK_URI=rabbit://guest:guest@${TMP_VM_NAME}:5672/output-queueDeploy JobManager virtual machine
-
Create a virtual machine.
VM_FLINK_JM="${APP_NAME}-${APP_VERSION}-jobmanager" az vm create \ --name "${VM_FLINK_JM}" \ --image "${AZURE_VM_IMAGE}" \ --size "${AZURE_VM_SIZE}" \ --vnet-name ${AZURE_RESOURCE_GROUP}-vnet \ --subnet ${AZURE_RESOURCE_GROUP}-subnet \ --admin-username "azureuser" \ --generate-ssh-keys \ --tags ${TAGS} \ --public-ip-sku Standard IP_FLINK_JM=$(az vm show --name "${VM_FLINK_JM}" --show-details -o tsv --query publicIps) -
Configure and upload environment configuration.
JM_RPC_HOST=$VM_FLINK_JM PARALLELISM_DEFAULT=$TASKMANAGER_INSTANCE_COUNT \ SOURCE_URI=$SOURCE_URI SINK_URI=$SINK_URI \ envsubst \ < "./deployments/vm/azure/flink-config.template.env" \ > "${TMP_DIR}/config.env" az storage blob upload -f "${TMP_DIR}/config.env" -c "${CONTAINER}" --overwrite -
Apply the CustomScript Azure VM extension to install flink.
JM_CMD=jobmanager-stream # For running Flink in session mode for batch processing #JM_CMD=jobmanager FLINK_CMD=$JM_CMD \ envsubst \ < "./deployments/vm/azure/flink-custom-script.template.json" \ > "${TMP_DIR}/flink-jm-custom-script.json" az vm extension set --no-wait --vm-name "${VM_FLINK_JM}" -n CustomScript --publisher Microsoft.Azure.Extensions \ --protected-settings "${TMP_DIR}/flink-jm-custom-script.json" -
Start the jobmanager process.
# Start bootstrap service for keeping the license and anonymization config during streaming ssh azureuser@${IP_FLINK_JM} "nohup java -cp '/opt/flink/usrlib/*:/opt/flink/lib/*' com.here.anonymization.opa.flink.BootstrapServer > /opt/flink/bootstrap-server.log 2>&1 < /dev/null &" # Load initial license and anonymization config into bootstrap service ssh azureuser@${IP_FLINK_JM} "java -cp '/opt/flink/usrlib/*:/opt/flink/lib/*' com.here.anonymization.opa.flink.Bootstrap" # Start main anonymization stream ssh azureuser@${IP_FLINK_JM} "/opt/flink/bin/standalone-job.sh start --job-classname com.here.anonymization.opa.flink.MainStream"
Note
Optionally, you can enable the
8081Flink UI port and it athttp://${IP_FLINK_JM}:8081/. Note that this link is not secured.az vm open-port --port 8081 --resource-group $AZURE_RESOURCE_GROUP --name ${VM_FLINK_JM}
Deploy Flink Task Managers Scale Set
-
Create virtual machines scale set.
VM_FLINK_TM="${APP_NAME}-${APP_VERSION}-taskmanager" az vmss create \ --orchestration-mode Uniform \ --upgrade-policy-mode Automatic \ --name "$VM_FLINK_TM" \ --image "$AZURE_VM_IMAGE" \ --vnet-name ${AZURE_RESOURCE_GROUP}-vnet \ --subnet ${AZURE_RESOURCE_GROUP}-subnet \ --vm-sku "$AZURE_VM_SIZE" \ --instance-count $TASKMANAGER_INSTANCE_COUNT \ --admin-username "azureuser" \ --generate-ssh-keys \ --public-ip-address-allocation static --public-ip-per-vm \ --lb-sku Standard \ --tags ${TAGS} -
Apply the CustomScript Azure VM extension to install flink.
FLINK_CMD="taskmanager" \ envsubst \ < "./deployments/vm/azure/flink-custom-script.template.json" \ > "${TMP_DIR}/flink-tm-custom-script.json" # The extension will upgrade all the VM in scale set only if --orchestration-mode Uniform and --upgrade-policy-mode Automatic az vmss extension set --vmss-name "${VM_FLINK_TM}" -n CustomScript --publisher Microsoft.Azure.Extensions \ --protected-settings "${TMP_DIR}/flink-tm-custom-script.json" -
Start the taskmanager process.
for ip in $(az vmss list-instance-public-ips --name $VM_FLINK_TM --query "[*].ipAddress" -o tsv); do echo "Starting flink on taskmanager $ip" ssh azureuser@${ip} "/opt/flink/bin/taskmanager.sh start" done
Check the deployed HERE Anonymizer Self-Hosted
To smoke test HERE Anonymizer Self-Hosted, publish an example probe file and check if the Anonymization metrics changed.
-
Publish the example probe file.
ssh azureuser@${IP_RABBIT} 'rabbitmqadmin publish exchange=amq.default routing_key="input-queue"' \ < ./deployments/common/here-probe-example.json -
Check if anonymization metrics are not empty.
JOB_ID=$(ssh azureuser@${IP_FLINK_JM} curl -s "http://localhost:8081/jobs" | jq -r '.jobs[0].id') ssh azureuser@${IP_FLINK_JM} curl -s "http://localhost:8081/jobs/${JOB_ID}/accumulators" | jq
Cleanup
-
Remove the Azure Resource Group to delete all associated resources.
az group delete --name "$AZURE_RESOURCE_GROUP" -y -
Delete the container created for the uploaded application files.
az storage container delete \ --name "${APP_NAME}-${APP_VERSION}" \ --account-name "$AZURE_STORAGE_ACCOUNT"
Updated 26 days ago