Manage Secrets in Azure Databricks Using Azure Key Vault

Sampat Budankayala
Sampat Budankayala
6 min read
Posted on September 13, 2022
Manage Secrets in Azure Databricks Using Azure Key Vault

Introduction

Databricks' Unified Data Analytics Platform helps organisations accelerate innovation by unifying data science with engineering and business. With Databricks as your Unified Data Analytics Platform, you can quickly prepare and clean data at massive scale with no limitations.

Sometimes accessing data requires that you authenticate to external data sources. Instead of directly entering your credentials into a notebook, use Azure Databricks secrets to store your credentials and reference them in notebooks and jobs.

Databricks has introduced Secret Management, which allows users to leverage and share credentials within Databricks in a secured manner.

Securing your confidential digital assets has always been a challenge on the cloud. Thanks to Azure Key Vault, protecting your API keys, passwords, access tokens, and digital certificates is now a breeze. Using its key management solution, Key Vault encrypts your data effectively. In this article let us see how we can manage our Azure Databricks secrets using Key Vault.
 

Overview

There are two types of secret scopes that are backed Azure Databricks :

  • Azure Key Vault Backed

  • Databricks Backed

Over the course of this blob we will be focusing on Azure Key Vault Backed secret scope.

To reference secrets stored in an Azure Key Vault, you can create a secret scope backed by Azure Key Vault. You can then leverage all of the secrets in the corresponding Key Vault instance from that secret scope.

Note:
Because the Azure Key Vault-backed secret scope is a read-only interface to the Key Vault, the PutSecret and DeleteSecret Secrets API 2.0 operations are not allowed. To manage secrets in Azure Key Vault, you must use the Azure SetSecret REST API or Azure portal UI.
 

Our Learning Objectives

  • Create a Secret Scope connected to Azure Key Vault

  • Mount Blob Storage to DBFS using a SAS token

  • Write data to Blob using a SAS token in Spark Configuration
     

Access the Azure Databricks Secrets UI

Open a new web browser tab and navigate to https://<your_azure_databricks_url>#secrets/createScope</your_azure_databricks_url>

The digits following the ?o= represent the unique workspace identifier.

Append the text #secrets/createScope next to the identifier.

This will land you to the Azure Databricks Secret Scope UI as shown below.


 

To make it operational, the Azure Key Vault must be linked to Azure Databricks.

As a prerequisite, we need to create a Key Vault Store from the Azure Portal as described here.

Copy the Vault URL and Resource ID from the Azure Portal.

Navigate to your Key Vault tab:

  1. Go to Properties

  2. Paste the Vault URL and Resource ID


 

Adding Config Values to Databricks Secret Scope UI

Add configuration values to the Databricks Secret Scope UI that you copied from the Azure Key Vault

In the Databricks Secrets Scope UI:

  1. Enter the Scope Name (we are using myKey)

  2. Paste the Vault URL to the DNS Name.

  3. Paste the Resource ID

  4. Click Create

After a while, a dialog appears to verify that the secret scope has been created. Click OK.
 

Accessing the Key

Once the secret scope is created we can access the key as shown below.

Using your scope name, key, and the GET method you can access your secrets.

The secrets are not displayed in clear text. Observe that the value when printed out is [REDACTED]. This prevents your secrets from being exposed.
 

Mount Azure Blob Container - Read/List Using Stored Secrets

Any user within the workspace can view and read the files mounted using the key. This key can be used to mount any container within the storage account with these privileges.

Please unmount the directory if previously mounted. Leverage Databricks notebooks to use the below scripts to mount/un-mount.

MOUNTPOINT = "/mnt/commonfiles"

if MOUNTPOINT in [mnt.mountPoint for mnt in dbutils.fs.mounts()]:
  dbutils.fs.unmount(MOUNTPOINT)

# Add the Storage Account, Container, and reference the secret to pass the SAS Token
STORAGE_ACCOUNT = dbutils.secrets.get(scope="myKey", key="storageKey")
CONTAINER = "commonfiles"

# Do not change these values
SOURCE = "wasbs://{container}@{storage_acct}.blob.core.windows.net/".format(container=CONTAINER, storage_acct=STORAGE_ACCOUNT)
URI = "fs.azure.sas.{container}.{storage_acct}.blob.core.windows.net".format(container=CONTAINER, storage_acct=STORAGE_ACCOUNT)

try:
  dbutils.fs.mount(
    source=SOURCE,
    mount_point=MOUNTPOINT)
except Exception as e:
  if "Directory already mounted" in str(e):
    pass # Ignore error if already mounted.
  else:
    raise e
print("Success.")

dbutils.fs.ls(MOUNTPOINT)


You should now be able to use the following tools in your workspace:

  • Databricks Secrets

  • Azure Key Vault

  • dbutils.moun
     

References:

https://docs.microsoft.com/en-us/azure/databricks/security/secrets/secret-scopes#create-an-azure-key-vault-backed-secret-scope-using-the-ui