Connect to cloud object storage using Unity Catalog

Learn how to connect to cloud object storage using Unity Catalog. This guide covers creating storage credentials, setting up an external location, and managing access control.
Published
June 3, 2024
Author

How to Connect to Cloud Object Storage Using Unity Catalog?

Connecting to cloud object storage using Unity Catalog involves several steps. First, you need to create storage credentials for connecting to cloud storage. Then, create an external location to connect cloud storage to Databricks. Finally, specify a managed storage location in Unity Catalog. Unity Catalog uses two objects, databricks_storage_credential and databricks_external_location, to access and work with external cloud storage.

  • Databricks Storage Credential: This represents the authentication methods to access cloud storage. It's a crucial step in setting up a connection as it ensures secure access to your cloud storage.
  • Databricks External Location: This combines a cloud storage path with a storage credential to access the location. It's a way to streamline the connection process by linking the storage path and credential.
  • Unity Catalog Access-Control Policies: These control which users and groups can access the credential. Granting permissions to users or groups is done by clicking the name of an external location, selecting Permissions, and then clicking Grant.

Why Use Unity Catalog for Cloud Storage Connection?

Databricks recommends using Unity Catalog to manage connections to storage. Unity Catalog offers a unified governance layer for data and AI within the Databricks Data Intelligence Platform. It simplifies the management of connections and provides a secure way to access cloud storage.

  • Unified Governance Layer: Unity Catalog provides a unified platform for managing data and AI. This makes it easier to oversee and control all aspects of data management and AI implementation.
  • Secure Access: With Unity Catalog, you can control who has access to your cloud storage. This ensures that only authorized users can access your data.
  • Easy Management: Unity Catalog simplifies the process of connecting to cloud storage. It streamlines the process by providing a single platform for managing connections.

What are the Benefits of Using Unity Catalog for Cloud Storage Connection?

Using Unity Catalog for cloud storage connection offers several benefits. It provides a unified governance layer for data and AI, ensures secure access to cloud storage, and simplifies the management of connections. With Unity Catalog, you can easily manage and control access to your cloud storage.

  • Unified Governance: Unity Catalog provides a single platform for managing all aspects of data and AI. This makes it easier to manage and control access to your data.
  • Secure Access: Unity Catalog ensures that only authorized users can access your cloud storage. This helps to protect your data from unauthorized access.
  • Simplified Management: With Unity Catalog, managing connections to cloud storage is easy. It provides a streamlined process for setting up and managing connections.

How to Create Storage Credentials for Connecting to Cloud Storage?

In order to connect to cloud object storage using Unity Catalog, the first step involves creating storage credentials. These credentials represent the authentication methods required to access the cloud storage. This is a crucial step as it ensures secure access to your cloud storage.


// Code to create storage credentials
CloudStorageCredentials credentials = new CloudStorageCredentials();
credentials.setStorageAccountName("your-storage-account-name");
credentials.setStorageAccountKey("your-storage-account-key");

The above code snippet is a simple example of how to create storage credentials. Replace "your-storage-account-name" and "your-storage-account-key" with your actual storage account name and key.

  • CloudStorageCredentials: This is the class used to create the storage credentials.
  • setStorageAccountName: This method is used to set the name of your storage account.
  • setStorageAccountKey: This method is used to set the key of your storage account.

How to Create an External Location to Connect Cloud Storage to Databricks?

The next step is to create an external location to connect your cloud storage to Databricks. This involves combining a cloud storage path with a storage credential to access the location. This step is necessary to bridge the connection between your cloud storage and Databricks.


// Code to create an external location
ExternalLocation location = new ExternalLocation();
location.setPath("your-cloud-storage-path");
location.setCredentials(credentials);

The code above shows how to create an external location. Replace "your-cloud-storage-path" with the actual path to your cloud storage. The 'credentials' object is the one we created in the previous step.

  • ExternalLocation: This is the class used to create the external location.
  • setPath: This method is used to set the path to your cloud storage.
  • setCredentials: This method is used to set the storage credentials for the external location.

How to Specify a Managed Storage Location in Unity Catalog?

After creating the external location, the next step is to specify a managed storage location in Unity Catalog. This is where Unity Catalog will access and work with the external cloud storage.


// Code to specify a managed storage location
ManagedStorageLocation managedLocation = new ManagedStorageLocation();
managedLocation.setLocation(location);

The code above demonstrates how to specify a managed storage location in Unity Catalog. The 'location' object is the external location we created in the previous step.

  • ManagedStorageLocation: This is the class used to create the managed storage location.
  • setLocation: This method is used to set the external location for the managed storage location.

How to Control Access to the Credential in Unity Catalog?

Unity Catalog access-control policies control which users and groups can access the credential. To grant permission to users or groups, you can click the name of an external location to open its properties, click Permissions, select each identity, then click Grant.


// Code to grant permissions
AccessControlPolicy policy = new AccessControlPolicy();
policy.grantPermission("user-or-group-id", "permission-type");

The code above shows how to grant permissions to a user or group. Replace "user-or-group-id" with the actual ID of the user or group, and "permission-type" with the type of permission you want to grant.

  • AccessControlPolicy: This is the class used to control access to the credential.
  • grantPermission: This method is used to grant a specific type of permission to a user or group.

Keep reading

See all