How to Configure Doris with dbt

Configure Doris views & dbt models. Connect to Doris/SelectDB with user/password auth. Learn dbt Core installation & profile creation.
Published
May 10, 2024
Author

How Can You Configure a Doris View and dbt Model?

According to the dbt Developer Hub, configuring a Doris view and dbt model involves the use of Project file, Config block, and dbt_project.yml. These elements provide the necessary syntax and structure for setting up your Doris view and dbt model effectively.


# Project file
name: 'my_project'
version: '1.0.0'

# Config block
models:
my_project:
materialized: view

# dbt_project.yml
name: 'my_project'
version: '1.0.0'
config-version: 2

The code above represents an example of how you can configure your project file, config block, and dbt_project.yml. These are essential components in setting up a Doris view and dbt model.

  • Project file: This file contains the name and version of your project.
  • Config block: This block specifies the materialization strategy for your models.
  • dbt_project.yml: This file contains the configuration settings for your dbt project.

How Can You Connect to Doris/SelectDB with dbt-doris?

Connecting to Doris/SelectDB with dbt-doris can be achieved using user or password authentication. This method allows for secure access to your Doris/SelectDB database.


# dbt-doris connection
profile: 'default'
target: 'dev'
outputs:
dev:
type: 'doris'
host: 'localhost'
port: 5432
user: 'username'
pass: 'password'
dbname: 'my_database'

The code above is an example of how you can connect to Doris/SelectDB with dbt-doris. It includes the necessary parameters for establishing a secure connection using user or password authentication.

  • profile: This is the name of the profile that you are using.
  • target: This is the name of the target environment.
  • outputs: This section contains the connection parameters for your database.

What Are the Different Ways to Install dbt Core on the Command Line?

There are several ways to install dbt Core on the command line. These include using pip, Homebrew, a Docker image, installing dbt from source, and developing locally using the dbt Cloud CLI.


# Use pip
pip install dbt

# Use Homebrew
brew install dbt

# Use a Docker image
docker pull fishtownanalytics/dbt

# Install dbt from source
git clone https://github.com/fishtown-analytics/dbt.git
cd dbt
pip install -r requirements.txt

# Develop locally using the dbt Cloud CLI
dbt run

The code above demonstrates the different ways you can install dbt Core on the command line. Each method has its own advantages and is suitable for different development environments.

  • pip: This is a package manager for Python. You can use it to install dbt Core.
  • Homebrew: This is a package manager for macOS. You can use it to install dbt Core.
  • Docker image: This is a lightweight, standalone, executable package that includes everything needed to run a piece of software, including dbt Core.
  • dbt from source: This method involves cloning the dbt repository and installing the necessary requirements.
  • dbt Cloud CLI: This is a command line interface for dbt Cloud. You can use it to run dbt Core locally.

How Can You Create a dbt Profile?

Creating a dbt profile involves specifying the type of data warehouse you are connecting to, getting warehouse credentials from your database administrator, specifying the schema that dbt will build objects in, and specifying the number of threads the dbt project will run on.


# dbt profile
default:
outputs:
dev:
type: 'doris'
threads: 1
host: 'localhost'
port: 5432
user: 'username'
pass: 'password'
dbname: 'my_database'
schema: 'public'
target: 'dev'

The code above is an example of a dbt profile. It includes the necessary parameters for connecting to a data warehouse, specifying the schema for building objects, and setting the number of threads for running the dbt project.

  • type: This is the type of data warehouse you are connecting to.
  • threads: This is the number of threads the dbt project will run on.
  • host: This is the host name or IP address of your data warehouse.
  • port: This is the port number for your data warehouse.
  • user and pass: These are the credentials for your data warehouse.
  • dbname: This is the name of your database.
  • schema: This is the schema that dbt will build objects in.

What is Apache Doris and What are its Data Sources?

Apache Doris is an open-source real-time data warehouse that can collect data from various sources. These sources include relational databases, logs, and time series data from IoT devices.


# Apache Doris
CREATE DATABASE example_db;
CREATE TABLE example_table (
siteid INT,
city CHAR(16),
username VARCHAR(32),
pv BIGINT SUM
) DISTRIBUTED BY HASH(siteid) BUCKETS 10;

The code above is an example of how you can create a database and table in Apache Doris. It demonstrates the flexibility of Apache Doris in handling different types of data sources.

  • Relational databases: Apache Doris can collect data from relational databases, which store data in a structured format.
  • Logs: Apache Doris can collect data from logs, which record the events that occur in an operating system or other software.
  • Time series data from IoT devices: Apache Doris can collect data from IoT devices, which generate time series data.

Keep reading

See all