Skip to main content

Continuous integration jobs in dbt Cloud

You can set up continuous integration (CI) jobs to run when someone opens a new pull request (PR) in your dbt Git repository. By running and testing only modified models, dbt Cloud ensures these jobs are as efficient and resource conscientious as possible on your data platform.

Prerequisites

Availability of features by Git provider

The following table outlines the available integration options and their corresponding capabilities.

Git providerNative dbt Cloud integrationAutomated CI jobGit cloneInformation
Azure DevOps
enterprise
Organizations on the Team and Developer plans can connect to Azure DeveOps using a deploy key. Note, you won’t be able to configure automated CI jobs but you can still develop.
GitHub
developerteamenterprise
GitLab
developerteamenterprise
All other git providers using Git clone (BitBucket, AWS CodeCommit, and others)Refer to the Customizing CI/CD with custom pipelines guide to set up continuous integration and continuous deployment (CI/CD).

Set up CI jobs

dbt Labs recommends that you create your CI job in a dedicated dbt Cloud deployment environment that's connected to a staging database. Having a separate environment dedicated for CI will provide better isolation between your temporary CI schema builds and your production data builds. Additionally, sometimes teams need their CI jobs to be triggered when a PR is made to a branch other than main. If your team maintains a staging branch as part of your release process, having a separate environment will allow you to set a custom branch and, accordingly, the CI job in that dedicated environment will be triggered only when PRs are made to the specified custom branch. To learn more, refer to Get started with CI tests.

To make CI job creation easier, many options on the CI job page are set to default values that dbt Labs recommends that you use. If you don't want to use the defaults, you can change them.

  1. On your deployment environment page, click Create job > Continuous integration job to create a new CI job.

  2. Options in the Job settings section:

    • Job name — Specify the name for this CI job.
    • Description — Provide a description about the CI job.
    • Environment — By default, this will be set to the environment you created the CI job from. Use the dropdown to change the default setting.
  3. Options in the Git trigger section:

    • Triggered by pull requests — By default, it’s enabled. Every time a developer opens up a pull request or pushes a commit to an existing pull request, this job will get triggered to run.
      • Run on draft pull request — Enable this option if you want to also trigger the job to run every time a developer opens up a draft pull request or pushes a commit to that draft pull request.
  4. Options in the Execution settings section:

    • Commands — By default, this includes the dbt build --select state:modified+ command. This informs dbt Cloud to build only new or changed models and their downstream dependents. Importantly, state comparison can only happen when there is a deferred environment selected to compare state to. Click Add command to add more commands that you want to be invoked when this job runs.

    • Linting — Enable this option for dbt to lint the SQL files in your project as the first step in dbt run. If this check runs into an error, dbt can either Stop running on error or Continue running on error.

    • dbt compareenterprise — Enable this option to compare the last applied state of the production environment (if one exists) with the latest changes from the pull request, and identify what those differences are. To enable record-level comparison and primary key analysis, you must add a primary key constraint or uniqueness test. Otherwise, you'll receive a "Primary key missing" error message in dbt Cloud.

      To review the comparison report, navigate to the Compare tab in the job run's details. A summary of the report is also available from the pull request in your Git provider (see the CI report example).

      Optimization tip

      When you enable the dbt compare checkbox, you can customize the comparison command to optimize your CI job. For example, if you have large models that take a long time to compare, you can exclude them to speed up the process using the --exclude flag. Refer to compare changes custom commands for more details.

      Additionally, if you set event_time in your models/seeds/snapshots/sources, it allows you to compare matching date ranges between tables by filtering to overlapping date ranges. This is useful for faster CI workflow or custom sampling set ups.

    • Compare changes against an environment (Deferral) — By default, it’s set to the Production environment if you created one. This option allows dbt Cloud to check the state of the code in the PR against the code running in the deferred environment, so as to only check the modified code, instead of building the full table or the entire DAG.

      info

      Older versions of dbt Cloud only allow you to defer to a specific job instead of an environment. Deferral to a job compares state against the project code that was run in the deferred job's last successful run. Deferral to an environment is more efficient as dbt Cloud will compare against the project representation (which is stored in the manifest.json) of the last successful deploy job run that executed in the deferred environment. By considering all deploy jobs that run in the deferred environment, dbt Cloud will get a more accurate, latest project representation state.

    • Run timeout — Cancel the CI job if the run time exceeds the timeout value. You can use this option to help ensure that a CI check doesn't consume too much of your warehouse resources. If you enable the dbt compare option, the timeout value defaults to 3600 (one hour) to prevent long-running comparisons.

  5. (optional) Options in the Advanced settings section:

    • Environment variables — Define environment variables to customize the behavior of your project when this CI job runs. You can specify that a CI job is running in a Staging or CI environment by setting an environment variable and modifying your project code to behave differently, depending on the context. It's common for teams to process only a subset of data for CI runs, using environment variables to branch logic in their dbt project code.
    • Target name — Define the target name. Similar to Environment Variables, this option lets you customize the behavior of the project. You can use this option to specify that a CI job is running in a Staging or CI environment by setting the target name and modifying your project code to behave differently, depending on the context.
    • dbt version — By default, it’s set to inherit the dbt version from the environment. dbt Labs strongly recommends that you don't change the default setting. This option to change the version at the job level is useful only when you upgrade a project to the next dbt version; otherwise, mismatched versions between the environment and job can lead to confusing behavior.
    • Threads — By default, it’s set to 4 threads. Increase the thread count to increase model execution concurrency.
    • Generate docs on run — Enable this if you want to generate project docs when this job runs. This is disabled by default since testing doc generation on every CI check is not a recommended practice.
    • Run source freshness — Enable this option to invoke the dbt source freshness command before running this CI job. Refer to Source freshness for more details.
    Example of CI Job page in the dbt Cloud UIExample of CI Job page in the dbt Cloud UI

Example of CI check in pull request

The following is an example of a CI check in a GitHub pull request. The green checkmark means the dbt build and tests were successful. Clicking on the dbt Cloud section takes you to the relevant CI run in dbt Cloud.

Example of CI check in GitHub pull requestExample of CI check in GitHub pull request

Example of CI report in pull request preview

The following is an example of a CI report in a GitHub pull request, which is shown when the dbt compare option is enabled for the CI job. It displays a high-level summary of the models that changed from the pull request.

Example of CI report comment in GitHub pull requestExample of CI report comment in GitHub pull request

Trigger a CI job with the API

If you're not using dbt Cloud’s native Git integration with GitHubGitLab, or Azure DevOps, you can use the Administrative API to trigger a CI job to run. However, dbt Cloud will not automatically delete the temporary schema for you. This is because automatic deletion relies on incoming webhooks from Git providers, which is only available through the native integrations.

Prerequisites

  1. Set up a CI job with the Create Job API endpoint using "job_type": ci or from the dbt Cloud UI.
  2. Call the Trigger Job Run API endpoint to trigger the CI job. You must include both of these fields to the payload:
    • Provide the pull request (PR) ID using one of these fields:

      • github_pull_request_id
      • gitlab_merge_request_id
      • azure_devops_pull_request_id
      • non_native_pull_request_id (for example, BitBucket)
    • Provide the git_sha or git_branch to target the correct commit or branch to run the job against.

Semantic validations in CI teamenterprise

Automatically test your semantic nodes (metrics, semantic models, and saved queries) during code reviews by adding warehouse validation checks in your CI job, guaranteeing that any code changes made to dbt models don't break these metrics.

To do this, add the command dbt sl validate --select state:modified+ in the CI job. This ensures the validation of modified semantic nodes and their downstream dependencies.

Semantic validations in CI workflowSemantic validations in CI workflow

Benefits

  • Testing semantic nodes in a CI job supports deferral and selection of semantic nodes.
  • It allows you to catch issues early in the development process and deliver high-quality data to your end users.
  • Semantic validation executes an explain query in the data warehouse for semantic nodes to ensure the generated SQL will execute.
  • For semantic nodes and models that aren't downstream of modified models, dbt Cloud defers to the production models

Set up semantic validations in your CI job

To learn how to set this up, refer to the following steps:

  1. Navigate to the Job setting page and click Edit.
  2. Add the dbt sl validate --select state:modified+ command under Commands in the Execution settings section. The command uses state selection and deferral to run validation on any semantic nodes downstream of model changes. To reduce job times, we recommend only running CI on modified semantic models.
  3. Click Save to save your changes.

There are additional commands and use cases described in the next section, such as validating all semantic nodes, validating specific semantic nodes, and so on.

Validate semantic nodes downstream of model changes in your CI job.Validate semantic nodes downstream of model changes in your CI job.

Use cases

Use or combine different selectors or commands to validate semantic nodes in your CI job. Semantic validations in CI supports the following use cases:

 Semantic nodes that are modified or affected by downstream modified nodes.
 Select specific semantic nodes
 Select all semantic nodes

Troubleshooting

Unable to trigger a CI job with GitLab
 CI jobs aren't triggering occasionally when opening a PR using the Azure DevOps (ADO) integration
 Temporary schemas aren't dropping
 Error messages that refer to schemas from previous PRs
 Production job runs failing at the 'Clone Git Repository step'
 CI job not triggering for Virtual Private dbt users
 PR status for CI job stays in 'pending' in Azure DevOps after job run finishes
0