Robert van Timmeren
Back to Blog
azure databricks devops

Databricks on Azure with Bicep — What the Docs Don't Tell You

Automating end-to-end Databricks workspace deployments with Bicep: networking, identity, secrets, and the gotchas that cost you time when you learn them the hard way.

· 10 min read

Deploying Databricks manually through the Azure portal takes twenty minutes. Deploying it repeatably, securely, and identically across dev/staging/prod environments via IaC is a completely different problem. This post covers what I’ve learned doing it properly with Bicep.

Why Bicep over Terraform for Azure?

Short answer: it depends on your org. At RIVM I used both. Bicep has no state file to manage and maps more directly to ARM, which means fewer surprises when Azure releases a new resource. Terraform’s provider ecosystem and plan/apply workflow often make it the better production choice when you need cross-cloud resource management or your team already knows HCL.

For greenfield Azure-only projects, I usually start with Bicep. You can always generate Terraform from it later.

The deployment architecture

A production Databricks platform on Azure needs more than just a Microsoft.Databricks/workspaces resource. The full set:

├── Resource Group
│   ├── Databricks Workspace (Premium)
│   ├── Managed Resource Group (created by Databricks)
│   │   ├── VNet + subnets (host/container)
│   │   ├── NSGs
│   │   └── Storage account (DBFS root)
│   ├── Azure Data Lake Storage Gen2 (external metastore/Unity Catalog)
│   ├── Key Vault (secrets, token management)
│   ├── Managed Identity (for workspace → ADLS2 access)
│   └── Role assignments

The workspace Bicep module

resource databricksWorkspace 'Microsoft.Databricks/workspaces@2023-02-01' = {
  name: workspaceName
  location: location
  sku: {
    name: 'premium'  // Unity Catalog requires premium
  }
  properties: {
    managedResourceGroupId: managedRgId
    parameters: {
      customVirtualNetworkId: {
        value: vnet.id
      }
      customPublicSubnetName: {
        value: publicSubnet.name
      }
      customPrivateSubnetName: {
        value: privateSubnet.name
      }
      enableNoPublicIp: {
        value: true  // secure cluster connectivity
      }
    }
  }
}
⚠️Managed resource group naming

The managed resource group name must be globally unique and cannot be changed after deployment. Use a deterministic naming convention like rg-dbw-managed-{env}-{workspaceName} from the start — renaming it means tearing down and redeploying the workspace.

Networking: the part nobody explains clearly

Databricks in Azure uses a VNet injection model. Your workspace attaches to two subnets you control:

  • Public subnet — control plane traffic (Databricks service → your clusters)
  • Private subnet — data plane traffic (cluster ↔ cluster, cluster → storage)

Both subnets need specific delegations and NSG rules. The Databricks service will add its own rules at deployment time but requires the NSG to already exist and be attached.

resource hostSubnetNsg 'Microsoft.Network/networkSecurityGroups@2023-05-01' = {
  name: 'nsg-dbw-host'
  location: location
  properties: {
    securityRules: [] // Databricks adds required rules automatically
  }
}

resource hostSubnet 'Microsoft.Network/virtualNetworks/subnets@2023-05-01' = {
  parent: vnet
  name: 'snet-dbw-host'
  properties: {
    addressPrefix: hostSubnetCidr
    networkSecurityGroup: { id: hostSubnetNsg.id }
    delegations: [
      {
        name: 'databricks-delegation'
        properties: {
          serviceName: 'Microsoft.Databricks/workspaces'
        }
      }
    ]
  }
}

If you skip the delegation or attach the NSG after the fact, the deployment will succeed but clusters will fail to start — with an error message that points nowhere near the actual problem.

Identity and secrets

Use a managed identity for the workspace to access ADLS2. Avoid storing access keys anywhere.

resource managedIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
  name: 'id-dbw-${env}'
  location: location
}

// Grant Storage Blob Data Contributor on the storage account
resource storageRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  scope: storageAccount
  name: guid(storageAccount.id, managedIdentity.id, storageBlobDataContributorRoleId)
  properties: {
    roleDefinitionId: storageBlobDataContributorRoleId
    principalId: managedIdentity.properties.principalId
    principalType: 'ServicePrincipal'
  }
}

Then in your Key Vault, store the workspace URL and relevant tokens. The Databricks provider for Terraform (and the Databricks CLI) can authenticate using Azure CLI credentials — no secrets needed in CI/CD if you set up federated identity on your service principal.

💡Unity Catalog gotcha

Unity Catalog requires a storage account with hierarchical namespace (ADLS Gen2) for the metastore root. This is a separate storage account from your data containers — the metastore root is managed by Unity Catalog itself. You cannot use an existing storage account that already has data in it.

CI/CD pipeline structure

# azure-pipelines.yml (simplified)
stages:
  - stage: validate
    jobs:
      - job: bicep_lint
        steps:
          - script: az bicep build --file main.bicep
          - script: az deployment group validate \
              --resource-group $(RG_NAME) \
              --template-file main.bicep \
              --parameters @params.$(ENV).json

  - stage: deploy
    dependsOn: validate
    jobs:
      - job: deploy_infra
        steps:
          - script: az deployment group create \
              --resource-group $(RG_NAME) \
              --template-file main.bicep \
              --parameters @params.$(ENV).json \
              --mode Incremental
      - job: configure_workspace
        dependsOn: deploy_infra
        steps:
          - script: databricks workspace configure ...

The --mode Incremental flag is critical. Complete mode will delete resources not in the template — including things Databricks created in the managed resource group.

What I’d do differently

  • Start with Unity Catalog from day one. Retrofitting it onto an existing workspace with data is painful.
  • Parameterise subnet CIDRs in the Bicep module. Hard-coding them causes conflicts when you try to peer VNets.
  • Put workspace URL and cluster IDs in Key Vault immediately, not in pipeline variables. It makes secret rotation and multi-environment setups much cleaner.

The IaC for a workspace is maybe 300 lines of Bicep. The configuration inside the workspace (clusters, policies, permissions, catalogs, schemas) is another problem entirely — and probably worth a separate post.

Astro describes itself as “the web framework for content-driven websites.” After building this blog with it, I’d add: it’s the framework that actually lets you use any UI library without the framework fighting you.

Let me walk you through the concepts that matter.

The Island Architecture

Astro’s core idea is partial hydration — also called the Island Architecture. Instead of shipping a massive JavaScript bundle that hydrates the whole page (like a traditional React SPA), Astro only hydrates the components that actually need to be interactive.

---
// This React component is hydrated only on the client
import Counter from './Counter.tsx';
---

<h1>This heading is pure HTML. Zero JS.</h1>

<!-- This component ships JS to the browser -->
<Counter client:load />

<!-- This one is only hydrated when visible in the viewport -->
<HeavyChart client:visible />

The client:* directives give you surgical control over hydration:

DirectiveWhen it hydrates
client:loadImmediately on page load
client:idleWhen the browser is idle
client:visibleWhen scrolled into the viewport
client:mediaWhen a media query matches
client:onlyClient-side only (no SSR)
💡Default: Zero JS

If you don’t add a client:* directive to a component, it renders as static HTML only. No JavaScript is shipped to the browser at all.

Content Collections

Content Collections are Astro’s way of managing structured content. You define a schema, and Astro validates your frontmatter at build time.

// src/content/config.ts
import { defineCollection, z } from 'astro:content';

const blog = defineCollection({
  type: 'content',
  schema: z.object({
    title: z.string(),
    description: z.string(),
    pubDate: z.coerce.date(),
    tags: z.array(z.string()).default([]),
    draft: z.boolean().default(false),
  }),
});

export const collections = { blog };

Then query your content anywhere:

import { getCollection } from 'astro:content';

const posts = await getCollection('blog', ({ data }) => !data.draft);
const sorted = posts.sort(
  (a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf()
);

The TypeScript types are fully inferred from the Zod schema. You get autocomplete and type errors if frontmatter is missing or wrong.

Routing

Astro uses file-based routing inside src/pages/. The filename becomes the URL:

src/pages/
├── index.astro       →  /
├── about.astro       →  /about
└── blog/
    ├── index.astro   →  /blog
    └── [...slug].astro →  /blog/[any-slug]

Dynamic routes export a getStaticPaths function:

---
import { getCollection } from 'astro:content';

export async function getStaticPaths() {
  const posts = await getCollection('blog');
  return posts.map((post) => ({
    params: { slug: post.slug },
    props: post,
  }));
}

const post = Astro.props;
const { Content } = await post.render();
---

<article>
  <h1>{post.data.title}</h1>
  <Content />
</article>

Layouts

Astro layouts are just .astro components that wrap a <slot />. They’re great for page-level structure:

---
// src/layouts/BaseLayout.astro
interface Props {
  title: string;
}
const { title } = Astro.props;
---

<!doctype html>
<html lang="en">
  <head>
    <title>{title}</title>
  </head>
  <body>
    <header>...</header>
    <main>
      <slot />
    </main>
    <footer>...</footer>
  </body>
</html>

Should you use Astro?

Yes, if:

  • You’re building a blog, docs site, or marketing site
  • You want to mix UI frameworks (React + Svelte + vanilla JS — all in the same project)
  • You care about Core Web Vitals and shipping minimal JavaScript

Maybe not, if:

  • You need a heavily interactive app (think Figma, not a blog)
  • Your whole team is already deep in Next.js and there’s no reason to switch
Start with the docs

The Astro docs are genuinely excellent. The tutorial alone will get you from zero to deployed blog in an afternoon.

Astro has become my default choice for content-focused projects. I hope this gives you enough of an overview to decide if it’s right for yours.

Robert van Timmeren

Robert van Timmeren

Senior Data & Cloud Engineer — Databricks, Azure, Python.