Databricks on Azure with Bicep — What the Docs Don't Tell You
Automating end-to-end Databricks workspace deployments with Bicep: networking, identity, secrets, and the gotchas that cost you time when you learn them the hard way.
Deploying Databricks manually through the Azure portal takes twenty minutes. Deploying it repeatably, securely, and identically across dev/staging/prod environments via IaC is a completely different problem. This post covers what I’ve learned doing it properly with Bicep.
Why Bicep over Terraform for Azure?
Short answer: it depends on your org. At RIVM I used both. Bicep has no state file to manage and maps more directly to ARM, which means fewer surprises when Azure releases a new resource. Terraform’s provider ecosystem and plan/apply workflow often make it the better production choice when you need cross-cloud resource management or your team already knows HCL.
For greenfield Azure-only projects, I usually start with Bicep. You can always generate Terraform from it later.
The deployment architecture
A production Databricks platform on Azure needs more than just a Microsoft.Databricks/workspaces resource. The full set:
├── Resource Group
│ ├── Databricks Workspace (Premium)
│ ├── Managed Resource Group (created by Databricks)
│ │ ├── VNet + subnets (host/container)
│ │ ├── NSGs
│ │ └── Storage account (DBFS root)
│ ├── Azure Data Lake Storage Gen2 (external metastore/Unity Catalog)
│ ├── Key Vault (secrets, token management)
│ ├── Managed Identity (for workspace → ADLS2 access)
│ └── Role assignments
The workspace Bicep module
resource databricksWorkspace 'Microsoft.Databricks/workspaces@2023-02-01' = {
name: workspaceName
location: location
sku: {
name: 'premium' // Unity Catalog requires premium
}
properties: {
managedResourceGroupId: managedRgId
parameters: {
customVirtualNetworkId: {
value: vnet.id
}
customPublicSubnetName: {
value: publicSubnet.name
}
customPrivateSubnetName: {
value: privateSubnet.name
}
enableNoPublicIp: {
value: true // secure cluster connectivity
}
}
}
}
The managed resource group name must be globally unique and cannot be changed after deployment. Use a deterministic naming convention like rg-dbw-managed-{env}-{workspaceName} from the start — renaming it means tearing down and redeploying the workspace.
Networking: the part nobody explains clearly
Databricks in Azure uses a VNet injection model. Your workspace attaches to two subnets you control:
- Public subnet — control plane traffic (Databricks service → your clusters)
- Private subnet — data plane traffic (cluster ↔ cluster, cluster → storage)
Both subnets need specific delegations and NSG rules. The Databricks service will add its own rules at deployment time but requires the NSG to already exist and be attached.
resource hostSubnetNsg 'Microsoft.Network/networkSecurityGroups@2023-05-01' = {
name: 'nsg-dbw-host'
location: location
properties: {
securityRules: [] // Databricks adds required rules automatically
}
}
resource hostSubnet 'Microsoft.Network/virtualNetworks/subnets@2023-05-01' = {
parent: vnet
name: 'snet-dbw-host'
properties: {
addressPrefix: hostSubnetCidr
networkSecurityGroup: { id: hostSubnetNsg.id }
delegations: [
{
name: 'databricks-delegation'
properties: {
serviceName: 'Microsoft.Databricks/workspaces'
}
}
]
}
}
If you skip the delegation or attach the NSG after the fact, the deployment will succeed but clusters will fail to start — with an error message that points nowhere near the actual problem.
Identity and secrets
Use a managed identity for the workspace to access ADLS2. Avoid storing access keys anywhere.
resource managedIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
name: 'id-dbw-${env}'
location: location
}
// Grant Storage Blob Data Contributor on the storage account
resource storageRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
scope: storageAccount
name: guid(storageAccount.id, managedIdentity.id, storageBlobDataContributorRoleId)
properties: {
roleDefinitionId: storageBlobDataContributorRoleId
principalId: managedIdentity.properties.principalId
principalType: 'ServicePrincipal'
}
}
Then in your Key Vault, store the workspace URL and relevant tokens. The Databricks provider for Terraform (and the Databricks CLI) can authenticate using Azure CLI credentials — no secrets needed in CI/CD if you set up federated identity on your service principal.
Unity Catalog requires a storage account with hierarchical namespace (ADLS Gen2) for the metastore root. This is a separate storage account from your data containers — the metastore root is managed by Unity Catalog itself. You cannot use an existing storage account that already has data in it.
CI/CD pipeline structure
# azure-pipelines.yml (simplified)
stages:
- stage: validate
jobs:
- job: bicep_lint
steps:
- script: az bicep build --file main.bicep
- script: az deployment group validate \
--resource-group $(RG_NAME) \
--template-file main.bicep \
--parameters @params.$(ENV).json
- stage: deploy
dependsOn: validate
jobs:
- job: deploy_infra
steps:
- script: az deployment group create \
--resource-group $(RG_NAME) \
--template-file main.bicep \
--parameters @params.$(ENV).json \
--mode Incremental
- job: configure_workspace
dependsOn: deploy_infra
steps:
- script: databricks workspace configure ...
The --mode Incremental flag is critical. Complete mode will delete resources not in the template — including things Databricks created in the managed resource group.
What I’d do differently
- Start with Unity Catalog from day one. Retrofitting it onto an existing workspace with data is painful.
- Parameterise subnet CIDRs in the Bicep module. Hard-coding them causes conflicts when you try to peer VNets.
- Put workspace URL and cluster IDs in Key Vault immediately, not in pipeline variables. It makes secret rotation and multi-environment setups much cleaner.
The IaC for a workspace is maybe 300 lines of Bicep. The configuration inside the workspace (clusters, policies, permissions, catalogs, schemas) is another problem entirely — and probably worth a separate post.
Astro describes itself as “the web framework for content-driven websites.” After building this blog with it, I’d add: it’s the framework that actually lets you use any UI library without the framework fighting you.
Let me walk you through the concepts that matter.
The Island Architecture
Astro’s core idea is partial hydration — also called the Island Architecture. Instead of shipping a massive JavaScript bundle that hydrates the whole page (like a traditional React SPA), Astro only hydrates the components that actually need to be interactive.
---
// This React component is hydrated only on the client
import Counter from './Counter.tsx';
---
<h1>This heading is pure HTML. Zero JS.</h1>
<!-- This component ships JS to the browser -->
<Counter client:load />
<!-- This one is only hydrated when visible in the viewport -->
<HeavyChart client:visible />
The client:* directives give you surgical control over hydration:
| Directive | When it hydrates |
|---|---|
client:load | Immediately on page load |
client:idle | When the browser is idle |
client:visible | When scrolled into the viewport |
client:media | When a media query matches |
client:only | Client-side only (no SSR) |
If you don’t add a client:* directive to a component, it renders as static HTML only. No JavaScript is shipped to the browser at all.
Content Collections
Content Collections are Astro’s way of managing structured content. You define a schema, and Astro validates your frontmatter at build time.
// src/content/config.ts
import { defineCollection, z } from 'astro:content';
const blog = defineCollection({
type: 'content',
schema: z.object({
title: z.string(),
description: z.string(),
pubDate: z.coerce.date(),
tags: z.array(z.string()).default([]),
draft: z.boolean().default(false),
}),
});
export const collections = { blog };
Then query your content anywhere:
import { getCollection } from 'astro:content';
const posts = await getCollection('blog', ({ data }) => !data.draft);
const sorted = posts.sort(
(a, b) => b.data.pubDate.valueOf() - a.data.pubDate.valueOf()
);
The TypeScript types are fully inferred from the Zod schema. You get autocomplete and type errors if frontmatter is missing or wrong.
Routing
Astro uses file-based routing inside src/pages/. The filename becomes the URL:
src/pages/
├── index.astro → /
├── about.astro → /about
└── blog/
├── index.astro → /blog
└── [...slug].astro → /blog/[any-slug]
Dynamic routes export a getStaticPaths function:
---
import { getCollection } from 'astro:content';
export async function getStaticPaths() {
const posts = await getCollection('blog');
return posts.map((post) => ({
params: { slug: post.slug },
props: post,
}));
}
const post = Astro.props;
const { Content } = await post.render();
---
<article>
<h1>{post.data.title}</h1>
<Content />
</article>
Layouts
Astro layouts are just .astro components that wrap a <slot />. They’re great for page-level structure:
---
// src/layouts/BaseLayout.astro
interface Props {
title: string;
}
const { title } = Astro.props;
---
<!doctype html>
<html lang="en">
<head>
<title>{title}</title>
</head>
<body>
<header>...</header>
<main>
<slot />
</main>
<footer>...</footer>
</body>
</html>
Should you use Astro?
Yes, if:
- You’re building a blog, docs site, or marketing site
- You want to mix UI frameworks (React + Svelte + vanilla JS — all in the same project)
- You care about Core Web Vitals and shipping minimal JavaScript
Maybe not, if:
- You need a heavily interactive app (think Figma, not a blog)
- Your whole team is already deep in Next.js and there’s no reason to switch
The Astro docs are genuinely excellent. The tutorial alone will get you from zero to deployed blog in an afternoon.
Astro has become my default choice for content-focused projects. I hope this gives you enough of an overview to decide if it’s right for yours.
Robert van Timmeren
Senior Data & Cloud Engineer — Databricks, Azure, Python.