Serverless data platform.
Strictly for developers.

Build AI and data applications with serverless Python functions and data branches. Turn weeks of infrastructure into a few lines of code.

Build AI and data applications with serverless Python functions and data branches. Turn weeks of infrastructure into a few lines of code.

No credit card required.

Branch

Import

Run

Merge

import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')
# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.namespace:<12} {table.name:<12} {table.kind}")

Create sandboxes instantly without data duplication

Branch

Import

Run

Merge

import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')
# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.name:<12} {table.kind}")

Zero copy data lake branches

Branch

Import

Run

Merge

import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')
# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.name:<12} {table.kind}")

Zero copy data lake branches

See what you can build when infrastructure becomes Python code.

Branch your Data

Git for Data

Git for data

Data Version Control

Version control your data lake. Work with familiar operations—branch, commit, and merge—to track changes and collaborate confidently.

Instant Zero-Copy

Spin up development environments with your data in seconds. Create branches without duplicating data, saving both time and storage costs.

Safe and Sandboxed Experiments

Test and iterate freely with production data using safe, separate branches to move faster while keeping its integrity.

Launch development environments in seconds without data duplication, saving time and storage.

Use Git-like workflows for your data lake—branch, checkout, and merge seamlessly.

Keep your production environment safe. Collaborate in fully isolated, sandboxed environments.

import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')

# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.name:<30}")

import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')

# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.name:<30} {table.kind}")

import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')

# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.name:<30}")

import bauplan

@bauplan.model()
# Define Python env with package versions

@bauplan.python(pip={'pandas': '2.2.0'})
def clean_data(
   # Input model reference
   data=bauplan.Model('my_data')
):
   import pandas as pd
   
   # Your data transformation logic here
   ...       
   
   return clean_data
import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')

# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.name:<30} {table.kind}")

import bauplan

@bauplan.model()
# Define Python env with package versions

@bauplan.python(pip={'pandas': '2.2.0'})
def clean_data(
   # Input model reference
   data=bauplan.Model('my_data')
):
   import pandas as pd
   
   # Your data transformation logic here
   ...       
   
   return clean_data

Cloud development, simpler than local

Serverless Functions

Serverless Functions

Eliminate compute management. Run Python workloads seamlessly in the cloud with automatic scaling—no cluster configuration required.

Pure Python

Code in the language you already know. Build and test data applications directly in your IDE without learning specialized frameworks or DSLs.

No Infrastructure

Define environments with simple decorators and let Bauplan handle containers, dependencies, and resource management.

Run Python workloads in the cloud with automatic scaling—no cluster setup needed.

Build and test in your IDE without new frameworks or DSLs.

Focus on models—Bauplan handles containers, dependencies, and resources.

Develop with no Ops

Code-Driven Data Automation

Code-driven data automation

Robust deployment

Deploy with confidence. Merge validated changes into your data lake and integrate seamlessly with CI/CD pipelines.

Full reproducibility

Track and replicate pipelines deterministically. Know exactly what code produced which data, by whom, and when in one API call.

Effortless Integration

Connect your data ecosystem with one simple SDK. Integrate with visualization and orchestration tools—no special connectors needed.

Deploy with confidence—integrate validated data into production seamlessly.

Prevent issues early with embedded data quality checks.

Connect to tools and platforms with a single line of code.

Automate your Workflows

import bauplan

client = bauplan.Client()

# create a zero-copy branch of your data lake
client.create_branch(dev_branch, from_ref='main')
# create an Iceberg table and import data in it
client.create_table(table_name, dev_branch)
client.import_data(table_name, dev_branch)
# run a pipeline end-to-end in a branch
client.run('./my_project_dir', dev_branch)
# merge the new tables into the main data lake
client.merge_branch(dev_branch, into_branch='main')

print('So Long, and Thanks for All the Fish')
import bauplan

client = bauplan.Client()

# Create a new branch
branch = client.create_branch(
   branch="dev_branch",
   from_ref="main"
)
print(f'Created branch "{branch.name}"')

# List tables in the branch
for table in client.get_tables(ref=branch):
   print(f"{table.name:<30} {table.kind}")

FAQ

Don't see an answer to your question? Check our docs.

How do you keep my data secure?

What if I have already Databricks or Snowflake?

What does Bauplan replace in my AWS data stack?

Do I need to learn an entirely new data framework?

What does Git for data mean?

How do you keep my data secure?

What if I have already Databricks or Snowflake?

What does Bauplan replace in my AWS data stack?

Do I need to learn an entirely new data framework?

What does Git for data mean?

Try bauplan for free

Create your sandbox and start building.

Try bauplan for free

Create your sandbox and start building.