Build the data layer for AI systems
Code-native platform for versioned pipelines on object storage with zero infrastructure management. Simple for developers, robust for systems.
Code-native platform for versioned pipelines on object storage with zero infrastructure management. Simple for developers, robust for systems.
Branch
Import
Run
Merge
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.namespace:<12} {table.name:<12} {table.kind}")
Create sandboxes instantly without data duplication
Branch
Import
Run
Merge

import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<12} {table.kind}")
Zero copy data lake branches
Branch
Import
Run
Merge

import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<12} {table.kind}")
Zero copy data lake branches


Everything you need to build reliable data applications. Nothing you don’t.




Branch your Data
Like Git for your data systems
Git for data
Reproducibility, by design
Know exactly what code produced what data, when, and why with Git-style commits. Everything is versioned, traceable, and auditable by default.
Instant branching for dev and prod
Instant data branching
Create isolated branches in seconds—zero copy, zero wait. Power experiments and Write-Audit-Publish workflows in production.
Safe, composable data operations
Run, test, and validate changes in complete isolation. Merge confidently, automate quality gates, and revert at any time.
Launch development environments in seconds without data duplication, saving time and storage.
Use Git-like workflows for your data lake—branch, checkout, and merge seamlessly.
Keep your production environment safe. Collaborate in fully isolated, sandboxed environments.
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<30}")
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<30}")
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<30} {table.kind}")
import bauplan @bauplan.model() # Define Python env with package versions @bauplan.python(pip={'pandas': '2.2.0'}) def clean_data( # Input model reference data=bauplan.Model('my_data') ): import pandas as pd # Your data transformation logic here ... return clean_data
import bauplan @bauplan.model() # Define Python env with package versions @bauplan.python(pip={'pandas': '2.2.0'}) def clean_data( # Input model reference data=bauplan.Model('my_data') ): import pandas as pd # Your data transformation logic here ... return clean_data
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<30} {table.kind}")
Build applications not platforms
No infrastructure
Serverless Functions
Bauplan handles the tasks that usually require a platform team—packaging, scaling, execution, and environment isolation—so your developers can focus on writing Python.
Just serverless functions
Write modular functions in plain Python or SQL. Bauplan handles execution, scaling, and table I/O automatically—no config files, containers, or DAGs.
One interface, zero setup
Build and run everything from your IDE with a simple SDK. No Dockerfiles, no local hacks, no divergence between dev and prod.
Run Python workloads in the cloud with automatic scaling—no cluster setup needed.
Build and test in your IDE without new frameworks or DSLs.
Focus on models—Bauplan handles containers, dependencies, and resources.
Develop with no Ops
The code-first platform for the AI era
Code-driven data automation
A programmable data platform
Programmable data platform
Write modular, testable functions for every step of your data workflow. Version everything—tables, logic, environments—just like software.
Ship with confidence from commit to CI/CD
From commit to CI/CD
Validate before merge. Every run is tied to a commit, so results are deterministic, traceable, and rollback-ready.
Built for developers. Ready for agents
For developers and agents
Stop deploying notebooks with hidden states. Use typed APIs and versioned logic easy to script or automate—whether by a team or by a model.
Deploy with confidence—integrate validated data into production seamlessly.
Prevent issues early with embedded data quality checks.
Connect to tools and platforms with a single line of code.
Automate your Workflows
import bauplan client = bauplan.Client() # create a dev data branch client.create_branch(dev_branch, from_ref='main') # import data into tables client.import_data(table_name, dev_branch) # run a pipeline in dev client.run('./my_project_dir', dev_branch) # merge the new tables into the main data lake client.merge_branch(dev_branch, into_branch='main')
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<30} {table.kind}")
Python, Go, serverless, data lakes, Iceberg, and more than anything, superb DevEX.
Hello Bauplan
Hello Bauplan
Bauplan is a serverless data platform that treats pipelines, models, and tables like software.
Ciro Greco
Announcing Bauplan's seed round
Announcing Bauplan's seed round
Announcing $7.5M seed round led by Innovation Endeavors
Bauplan Team
Mediaset: EMR+Airflow → Bauplan+Temporal
Mediaset: EMR+Airflow → Bauplan+Temporal
Faster feedback, less infra: near real-time analytics at EU broadcaster
Jacopo Tagliabue and Marco Reni
RAG application with Bauplan and Pinecone
RAG application with Bauplan and Pinecone
End-to-end RAG system for a conversational service agent
Ciro Greco
Hello Bauplan
Bauplan is a serverless data platform that treats pipelines, models, and tables like software.
Ciro Greco
Announcing Bauplan's seed round
Announcing $7.5M seed round led by Innovation Endeavors
Bauplan Team
Mediaset: EMR+Airflow → Bauplan+Temporal
Faster feedback, less infra: near real-time analytics at EU broadcaster
Jacopo Tagliabue and Marco Reni
RAG application with Bauplan and Pinecone
End-to-end RAG system for a conversational service agent
Ciro Greco
Data as software and AI for the 99%
Build simple, robust data apps with software engineering principles.
Ciro Greco




FAQ
Don't see an answer to your question? Check our docs.
How do you keep my data secure?
What if I have already Databricks or Snowflake?
What does Bauplan replace in my AWS data stack?
Do I need to learn an entirely new data framework?
What does Git for Data mean?
How do you keep my data secure?
What if I have already Databricks or Snowflake?
What does Bauplan replace in my AWS data stack?
Do I need to learn an entirely new data framework?
What does Git for Data mean?