Serverless data platform.
Strictly for developers.
Build AI and data applications with serverless Python functions and data branches. Turn weeks of infrastructure into a few lines of code.
No credit card required.
Branch
Import
Run
Merge
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.namespace:<12} {table.name:<12} {table.kind}")
Create sandboxes instantly without data duplication

Built with Bauplan
See what you can build when infrastructure becomes Python code.
Git for Data
Data Version Control
Version control your data lake. Work with familiar operations—branch, commit, and merge—to track changes and collaborate confidently.
Instant Zero-Copy
Spin up development environments with your data in seconds. Create branches without duplicating data, saving both time and storage costs.
Safe and Sandboxed Experiments
Test and iterate freely with production data using safe, separate branches to move faster while keeping its integrity.
import bauplan client = bauplan.Client() # Create a new branch branch = client.create_branch( branch="dev_branch", from_ref="main" ) print(f'Created branch "{branch.name}"') # List tables in the branch for table in client.get_tables(ref=branch): print(f"{table.name:<30}")
import bauplan @bauplan.model() # Define Python env with package versions @bauplan.python(pip={'pandas': '2.2.0'}) def clean_data( # Input model reference data=bauplan.Model('my_data') ): import pandas as pd # Your data transformation logic here ... return clean_data
Cloud development, simpler than local
Serverless Functions
Eliminate compute management. Run Python workloads seamlessly in the cloud with automatic scaling—no cluster configuration required.
Pure Python
Code in the language you already know. Build and test data applications directly in your IDE without learning specialized frameworks or DSLs.
No Infrastructure
Define environments with simple decorators and let Bauplan handle containers, dependencies, and resource management.
Code-Driven Data Automation
Robust deployment
Deploy with confidence. Merge validated changes into your data lake and integrate seamlessly with CI/CD pipelines.
Full reproducibility
Track and replicate pipelines deterministically. Know exactly what code produced which data, by whom, and when in one API call.
Effortless Integration
Connect your data ecosystem with one simple SDK. Integrate with visualization and orchestration tools—no special connectors needed.
import bauplan client = bauplan.Client() # create a zero-copy branch of your data lake client.create_branch(dev_branch, from_ref='main') # create an Iceberg table and import data in it client.create_table(table_name, dev_branch) client.import_data(table_name, dev_branch) # run a pipeline end-to-end in a branch client.run('./my_project_dir', dev_branch) # merge the new tables into the main data lake client.merge_branch(dev_branch, into_branch='main') print('So Long, and Thanks for All the Fish')
Latest from Our Blog
Python, Go, serverless, data lakes, Iceberg, and more than anything, superb DevEX.

RAG application with Bauplan and Pinecone
End-to-end RAG system for a conversational service agent
Ciro Greco

Data as software and AI for the 99%
Build simple, robust data apps with software engineering principles.
Ciro Greco

Optimizing Cloud OLAP with DuckDB & Iceberg
Building a Serverless Lakehouse: Lessons from Spare Parts
Nathan LeClaire

Recommender systems with MongoDB
Full-stack recommender for training & MongoDB Atlas for real-time inference.
Ciro Greco
FAQ
Don't see an answer to your question? Check our docs.
How do you keep my data secure?
What if I have already Databricks or Snowflake?
What does Bauplan replace in my AWS data stack?
Do I need to learn an entirely new data framework?
What does Git for data mean?