Welcome to our blog. Here we share insights and scientific research on handling the data stack

Welcome to our blog. Here we share insights and scientific research on handling the data stack

Reference

Dec 20, 2024

Written by Ciro Greco

Full-stack recommender system with Bauplan for data preparation and training, and MongoDB Atlas for real-time inference.

Read more

Reasearch

Dec 2, 2024

Written by Jacopo Tagliabue, Tyler Caraza-Harter and Ciro Greco

Paper presented at WoSC10 2024. In collaboration with The University of Wisconsin.

Read more

Engineering

Nov 22, 2024

Written by Nathan LeClaire and Ciro Greco

Making the experience of running data workflow in the cloud indistinguishable than doing it locally.

Read more

Reasearch

Nov 12, 2024

Written by Jacopo Tagliabue, Ryan Curtin, Ciro Greco

Read more

Reference

Oct 21, 2024

Written by Jacopo Tagliabue and Chris White (CTO, Prefect)

A reference implementation to implement a Write-Audit-Publish (WAP) pattern with Bauplan and Prefect 3.0.

Read more

Engineering

Sep 18, 2024

Written by Ciro Greco

Find the right balance between cost control and fast startup time for your Spark clusters.

Read more

Research

Jun 9, 2024

Written by Jacopo Tagliabue and Ciro Greco

Paper presented at SIGMOD/PODS 2024. Awarded best paper DEEM@SIGMOD.

Read more

Open Source

Apr 11, 2024

Written by Ciro Greco

An open source implementation of WAP using Apache Iceberg, Lambdas, and Nessie all running entirely Python.

Read more

Engineering

Sep 6, 2023

Written by Ciro Greco and Jacopo Tagliabue

The greatest invention since sliced Virtual Machines.

Read more

Research

Aug 10, 2023

Written by Jacobo Tagliabue, Ciro Greco, and Luca Bigon

Read more

Open Source

Jun 4, 2023

Written by Ciro Greco

An open-source implementation of a Data Lake with DuckDB and AWS Lambdas.

Read more

Try bauplan

Try bauplan