From BI to AI: Why Your Lakehouse Needs Both Lance and Iceberg

To Nha Notes | Dec. 22, 2025, 4:58 p.m.

The data landscape is evolving. As AI workloads become mainstream, the traditional lakehouse architecture—built primarily for BI and analytics—is being stretched to its limits. A new pattern is emerging: dual-format lakehouses that leverage both Apache Iceberg and Lance.

The Modern Lakehouse Challenge

Apache Iceberg has been the backbone of analytics lakehouses, providing transactional guarantees and schema evolution at scale. But AI/ML workloads bring different requirements: vector embeddings, multimodal data (images, audio, video), and fundamentally different access patterns.

This is where Lance enters the picture—a columnar format purpose-built for AI/ML workloads.

Key Differences

Iceberg excels at:

  • Traditional OLAP analytics with partition-based optimization
  • Mature ecosystem with deep compute engine integrations
  • Centralized observability through catalog-aware operations

Lance shines for:

  • Fast random access (100x faster than Parquet for AI workloads)
  • Native multimodal data storage (no external blob references needed)
  • Zero-cost schema evolution—adding columns doesn't require full table rewrites

The Unified Approach

Companies like Netflix are now adopting both formats: Iceberg for BI workloads, Lance for AI and multimodal data. This dual-format strategy lets organizations leverage the strengths of each without compromise.

The key insight? It's not about choosing one over the other—it's about using the right tool for each workload while maintaining interoperability at the compute layer.


Want to dive deeper? Read the full technical analysis: From BI to AI: A Modern Lakehouse Stack with Lance and Iceberg