Audience · updated 2026-05-17

Apache Iceberg vs Delta Lake vs Apache Hudi in 2026: the user audiences compared

The lakehouse audience exists because the warehouse economics broke at the high end. Companies whose Snowflake bills crossed a threshold, whose Databricks compute costs outgrew the simpler use cases, or whose data volume made closed formats untenable started routing the harder workloads through open table formats on object storage. Apache Iceberg, Delta Lake, and Apache Hudi each found a different slice of that audience. Vendors of compute engines, catalog services, and metadata management tools reach the lakehouse audience through Sponsored Challenges scoped to the specific operational problem the open table format the buyer chose actually solves.

By DataDriven Partners Editorial Researched against open-source project surfaces and observed buyer patterns Last reviewed 2026-05-17 · 13 min read

Frequently asked

Which open table format is dominant in 2026?

Apache Iceberg. Cross-vendor support from Snowflake, BigQuery, Databricks, Dremio, Starburst, and most catalog services. Delta Lake remains dominant inside the Databricks ecosystem; Apache Hudi has the strongest merge-on-read support but smaller cross-vendor adoption.

How should I scope a Sponsored Challenge for lakehouse practitioners?

Scope to an operational problem specific to the open table format the customer runs. Iceberg partition evolution, Delta Lake protocol upgrades, Hudi merge-on-read semantics, cross-engine catalog interoperability. The vendor partners with DataDriven Partners editorial to scope the problem against the format the vendor wants to reach.

Should I support Iceberg, Delta, and Hudi all equally?

Probably not in your first product version. Pick the format your customers actually use (typically Iceberg in 2026), implement deeply, add others when there is real demand. The Sponsored Challenge scope can be per format across multiple placement quarters.

Why does catalog choice matter so much?

The catalog service holds the metadata that determines which compute engines can read which tables. Catalog choice gates interoperability across engines, governance policies, and metadata portability. Vendors who skip catalog support in their marketing lose evaluations.

What is Subsurface?

The annual lakehouse-specific conference run by Dremio. Largest in-person lakehouse practitioner event in 2026. Tiered sponsorship and editorial speaking slots; reinforces the Sponsored Challenge running concurrently.

How does this audience differ from warehouse practitioners?

Lakehouse practitioners chose open table formats on object storage over closed warehouses. The choice is architectural; cost, vendor lock-in avoidance, data volume, or interoperability requirements drove it. Many practitioners run hybrid architectures (warehouse for easy cases, lakehouse for hard ones).

What about the Tabular acquisition by Databricks?

Tabular's acquisition by Databricks in 2024 changed the Iceberg ecosystem dynamics. Databricks now has both Delta Lake and significant Iceberg position; the practical effect is that Databricks lakehouse marketing increasingly covers both formats. Plan for cross-format expectations.

How do I reach the Hudi subaudience?

The Hudi audience is smaller but technically concentrated. The Apache Hudi mailing list and the Onehouse community channels are the venues. Sponsored Challenges scoped to merge-on-read semantics or incremental-table patterns reach the audience inside their evaluation frame.

What evaluation criteria does the lakehouse audience apply?

Specification fidelity (does this implement the spec correctly), catalog interoperability (does this work with our catalog), honest open-versus-closed positioning (vendors who add quiet lock-in get caught), operational maturity at scale, and integration breadth across the heterogeneous lakehouse stack.

How long are lakehouse tool evaluation cycles?

Longer than warehouse-adjacent tool evaluations because the audience is making an architectural decision that affects multiple downstream systems. Six-month cycles are typical at Series B and later companies; cycles can extend longer at enterprises.

The three open table formats and who chose each

Apache Iceberg has become the dominant open table format in 2026 with cross-vendor support from Snowflake, BigQuery, Databricks, Dremio, Starburst, and most catalog services. The audience that chose Iceberg typically did so because they wanted cross-engine interoperability and vendor neutrality; the project originated at Netflix and is governed by Apache. The Iceberg user subaudience is the largest slice of the broader lakehouse audience in 2026.

Delta Lake remains dominant inside the Databricks ecosystem and has strong Linux Foundation governance. The audience that chose Delta typically did so because they were already running Databricks at significant scale; Delta Lake's tight integration with the Databricks platform was the deciding factor. Cross-vendor Delta support is thinner than Iceberg; the audience is overwhelmingly Databricks-flavored.

Apache Hudi has the strongest support for merge-on-read and incremental-table semantics but smaller cross-vendor adoption than Iceberg or Delta. The audience that chose Hudi typically did so because they have specific incremental-table requirements (CDC ingestion, upsert-heavy workloads) that the other formats handle awkwardly. Onehouse runs the commercial entity around Hudi; the community is smaller but technically concentrated.

Why specification fidelity is the evaluation gate

The lakehouse audience reads upstream Apache project documentation in surprising depth. The Iceberg spec, Delta Lake protocol, and Hudi specification are all reference material the audience returns to. Vendors whose products integrate with these formats are evaluated against specification fidelity, not against feature lists alone. A vendor that claims Iceberg support but implements snapshot isolation incorrectly gets caught on first proof-of-concept; the audience tests against the spec the day they sign up.

The implication for vendor marketing is that overstated spec coverage is corrosive. Vendors who claim full Iceberg support with shallow implementation lose evaluations and lose trust durably. Vendors who scope honestly (deep Iceberg support, Delta Lake support next, Hudi support later if there is real demand) earn durable trust the audience extends to engineering-honest vendors.

The catalog question, which is bigger than vendors expect

The catalog service holds the metadata that determines which query engines can read which tables. In 2026 the main catalog options are Tabular (Iceberg- focused, now part of Databricks following the 2024 acquisition), Snowflake Polaris, Databricks Unity Catalog, and the open-source Apache Iceberg REST catalog spec that multiple vendors implement.

The choice is consequential. Catalog choice determines what compute engines can interoperate, what governance policies apply, and what metadata travels with the data. Vendors of compute or query tools need to address catalog support explicitly; vendors of catalog services need to interoperate with the engines the audience already runs. Vendors who pretend the catalog question is a minor implementation detail get caught in evaluation; the audience knows the catalog is the linchpin and asks the catalog question first.

Why a Sponsored Challenge reaches the lakehouse audience cleanly

A Sponsored Challenge on DataDriven.io scoped to an open-table-format operational problem reaches the lakehouse audience inside the architectural evaluation frame. Partition evolution under concurrent writes against an Iceberg dataset. Snapshot isolation testing across multiple writers. Schema evolution semantics validation. Catalog interoperability across query engines. Each of these problem shapes is something a lakehouse engineer evaluates real products against; the placement reaches them in the same mode they apply at work.

The mechanics: the engineer browses the challenge catalog, selects a problem on partition evolution against an Iceberg dataset, attempts the solution for twenty to forty minutes, and clicks through the UTM-tagged closing CTA to the vendor's documentation on the technique. The engineer leaves with a working operational mental model of the vendor's product against the open table format they actually use; the placement reaches the lakehouse audience through architectural depth, not through marketing copy.

Lakehouse vocabulary

The terms that come up in lakehouse-targeted placement scoping.

Lakehouse: An architecture that puts data in open table formats on object storage and routes different workloads to different compute engines. Distinct from a data warehouse (vendor-managed storage and compute, closed format) and a data lake (object storage, often raw files without table semantics).
Open table format: A specification for organizing files in object storage so they behave like tables (snapshot isolation, schema evolution, partition pruning). Apache Iceberg, Delta Lake, and Apache Hudi are the three primary options in 2026.
Apache Iceberg: The dominant open table format in 2026. Originated at Netflix, governed by Apache. Cross-vendor support from Snowflake, BigQuery, Databricks, Dremio, Trino, Starburst, and most catalog services.
Delta Lake: Databricks-origin open table format, now governed by the Linux Foundation. Strongest support inside the Databricks ecosystem; cross-vendor support thinner than Iceberg.
Apache Hudi: Uber-origin open table format with the strongest merge-on-read and incremental table semantics. Smaller cross-vendor adoption than Iceberg or Delta but distinct technical advantages for specific workloads.
Catalog service: The metadata layer that tracks which tables exist, where their files live, and what schemas they have. Determines compute-engine interoperability. Tabular, Snowflake Polaris, Databricks Unity Catalog, and Apache Iceberg REST catalog are the main options.
Sponsored Challenge scoped to lakehouse format: A placement on DataDriven.io scoped to the open-table-format problems the audience evaluates real products against. Partition evolution, schema evolution, catalog interoperability, snapshot isolation. Reaches the lakehouse audience inside the architectural evaluation frame.

What this page documents

Apache Iceberg has become the dominant open table format for lakehouse workloads in 2026, with Delta Lake (Databricks-aligned) and Apache Hudi (Uber-originated, smaller) holding meaningful shares. Snowflake, BigQuery, Databricks, Dremio, and Trino all support Iceberg natively as of 2026.

Apache Iceberg project, vendor public positioning 2026-05 Open-source project momentum

The lakehouse audience evaluates vendors on specification fidelity (does this implement the Iceberg or Delta protocol correctly), catalog interoperability (does this work with the catalog we run), and honest open-versus-closed positioning (vendors who quietly add lock-in get caught fast).

Industry pattern; audience evaluation framing 2026-05 Evaluation-criteria scoping

A Sponsored Challenge on DataDriven.io scoped to an open-table- format problem (partition evolution under concurrent writes, schema evolution semantics, catalog interoperability across query engines) reaches the lakehouse audience inside the architectural evaluation frame.

DataDriven Partners placement scoping 2026-05 Placement-audience alignment framing

Catalog services (Tabular, Snowflake Polaris, Databricks Unity Catalog, Apache Iceberg REST catalog) have become a central decision point. Catalog choice determines metadata interoperability across query engines; vendors of compute or query tools must address catalog support explicitly.

Industry consensus on lakehouse architecture 2026-05 Architectural decision framing

Subsurface (Dremio's lakehouse conference) is the largest lakehouse-specific in-person event in 2026. Databricks Data + AI Summit covers the Delta Lake slice. Vendor-run Slacks (Tabular, Onehouse, Dremio, Starburst) cover the daily engagement layer.

Public conference and community surfaces 2026-05 Venue scope cross-reference

How vendor scope should match format scope

The lakehouse market consolidation around Iceberg, Delta, and Hudi means vendors of compute, catalog, query, and metadata tools need to scope their support honestly. A vendor that claims to support all three formats but only one is production-grade gets caught fast; the audience tests integrations on day one. The honest scope: pick the format the customer base runs (typically Iceberg in 2026), implement deeply, add the others when there is real demand. The Sponsored Challenge scope follows: a vendor with deep Iceberg support scopes the placement to an Iceberg problem; a vendor with deep Delta Lake support scopes to a Delta problem.

The Tabular acquisition by Databricks in 2024 changed the Iceberg ecosystem dynamics meaningfully. Databricks now has both Delta Lake and significant Iceberg position; the practical effect is that Databricks lakehouse marketing increasingly covers both formats. Vendors targeting Databricks-ecosystem lakehouse practitioners should plan for cross-format integration as the audience increasingly expects both.

Catalog support in Sponsored Challenge scoping

The catalog question surfaces directly in Sponsored Challenge scoping. A challenge on Iceberg partition evolution implicitly assumes a catalog; the dataset has to live somewhere; the catalog determines what query engines can read it. Vendors scoping a placement around their product's catalog support are scoping exactly the question the audience asks first during evaluation. The placement is the audience's first hands-on experience with the vendor's catalog integration; the closing CTA points to documentation on the catalog story.

Vendors of catalog services have a particularly clean Sponsored Challenge story. The placement can be scoped to a catalog interoperability problem across multiple query engines; the engineer attempting the challenge experiences the catalog's cross-engine behavior directly. Vendors of compute engines have the inverse story: the placement can be scoped to a workload that depends on catalog metadata, with the engineer experiencing the compute engine's catalog integration through the challenge.

The three open table formats compared for vendor placement scoping

How vendor positioning should differ by subaudience.

Format	Origin and governance	Sponsored Challenge problem shapes that fit	Where the subaudience lives
Apache Iceberg	Netflix-origin, Apache-governed, cross-vendor	Partition evolution, snapshot isolation, hidden partitioning	Apache Iceberg mailing list, Tabular Slack, Subsurface
Delta Lake	Databricks-origin, Linux Foundation-governed	Delta protocol upgrades, multi-writer behavior	Databricks community, Databricks Summit
Apache Hudi	Uber-origin, Apache-governed	Merge-on-read semantics, incremental table patterns	Apache Hudi mailing list, Onehouse community

The same vendor running lakehouse-adjacent tools can scope different Sponsored Challenges to different subaudiences across multiple placement quarters. The placement scope matches the format the customer runs.

One specific situation: a Series A query engine vendor's lakehouse playbook

A Series A query engine vendor targeting lakehouse workloads has a clean playbook. Year one focus: ship deep Apache Iceberg support (full spec compliance, not partial); add Delta Lake support next; leave Hudi support for year two if there is real demand. Scope a Sponsored Challenge to an Iceberg-specific problem (partition evolution under concurrent writes against a realistic dataset, snapshot isolation testing, hidden-partitioning correctness). Participate substantively on the Apache Iceberg user mailing list with named vendor engineers.

Sponsor Subsurface at a mid-tier with speaking-slot pursuit on an Iceberg topic. Build presence in the Tabular Slack and the Dremio community Slack with disclosed-affiliation engineers. Address the catalog question explicitly in documentation: name which catalogs the engine integrates with and how. Pair the Sponsored Challenge with a Brand Slot on lakehouse-relevant topic pages during the placement quarter for repeated brand exposure.

The combination reaches the lakehouse audience through the venues they actually read and the placement format that matches their architectural evaluation frame. Pipeline conversion measures through multi-touch attribution; the Sponsored Challenge consistently appears in first-touch position for lakehouse customers who closed during the placement window.

What does not work for this audience

Three patterns waste vendor effort on lakehouse practitioners. Format-agnostic positioning that does not name Iceberg, Delta, or Hudi explicitly reads as either ignorance or hedging. Overstated cross-format coverage where the vendor claims to support all three formats but only one is production-grade gets caught on first proof-of-concept. Catalog hand-waving where the vendor's product depends on catalog choice but the marketing pretends it does not; the audience asks the catalog question first and rules out vendors who do not have a clear answer.

The Sponsored Challenge scoping helps with each of these. A placement scoped to a specific open-table-format problem names the format and the problem directly; the closing CTA points to documentation the engineer can validate against; the editorial collaboration during scoping forces the vendor to be honest about which formats the product handles well and which it handles less well.

The long arc on lakehouse architecture decisions

Lakehouse practitioners are an architecture-driven audience, and architectural choices have multi-year half-lives. Vendors who establish presence in this audience in 2026 are positioning for buying decisions that play out through 2028 and beyond. Year-one investment in upstream contribution, conference presence, product depth, and Sponsored Challenge placements compounds into trust that survives the architectural shifts that will come. Iceberg may not be the dominant format in 2030; the vendors who showed up in the Iceberg community in 2026 are positioned to follow the audience wherever it goes next.

Architecture

The lakehouse audience is defined by an architectural decision, not a job title. The decision converges on a similar set of evaluation criteria for tooling (specification fidelity, catalog interoperability, honest open-versus-closed positioning). The Sponsored Challenge format adapts cleanly to these criteria when scoped to an open-table-format operational problem; the placement reaches the audience inside the frame they evaluate in.

DataDriven Partners audience-scoping framework, Architectural-decision framing · 2026-05-17

Sources cited

Apache Iceberg project · Apache Software Foundation · 2026
Delta Lake project · Linux Foundation · 2026
Apache Hudi project · Apache Software Foundation · 2026
Subsurface conference · Dremio · 2026

Reach lakehouse practitioners inside the architectural evaluation frame.

A Sponsored Challenge scoped to an open-table-format operational problem against an Iceberg or Delta Lake dataset reaches the lakehouse audience in evaluation mode. Apply with your operational story and the founder will scope the placement.

Apply to scope a lakehouse Sponsored Challenge Suggest a correction