Dataiku logo

Dataiku Review

Visit

End-to-end data science and AI platform for teams

Dataiku is a collaborative data science and machine learning platform for building and deploying AI projects.

Dataiku·Founded 2013·Contact for pricingFree PlanFree TrialMachine Learning PlatformsAI AnalyticsAI Data Tools

AI Panel Score

7.8/10

6 AI reviews

Reviewed

About Dataiku

Dataiku is an end-to-end data science and AI platform that covers the entire machine learning lifecycle, from raw data ingestion through model deployment and ongoing governance. It provides a centralized workspace where data professionals can collaborate on projects, share datasets, build pipelines, and operationalize models at scale. The platform is used by enterprises across industries including finance, retail, healthcare, and manufacturing.

The platform accommodates users of varying technical skill levels. Data scientists can write code in Python, R, or SQL, while less technical users can work through a visual, point-and-click interface for data preparation and model building using AutoML capabilities. This dual approach allows organizations to involve both technical and business-facing team members in the same projects without requiring everyone to write code.

Dataiku connects to a wide range of data sources and infrastructure, including cloud storage systems, databases, Hadoop clusters, Spark, Snowflake, and major cloud providers such as AWS, Azure, and Google Cloud. It supports model deployment to REST API endpoints, batch scoring, and streaming pipelines, enabling teams to move from experimentation to production within the same tool.

On the governance side, Dataiku includes features for model monitoring, explainability, audit trails, and access controls, which are relevant for organizations operating under regulatory requirements or internal AI governance policies. These capabilities position the platform within the MLOps and responsible AI segments of the market.

Dataiku competes with platforms such as Databricks, SAS Viya, Alteryx, and cloud-native ML services from AWS and Azure. It is typically sold to mid-sized and large enterprises, with pricing available through direct sales. A free edition called Dataiku Free is available for individual users with limited features.

Features

AI

  • Agentic AI

    Allows users to build and deploy AI agents grounded in enterprise data, pipelines, and models, with governed execution and scalable impact.

  • LLM Mesh

    Connects large language models to enterprise data and pipelines, enabling teams to build LLM-powered applications and research tools within the platform.

  • Machine Learning Scaling

    Breaks down organizational silos to allow more teams to build, reuse, and operationalize ML models within shared enterprise standards.

  • RAG Chatbots

    Enables construction of Retrieval-Augmented Generation chatbots that can be deployed to automate workflows and generate new revenue streams.

Analytics

  • AI Performance & Cost Tracking

    Tracks performance, cost, and risk across all AI systems within the platform to provide visibility into enterprise AI investments.

Automation

  • AI-Assisted Governed Pipelines

    Moves analysts off spreadsheets and legacy desktop tools by automating pipeline creation with AI assistance while preserving institutional knowledge.

  • Orchestration

    Connects data, AI, and applications to design and automate how enterprise workflows and AI systems run end-to-end.

Collaboration

  • Collaborative Environment

    Unites domain experts, data scientists, and engineers in a single platform where all user types can build and self-serve AI together.

Core

  • Data Preparation

    Provides tools for sourcing, cleaning, and preparing data as part of the end-to-end AI and analytics pipeline.

  • Model Deployment & Operationalization

    Enables teams to turn isolated models into shared, production-ready ML that can be reused and operationalized within enterprise standards.

Security

  • Unified Governance

    Applies consistent governance across all AI systems inside and outside Dataiku, including unified visibility, cost controls, and audit-ready oversight.

Preview

Dataiku desktop previewDataiku mobile preview

Pricing Plans

Free Trial

Contact sales

Try Dataiku's platform for AI success with a free trial, suitable for individuals exploring the platform

  • Access to core Dataiku platform features
  • Data preparation and analytics
  • Model building and deployment
  • Agentic AI capabilities
Popular

Enterprise

Contact sales

Full-scale enterprise AI platform for large organizations requiring governance, orchestration, and unified AI management

  • People: domain experts self-serve AI, experts build and deploy faster
  • Orchestration: connect data, AI, and applications across the enterprise
  • Governance: track performance, cost, and risk across all AI systems
  • Agentic AI grounded in your data, pipelines, and models
  • Unified visibility, cost controls, and audit-ready oversight
  • Enterprise ML with shared production-ready models

AI Panel Reviews

The Decision Maker

The Decision Maker

Strategic bet, vendor viability, timing, adoption approval
8.2/10

Morgan Stanley and Citigroup hired for the IPO — twelve-year-old Paris-born Dataiku is preparing to go public.

Dataiku is the twelve-year-old Paris-born data-science platform now at $342.5M ARR and lining up a 2026 US IPO. The vendor-survival question is answered; the buying call is whether LLM Mesh and unified governance beat Databricks for your specific MLOps stack.

Morgan Stanley and Citigroup are running the IPO process. That's the signal — Dataiku is preparing to be a public company in 2026, not just survive as a private one. Twelve years out of Paris, $342.5M ARR by September 2025, Series F closed.

LLM Mesh is the differentiated bet against Databricks — one governed routing layer for every model the enterprise calls, not a per-team experiment. AutoML for analysts, code notebooks for scientists, and a single audit trail underneath. Alteryx never built the model-deployment layer; Dataiku did.

However, contact-sales pricing means procurement runs 6-9 months and the Free edition is a single-user toy. Databricks owns the data-engineering-first buyer; SAS Viya owns the regulated-bank installed base. Run a 90-day pilot with one analytics team before the enterprise commit, then lock pricing before the IPO repricing.

Competitive Positioning8.0

Peer-used at enterprise scale; LLM Mesh and unified governance differentiate against Databricks, Alteryx, and SAS Viya.

Reputation Risk8.5

Used by Novartis and named enterprises across finance, retail, healthcare, and manufacturing — a defensible board pick.

Speed to Value7.0

Contact-sales pricing and enterprise rollout pushes first production model 6-9 months out; Free edition is single-user only.

Strategic Fit8.0

End-to-end MLOps with LLM Mesh, governance, and dual code/no-code modes fits enterprises consolidating AI investments.

Vendor Viability9.0

Series F, $852M raised across 12 years, $342.5M ARR, and Morgan Stanley plus Citigroup hired for a 2026 IPO.

Pros

  • LLM Mesh provides one governed routing layer for enterprise LLM calls instead of per-team experiments.
  • $342.5M ARR by September 2025 plus Morgan Stanley and Citigroup hired for IPO prep — vendor survival is settled.
  • Dual-mode interface lets data scientists write Python while analysts use AutoML in the same project.
  • Unified governance with audit trails, model monitoring, and cost tracking suits regulated industries.

Cons

  • Contact-sales pricing means 6-9 month procurement cycles before the first production model ships.
  • The Free edition is single-user and limited — no real path to evaluate team workflows without a sales call.
  • Databricks owns the data-engineering-first buyer; switching costs against an existing lakehouse are high.

Right for

Mid-sized and large enterprises who need governed AI across mixed code and no-code teams.

Avoid if

Solo data scientists who want a self-serve cloud notebook without procurement.

The Domain Strategist

The Domain Strategist

Craft and strategy in the product's domain — adapts identity per category, same lens
8.2/10

Twelve years building the AI platform CDOs actually defend in board reviews, now hedging against Databricks gravity.

Dataiku has been Paris-founded since 2013 and Tiger Global-backed through a $400M Series E that priced the company at $4.6 billion in 2021. LLM Mesh and Unified Governance give a Chief AI Officer the audit posture to scale agentic AI, but Databricks owns the lakehouse layer underneath.

The mandate for enterprise AI changed once governance moved to the board agenda — CDOs now buy platforms that survive a regulator walkthrough, not notebooks that won a hackathon. Dataiku has been building toward that posture since 2013, well before the agentic wave forced everyone to retrofit.

The product surface holds up — LLM Mesh routes enterprise data into external model providers under a governed layer, and Unified Governance covers cost, audit trail, and access across all AI inside Dataiku and beyond it. Tiger Global led a $400M Series E in 2021 at a $4.6 billion valuation; cumulative raise is $1.04 billion across nine rounds.

However, the architectural ceiling sits at the substrate. Databricks owns the lakehouse the models actually train on, so a three-year Dataiku bet means living above someone else's data plane.

Category Positioning8.1

Clear leader in MLOps and responsible AI alongside Databricks, SAS Viya, and Alteryx.

Domain Fit8.4

Twelve years of CDO-shaped workflow with dual code and visual paths matches how data teams actually staff.

Integration Surface8.2

Native connectors to Snowflake, Spark, Hadoop, and AWS, Azure, GCP cover the enterprise stack.

Long-term Implications7.8

Sits above the lakehouse, so three-year strategy depends on a substrate Dataiku does not own.

Strategic Depth8.3

LLM Mesh plus Unified Governance show best-in-class craft for enterprise AI orchestration.

Pros

  • LLM Mesh provides governed routing into external model providers grounded on enterprise data.
  • Unified Governance covers cost, audit, and risk across all AI systems inside Dataiku and beyond it.
  • $1.04 billion raised across nine rounds funds the agentic pivot without forced-sale pressure.
  • Visual and code workflows let domain experts and Python data scientists ship in one project.

Cons

  • Pricing is contact-sales only, binding any three-year procurement call to opaque renewal leverage.
  • Sits above the lakehouse layer where Databricks owns the underlying data plane.
  • Dataiku Free is feature-limited and not a real onboarding path for enterprise teams.

Right for

Chief AI Officers who need governed agentic AI on enterprise data.

Avoid if

Solo data scientists who want a notebook to ship one model.

The Finance Lead

The Finance Lead

Money, total cost of ownership, contracts, procurement math
6.8/10

Series F closed at $3.7B in December 2022 — a 20% markdown, but pricing stays contact-sales.

Wellington led Dataiku's $200M Series F at $3.7B in December 2022, down from $4.6B. Expect roughly $25K/year for 10 seats and $150K for 100 — no public floor, no published overage.

Wellington Management led the $200M Series F in December 2022 at $3.7B. Down from the $4.6B Series E in August 2021. Roughly a 20% markdown. Founded 2013. Runway is not the question; the discovery call is.

PriceLevel pegs the median at $26K per year. A 10-user license lands near $25K. 100 users runs $150K. No published per-seat anchor, no overage rate, no SSO line item. Procurement starts blind.

Databricks competes on Lakehouse depth and notebook breadth. Alteryx undercuts on desktop analyst workflows. Dataiku's LLM Mesh and Unified Governance pitch the audit-ready story enterprise legal teams ask for. The tradeoff is opacity — the platform breadth is real, but you can't model year-3 TCO without a sales call.

Billing & Procurement6.5

Enterprise invoicing with named-account sales — predictable but heavy onboarding for first-time buyers.

Contract Flexibility6.0

No public MSA terms; enterprise sales-led model implies multi-year commits and auto-renewal language.

Pricing Transparency4.5

Contact-sales only with no public tiers, per-seat anchor, or overage rate.

ROI Clarity7.0

AI Performance & Cost Tracking surfaces per-project spend, supporting measurable ROI for governed deployments.

Total Cost of Ownership6.5

Third-party data suggests $25K for 10 seats and $150K for 100, but no published metering for compute or add-ons.

Pros

  • Wellington Management led the $200M Series F at $3.7B in December 2022, so vendor runway risk is low.
  • LLM Mesh and Unified Governance give regulated industries audit-ready oversight across enterprise AI.
  • Visual AutoML plus Python, R, and SQL serves both business analysts and data scientists in one workspace.
  • Connectors to Snowflake, AWS, Azure, GCP, Spark, and Hadoop avoid single-cloud lock-in.

Cons

  • No public pricing — every tier requires a sales discovery call before any number lands.
  • PriceLevel data points to roughly $25K for 10 seats and $150K for 100, so six-figure year-2 commits are typical.
  • Series F at $3.7B was a 20% markdown from the $4.6B Series E, so growth math has reset.
  • Databricks and Alteryx compete on adjacent workloads with clearer per-unit pricing models.

Right for

Enterprise data teams who need governed MLOps at scale.

Avoid if

SMB buyers who need published per-seat pricing.

The Domain Practitioner

The Domain Practitioner

Daily hands-on reality in the product's domain — adapts identity per category, same lens
8.0/10

Visual Flow plus Python recipes scale a team, but Free edition's three-user cap kills real pilot work.

The Flow canvas and Python recipes let coders and analysts share one project without forking the pipeline. The catch is Free edition's 3-user ceiling, which forces a sales call before a squad of five can even pilot.

The Flow canvas is where the daily work lives — a CSV lands, visual recipes handle prep, a Python recipe slots where logic gets thorny, and that DAG renders for whoever opens it next. Databricks notebooks force a code-first mindset; the Flow lets a SQL-heavy analyst commit alongside a scikit-learn engineer without rewriting the other's work.

LLM Mesh routes prompts through governed connectors to OpenAI, Anthropic, or a self-hosted endpoint with the same audit trail. That matters when a compliance review asks which model touched PII. AutoML scaffolds a baseline in minutes, then exports the generated Python — no black box once you want to tune.

However, Free edition caps at 3 users per project, so any squad of five hits a contact-sales wall before validating fit. No list pricing on Enterprise, founded 2013 in Paris with $1.04B raised through a 2022 Series F — durable, but procurement still owns the timeline.

Day-3 Reality7.8

Dual visual-plus-code Flow holds up past the demo, though the canvas gets dense on projects with many recipes.

Documentation Practitioner-Fit8.0

doc.dataiku.com is dense, versioned, and code-example heavy — written for builders, not just buyers.

Friction Surface7.2

Free edition 3-user cap and contact-sales-only Enterprise pricing add procurement drag before practitioners can validate fit.

Power-User Depth8.4

Python, R, and SQL recipes plus plugins and custom code recipes scale from AutoML baseline to bespoke production pipelines.

Workflow Integration8.2

Native connectors to Snowflake, Spark, AWS, Azure, and GCP fit the common enterprise data stack without middleware.

Pros

  • Flow canvas lets coders and analysts contribute to the same DAG without forking the project.
  • LLM Mesh routes governed prompts to OpenAI, Anthropic, or self-hosted endpoints under one audit trail.
  • AutoML exports the generated Python, so tuning a baseline model isn't a black box.
  • Native connectors to Snowflake, Spark, AWS, Azure, and GCP cover the common enterprise stack.

Cons

  • Free edition caps at 3 users per project, blocking real squad-sized pilots without a sales call.
  • Enterprise pricing is contact-sales only, so procurement timelines stall fit-validation work.
  • The Flow UI gets visually dense fast on projects with 30+ recipes.

Right for

Data teams who blend Python coders and SQL-fluent analysts on the same project.

Avoid if

Solo data scientists who want a notebook-first tool without enterprise scaffolding.

The Power User

The Power User

Daily human experience, onboarding, polish, learning curve, reliability
7.7/10

Dataiku's Free edition caps at three users — the rest is enterprise sales calls and visual recipes

The LLM Mesh routes enterprise data into modern models while the visual recipe canvas keeps analysts and data scientists in one project. The catch is contact-sales pricing, a Free tier capped at three users, and a learning curve heavy enough to feel like homework on day one.

Free edition tops out at three users. That's the first thing you notice trying to start. The hosted Free Trial maxes at two collaborators, the downloadable Free at three, everything beyond is contact-sales. Founded in Paris in 2013, $1.04B raised across nine rounds, CapitalG anchored the 2019 unicorn moment.

The LLM Mesh is the 2026 headline piece — a routing layer between enterprise data and whichever model is in fashion. Underneath, the visual recipe canvas does the boring data-prep work, and Python, R, and SQL drop into the same flow. Analysts and a data scientist can sit in one project.

But the platform is heavy. Versus Databricks for code-first teams or Alteryx for pure analyst flow, Dataiku is the both-at-once bet, and the first hour feels like homework. Month three is when governance starts paying off — if you survived month one.

Daily Polish7.6

Mature visual recipe canvas and a decade of UX iteration, though the marketing site feels heavier than the product.

Learning Curve6.8

Dual code-plus-visual interface is powerful but the first hour reads like coursework rather than welcome.

Mobile Parity7.5

Enterprise ML platforms are desktop-first by nature; neutral score for category.

Onboarding Experience6.8

Free Trial spins up in two minutes but anything past three users routes you to sales.

Reliability Feel8.2

Twelve-year-old platform with audit trails, governance, and enterprise customers across regulated industries.

Pros

  • LLM Mesh provides a governed routing layer between enterprise data and external language models.
  • Visual recipe canvas and code in Python, R, or SQL live in the same project flow.
  • Twelve-year-old company with $1.04B raised and CapitalG-backed unicorn status — durable vendor risk.
  • Built-in governance, audit trails, and cost tracking for regulated finance and healthcare buyers.

Cons

  • Free edition caps at three users and everything past that is contact-sales pricing.
  • Heavy platform with a real learning curve in the first hour of use.
  • Overkill for solo practitioners or small analytics teams who want a Notebook and a notebook.

Right for

Mid-size to large enterprises who want analysts and data scientists collaborating in one platform.

Avoid if

Solo practitioners who want to start without contacting a sales team.

The Skeptic

The Skeptic

Contrarian. Watch-outs, deal-breakers, broken promises, category patterns
7.8/10

Series F at $3.7B in 2022, IPO whispers in 2026 — the standalone story still has runway.

Dataiku closed a $200M Series F at a $3.7B valuation in December 2022 and is reportedly prepping a 2026 IPO with Morgan Stanley and Citi. The LLM Mesh and 2013 Paris-vintage team are real differentiators, but contact-only pricing against Databricks lakehouse gravity makes the buyer math slow.

Dataiku filed quiet IPO prep with Morgan Stanley and Citi for H1 2026. That's the lens. Series F closed December 2022 at a $3.7B valuation, roughly $350M ARR by 2025. Not flailing. Not flying either.

Founded 2013 in Paris by Florian Douetteau and three co-founders. LLM Mesh is the real differentiator — a governance layer between enterprise data and external LLMs, not yet-another-AutoML wrapper. Snowflake, Spark, and the big-three cloud connectors are named. The 2013 vintage matters: this team shipped before the generative-AI gold rush.

The catch is the buyer profile. Against Databricks lakehouse gravity and Azure ML bundled pricing, contact-only sales cycles get long. Exit travels — Python and SQL code is portable, models export. But the visual recipes don't, and that's where mid-tier analysts live.

Competitive Differentiation7.5

LLM Mesh and the unified code-plus-visual workspace are genuinely different from Databricks lakehouse-first and Alteryx desktop-roots positioning.

Exit Portability7.5

Python, R, SQL code and trained models export cleanly, but visual recipes and pipeline metadata stay inside the platform.

Long-term Viability8.0

13-year independent operator with Wellington-led Series F and Morgan Stanley/Citi IPO underwriters appointed — strong public signals.

Marketing Honesty7.0

The Platform for AI Success tagline is buzzy, but the pillar breakdown (People, Orchestration, Governance) maps to shipped features.

Track Record Match8.0

2013 founding, $1B raised over nine rounds, ~$350M ARR — matches the survivor pattern, not the burn-and-vanish one.

Pros

  • Founded 2013 in Paris and still independent 13 years later with $1B raised across nine rounds.
  • LLM Mesh provides a governance and routing layer for external LLM integration that lakehouse-first competitors do not ship.
  • Visual point-and-click interface and code paths (Python, R, SQL) let mixed-skill teams collaborate in one workspace.
  • Named connectors to Snowflake, Spark, AWS, Azure, and Google Cloud — not vaporware integration claims.

Cons

  • Contact-only pricing on the Enterprise tier buries ROI math before sales gets involved.
  • Visual recipes and pipeline metadata do not export cleanly — exit story is partial for analyst-built work.
  • Crowded enterprise ML category with Databricks lakehouse gravity and Azure ML bundled pricing pulling budget.

Right for

Mid-to-large enterprises who need governed end-to-end ML on hybrid data.

Avoid if

Solo analysts who want a self-serve credit-card AutoML tool.

Buyer Questions

Common questions answered by our AI research team

Integration

Does Dataiku's LLM Mesh support integration with external large language model providers, or is it limited to models built within the platform?

The content mentions Dataiku's LLM Mesh in the context of Novartis using it to 'revolutionize healthcare market research,' but does not specify whether it supports integration with external LLM providers or is limited to models built within the platform.

Security

What governance controls does Dataiku provide for monitoring AI agent execution costs and audit trails across enterprise deployments?

Dataiku's governance capabilities include 'unified visibility, cost controls, and audit-ready oversight' applied 'across all AI inside Dataiku and beyond it.' The platform allows users to 'see everything, control cost, and mitigate risk' with consistent governance across AI systems, including agent execution.

Setup

How does Dataiku enable domain experts and non-data-scientists to self-serve AI without requiring engineering support during setup?

The content states that Dataiku's People pillar enables 'domain experts to self-serve AI' while 'experts build and deploy faster,' but does not provide specific details about the setup process or whether engineering support is required during onboarding.

Features

Can Dataiku's pipelines connect to legacy analytics tools and replace spreadsheet-based workflows without disrupting existing analyst processes?

The content states that Dataiku helps 'move analysts off spreadsheets and legacy desktops without disruption,' with 'AI-assisted, governed pipelines' that 'preserve institutional knowledge and deliver trusted insights at enterprise scale.' This suggests existing analyst workflows can be modernized without disrupting current processes.

Product Information

  • Company

    Dataiku
  • Founded

    2013
  • Pricing

    Contact for pricing
  • Free Trial

    Available
  • Free Plan

    Available

Platforms

webmacwindowslinux

About Dataiku

Dataiku is a New York-based enterprise AI and machine learning platform that unifies data preparation, modeling, and deployment for data science teams.

Resources

Documentation
API
Blog
Changelog

Also in Machine Learning Platforms