Dataiku review: Data science fit for the enterprise


Dataiku Data Science Studio (DSS) is a platform that tries to span the needs of data scientists, data engineers, business analysts, and AI consumers. It mostly succeeds. In addition, Dataiku DSS tries to span the machine learning process from end to end, i.e. from data preparation through MLOps and application support. Again, it mostly succeeds.

The Dataiku DSS user interface is a combination of graphical elements, notebooks, and code, as we’ll see later on in the review. As a user, you often have a choice of how you’d like to proceed, and you’re usually not locked into your initial choice, given that graphical choices can generate editable notebooks and scripts.

During my initial discussion with Dataiku, their senior product marketing manager asked me point blank whether I preferred a GUI or writing code for data science. I said “I usually wind up writing code, but I’ll use a GUI whenever it’s faster and easier.” This met with approval: Many of their customers have the same pragmatic attitude.

Dataiku competes with pretty much every data science and machine learning platform, but also partners with several of them, including Microsoft Azure, Databricks, AWS, and Google Cloud. I consider KNIME similar to DSS in its use of flow diagrams, and at least half a dozen platforms similar to DSS in their use of Jupyter notebooks, including the four partners I mentioned. DSS is similar to DataRobot, H2O.ai, and others in its implementation of AutoML.

Dataiku DSS features

Dataiku says that its key capabilities are data preparation, visualization, machine learning, DataOps, MLOps, analytic apps, collaboration, governance, explainability, and architecture. It supports additional capabilities through plug-ins.

Dataiku data preparation features a visual flow where users can build data pipelines with datasets, recipes to join and transform datasets, plus code and reusable plug-in elements.

Copyright © 2021 IDG Communications, Inc.



Source link