Andreea Munteanu
on 17 April 2024
MLflow is an open source platform, used for managing machine learning workflows. It was launched back in 2018 and has grown in popularity ever since, reaching 10 million users in November 2022. AI enthusiasts and professionals have struggled with experiment tracking, model management and code reproducibility, so when MLflow was launched, it addressed pressing problems in the market. MLflow is lightweight and able to run on an average-priced machine. But it also integrates with more complex tools, so it’s ideal to run AI at scale.
A short history
Since MLflow was first released in June 2018, the community behind it has run a recurring survey to better understand user needs and ensure the roadmap s address real-life challenges. About a year after the launch, MLflow 1.0 was released, introducing features such as improved metric visualisations, metric X coordinates, improved search functionality and HDFS support. Additionally, it offered Python, Java, R, and REST API stability.
MLflow 2.0 landed in November 2022, when the product also celebrated 10 million users. This version incorporates extensive community feedback to simplify data science workflows and deliver innovative, first-class tools for MLOps. Features and improvements include extensions to MLflow Recipes (formerly MLflow Pipelines) such as AutoML, hyperparameter tuning, and classification support, as well as improved integrations with the ML ecosystem, a revamped MLflow Tracking UI, a refresh of core APIs across MLflow’s platform components, and much more.
In September 2023, Canonical released Charmed MLflow, a distribution of the upstream project.
Why use MLflow?
MLflow is often considered the most popular ML platform. It enables users to perform different activities, including:
- Reproducing results: ML projects usually start with simplistic plans and tend to go overboard, resulting in an overwhelming quantity of experiments. Manual or non-automated tracking implies a high chance of missing out on finer details. ML pipelines are fragile, and even a single missing element can throw off the results. The inability to reproduce results and codes is one of the top challenges for ML teams.
- Easy to get started: MLflow can be easily deployed and does not require heavy hardware to run. It is suitable for beginners who are looking for a solution to better see and manage their models. For example, this video shows how Charmed MLflow can be installed in less than 5 minutes.
- Environment agnostic: The flexibility of MLflow across libraries and languages is possible because it can be accessed through a REST API and Command Line Interface (CLI). Python, R, and Java APIs are also available for convenience.
- Integrations: While MLflow is popular in itself, it does not work in a silo. It integrates seamlessly with leading open source tools and frameworks such as Spark, Kubeflow, PyTorch or TensorFlow.
- Works anywhere: MLflow runs on any environment, including hybrid or multi-cloud scenarios, and on any Kubernetes.
MLflow components
MLFlow is an end-to-end platform for managing the machine learning lifecycle. It has four primary components:
MLflow Tracking
MLflow Tracking enables you to track experiments, with the primary goal of comparing results and the parameters used. It is crucial when it comes to measuring performance, as well as reproducing results. Tracked parameters include metrics, hyperparameters, features and other artefacts that can be stored on local systems or remote servers.
MLflow Models
MLflow Models provide professionals with different formats for packaging their models. This gives flexibility in where models can be used, as well as the format in which they will be consumed. It encourages portability across platforms and simplifies the management of the machine learning models.
MLflow projects
Machine learning projects are packaged using MLflow Projects. It ensures reusability, reproducibility and portability. A project is a directory that is used to give structure to the ML initiative. It contains the descriptor file used to define the project structure and all its dependencies. The more complex a project is, the more dependencies it has. They come with risks when it comes to version compatibility as well as upgrades.
MLflow project is useful especially when running ML at scale, where there are larger teams and multiple models being built at the same time. It enables collaboration between team members who are looking to jointly work on a project or transfer knowledge between them or to production environments.
MLflow model registry
Model Registry enables you to have a centralised place where ML models are stored. It helps with simplifying model management throughout the full lifecycle and how it transitions between different stages. It includes capabilities such as versioning and annotating, and provides APIs and a UI.
Key concepts of MLflow
MLflow is built around two key concepts: runs and experiments.
- In MLflow, each execution of your ML model code is referred to as a run. All runs are associated with an experiment.
An MLflow experiment is the primary unit for MLflow runs. It influences how runs are organised, accessed and maintained. An experiment has multiple runs, and it enables you to efficiently go through those runs and perform activities such as visualisation, search and comparisons. In addition, experiments let you run artefacts and metadata for analysis in other tools.
Kubeflow vs MLflow
Both Kubeflow and MLFlow are open source solutions designed for the machine learning landscape. They received massive support from industry leaders, and are driven by a thriving community whose contributions are making a difference in the development of the projects. The main purpose of both Kubeflow and MLFlow is to create a collaborative environment for data scientists and machine learning engineers, and enable teams to develop and deploy machine learning models in a scalable, portable and reproducible manner.
However, comparing Kubeflow and MLflow is like comparing apples to oranges. From the very beginning, they were designed for different purposes. The projects evolved over time and now have overlapping features. But most importantly, they have different strengths. On the one hand, Kubeflow is proficient when it comes to machine learning workflow automation, using pipelines, as well as model development. On the other hand, MLFlow is great for experiment tracking and model registry. From a user perspective, MLFlow requires fewer resources and is easier to deploy and use by beginners, whereas Kubeflow is a heavier solution, ideal for scaling up machine learning projects.
Read more about Kubefllow vs. MLflow
Go to the blogCharmed MLflow vs the upstream project
Charmed MLflow is Canonical’s distribution of the upstream project. It is part of Canonical’s growing MLOps portfolio. It has all the features of the upstream project, to which we add enterprise-grade capabilities such as:
- Simplified deployment: the time to deployment is less than 5 minutes, enabling users to also upgrade their tools seamlessly.
- Simplified upgrades using our guides.
- Automated security scanning: The bundle is scanned at a regular cadence..
- Security patching: Charmed MLflow follows Canonical’s process and procedure for security patching. Vulnerabilities are prioritised based on severity, the presence of patches in the upstream project, and the risk of exploitation.
- Maintained images: All Charmed MLflow images are actively maintained.
- Comprehensive testing: Charmed MLflow is thoroughly tested on multiple platforms, including public cloud, local workstations, on-premises deployments, and various CNCF-compliant Kubernetes distributions.