that reduces complexity around machine learning experiment management at Bloomberg?
Danielle Shoshani | Neha Chopade | Chi Huang | Norman Kang | Amy Lu
Jan 2020 - Present
Bloomberg has tasked our interdisciplinary team to design a solution for machine learning engineers at the organization within the timeframe of 28 weeks. Applying machine learning at scale has created friction around experiment management, consuming lots of time, money, and resources, thus driving the need to manage some of this complexity.
Our goal is to design a platform for managing machine learning experiments and enhance both reproducibility and knowledge sharing to simplify the model training process, allowing for higher success rates of experiments and the delivery of increasingly sophisticated products powered by machine learning.
As the product designer, I am responsible for defining the project’s product specifications, coordinating within our team and with our external client, Bloomberg to develop a product that improves machine learning experiment management.
I also established and implemented our design system across the product ecosystem in the forms of a centralized machine learning management platform and a project website.
Being the design lead for the team, I also provided design mentorship to team members, overseeing their growth and learning in design processes.
Machine learning was a new domain to all of the team members when we first started this project. In order to understand the type of work our users did, we conducted in-depth primary and secondary research before meeting our stakeholders at Bloomberg. This included reviewing literature and interviewing data scientists, machine learning engineers, and product managers.
With a shared ML language, we moved forward with generative research techniques to not only understand and visualize the biggest pain points for our stakeholders, but also identify insights that could bring the most value to them.
Gain an in-depth understanding of the realm of machine learning through primary research, secondary research, and taking online courses in machine learning.
Map out a master workflow and identify where different teams diverge from this.
Pinpoint and quantify pain points in order to target areas where we could have the most impact for product managers and ML engineers.
Test out our assumptions and reframe the direction of the project through storyboarding and visual storytelling.
We acquired domain knowledge through literature reviews, interviews with Carnegie Mellon University faculty and model uses. To understand the competitive landscape, we also did an extensive competitive analysis on the top four machine learning experiment management software.
Current State Analysis
Then we began to analyze the current software Bloomberg engineers use, whose main function is to run ML experiments by connecting data and models with GPU, then displaying the results in logs.
In this diagram we give a synthesized analysis of the pain points and opportunities identified from remote interviews, and other research methods such as love letter/breakup letters and surveys.
The identified top 4 pain points will provide a starting point for our design space.
an in-depth look at Bloomberg engineer’s machine learning experiment process
We conducted remote interviews with Bloomberg employees
in order to gain a deeper understanding of the ML engineers’ workflows and mental models. We landed on visual storytelling in an effort to create a more tangible, shared understanding of our users’ processes for machine learning.
Using Mural, we created a template for our interviews, which allowed users to run through the workflow of a recent experiment they were working on while the interviewer captured and reflected feedback in real-time.
With provided sticky notes and emojis, they could add various tools and software at different stages to narrate their own workflow story. We conducted 20+ interviews using this mixed-method, and the responses were valuable in drawing out critical insights to inform design decisions.
Ineffective tracking leads to further issues in documentation and discoverability.
Manual tracking occuring in dispersed locations creates inconsistencies in documentation, which hinders discoverability, thus decreasing collaboration.
The machine learning workflow is comprised of three interdependent components: data, code, and results, which all rely upon effective tracking.
In order to solve tracking, we need to tackle inconsistencies in data, code, and results- creating a centralized platform for teams to effectively run experiments and collaborate together, speeding up the development process.
Because of system limitations, machine learning engineers resort to developing their own workarounds to overcome workflow challenges.
Current workarounds in tracking datasets across rows/columns and writing custom script files provide immense opportunities for streamline and integration.