TRACE

How might we reduce complexity around machine learning experiment management at Bloomberg?

The Bloomineers
Danielle Shoshani | Neha Chopade | Chi Huang | Norman Kang | Amy Lu

Project Background

Jan 2020 - Present

Bloomberg has tasked our interdisciplinary team to design a solution for machine learning engineers at the organization within the timeframe of 28 weeks. Applying machine learning at scale has created friction around experiment management, consuming lots of time, money, and resources, thus driving the need to manage some of this complexity.

Our goal is to design a platform for managing machine learning experiments and enhance both reproducibility and knowledge sharing to simplify the model training process, allowing for higher success rates of experiments and the delivery of increasingly sophisticated products powered by machine learning.

My Role

Product Designer

As the product designer, I am responsible for defining the project’s product specifications, coordinating within our team and with our external client, Bloomberg to define the core features for our product.

I also established and implemented our design system across the product ecosystem in the forms of a centralized machine learning management platform and a project website.

Being the design lead for the team, I also provided design mentorship to team members, overseeing their growth and learning in design processes.

Machine learning was a new domain to all of the team members when we first started this project. In order to understand the type of work our users did, we conducted in-depth primary and secondary research before meeting our stakeholders at Bloomberg. This included reviewing literature and interviewing data scientists, machine learning engineers, and product managers.

With a shared ML language, we moved forward with generative research techniques to not only understand and visualize the biggest pain points for our stakeholders, but also identify insights that could bring the most value to them.

Understand the problem space

Gain an in-depth understanding of the realm of machine learning through primary research, secondary research, and taking online courses in machine learning.

Visualize workflows across teams

Map out a master workflow and identify where different teams diverge from this.

Identify pain points and opportunities

Pinpoint and quantify pain points in order to target areas where we could have the most impact for product managers and ML engineers.

Research through design

Test out our assumptions and reframe the direction of the project through storyboarding and visual storytelling.

Domain Knowledge

We acquired domain knowledge through literature reviews, interviews with Carnegie Mellon University faculty and model uses. To understand the competitive landscape, we also did an extensive competitive analysis on the top four machine learning experiment management software.

Current State Analysis

Then we began to analyze the current software Bloomberg engineers use, whose main function is to run ML experiments by connecting data and models with GPU, then displaying the results in logs.

In this diagram we give a synthesized analysis of the pain points and opportunities identified from remote interviews, and other research methods such as love letter/breakup letters and surveys.

The identified top 4 pain points will provide a starting point for our design space.

Master Workflow

an in-depth look at Bloomberg engineer’s machine learning experiment process

We conducted remote interviews with Bloomberg employees

in order to gain a deeper understanding of the ML engineers’ workflows and mental models. We landed on visual storytelling in an effort to create a more tangible, shared understanding of our users’ processes for machine learning.

Using Mural, we created a template for our interviews, which allowed users to run through the workflow of a recent experiment they were working on while the interviewer captured and reflected feedback in real-time.

With provided sticky notes and emojis, they could add various tools and software at different stages to narrate their own workflow story. We conducted 20+ interviews using this mixed-method, and the responses were valuable in drawing out critical insights to inform design decisions.

Insight 01

Ineffective tracking leads to further issues in documentation and discoverability.

Manual tracking occuring in dispersed locations creates inconsistencies in documentation, which hinders discoverability, thus decreasing collaboration.

Insight 02

The machine learning workflow is comprised of three interdependent components: data, code, and results, which all rely upon effective tracking.

In order to solve tracking, we need to tackle inconsistencies in data, code, and results- creating a centralized platform for teams to effectively run experiments and collaborate together, speeding up the development process.

Insight 03

Because of system limitations, machine learning engineers resort to developing their own workarounds to overcome workflow challenges.

Current workarounds in tracking datasets across rows/columns and writing custom script files provide immense opportunities for streamline and integration.

As we move into the design phase of our project, a design system is much needed to ensure the product’s design consistency. Additionally, since our product may be integrated with Bloomberg’s existing ecosystem, I kept some of the elements such as the font ‘Avenir’, primary blue color, and parts of the dropdown/ breadcrumb navigation into our design system.

Here is a look at the final designs based the Design System. From left to right: Homepage/Projects page, an individual Project Overview page, and an Experiment page.

We conducted usability testing of the prototypes with our target users at Bloomberg. The results were unsatisfactory; we saw a high failure rate of task completion. I hypothesized that it was the information architecture that led to the confusion, so I proposed a dashboard approach that was much more robust, reorganizing content according to our users’ mental model. But, due to the insufficient time we have left to complete the project, it was decided too late and risky to make such drastic changes.

In this diagram we give a synthesized analysis of the pain points and opportunities identified from remote interviews, and other research methods such as love letter/breakup letters and surveys.

My teammate, Chi and I discussed this further, and we both believed that this new dashboard approach has potential. As one of the research leads, Chi suggested that we conduct A/B testing on the two designs. In two days, we talked to five machine learning engineers. 4 out of 5 preferred the new design.

4 out of 5 engineers
preferred the revised design.

Besides feedback on the cleaner UI, engineers also preferred the new design for its valuable features, such as a To-do List, a list view of their recent runs, as well as a Teams page to easily access work teammates are working on.

Robust information architecture

Contrary to the current IA, where the Homepage only shows a list of projects, the new information architecture aimed to optimize efficiency and transparacy, which are key building blocks of an engineer’s workflow.

On the home screen, we surfaced various tasks an engineer does on an daily basis, such as a report of their recent runs, an overview of the runs they have for an experiment, as well as a list of to-do’s/notes.

Less relevant tasks are embedded under secondary navigation under Runs, Projects, and Settings. Discover and Teams, which are about discoverability, now sit at the top-level navigation.

To-Do List

One of the loves in this revised design is the To-Do List/Notes section, which makes documenting ideas and thoughts about the experiment much swifter.

“I constantly make things in notepads, and stuff in the terminal so the To-Dos is very helpful.”

Recent runs at a glance

From what we learned, engineers don’t switch projects often, so an overview about the recent runs of a project is helpful in getting to the results more quickly.

“Results are critical to me so showing last experiment/last run is good.”

Although we didn’t take any actionable steps on this revised design, the exercise proved to be a valuable learning experience. Iterative testing is much needed, and we should always test to learn in order improve our design.

TRACE

Research

Master Workflow

Insights

Design