Seven collisions from the industry’s biggest event on machine learning observations

Arize: Observe, an annual summit focused on machine learning (ML), was summed up last week in front of an audience of more than 1,000 technical leaders and interns. Now available for on-demand streaming, the event has several tracks and conversations from Etsy, Kaggle, Opendoor, Spotify, Uber and many more. Here are some highlights and quotes from some of the top sessions.

The scale of the machine learning platform is all about the client

In the panel “Scale of Your ML Experience,” Coinbase Engineering Director Chintan Turakhia makes it clear: “A platform is not a platform for a single platform.” His advice to teams who want to build from the ground up is: “Don’t talk about the ML platform first; talk about what problems you solve for your customers and how you can improve your business with ML… ML’s work is wonderful for the sake of ML and all the worlds like Kaggle are built for this but problem solving The “core client” is everything that is internal, he says.

The machine learning infrastructure is more complex than the software infrastructure

In “ML Platform Energy: Collect!” panel, Smith Schaam, Uber’s Director of Engineering, makes an important distinction between the engineering learning infrastructure and the software and data infrastructure.

“There’s a misconception that ML is all about algorithms,” he says. “In reality, machine learning is about data, systems, and models. Thus, the infrastructure needed to support machine learning from initial design to deployment to current maintenance is very large and complex. The system in which these models work also depends on the underlying layers of data. As an example, seemingly innocuous changes in the data layer can completely change the outcome of the model. Unlike software engineering, ML work doesn’t end when you just test your model and put it into production – model predictions change with data changes, market conditions, you have seasonality, and your surrounding systems and business assumptions change. You have to take all of this into account because you are building the entire infrastructure of the ML platform. “

As a result, the ML infrastructure is “a set of things that include your software infrastructure, data infrastructure, and then things that are unique to your modeling”.

Variety is Table Contribution

Sean Ramirez, PhD, Head of Data Science at Shelf Engine, where women hold 50% of all leadership positions, quickly notes the countless advantages of his company’s diversity. “I think the commitment to diversity and coverage in the Shelf Engine is important in many ways,” he says. “First, it affects the accuracy and objectivity in our scientific models. Second, it changes the development of our products. And finally, it affects the quality and maintenance of our team.”

Tulsi Doshi, Product Manager – Head of AI and Human-Oriented Technology at Google, adds that it’s important not to overlook the global dimensions of diversity. “A lot of what we say in the press today is very Western – we’re talking about the patterns of failure that are relevant to U.S. societies – but I think a lot of those concerns are about justice, about systemic racism. and selfishness is really very different when you go to different regions, ”he says.

AI ethics towards compliance or Explanatory

According to a wide range of speakers, the existence of an AI ethics strategy for enterprises is also important. Bahar Sateli, senior manager of AI and PwC analysis, notes: “This artificial intelligence is not an addition to the scientific practice of your data, it’s not a luxury to be added to your operations, it’s something that should exist from day one.”

To Reed Blackman, Founder and CEO of Virtue Consultants, this is also something to start from the top. “One of the reasons we don’t see enough AI ethics in practice is the lack of top leadership,” he says. Finally, AI ethics should be “through how you feel about financial incentives for employees, how you feel about roles and responsibilities,” he adds.

For many, new approaches are needed to manage AI ethical risk. “We can’t ignore the fact that models make mistakes and we need to be conservative and responsible for this,” notes Tulsi Doshi of Google. “But we can also do a lot to prevent potential errors if we are careful in the measurements we develop and are really determined to make sure that we are cutting these measurements in different ways and we are one. we develop a series of metrics to measure different types of results. ”He warns against over-reliance on interpretation or transparency in the process:“ I don’t think any of this in itself is a way to solve AI ethics issues. just like a metric doesn’t work … these things work in a concert. together. ”

The artificial information-focused revolution increases the need for ultimate observation

Diego Oppenheimer, executive vice president of DataRobot, notes in the “Prepare Yourself for a Data-Centered World” panel that the world of civil data scientists and specialized data science groups have some commonalities. “Practices are changing, but the part that’s consistent – and that’s interesting – is that use cases are increasing and you have more people involved in developing machine learning models and using ML to handle cases, strict security, scale ., managing and understanding what’s going on, hearing and observing the whole stack becomes even more important because you’re broad – it’s just a bad thing if you don’t know what’s going on, ”he notes. .

Michael Del Balso, CEO and co-founder of Tecton, also notes the importance of understanding during ML life. “Teams that make really high-quality ML programs run well across the ML flywheel,” he explains. “It’s not just about the learning phase, it’s not just the decisive phase – they’re also thinking about how to tell how my data goes back from my application to the learning set? They play in all parts of this cycle and… it much faster. ”

The space for machine learning infrastructure is maturing

Many speakers are surprised at how far the sector has come in a short period of time. As Josh Beer, product manager of the machine learning platform at Spotify, notes, “When we started, there weren’t many solutions out there to meet our needs as we had our own purchase option, so we had to. we have to build a lot of things. our main components. “

Anthony Goldblum, CEO and founder of Kaggle, agrees: “Some of the tools, including Ariz – are actually maturing to help with the placement of the models and make sure they do what they need to do.”

🔮Future: Multimodal Learning in Mechanical Engineering

In the panel “Placement of Use and Visualization in Modern ML Systems,” Leland McInnes, creator of UMAP and a researcher at the Tutte Institute of Mathematics and Computing, talks about what he’s excited about in the future. On a more theoretical side, McInnes notes that “there is a lot of work on tissue and cell bands, which is a very mathematical but surprisingly useful thing” with “many approaches to neural network graphics” that are starting to work. shown in the literature.

In particular, at UMAP, McInnes says that the UMAP parameter “widely used” is of more interest. He is also “very interested in how to adapt different UMAP models. There is a coordinated UMAP that can synchronize data so you can define a relationship from one set of data to another, but what if I start with just two arbitrary data sets – say, word vectors from French and word vectors? from English and no dictionary? How do you create a UMAP layout that connects them together so I can enter both? There are ways to do this, ”he says, with“ Grumov Wasserstein distance ”as a key search term for those interested in learning more.

Goldbloom’s Kaggle is equally exciting from this space. “Some of the possibilities around multimodal ML are an area of ​​excitement,” he says, especially “multimodal learning. Say you’re trying to do speech recognition where you’re listening to what is being said – what if you’re can you insert the camera at the same time to read the lips? “


With global corporate investment in AI systems expected to increase to $ 200 billion by the year, this is an exciting period for the future of the industry. It is also for ML groups to learn best practices from peers and invest heavily in ML platforms, including machine learning observation with ML performance tracking – to navigate a world where model issues directly affect business results.

Leave a Comment