Couldn’t attend Transform 2022? Check out all the top sessions in our on-demand library now! Look here.
Artificial intelligence (AI) promises to transform almost every business on the planet. That’s why most business leaders are wondering what to do to successfully take AI into production.
Many get stuck trying to decipher which applications are realistic for the business; that holds up over time as the business changes; and that put the least strain on their teams. But during production, one of the leading indicators of an AI project’s success is the ongoing model monitoring practices that are put in place around it.
The best teams use three main strategies for monitoring AI models:
1. Performance Shift Control
Measuring shifts in the performance of AI models requires two layers of metric analysis: health and business statistics. Most Machine Learning (ML) teams focus solely on model health metrics. These include metrics used during training, such as precision and recalls, as well as operational metrics, such as CPU usage, memory, and network I/O. While these statistics are necessary, they are insufficient on their own. To ensure AI models have real world impact, ML teams also need to monitor trends and fluctuations in product and business metrics that are directly impacted by AI.
MetaBeat will bring together thought leaders to offer advice on how metaverse technology will change the way all industries communicate and do business October 4 in San Francisco, CA.
For example, YouTube uses AI to recommend a personalized set of videos to each user based on several factors: watch history, number of sessions, user engagement, and more. And when these models don’t perform well, users spend less time watching videos in the app.
To increase performance visibility, teams need to build a single, unified dashboard that highlights model health metrics next key product and business statistics. This visibility also helps ML Ops teams effectively resolve issues as they arise.
2. Outlier Detection
Models can sometimes yield an outcome that is significantly outside the normal range of results – we call this an outlier. Outliers can disrupt business results and often have major negative consequences if they go unnoticed.
For example, Uber uses AI to dynamically determine the price of each ride, including price increases. This is based on several factors, such as driver demand or driver availability in an area. Consider a scenario where a concert ends and visitors request a ride at the same time. An increase in demand allows the model to increase the price of a ride by 100 times the normal range. Riders never want to pay 100 times the price to hold a ride, and this can have a significant impact on consumer confidence.
Monitoring can help companies weigh the benefits of AI predictions against their need for predictable outcomes. Automated alerts can help ML operations teams detect outliers in real time by giving them a chance to react before damage occurs. In addition, ML Ops teams must invest in tooling to manually override the output of the model.
In our example above, detecting the outlier in the pricing model can alert the team and help them take corrective action, such as knocking out the spike before riders notice. In addition, it can help the ML team to collect valuable data to retrain the model to avoid this in the future.
3. Track data drift
Drift refers to the performance of a model that deteriorates over time once it is in production. Because AI models are often trained on a small set of data, they perform well initially because the real-world production data is very similar to the training data. But over time, actual production data changes due to a variety of factors, such as user behavior, geography, and time of year.
Consider a conversational AI bot that solves customer support issues. As we launch this bot for different customers, we may find that users can request support in very different ways. For example, a user requesting support from a bank may speak more formally, while a user on a retail website may speak more casually. This change in language patterns compared to the training data can cause bone performance to deteriorate over time.
To ensure that models remain effective, the best ML teams track the drift in feature distribution, that is, embeddings between our training data and production data. A major change in distribution indicates that we need to retrain our models to achieve optimal performance. Ideally, data drift should be checked at least every six months and may even occur every few weeks for high volume applications. Failure to do so may cause significant inaccuracies and impair the overall reliability of the model.
A structured approach to success
AI is not a panacea for business transformation, nor a false promise of improvement. Like any other technology, it holds tremendous promise, given the right strategy.
If AI is developed from scratch, it cannot be deployed and then left on its own without proper attention. Truly transformative AI implementations take a structured approach that includes careful monitoring, testing, and increased improvement over time. Companies that don’t have the time or resources to follow this approach will find themselves caught up in a never-ending catching up process.
Rahul Kayala is chief product manager at Moveworks.
Welcome to the VentureBeat Community!
DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.
If you want to read about the very latest ideas and up-to-date information, best practices and the future of data and data technology, join DataDecisionMakers.
You might even consider contributing an article yourself!
Read more from DataDecisionMakers
Janice has been with businesskinda for 5 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider businesskinda team, Janice seeks to understand an audience before creating memorable, persuasive copy.