IBM chief scientist advocates for building AI factories


Join the leaders at the Conversational AI & Intelligent AI Assistants Summit, presented by Five9. Look now!

John Thomas, IBM’s leading engineer and chief data scientist, says for organizations to truly embrace AI, they need to embrace a factory model that automates the model building process as much as possible.

Just as a traditional factory would reliably build physical products at scale and at high speed, an AI factory would allow companies to build and scale reliable AI models quickly.

VentureBeat sat down with Thomas to better understand how an AI factory actually works.

VentureBeat: What are the biggest challenges organizations face with AI today?

Jean Thomas: There are a few recurring themes that we encountered. Almost all of the large clients we work with have a data science team. They already have some kind of data science project going on. But many of these projects are just experiments. They don’t make it into production, and even if they do, it takes forever to get things from concept to production.

When we start to look at why this is happening and what is happening, there are a number of different things. Sometimes it’s a mismatch between what the business expects and what the data science team is building. Sometimes it’s the model – it’s great in development, but organizations struggle to get it through the model validation and risk management processes and get it approved for deployment to production. Sometimes it’s about what happens after you go into production. These are all of our challenges outside of the model building part itself.

Not enough attention has been paid to these different stages of the life cycle. This is what we see over and over again, even with some of the most advanced data science teams. They are very good at using the algorithms, libraries, and frameworks to build the model, but when it comes to deployment, management, monitoring, and alignment to ongoing business impact, that seems problematic.

VentureBeat: How is this resolved?

Thomas: Software development went through this phase a long time ago. App developers only wrote code, and it was just hard to get everything into production. You needed a structured approach and DevOps has arrived. It’s the same kind of mindset, but now in the world of AI and machine learning. Just like a physical factory has a set of processes, a set of best practices, and people with certain skills to produce goods on a large scale and at high speed. You need a similar construction. You need people, process and technology.

If you look at the different stages of the life cycle, the first part of planning and scoping is a major step. IBM uses design thinking to unravel all aspects of the project in a very structured way. The next step is data mining, and the third step is model building. This is when you start to examine reliability and determine if the data is biased. Any reliability challenges should be part of the model building stage itself. Then the next step is the validation and deployment step, where we define best practices. A validation team, separate from the model development team, should come to run the validation performance measurements, verify fairness, verify model explanations, produce reports and ensure that certain criteria or thresholds are met. defined by the company. defined are met. The last step is ongoing monitoring and management. This is where you set up guardrails to check the continued performance of the model. Once you set that up, it’s like a physical factory.

VentureBeat: Who is responsible for building this AI factory?

Thomas: It’s not usually the data science team because they don’t want to be in the middle of it all. What we have seen is that it is the stakeholders. Each line of business has its own data science team that takes care of a bunch of models as part of a star construction. The person who cares about consistency and scale in these lines of business is a person who advocates for the creation of a factory. They will involve people from different departments at the plant. IBM helps them set up the factory.

It’s not like everything has to go through this. This is not what we are saying. We say departments have the freedom to innovate, but they follow the same guidelines. They go through the same design process to define the scope and create the action plan. They follow the same governance model. They have complete freedom in the algorithms and frameworks they use.

VentureBeat: Where are machine learning operations (MLOps) and DevOps located within this factory?

Thomas: I didn’t use the term MLOps because there is so much more to it beyond MLOps. Understand reliability, bias, fairness, explanations, etc. The very nature of AI and ML is that it is a probabilistic, not a deterministic paradigm. It’s not something a typical application development paradigm has to deal with.

VentureBeat: Should the AI ​​factory and the software development factory merge?

Thomas: At a very high level, there are similar constructs, but there are some unique challenges in the world of AI. I don’t think AI factories and software development factories will all become one thing. There will be similar constructs and similar paradigms, but unique challenges must be addressed in unique ways.

VentureBeat: A factory involves automation. How will the data engineering process be automated?

Thomas: I don’t think we’re about to automate everything. We want to automate manual, tedious and boring tasks as much as possible. If you are working with a very large data set with hundreds or thousands of features, this is quite boring, manual, and labor intensive work. You want to focus on automation as much as possible. Creating a pipeline for model deployment should be automated, but with a human in the loop. It’s about making sure that domain experts are used in the right way throughout the various stages of the lifecycle while automating some of the more mundane tasks. This is the reality of where we are.

VentureBeat: We hear about the democratization of AI all the time, where end users are going to build their own little AI frameworks. How does this fit into a factory model?

Thomas: We go through the different steps from the start. Before even a single line of Python code is written, you need the business owner to be part of the scoping and planning stage itself. Often the data science team is chasing data science metrics. “My model is great because look at the precision. But how this translates to the actual KPI (Key Performance Indicator) of the company is not very clear. Sometimes this is not the case. It’s important to be able to understand up front how your model relates to business KPIs before a single line of coding is done. You need to make the business part of this lifecycle.

VentureBeat: Many end users are told they don’t need a data science team to build an AI model as part of the democratization argument. Where is the line between the two?

Thomas: There are tools that lower the barrier of entry for sure, but at some point you need the subject matter expert and the data science person. Unless the businessman and data scientists are working hand in hand, you can’t go the last mile. You can’t get something to go into production. You can’t just have data thrown into a magic box to produce AI. This is not true.


VentureBeat’s mission is to be a digital public place for technical decision-makers to learn about transformative technology and conduct transactions. Our site provides essential information on data technologies and strategies to guide you in managing your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the topics that interest you
  • our newsletters
  • Closed thought leader content and discounted access to our popular events, such as Transform 2021: Learn more
  • networking features, and more

Become a member


Leave A Reply