End To End Machine Learning Life Cycle
Various steps are well known to produced a good ML model. From data collection to building a model or deploying it in a application seems to be very easy. There is certain steps we need to follow to build and deploy an actionable ML model.
The horizontal steps are common for every ML solution. But the vertical figure is use case specific. Still can be used as a framework for various problems. Most of the important steps are often missed just because we ignore the important part.
Every machine learning solution starts with business need, if there is no business need, there is no solution. But now a days AI, ML, DL and data science are becoming some buzz words and are hugely used across industries. Rather focusing on buzz words, we should focus on solutions. If we can solve the problem by using rule based programs, then we shouldn’t choose machine learning for that use case. Below given some scenario, where we should use machine learning or AI.
- Automation / augmentation.
- Where hand coding is almost impossible or difficult.
- Should be used in an environment where experimentation is facilitated.
- Where the decisions have to be scaled.
- Critical for the business to be responsive.
- Previous known outcome from industry experts or reliable sources.
- Cost of failure doesn’t create business impact or reputation damage.
Steps of ML Life Cycle
Problem Definition
Before start solving the problem, we should know the problem properly. It’s not all about preprocessing, model building and deployment. We should understand the problem first.
Business Understanding
In this step the business problem should be written in plain English. What is the need? In the sense of the solution. Understand the objectives in business point of view. Then define the success criteria, means what the situations where the business problem is solved. This step defines whether the ML project is succeed or failed.
Data Understanding
After getting a clear picture of the business and the problem statement, we should move towards the data. In this phase we should understand the data touch point in the context of business process. Gathering knowledge about the data like what data is exactly we need to build the solution, which attributes impact the most according to our gut feelings. Then we should think where the data originates from and how we collect that data. How it’s going to be processed, where it’s getting stored and how it flows to downstream. Also check whether labels are present or not. Understand late arriving labels to define your validation and data selection.
SLA Understanding
SLA stands for service level agreement. There is certain commitments of business towards it’s customers. Like if you working for a bank. Then the banking app may have done 2000 transactions per second. So the SLA need for the ML based solution should satisfy this need. And should be responsive enough to meet the target. We should discuss integration touch points of model with business process. And also we should understand the SLA and scalability need.
Data Collection
Then comes the data collection phase. Your data may reside in some public repositories, internet, in some databases, in a data lake, data warehouse or may be in some static files like Excel. You should build a pipeline for the flow of your data towards the system.
The ultimate goal of the business should be action driven, that may be driven by opinion or data. Both have pros and cons. The data may be across disparate enterprise or in partner network. Like operational system, external sources like social media, unstructured files like pdf, text, image and video or may be in IOT devices.
According to the functional principle of data architecture, you should follow a process like
- Center data assets across enterprise.
- Govern, enrich and secure data assets.
- Catalog data to make it searchable, enrich it with both tech and business meta data.
- Organize the data into common formats as much as possible.
- Democratize and secure the data, give access to those, who need it.
All the data can be stored in a data warehouse or data lake. Both have different specific use case scenarios.
The next step is data onboarding or data ingestion. There are 3 different ways:
- Batch: It can be high volume nightly loads or intraday loads. Due to some cause the organization may not allow you to connect with the operational system. Generally they deliver huge amount of data in a certain interval.
- Real Time: It may be 3 kinds those are: Real time high velocity ingestion, its like real time data flowing to the system. continuously generated and screened. Near real time, it may be very near to the real time data like the difference is in seconds or minutes. And the 3rd category is streaming analytics, for example like in credit card fraud detection system. The system collects all the data, execute the model and after providing the result the transaction can proceed.
- API: Also the data can be collected by API end points. Like API end points from IOT devices. Partner and service providers may serve an API end point for data. It can be continuous or on demand push pull.
Most importantly, the data collected has horizontal capability, not vertical capability. After collecting the data it can be used for many use cases, that’s why it has horizontal capability. Vertical capability in the sense, the data collected can be used in only for that scenario with much deeper analytics.
Data Analysis
Torture your data, it will confess everything!
Insight is founded from data in this phase. Data can be analyzed for various purposes like descriptive analysis can be done to see the big picture of the data, inferential analysis can be done to understand the distribution. Moreover EDA can be done to understand the data. This step helps in data cleaning, getting various data transformation rules for data engineering and also can indicate potential features to be created. To get more out of it. Just build a dashboard and represent all your findings. And again your dashboard should be representative. Means the designs should be different for different professions.
Model Training / Validation
This phase is an important step in the ml solution design. By using the findings from data analysis various feature engineering techniques are used to create features. After feature engineering, feature selection techniques are used to select some of the features. Then you should think about feature extraction or dimensionality reduction, if your feature set is still large enough. Create a baseline model and validate it against test data and see the result matrix. Just check how efficient is it. Experiment with various other algorithms and find the top models. Do hyper-parameter optimization to increase the efficiency. Store the models and selected features. Cross validate your model, to check for over-fitting.
Model Explanation
Your model shouldn’t be black-box one. It should be explainable. Business doesn’t runs on the score of the model. Businesses run on the KPIs. Model explanation could be local or global. It should reveal the feature importance, which is going to be used as a part of KPIs in the business.
Model Monitoring and Alerting
There is no guarantee that your model will work with real world data for all the time. Sometimes or after a period your model may fail. So we need to keep track of the performance. That’s why we need to monitor the model. The first step is identifying the characteristics of the model that need to be monitored after deployment. Track the performance time to time against baseline evaluation metrics. Identify and alert when model or data drift is occurred. Track key or all model features distributions against baseline distribution. Finally, on drift kick off model recalibration cycle with new observed data.
Monitoring and building the solution is not sufficient. Better to collect data of the solution like how your system is beneficial for the business and present that.