Workflow Element Store

  1. APIs and Data Feeds
  2. Data Bases - SQL
  3. Data Collaboration and Partnerships
  4. Mobile Applications or IoT Applications
  5. WebScraping
  6. Data bases - NoSQL
  7. Flat files
  8. Public Datasets
  9. Experiments (DoE)
  10. Surveys and Questionnaires
  11. Feedback Data
  1. ETL/ELT pipeline
  2. Azure Synapse
  3. GCP Dataflow
  4. AWS Redshift
  5. RDBMS
  6. MS SQL server
  7. AWS Kinesis
  8. Azure ADF
  9. PostgreSQL
  10. Oracle DB
  11. AWS Glue
  12. GCS
  13. AWS RDS
  14. s3
  15. GCP Data Fusion
  16. Apache Kafka
  17. GCP BigQuery
  18. Azure Streaming Analytics
  19. MongoDB
  20. Azure blob storage
  21. MySQL
  1. Feature Selection
  2. Augmentation
  3. Dealing with Outliers
  4. Handling Time-Series Data
  5. Textual Feature Extraction
  6. Handling Imbalanced Classes
  7. Time-Based Features
  8. Feature Extraction from Images
  9. Binning / Discretization
  10. AutoEDA libraries
  11. Data Partitioning - Train, Validation, & Test
  12. Handling Noisy Data
  13. Dimensionality Reduction
  14. Auto-Preprocessing libraries
  15. Domain-Specific Feature Engineering
  16. Data Transformations
  17. Data Scaling and Normalization
  18. Handling Missing Data
  19. Annotation
  20. Interaction Features
  21. Polynomial Features
  22. Handling Categorical Data
  1. Transfer Learning
  2. Cross-Validation
  3. Transfer Learning
  4. Regular Monitoring and Logging
  5. Data Augmentation
  6. Regression Analysis
  7. Network Analytics/ GeoSpatial Analytics
  8. Learning Rate Scheduling
  9. Clustering
  10. Recommendation Engine
  11. Model Comparison
  12. Multiclass Classification Techniques
  13. Performance Visualization
  14. Blackbox - Neural Network Models
  15. Binary Classification Techniques
  16. Association Rules
  17. External Validation
  18. Word Embeddings
  19. Ensemble Techniques
  20. AutoML
  21. Regularization Techniques
  22. Early Stopping
  23. Weight Initialization
  24. Forecasting Techniques
  25. Batch Normalization
  26. Hyperparameter Tuning
  27. Batch Size Selection
  28. Regularization
  29. Model Interpretability
  30. Cross-Validation
  31. Reinforcement Learning
  32. Natural Language Processing
  33. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  34. Evaluation Metrics
  1. code repository
  2. Databases
  3. Data Preprocessing pipeline models
  4. Datawarehouse
  5. model registry
  1. Model Health Monitoring
  2. Data Drift Monitoring
  3. Prediction Logging
  4. Model Drift
  5. Model Serialization
  6. FastAPI
  7. Performance Metrics
  8. Flask
  9. Cloud Deployment
  10. Bias and Fairness Assessment
  11. Edge Deployment
  12. Containerization
  13. Streamlit
  14. Serverless Computing
  15. Concept Drift Detection
  16. Feedback Collection
  17. Alerting and Notification
  18. Model Versioning
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API