Surgical Robot Fleet — Predictive Maintenance MLOps

Overview

At Intuitive Surgical, I designed and deployed a production MLOps system for predictive maintenance of the da Vinci surgical robot fleet — a globally distributed set of high-value medical devices where unplanned downtime has direct patient care consequences.

Business Impact

25% reduction in unplanned downtime across the global surgical robot fleet
$700,000+ in annual cost savings from prevented emergency service dispatches and extended component life
Shifted maintenance posture from reactive → condition-based → predictive

MLOps Architecture

Data Pipeline

Real-time telemetry ingestion from robot IoT sensors (vibration, temperature, motor current, error codes)
PySpark-based feature engineering at scale
Delta Lake on Azure for reliable data versioning

Model Layer

Failure prediction models: XGBoost, LSTM for temporal patterns
Multi-label classification across failure modes (mechanical, electrical, software)
MLflow experiment tracking, model registry, and artifact versioning

CI/CD for ML

GitHub Actions pipelines: data validation → training → evaluation → staging → production promotion
Automated retraining triggers on data drift detection
Quality gates: F1, AUC, and calibration thresholds enforced before any model promotion

Monitoring

Evidently for feature drift and data quality monitoring
Azure ML Monitor for production model performance tracking
PagerDuty alerting for threshold violations
Weekly monitoring reports auto-generated and distributed to engineering leadership

Tech Stack

Python · PySpark · MLflow · Azure ML · Delta Lake · XGBoost · PyTorch/LSTM · Evidently · GitHub Actions · Docker