

■ Global customer success & care ownership. Managed 300+ tickets per quarter, enforced formal escalation playbook, which drove median first-response time below 1 hours, and recorded 0% logo churn across all EMEA enterprise accounts. ■ Driven enterprise take-up. Managed >200 Mobileye engineers via onboarding, workflow building, and proof-of-value workstreams, that delivered a $1.5 M, three-year growth agreement. ■ Developed strategic integrations. Authored and maintained reference repo for full-fledged SageMaker MLOps Pipeline, from experiment tracking to production monitoring, now used by Fortune 500 R&D teams. ■ Partnered with enterprise clients to design end‑to‑end evaluation pipelines using Comet Opik. Defined taxonomies (hallucination, grounding, refusal/stance, PII/toxicity, compliance), sampling strategies, human‑in‑the‑loop review, and CI quality gates wired into Azure DevOps/GitHub Actions. ■ Implemented Azure OpenAI, Azure AI Search (Cognitive Search), AKS/Functions, Event Hub, and Cosmos/Redis integrations. Delivered secure VNET patterns, key‑rotation, cost dashboards, and blue/green rollouts for LLM services.
■ Implemented a Deep Learning model serving platform, using Torch Serve on top of Azure Kubernetes Service, cutting development time by 2 months ■ Fine-tuned Torch Serve parameters through load testing for the image embedder model, increasing throughput by 40%, reducing response time by 60% ■ Designed & Conceived a scalable micro service architecture with automated CI/CD flow reducing development & deployment efforts by 80% for future services. ■ Diagnosed Slow response times for the similar items service using profiling, Reducing response times by 60% after optimization ■ Operated Elasticsearch tuning efforts, reducing initial response times for our worst case from 11 to 3.5 seconds
■ Created and took full ownership of a scalable scraping architecture, allowing to keep the pace with e-commerce marketplaces pages changes without changing the code base, reducing development time & maintenance by 70%. ■ Developed UI for the dynamic scraper using React from the ground up. ■ Improved scraping velocity and number of page scraped per hour by 60% ■ Implemented ~2000 lines of code, pushed 80+ commits, resolved 30+ bugs.
■ Built a "model-as-a-service" solution on Azure AKS. Implemented K8s + Flask stack and GitLab CI pipelines that enable data-science notebooks to be production endpoints within < 1 hour (down from days). ■ Replaced legacy C++ image-tagging model with a TensorFlow/Keras implementation served on GPUs; precision ↑ 11 % and codebase ↓ 80 % LOC. Integrated the new API directly into an Erlang backend after learning the language. ■ Created a scalable training pipeline by deploying Apache Airflow on AWS EKS with shared DAGs stored on EFS, eliminated laptop-based executions and decreased experiment turnaround time by ~60 %. ■ Documented end-to-end SOPs for researchers and backend engineers, standardizing hand-offs from research → deploy and minimizing integration friction.
■ Upgraded device-insight API. Re-architected an asyncio + SQLAlchemy Python service offering device metadata to analytics teams, cut average query latency by ~45 % and doubled daily query rate. ■ Built high-volume device-classification pipeline. Worked with data scientists to ramp up a pandas prototype into a PySpark workflow on AWS EMR; processed 10 M+ devices per run and reduced ETL costs ~30 %. ■ Supported real-time forecastability. Ran a Kafka-streaming job on Kubernetes that scores tens of thousands of new devices per second, returning results into Armis's core platform in sub-second latency. ■ Scaled up for performance. Optimized Spark joins, partitioning, and cluster autoscaling, end-to-end runtime decreased from hours to less than 25 minutes. ■ Worked across departments. Acted as the intermediary between data science, backend, and DevOps, ensuring smooth hand-offs from research to production and developing extensive run-books for future maintenance.