At the Ellison Institute of Technology (EIT), we’re on a mission to translate scientific discovery into real world impact. We bring together visionary scientists, technologists, policy makers, and entrepreneurs to tackle humanity’s greatest challenges in four transformative areas:
- Health, Medical Science & Generative Biology
- Food Security & Sustainable Agriculture
- Climate Change & Managing CO₂
- Artificial Intelligence & Robotics
This is ambitious work - work that demands curiosity, courage, and a relentless drive to make a difference. At EIT, you’ll join a community built on excellence, innovation, tenacity, trust, and collaboration, where bold ideas become real-world breakthroughs. Together, we push boundaries, embrace complexity, and create solutions to scale ideas for lab to society. Explore more at www.eit.org
Requirements
Our MLOps team
Join our MLOps team to build the cloud and compute foundation that enables scientific breakthroughs. Deliver reliable, secure platforms and self-service guardrails that accelerate experimentation and turn ideas into results—faster, at scale, and with confidence.
Day-to-day, you might:
- Architect, build, and operate our cloud platform, moving infrastructure beyond the initial setup to deliver resilient compute, network, and storage, including full-sized GPU clusters
- Drive the implementation of highly structured, auditable delivery pipelines (CI/CD/GitOps) using to enforce automated, repeatable infrastructure changes
- Design and deploy automated governance and security controls using Policy-as-Code (specifically Kyverno and YAML) to ensure strong isolation, protect data, and meet internal audit standards
- Establish the foundational monitoring, alerting, and telemetry framework required for robust operations, defining clear SLOs, and setting the course for future SRE work
- Partner with Research and Data teams to build self-service capabilities that efficiently support diverse workloads, from Python notebooks to distributed clusters
What makes you a great fit:
- Proven experience platform engineering, with a demonstrable track record of architecting and automating operational processes
- A highly proactive attitude and a passion for introducing and automating operational structure
- Expertise with at least one major cloud provider (OCI, AWS, GCP, or Azure)
- Proficiency with Terraform for declarative, large-scale infrastructure provisioning
- Comfortable with operating and managing large-scale, resilient Kubernetes clusters
- Proficiency in at least one major language for system-level tools (e.g. Python, Go, or Java) with some scripting experience
It would also be great if you had:
- Familiarity with modern Policy-as-Code tooling
- A passion for introducing and automating operational rigour and structure
- Experience supporting ML and Data Engineering workloads
Benefits
We offer the following salary and benefits:
Enhanced holiday pay
Pension
Life Assurance
Income Protection
Private Medical Insurance
Hospital Cash Plan
Therapy Services
Perk Box
Electric Car Scheme
--
Why work for EIT:
At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. Valuing emotional intelligence, empathy, respect, and resilience, we encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact!