Job Description

The Company

Dexcom Corporation (NASDAQ DXCM) is a pioneer and global leader in continuous glucose monitoring (CGM). Dexcom began as a small company with a big dream: To forever change how diabetes is managed. To unlock information and insights that drive better health outcomes. Here we are 25 years later, having pioneered an industry. And we're just getting started. We are broadening our vision beyond diabetes to empower people to take control of health. That means personalized, actionable insights aimed at solving important health challenges. To continue what we've started: Improving human health.

We are driven by thousands of ambitious, passionate people worldwide who are willing to fight like warriors to earn the trust of our customers by listening, serving with integrity, thinking big, and being dependable. We've already changed millions of lives and we're ready to change millions more. Our future ambition is to become a leading consumer health technology company while continuing to develop solutions for serious health conditions. We'll get there by constantly reinventing unique biosensing-technology experiences. Though we've come a long way from our small company days, our dreams are bigger than ever. The opportunity to improve health on a global scale stands before us.

Position Summary: Dexcom is looking for an experienced, software-centric Site Reliability Engineer 2 to join our R&D Platform team. In this role, you will be a key driver in building and evolving the resilient cloud infrastructure that supports life-changing medical technologies. You will bridge the gap between traditional SRE and AI-Native Engineering, scaling distributed systems and implementing agentic workflows that ensure our platforms remain secure, highly available, and 10X-ready.

As a mid-level member of the team, you will focus on systemic reliability through code. You will tackle architectural challenges related to low-latency data streaming and high-concurrency environments. This is an opportunity for a seasoned engineer to move the needle on Agentic SDLC, building self-healing systems that replace manual operational intervention with intelligent, software-driven solutions.

Where You Come In

Agentic Architecture & OPAL: Take ownership of portions of the OPAL (Operations Performed by Agentic Layers) initiative. Design and deploy standardized AI agents and MCP (Model Context Protocol) servers to automate complex SDLC and operational tasks.
Observability Engineering: Design and refine the observability stack to provide deep insights into distributed tracing and system performance, using data-driven analysis to predict and prevent outages.
Cloud & Infrastructure Ownership: Architect and provision software-defined, scalable infrastructure on GCP. You will lead infrastructure projects from design to deployment with minimal supervision.
Orchestration Mastery: Optimize Kubernetes scheduler behavior and resource utilization patterns. Implement advanced traffic management and service mesh configurations to improve microservices orchestration.
Advanced Incident Management: Lead root-cause analysis for complex distributed systems disruptions. Develop long-term programmatic fixes and automated recovery patterns to eliminate entire classes of failure.
Internal Tooling Development: Build internal software services and agentic layers that treat infrastructure as a software product, abstracting away complexity for our development teams.
Mentorship & Review: Actively lead design reviews and facilitate blameless post-mortems. Mentor junior engineers in reliability-first design and modern systems programming practices.

What Makes You Successful

Systems Engineering & Logic: Advanced understanding of data structures, algorithms, and software design patterns. Proven proficiency in a systems language (Go strongly preferred, or Python) with experience writing concurrent, high-performance code.
AI-Native Mindset: Demonstrated experience or deep interest in Agentic SDLC, including the programmatic integration of LLMs (e.g., Gemini) into engineering workflows.
Systems Internals: First-principles understanding of Linux internals (cgroups, namespaces, I/O) and advanced networking (BGP, Load Balancing, HTTP/3, gRPC).
Methodical Architecture: You view infrastructure through the lens of software engineering, prioritizing modularity, testability, and self-healing capabilities.
Analytical Leadership: Ability to articulate complex technical challenges to stakeholders and drive consensus on architectural decisions.

Experience And Education

Education: Bachelor’s degree in Computer Science or a related engineering field.
Experience: 2–5 years of professional experience in SRE, Distributed Systems, or Software Engineering.
Proven Track Record: Experience managing production workloads in Kubernetes and Terraform/Pulumi at scale.
AI/Agentic Skills: Hands-on experience integrating AI agents, building connectors, or automating workflows via LLM APIs is a significant advantage.
Certifications: CKA (Certified Kubernetes Administrator) or Google Cloud Professional Cloud Architect is highly preferred.

To all Staffing and Recruiting Agencies: Our Careers Site is only for individuals seeking a job at Dexcom. Only authorized staffing and recruiting agencies may use this site or to submit profiles, applications or resumes on specific requisitions. Dexcom does not accept unsolicited resumes or applications from agencies. Please do not forward resumes to the Talent Acquisition team, Dexcom employees or any other company location. Dexcom is not responsible for any fees related to unsolicited resumes/applications.