Site Reliability Engineer – Operations

RealPage logo

RealPage

View Salaries, Reviews, and more  

Job Summary


Job Type
-

Seniority

Years of Experience
Information not provided

Tech Stacks
Python Linux Azure ELK IIS CI Windows Server AWS Powershell

Job Description

Overview

The SRE Ops Engineer reports to the Sr. Director of Reliability Engineering and is responsible for ensuring product stability, operational excellence, and a strong customer experience across critical platforms, with a primary focus on Windows‑based environments. This role partners closely with Engineering, CloudOps, InfoSec, and QA to reduce incidents, improve system reliability, and drive operational rigor through automation, monitoring, and incident management.

Responsibilities

Primary Responsibilities

  • Manage and support Windows‑based production environments, including IIS, Windows Services, Active Directory, and related infrastructure
  • Build, maintain, and enhance monitoring, alerting, and observability frameworks using ELK or equivalent platforms
  • Lead incident response, troubleshooting, and root cause analysis (RCA) for customer‑impacting issues
  • Improve system reliability by reducing critical incidents and driving down Mean Time to Resolution (MTTR)
  • Develop and maintain automation using scripting tools such as PowerShell, Python, or similar technologies
  • Support high‑availability, high‑performance production systems and participate in on‑call rotations
  • Collaborate with cross‑functional teams to ensure platform stability, security, and reliability
  • Contribute to platform upgrades, patching, modernization initiatives, and operational best practices
  • Create and maintain runbooks, operational standards, and documentation

Qualifications

Required Knowledge & Skills

  • 5+ years of experience in Windows Server environments, including IIS and Windows Services
  • 5+ years of experience with monitoring and observability tools (ELK stack or equivalent)
  • Strong experience with incident management, troubleshooting, and root cause analysis
  • Hands‑on experience with automation and scripting (PowerShell, Python, etc.)
  • Working knowledge of Linux systems for basic administration and troubleshooting
  • Strong understanding of system performance, scalability, and operational best practices
  • Experience supporting production systems with high availability requirements
  • Familiarity with cloud platforms (AWS, GCP, Azure) is a plus
  • Exposure to CI/CD tools and DevOps practices
  • Strong communication, collaboration, and ownership mindset
  • Ability to operate effectively in a fast‑paced, production‑focused environment

Interview Questions of Site Reliability Engineer – Operations at RealPage

Currently, there aren't any interview questions for this role at RealPage shared by other job seekers.
View more interview questions of similar roles from other companies →
banner icon
Prepare For Your Interview in 1 Week?
Equip yourself with possible questions that interviewers might ask you, based on your work experience and job description.
Get Started!

Salary Insights of Site Reliability Engineer – Operations at RealPage

Currently, there aren't any salaries for this role at RealPage shared by other job seekers.

View more salaries from RealPage →

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now