Site Reliability Engineering Manager

First American logo

First American

View Salaries, Reviews, and more  

Job Description

The Role:


A SRE Manager is ultimately responsible for system reliability, developer productivity and reducing time to market by striving to reduce technical debt of the services your SRE team supports. We seek managers who are passionate about site reliability to influence and drive the strategic SRE mission.

As a Site Reliability Engineering Manager working on critical services your mission will be to ensure our services are fast, highly available, scalable, and able to withstand unprecedented increases in load. The site Reliability Engineering Manager will be at the heart of solving production problems. Your scope is from the kernel to the application. The position requires the flexibility to take a holistic approach to troubleshooting and the ability to delve deeply into technical details. The Site Reliability Engineering Manager is co-located with the various application development teams. This ensures the Systems Reliability Engineer will acquire the necessary domain knowledge to effectively troubleshoot and repair an outage. The Site Reliability Engineering Manager will build automation tools for system health and production acceptance tests to validate production changes. The Site Reliability Engineering Manager will ensure the system is well instrumented and highly fault tolerant.


Key Leadership Responsibilities:


  • Engage, influence, and evangelize SRE practices with development, operational and product groups to align technology service/solution delivery.
  • Drive quality accountability within the organization with well-defined processes, metrics, and goals for process quality. This includes leading effective postmortems and ensuring actions are followed-up.
  • Manage availability, latency, scalability, and efficiency of Bloomberg applications development by instilling engineering reliability into our development life cycle with a focus on fault tolerant approaches.
  • Drive capacity planning, performance analysis, instrumentation, and other non-functional systems requirements.
  • Must be able to define and report "progress" on strategic initiates and project level tasks to all stakeholders including senior executives, clients and use effective communication approaches with each constituency.
  • Implement metrics driven processes to ensure service quality targets are met.

Key skills:


  • Expert knowledge in all aspects of designing, developing, managing large real-time systems.
  • Project and process management
  • Prior successful experience as a systems performance or site/systems reliability engineer.
  • Mastery of fault tolerant approaches in a large-scale distributed environment and high-performance systems.
  • Demonstrated experience working in large, complex systems environments.
  • AWS cloud experience is mandatory.
  • Experience in Infrastructure-a-Code using Terraform.
  • Experience with securing the AWS workloads and security practices will be a huge plus.
  • Experienced on Site Reliability Engineering (preferred) and automating repetitive tasks using Python, PowerShell, etc.
  • Experience delivering complex solutions utilizing common programming languages C#, JS, TypeScript, YAML, Terraform, PHP
  • Extensive experience with configuring and monitoring via tools such as DataDog, ELK, Splunk, AppDynamics, etc.
  • Experience collaborating across multiple functional and/or technical teams to deliver an Agile-based project.
  • Demonstrated growth mentality, enthusiasm about learning new technologies quickly and applying the gained knowledge to address practical business problems.
  • Ability to communicate with team members and partners to work through technical solutions.
  • Demonstrated knowledge of fundamental cloud security (e.g., Identity and Access Management, firewalls, etc...)

Qualifications, Knowledge, and Experience:


  • The successful candidate will possess an outstanding record of professional experience and will thrive in an environment that demands accountability. He/She must possess significant technology management and product development experience. He/She must also have strong planning, organizational, communication skills, and be a key driver to help the team understand the big picture perspective.
  • Proven leader of technology solutions in a high-volume transaction environment.
  • Accomplished leader with 5+ years managing regional and global areas.
  • Have excellent time management, communication, decision-making, presentation, and organizational skills.
  • Maintain excellent written and verbal communications with clients, employees, and management chain, including status reports, project plans, presentations, etc.
  • Ability to lead across functions and motivate a matrix staff.


Our teams thrive in a culture of openness, creativity, leadership, customer-centricity and people growth, Click here www.firstam.co.in to learn more about the work we do.

Follow us on LinkedIn


Interview Questions of Site Reliability Engineering Manager at First American

Currently, there aren't any interview questions for this role at First American shared by other job seekers.
View more interview questions of similar roles from other companies โ†’
Unlock Your Interview Potential
The only end-to-end front end interview preparation platform by FAANG ex-interviewers and Staff Engineers.
Get hired at FAANG
Users now work at:

Salary Insights of Site Reliability Engineering Manager at First American

Currently, there aren't any salaries for this role at First American shared by other job seekers.

View more salaries from First American โ†’

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now