Hi, We have a couple of openings for Observability / Site Reliability Engineer
Interested candidates can send their profiles by email to [email protected]
Work location Bangalore (Hybrid)
Key skills:
- Observability / SRE
- Dynatrace Enhanced skills (Required)
- Splunk Enhanced skills (Required)
- IaC
- Microsoft Azure / Google GCP
Detailed JD:
- Partner with application developers and solution architects to ensure services are built for scale and performance.
- Lead setting service-level objectives, agreements and indicators (SLOs, SLAs and SLIs) for the underlying service by collaborating with Application Development, Product and Business Owners
- Design, Develop and create Scripts/Software/Tools that will improve the reliability of systems in Production including fixing issues, responding to incidents and taking on-call responsibilities.
- Improve the overall resilience of a system and provide visibility to the health and performance of services across all applications and infrastructure
- Improve service performance metrics like latency, page load speed and ETL and help proactively identify performance issues across the system
- Implement monitoring solutions, create Dashboards and Alerts based on four golden signals of SRE providing single source to determine the overall performance and availability of the services they support.
- Writing, updating, and using documentation, including runbooks/playbooks
- Automating work including infrastructure needs, testing, failover solutions, failure mitigation, and much more
- Using Chaos Engineering to test what you build under real-world conditions
- Spread information across DevOps and business teams ๏ฟฝ encouraging a blameless culture focused on workflow visibility and collaboration
- Root-cause analysis complex problems involving multiple parties, networks, hardware, and software that relate to scaling and performance.
- Services as technical owner to ensures delivery for SRE initiative
- Performs deliverable reviews and coaches\' team in area of expertise in SRE
- Provide continuous competitive and best-practices research, leverage industry resources and market trends, and liaise with internal stakeholders.