We are looking for a
Cloud/SRE Engineer for the leading cloud-based business process automation provider that delivers flexible, managed software solutions.
Total Experience: 8+ years
Employment Type: Permanent
Notice Period: Immediate to 1 month Only
Working Model: Work from home (Only Pune-based candidates)
Technical Skills: Docker, Kubernetes, AWS, Linux, Shell Scripting (Python, shell scripting)
Shift Timings: Rotational Shift as per requirement (Work From Home)
What you'll do:
- Day-to-day management of alerts, checking systems, and escalating issues as necessary
- Be part of a team that provides 24x7 on-call support for critical SaaS events.
- Be available in case of emergencies when team members are not available or need help.
- Documentation of issues and remediation steps
- Proactively create appropriate monitors in the EKS/K8S ecosystem
- Deploy to EKS/K8s cluster using Terraform and Helm
- Learn and maintain existing infrastructure running under Docker Swarm
- Improve existing infrastructure health by implementing checks and scripts to correct known issues
- Maintenance and development of deployment code
- Automating tasks that are currently executed manually
- Implement/integrate new technologies in our Cloud Infrastructure
- Collaborate with other teams and departments to provide the highest level of support and assistance
- Apply a real customer focus when planning deployments/updates, having the customer in the forefront of the mind, and considering the impact on them before making changes
- Work closely on solutions with Support, Customer Success, Migration, and Professional Services teams to provide the best in class SaaS service to our customers
- Perform RCA and take necessary corrective actions to prevent the recurrence of issues
- Create and assign alert-related actions to the appropriate team after the investigation
- Handle support requests for environment-specific actions
- Identify and provide automation requirements to improve RCA
What you'll need:
- Hands-on AWS Cloud Engineer
- Working knowledge of EKS/Terraform/Helm
- Working Experience with Docker and Docker Swarm(Optional)
- Good understanding of AWS IAM roles and policies
- Logging and Monitoring AWS Resources using CloudWatch logs.
- Experience working with Linux environment
- Proficient in Bash and/or Python scripting
- A strong understanding of web technologies such as REST APIs
- Working Experience with monitoring solutions, such as Grafana, and Prometheus
- Excellent oral and written communication skills
- Customer-facing communication skills to effectively explain issues and RCAs to them
- Experience in Product/Application Support for SaaS-based products
- Experience in Product/Application Support for SaaS-based products
Bonus Skills: Certified AWS Solutions Architect, Working knowledge of Bitbucket Pipelines