Site Reliability Engineer (Junior)

Cloud Mile Inc. logo

Cloud Mile Inc.

View Salaries, Reviews, and more  

Job Summary


Job Type
-

Seniority

Years of Experience
Information not provided

Tech Stacks
Python Prometheus Grafana ELK CI Swarm Google Cloud play Go Docker Terraform AWS

Job Description

Job Description

We are seeking a highly skilled SRE Engineer to join our team and play a critical role in delivering exceptional managed services to our clients. As a key member of our engineering team, you will be responsible for designing, implementing, and maintaining robust and scalable infrastructure solutions on Google Cloud Platform (GCP) and Amazon Web Services (AWS).


Key Responsibilities:

  • Design, build, and maintain highly available, scalable, and performant platform components and shared services.
  • Define, monitor, and report on key Service Level Indicators (SLIs) and Service Level Objectives (SLOs) relevant to platform health and customer experience.
  • Identify and eliminate single points of failure across the infrastructure.
  • Participate in a rotating on-call schedule to provide after-hours support and incident response as needed.
  • Implement and improve monitoring, logging, and alerting systems to gain deep visibility into platform health, resource utilization, and potential issues including those triggered by customer activity.
  • Automate repetitive operational tasks ("toil") related to platform management, provisioning, scaling, and healing.
  • Develop and maintain Infrastructure as Code (IaC) and CI/CD pipeline to manage the platform infrastructure consistently and reliably.
  • Participate in incident response, troubleshooting, and resolution efforts for platform issues.
  • Collaborate with Customer Success teams to diagnose and resolve complex platform issues that may be related to customer-specific configurations or usage.
  • Contribute to the architectural design and evolution of the platform, focusing on resilience, multi-tenancy best practices, and supportability under varying customer loads.
  • Perform capacity planning to ensure the platform can handle anticipated customer growth and usage patterns.


Qualifications:

  • at least 2+ years of Site Reliability Engineer, DevOps Engineer, or similar role supporting production systems.
  • Experience working with cloud platforms (GCP, AWS, Alibaba, Azure).
  • Strong understanding of monitoring, logging, and alerting principles and tools (Prometheus, Grafana, ELK Stack, Datadog).
  • Proficiency in Infrastructure as Code (IaC) tools (Terraform, CloudFormation, Pulumi).
  • Solid scripting and automation skills (Python, Go, Bash).
  • Experience with containerization and orchestration technologies (Docker, Swarm, Kubernetes).
  • Familiarity with CI/CD pipelines and practices.
  • Understanding of networking fundamentals, databases, and distributed systems.
  • Experience participating in on-call rotations.
  • Excellent problem-solving and troubleshooting skills.
  • Strong communication and collaboration skills.
  • A passion for automation and continuous improvement.
  • A proactive approach in problem identification and resolution - donโ€™t wait around, grab & fix it.
  • A learning attitude.
  • Excellent communication skills in English and Bahasa Indonesia.


Preferred Qualifications:

  • Certifications in GCP or AWS.
  • Experience working directly with customer-facing teams.
  • Experience defining and tracking customer-facing SLOs.
  • Experience providing self-service tooling or observability insights to customers.
  • Experience with cloud cost optimization strategies.

Interview Questions of Site Reliability Engineer (Junior) at Cloud Mile Inc.

Currently, there aren't any interview questions for this role at Cloud Mile Inc. shared by other job seekers.
View more interview questions of similar roles from other companies โ†’
banner icon
Prepare For Your Interview in 1 Week?
Equip yourself with possible questions that interviewers might ask you, based on your work experience and job description.
Get Started!

Salary Insights of Site Reliability Engineer (Junior) at Cloud Mile Inc.

Currently, there aren't any salaries for this role at Cloud Mile Inc. shared by other job seekers.

View more salaries from Cloud Mile Inc. โ†’

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now