Job Title: Cloud Engineer
Location: Bangalore
Department: Technology
Reports To: VP of Engineering
Position Overvie
wTookitaki, a global leader in AI-driven regulatory compliance solutions, is seeking a Cloud Engineer to design, optimize, and automate scalable cloud infrastructure. With our growing platform and global presence, we need a dynamic engineer to enhance our cloud architecture, ensure cost-effective operations, and drive efficiency in deployment and incident management. This role is critical to advancing our mission of providing reliable, scalable, and high-performance infrastructure that supports the development of our cutting-edge solutions
.
Roles & responsibilitie
s:1. Cloud Architecture Desi
- gnDesign scalable and reliable cloud solutions using AWS and GC
- P.Ensure architectures meet performance, security, and scalability requirement
s.2. Resource Optimizati
- onAnalyze and optimize cloud resource allocation to minimize cost
- s.Implement cost-effective solutions for infrastructure managemen
t.3. CI/CD Automati
- onAutomate CI/CD pipelines to streamline deployment processe
- s.Manage and enhance deployment workflows to reduce lead time
s.4. Infrastructure Developme
- ntDevelop and maintain core data processing and service layer
- s.Ensure seamless integration of services to support scalable operation
s.5. Incident Manageme
- ntAct as an escalation point for production incidents, ensuring swift resolutio
- n.Proactively monitor systems and address issues to maintain uptim
e.6. Security & Complian
- ceImplement and maintain standards for PCI-DSS compliance, ensuring secure data handling across cloud infrastructur
- e.Conduct compromise assessments to proactively identify vulnerabilities and respond to potential breache
- s.Lead patch management cycles to ensure system integrity and minimize security risk
s.7. Disaster Recovery & Testi
- ngPlan and execute Disaster Recovery (DR) drills, ensuring business continuity and data protection in alignment with recovery time objectives (RTOs) and recovery point objectives (RPOs
).
O
- KRsDesign and implement a scalable cloud architecture ensuring 99.9% upti
- me.Reduce cloud operational costs by 20% within six mont
- hs.Automate 90% of deployment processes, reducing deployment time by 5
- 0%.Ensure average incident resolution time is less than 30 minutes for critical production issu
- es.Develop a comprehensive monitoring and alerting system within three mont
hs.
Requirem
entsEduca
- tionRequired: Bachelor’s degree in Computer Science, Engineering, or a related fi
- eld.Preferred: Master’s degree in Cloud Computing, Infrastructure Engineering, or equival
ent.Experi
- enceMinimum 5 to 8 years of experience in cloud engineering or a similar r
- ole.Proven expertise in managing and optimising cloud platforms (AWS/G
CP).
Technical Expe
- rtiseProficiency in cloud platforms like AWS and
- GCP.Hands-on experience with Infrastructure as Code tools (Terraform, CloudFormat
- ion).Strong understanding of CI/CD tools (Jenkins, GitLab CI
- /CD).Experience with containerization (Docker) and orchestration (Kuberne
- tes).Expertise in monitoring and logging tools (Prometheus, Grafana, ELK st
- ack).Proficiency in programming languages like Python, Go, or
Java.
Soft
- SkillsStrong problem-solving and analytical abil
- ities.Excellent communication skills to collaborate with cross-functional
- teams.Ability to work in a fast-paced, dynamic enviro
nment.
Key Com
- petencyInfrastructure Expertise: Deep understanding of cloud platforms and scalable architecture
- design.Automation and Optimization: Ability to streamline processes and improve cost effi
- ciency.Collaboration: Work effectively with engineering and operational
- teams.Ownership: Take full responsibility for the cloud infrastructure and its perfo
- rmance.Adaptability: Thrive in dynamic and rapidly changing enviro
nments.
Success
- MetricsMaintain 99.9% uptime across all cloud infrast
- ructure.Achieve a 20% reduction in cloud costs within six
- months.Complete automation of CI/CD pipelines, reducing deployment lead times
- by 50%.Resolve critical production incidents within 30 minutes on
- average.Implement a robust monitoring system with minimal false
alerts.