About Hytech
Hytech is a leading management consulting firm headquartered in Australia and Singapore, specialising in digital transformation for fintech and financial services organisations. We deliver end-to-end consulting services and provide robust middle- and back-office solutions that enable our clients to optimise operations, enhance efficiency, and stay ahead in a fast-evolving digital landscape.
With more than 2,000 professionals worldwide, Hytech has a strong and growing international presence, with offices across Australia, Singapore, Malaysia, Taiwan, the Philippines, Thailand, Morocco, Cyprus, Dubai, and beyond.
Responsibilities
(Business Continuity & High Availability Architecture)
- Define, implement, and operate SRE practices, including SLA/SLO/SLI design, availability, connectivity, and disaster recovery strategies
- Lead architecture design and execution for high availability, high concurrency, and large-scale systems (e.g., microservices, service mesh, multi-active/multi-region)
- Drive system observability, security compliance, and cost optimization (e.g., cost allocation and governance)
- Design resilient architectures for mission-critical systems with high availability, elasticity, and fault tolerance
(Observability, Monitoring & Reliability Engineering)
- Build observability platforms using tools such as Datadog, Prometheus, OpenTelemetry, logging systems, and alerting platforms (Flashcat/Nightingale)
- Implement full-stack monitoring across applications, infrastructure, and business metrics to enable precise issue detection
- Establish proactive monitoring systems with alerting, anomaly detection, and automated remediation capabilities
- Lead incident management (P1/P2), including rapid recovery, root cause analysis (RCA), and continuous improvement mechanisms
(Platform Engineering & Efficiency Optimization)
- Plan and implement platform engineering strategies to improve scalability, availability, and performance
- Build standardized platforms for system reliability, observability, and security while optimizing cost efficiency
- Design and optimize CI/CD pipelines (e.g., GitHub Actions, Jenkins, ArgoCD, Helm) to improve delivery speed and quality
- Establish standards for containerization, middleware, and deployment processes, ensuring scalability, reliability, and high availability
- Resolve system bottlenecks through capacity planning, performance tuning, and reliability improvements
(Technology Leadership & Collaboration)
- Deeply collaborate with business and engineering teams to embed reliability, observability, scalability, and security into system design
- Lead the definition and implementation of technical standards, security baselines, and quality control mechanisms
- Drive best practices adoption, tooling standardization, and engineering efficiency improvements
Key Requirements
- 5+ years in SRE / DevOps / Platform Engineering or related roles
- Proven experience in designing and operating high-availability, large-scale systems
- Cloud platforms: AWS (EC2, EKS, IAM, S3, VPC, NLB/ALB, RDS, ElastiCache), or equivalent (Azure/GCP)
- Infrastructure as Code: Terraform / CloudFormation
- CI/CD & automation: Jenkins, GitHub Actions, ArgoCD, CodeBuild, Helm
- Containerization: Docker, Kubernetes (K8s)
- Observability: Metrics, Logs, Traces (e.g., Prometheus, OpenTelemetry, Datadog)
- Strong system thinking and analytical problem-solving capability
- Excellent cross-functional collaboration and communication skills
- Self-driven with strong ownership and continuous improvement mindset
(Nice to Have)
- Experience in fintech, payments, or high-security environments
- Experience with high-concurrency, low-latency system design
- AI-driven operations (AIOps) or automation experience
- Certifications (e.g., AWS, CKA/CKS)
- Experience with large-scale systems or international project delivery
What We Offer
- Easy access to public transportation (LRT & KTM).
- Transportation allowance.
- Corporate insurance coverage, including dental, optical, and outpatient claims.
- Gym and fitness claims.
- Ongoing training and development opportunities.
- Exposure to exciting projects that support career growth and professional development.