Site Reliability Engineer [Linux]

TDCX logo

TDCX

View Salaries, Reviews, and more  

Job Description

#BeMore


Do you aspire for a rewarding career that lets you do more and achieve more? Unleash your

full potential at work with TDCX, an award-winning and fast-growing BPO company.

Work with the world’s most loved brands and be with awesome, diverse people. Be home,

belong, and start your journey to #BeMore!


Top Reasons to work with TDCX


• Attractive remuneration, great perks, and performance incentives

• Comprehensive medical, insurance, or social security coverage

• World-class workspaces

• Engaging activities and recognition programs

• Strong learning and development plans for your career growth

• Positive culture for you to #BeMore at work

• Easy to locate area with direct access to public transport

• Flexible working arrangements

• Be coached and mentored by experts in your field

• Join a global company, winner of hundreds of industry awards


Role Summary


Site Reliability Engineering (SRE) combines software and systems engineering with the art of machine learning to build and run large-scale, massively distributed, and fault-tolerant systems. You will have the opportunity to sharpen your expertise in coding, performance analysis, and large-scale system design while making a tangible impact on the future of TikTok’s Infrastructure services and AML systems.


What is your mission?


• Design, build, and maintain highly available, scalable, and fault-tolerant systems. Collaborate with software engineering teams to ensure applications are designed with reliability and performance in mind.

• Develop and maintain automation procedures to maximize system efficiency, minimize human intervention, and optimize routine tasks.

• Monitor and analyze system performance to identify and address bottlenecks before they impact users. Ensure the infrastructure can handle rapid growth in web traffic and ML data processing.

• Participate in 24/7 on-call rotations (including scheduled shifts and holidays). Practice sustainable on-call response, conduct root-cause analysis, and lead blameless post-mortems to prevent recurrence.

• Implement monitoring tools (SLIs/SLOs/SLAs) and set up automated alerting and metrics to track system health and performance.

• Implement and maintain security best practices and ensure all systems meet regulatory requirements.


Who are we looking for?


Minimum Qualifications:

• Education: Bachelor’s or Master’s degree in Computer Science, Information Technology, Computer Engineering, or a related field.

• Experience: 3+ years of experience as a Site Reliability Engineer, Systems Engineer, or Software Engineer.

• Coding: Proficient in at least one high-level programming language (e.g., Python, Go, C++, or Java) and shell scripting. Strong understanding of data structures and algorithms.

• Systems: Strong understanding of Linux operating systems and open-source technologies and a solid understanding of network architecture.

• Databases: Competent knowledge of relational database systems and database modeling.


Preferred Qualifications


• Experience with containers and container orchestration platforms such as Docker and Kubernetes.

• Proficiency in or exposure to machine learning frameworks such as TensorFlow, PyTorch, MXNet, or PaddlePaddle.

• Hands-on experience with monitoring tools and methodologies (e.g., Prometheus, Grafana).

• Soft Skills: Strategic thinking, exceptional communication, and the ability to collaborate effectively with cross-functional teams in a fast-paced environment.


Who is TDCX?

TDCX (NYSE: TDCX) provides transformative digital CX solutions, enabling world-leading and

disruptive brands to acquire new customers, build customer loyalty, and protect their online

communities.


TDCX helps clients, including many of the world’s best brands, achieve their customer

experience aspirations by harnessing technology, human intelligence, and our global

footprint. We serve clients in fintech, gaming, technology, home sharing and travel, digital

advertising and social media, streaming, and e-commerce. Our expertise and strong

footprint in Asia have made us a trusted partner for clients, particularly high-growth, new

economy companies looking to tap the region’s growth potential.


We pride ourselves on discovering and employing the best professionals to join us as we

transform the outsourced CX industry. Our commitment to #BeMore for our people, our

clients, and our community has led to us winning several hundreds of industry awards,

including being one of the best companies to work for in Asia.


From our first-rate workspaces, above-industry-average compensation packages, career

opportunities, to our workplace perks, find out what else is in store when you embark on a

career with TDCX.


Interview Questions of Site Reliability Engineer [Linux] at TDCX

Currently, there aren't any interview questions for this role at TDCX shared by other job seekers.
View more interview questions of similar roles from other companies →
banner icon
Prepare For Your Interview in 1 Week?
Equip yourself with possible questions that interviewers might ask you, based on your work experience and job description.
Get Started!

Salary Insights of Site Reliability Engineer [Linux] at TDCX

Currently, there aren't any salaries for this role at TDCX shared by other job seekers.

View more salaries from TDCX →

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now