HPC System Engineer, System, NSCC

Agency for Science, Technology and Research (A*STAR) logo

Agency for Science, Technology and Research (A*STAR)

View Salaries, Reviews, and more  

Job Summary


Salary
S$5,300 - S$8,100 / Monthly

Job Type
-

Seniority

Years of Experience
At least 3 years

Tech Stacks
Python Perl Linux UNIX

Job Description

Job Summary
The HPC System Engineer will design, optimize, and maintain HPC system architecture, including compute, interconnect, and storage components. This role involves advanced performance tuning, resource planning, and technology evaluation to ensure scalability, reliability, and security of NSCCโ€™s supercomputing infrastructure.

Roles and Responsibilities

System Engineering & Optimization:

  • Evaluate HPC system architecture, including compute, interconnect, and storage components.
  • Collaborate with HPC System Administrators to ensure system reliability and performance.
  • Assist in performance tuning and root-cause analysis for complex system-level issues.
  • Develop and maintain utility tools for system diagnostics and performance profiling.

Resource & Workload Management:

  • Configure and optimize job schedulers (e.g., Slurm, PBS Pro) to maximize resource utilization and throughput.
  • Develop and enforce policies for resource allocation and workload prioritization.

Design & Planning:

  • Assess future computational requirements and contribute to HPC system architecture design.
  • Evaluate emerging technologies (processors, accelerators, interconnects, storage solutions, programming models).

Compliance & Risk Management:

  • Define and implement security policies in collaboration with administrators.
  • Conduct regular security checks and ensure compliance with organizational standards.

Collaboration & Documentation:

  • Work closely with Middleware and Storage Engineers to ensure system compatibility.
  • Document system architecture, configurations, and engineering decisions.

Qualifications:

  • Degree in a Computer Science, Engineering, IT or other relevant areas.
  • At least 3 years of experience in managing HPC systems.
  • Highly proficient in UNIX/Linux environments and command line interface (CLI).
  • Experience with cluster management software (xCAT, BCM, PHPC, HPCM).
  • Experience with job scheduling and workload management software (Slurm or PBS Pro)
  • Strong knowledge of HPC storage principles and experience in managing parallel file system (Lustre, GPFS, BeeGFS).
  • Strong knowledge of RDMA-based interconnect (InfiniBand, RoCE).
  • Understanding of basic network protocols like DHCP, DNS, TFTP, SMTP, etc.
  • Good knowledge of scripting languages like Python, Bash or Perl.
  • Demonstrate ability to analyse complex issues and develop effective solutions.

Interview Questions of HPC System Engineer, System, NSCC at Agency for Science, Technology and Research (A*STAR)

Currently, there aren't any interview questions for this role at Agency for Science, Technology and Research (A*STAR) shared by other job seekers.
View more interview questions of similar roles from other companies โ†’
banner icon
Prepare For Your Interview in 1 Week?
Equip yourself with possible questions that interviewers might ask you, based on your work experience and job description.
Get Started!

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now