Job Summary


Salary
S$6,962 - S$12,567 / Monthly EST

Job Type
Permanent

Seniority
Senior

Years of Experience
At least 8 years

Tech Stacks
ETL
Oracle
Google Cloud
Analytics
Spark Streaming
Apache Beam
Message Queue
essage Queue
Fabric
CI
HDFS
HortonWorks
Cloudera
Snowflake
HBase
Rust
Databricks
Apache
Presto
Azure
Hive
Spark
yarn
NoSQL
Flink
Airflow
kafka
SQL
PostgreSQL
Scala
Redis
Hadoop
MySQL
Go
Python
AWS
Java

Job Description


Apply
  • Design, develop and automate large scale, high-performance distributed data processing systems (batch and/or real-time streaming) that meet both functional and non-functional requirements
  • Deliver high level & detailed design to ensure that the solution meet business requirements and align to the data architecture principles and technology stacks
  • Partner with business domain experts, data scientists, and solution designers to identify relevant data-assets, domain data model and data solutions. Collaborate with product data engineers to coordinate backlog feature development of data pipelines patterns and capabilities
  • Own and lead data engineering projects; data pipelines delivery with reliable, efficient, testable, & maintainable artifacts, involves ingest & process data from a large number and variety of data sources
  • Build, optimize and contribute to shared Data Engineering Frameworks and tooling, Data Products & standards to improve the productivity and quality of output for Data Engineers
  • Design and build scalable Data APIs to host Operational data and Data-Lake assert in Data[1]Mesh / Data Fabric Architecture
  • Drive Modern Data Platform operations using Data Ops, ensure data quality, monitoring the data system. Also support Data science MLOps platform 8. Drive and deliver industry standard Devops (CI/CD) best practices, automate development and release management
  • Understand data security standards, use data security guidelines & tools to apply and adhere to the required data controls across data platform, data pipelines, applications and access end points
  • Support and contribute to data engineering product and data pipeline documentations, development guidelines & standards for data-pipeline, data model and layer design

We are committed to a safe and healthy environment for our employees & customers and will require all prospective employees to be fully vaccinated

Requirements:
  • Minimum of 8 years of experience in Data Engineering, Data Lake Infrastructure, Data Warehousing, Data Analytics tools or related, in design and developing of end-to-end scalable data pipelines and data products
  • Experience in building and operating large and robust distributed data lakes (multiple PBs) and deploying high performance with reliable system with monitoring and logging practices
  • Experience in designing and building data products and pipelines using some of the most scalable and resilient open-source big data technologies; Spark, Delta-Lake, Kafka, Flink, Airflow, Presto and related distributed data processing
  • Build and deploy high performance modern data engineering & automation frameworks using programming languages such as Scala/Python and automate the big data workflows such as ingestion, aggregation, ETL processing etc
  • Good understanding of data modeling and high[1]end design, data engineering / software engineering best practices - include handling and logging errors, monitoring the system, fault-tolerant pipelines, data quality and ensuring a deterministic pipeline with DataOps
  • Excellent experience in using ANSI SQL for relational databases like – Postgres, MySql, Oracle and knowledge of Advanced SQL on distributed analytics
  • Experience working in Telco Data Warehouse and / or Data Lake engines – Databricks SQL, Snowflake , etc
  • Proficiency programming languageslike Scala , Python, Java, Go, Rust or scripting languages like Bash
  • Experience on cloud systems like AWS, Azure, or Google Cloud Platform o Cloud data engineering experience in at least one cloud (Azure, AWS, GCP) o Experience with Databrick (Cloud Data Lakehouse)
  • Experience on Hadoop stack: HDFS, Yarn, Hive, HBase, Cloudera, Hortonworks
  • Experience on NoSQL
  • Experience on Event Streaming Platform, Message Queues like Kafka, Pulsar , Rabbit -MQ, Redis -MQ o Event Processing systems – Kafka Streaming, KSQL, Spark Streaming, Apache Flink, Apache Beam etc.
  • Build and deploy using CI/CD toolkits Technical