Data Engineer - ETL

Algoworks logo

Algoworks

View Salaries, Reviews, and more  

Job Summary


Job Type
-

Seniority

Years of Experience
Information not provided

Tech Stacks
SQL Spark Azure Git Databricks pySpark CI Synapse Analytics Factory ETL Delta Lake

Job Description

Data Engineer - ETL

Location: India, Noida (Hybrid)

Experience: 6-8 Years


Algoworks

www.algoworks.com


About the Company


We are a global team of engineers, architects, designers, researchers, operators and innovators who share a passion for achieving client goals. Our engineering services help businesses thrive at the intersection of technology and people. From the latest AI implementations to legacy platform migrations and everything in between, our services span the enterprise technology spectrum. Our world class experience transformation playbook elevates digital success and increases ROI with a relentless focus on the human experience. Our customer base includes Fortune 500 companies around the globe. We’ve got the skills and insights and we’re also fun to work with. Our global team spans a diverse cultural spectrum, with wide range of interests, enabling us to bring personality and depth to every engagement.


Follow the video below to know about US! Clipchamp


Role Overview


We are seeking a Data Engineer - ETL to design, build, and optimize scalable data pipelines using Azure cloud technologies.

This role focuses on developing robust data ingestion and transformation pipelines, implementing Delta Lake-based data architectures, and enabling high-quality curated datasets for downstream analytics and reporting. The ideal candidate will have strong expertise in PySpark, Azure Databricks, and Azure Data Factory, along with a deep understanding of data performance optimization and engineering best practices.


Key Responsibilities


• Pipeline Development

- Build and maintain scalable data pipelines using Azure Databricks and Azure Data Factory.

- Implement ingestion and transformation logic across Bronze (raw) and Silver (cleaned) layers.

- Support batch and incremental data processing patterns.

• Curated Layer & Data Processing

- Implement hydration, merge, and upsert logic using Delta Lake.

- Ensure curated datasets meet business requirements and data quality standards.

- Handle late-arriving data and incremental updates efficiently.

Performance & Storage Optimization

- Optimize Delta Lake tables for performance and cost efficiency.

- Select and tune appropriate storage formats (Parquet, Delta).

- Apply partitioning, compaction, and file sizing strategies.

- Tune Spark jobs for large-scale distributed data processing.

• Downstream Collaboration & Data Enablement

- Collaborate with DWH and BI teams to support downstream data consumption.

- Provide optimized datasets for Synapse and reporting workloads.

- Support data validation, reconciliation, and consistency across Gold layer outputs.

• Engineering Best Practices

- Implement CI/CD practices for data pipelines and workflows.

- Follow coding standards, documentation, and version control practices.

- Support production troubleshooting, monitoring, and performance tuning.


Required Skills & Qualifications


· Bachelor’s degree in computer science, Engineering, or related field.

· 6–8 years of experience in data engineering.

Strong expertise in:

· PySpark and distributed data processing

· Azure Databricks (hands-on development and optimization)

· Azure Data Factory for pipeline orchestration

· Deep knowledge of Delta Lake (merge, upsert, optimization techniques).

· Strong SQL skills for data transformation and validation.

· Experience handling large datasets in distributed environments.

· Strong understanding of storage optimization (Parquet, Delta).

Tools & Practices

· Experience with Git and version control systems.

· Familiarity with CI/CD pipelines for data workflows.

· Understanding of data quality checks and validation techniques.

· Experience working in Agile/Scrum delivery models.


Nice to Have Skills


· Experience supporting Synapse Dedicated SQL Pool.

· Exposure to streaming or near real-time data pipelines.

· Familiarity with data governance or metadata management tools.


Soft Skills & Collaboration


· Strong analytical and problem-solving skills.

· Ability to work independently on complex data pipelines.

· Good communication and collaboration skills.

· Proactive and ownership-driven mindset.


Desired Attributes


· Strong attention to data quality and performance.

· Continuous learning mindset for evolving cloud/data technologies.

· Ability to work in fast-paced, data-intensive environments.


Interview Process

2 to 3 Rounds of Discussion


Interview Questions of Data Engineer - ETL at Algoworks

Currently, there aren't any interview questions for this role at Algoworks shared by other job seekers.
View more interview questions of similar roles from other companies →
banner icon
Prepare For Your Interview in 1 Week?
Equip yourself with possible questions that interviewers might ask you, based on your work experience and job description.
Get Started!

Salary Insights of Data Engineer - ETL at Algoworks

Currently, there aren't any salaries for this role at Algoworks shared by other job seekers.

View more salaries from Algoworks →

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now