Senior Data Engineer

Idexcel logo

Idexcel

View Salaries, Reviews, and more  

Job Description

Job Description for Senior Data Engineer

Experience : 4years to 8years

Required Skills : Aws,Python,Pyspark,Databricks

Notice Period : Immediate to 15days


Databricks (Spark)

· Develop scalable ETL/ELT pipelines using PySpark (RDD/DataFrame APIs), Delta Lake, Auto Loader (cloudFiles), and Structured Streaming.

· Optimize jobs: partitioning, bucketing, Z-Ordering, OPTIMIZE + VACUUM, broadcast joins, AQE, checkpointing.

· Manage Unity Catalog: catalogs/schemas/tables, data lineage, permissions, secrets, tokens, and cluster policies.

· CI/CD for Databricks assets: notebooks, Jobs, Repos, MLflow artifacts.

· Build Medallion Architecture (Bronze/Silver/Gold) with Delta Live Tables (DLT) and expectations for data quality.

· Event-driven ingestion: Kafka/Kinesis → Databricks Streaming

Snowflake (DW & ELT)

· Model and implement star/snowflake schemas, data marts, and secure views.

· Performance tuning: clustering keys, micro-partitions, result caching, warehouse sizing, query profile analysis.

· Implement Task/Stream patterns for CDC; external tables for data lakes (S3); Snowpipe for near-real-time ingestion.

· Python/Snowpark for transformations and UDFs; SQL best practices (CTEs, window functions).

· Security: Row Level Security (RLS), Column Masking, OAuth/SCIM, network policies, data sharing (reader accounts).


AWS Data Engineering

· Storage & compute: S3 (lifecycle, encryption, partitioning), EMR (if needed), Lambda, Glue (ETL/Schema registry), Athena, Kinesis (Data Streams/Firehose), RDS/Aurora, Step Functions.

· Orchestration: MWAA/Airflow or Step Functions (error handling, retries, backfills, SLA alerts).

· Infra-as-code: Terraform/CloudFormation for reproducible environments (Databricks workspace, IAM, S3, networking).

· Security/compliance: IAM least privilege, KMS, VPC endpoints/private links, Secrets Manager, CloudTrail/CloudWatch, GuardDuty.

· Observability: CloudWatch metrics/logs, structured logging, datadog/Prometheus (optional), cost monitoring (tags/budgets).

Data Quality, Governance & Security

· Implement unit/integration tests for pipelines (e.g., pytest + Great Expectations + DLT expectations).

· Data contracts and schema evolution; monitor SLA/SLO; DQ dashboards (missingness, drift, freshness, completeness).

· PII handling: tokenization/pseudonymization, field-level encryption, KYB/KYC data flows adherence; audit trails.

· Cataloging & lineage through Unity Catalog and/or OpenLineage/Purview (if applicable).

DevOps & CI/CD

· Git workflows (branching, PR reviews), Databricks CLI/Terraform modules for jobs/clusters/UC, Snowflake DevOps (object versioning via schemachange or SQL-based migration).

· Automated testing in pipelines; feature flags, canary releases for data jobs; rollback strategies.

Client-Facing PoCs & Delivery

· Rapid PoC build: clearly defined success metrics, benchmark cost/performance, produce a transition plan to production.

· Present architectural decisions, trade-offs (Spark vs Snowflake ELT), and cost projections (Databricks DBU, Snowflake credits, storage egress).

· Produce runbooks, operational playbooks, and knowledge transfer documents for client teams.

Required Technical Skillset


· Databricks: PySpark, Delta Lake, Auto Loader, DLT, Jobs, Unity Catalog, MLflow basics.

· Snowflake: SQL, Snowpipe, Tasks/Streams, Snowpark (Python), warehouse sizing, performance tuning, security policies.

· Python: strong in packages for DE (pandas, pyarrow, pytest), robust error handling, typing, and packaging.

· Orchestration: Airflow DAGs (Sensors, Operators, XCom), Step Functions state machines.

· Streaming & CDC: Kafka/Kinesis, Debezium (nice-to-have), CDC patterns to Delta/Snowflake.

· AWS: S3, Glue, Lambda, Kinesis, IAM/KMS, VPC, CloudWatch; Terraform/CloudFormation.

· Data Modeling: 3NF/Dimensional, slowly changing dimensions (SCD Type 2), surrogate keys, surrogate vs natural debates.

· Security & Compliance: encryption at rest/in transit, tokenization, key rotation, audit logging, governance controls.

· Performance & Cost: Spark job tuning, Snowflake warehouse right-sizing, partitioning/clustering, object storage best practices.


Nice-to-Have:

· dbt (Snowflake) with tests & exposures; Great Expectations.

· Databricks SQL Warehouses and BI connectivity; Photon engine awareness.

· Lakehouse Federation (UC external locations); Delta Sharing; Iceberg experience.

· Kafka Connect/Debezium, NiFi or MuleSoft (for data integrations).

· Experience in financial services

· Exposure to ISO/IEC 27001 controls in data platforms.

Education & Certifications

· Bachelor’s/Master’s in CS/IT/EE or related.

· Certifications (plus): Databricks Data Engineer Associate/Professional, Snowflake SnowPro Core/Advanced, AWS Solutions Architect/Big Data/DP.


Interview Questions of Senior Data Engineer at Idexcel

Currently, there aren't any interview questions for this role at Idexcel shared by other job seekers.
View more interview questions of similar roles from other companies →
banner icon
Prepare For Your Interview in 1 Week?
Equip yourself with possible questions that interviewers might ask you, based on your work experience and job description.
Get Started!

Salary Insights of Senior Data Engineer at Idexcel

Currently, there aren't any salaries for this role at Idexcel shared by other job seekers.

View more salaries from Idexcel →

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now