Job Description

What You’ll Do

If you desire to be part of something special, to be part of a winning team, to be part of a fun team – winning is fun. We are looking forward to hire Sr. Data Engineer in Pune, India. In Eaton, making our work exciting, engaging, meaningful; ensuring safety, health, wellness; and being a model of inclusion & diversity are already embedded in who we are - it’s in our values, part of our vision, and our clearly defined aspirational goals. This exciting role offers opportunity to:

The Senior Data Engineer is a pivotal role within the Finance Data Hub and the Enterprise Data platform, focused on establishing standards, building frameworks, and elevating engineering capabilities across data organization.
Rather than solely delivering features, this position defines and codifies principles and practices for building, testing, deploying, monitoring, and governing data pipelines in a modern DataOps and data mesh environment.
The impact extends beyond functional pipelines, creating a reusable foundation that empowers every data team member to deliver high-quality data products efficiently and confidently.
The ideal candidate brings deep technical expertise, a platform engineering mindset, and strong leadership to drive adoption of new standards, with a forward-looking approach to GenAI-augmented data engineering.
This role is directly accountable for establishing, documenting, and driving adoption of six foundational engineering frameworks that will define the data engineering operating model for the Finance Data Hub:
This role will define and codify leading practices across the full data lifecycle — including
DataOps framework for CI/CD-driven pipeline deployment, automated unit testing,
Data Quality framework with data contract testing, schema validation, and anomaly detection,
Data Observability standard for end-to-end lineage tracking, freshness monitoring, and incident response,
Data Modeling standard aligned to medallion or dimensional patterns with naming conventions and style guides,
Data Governance and Access Control framework covering classification, masking, and role-based access, and
Pipeline Design Pattern library of reusable, idempotent, and testable ELT/ETL templates.

Qualifications

Requirement :

B E/M.Tech in Electrical/Electronics/Computer Science
10+ years
End-to-end delivery of production data pipelines at enterprise scale: ingestion, transformation, orchestration, and serving layers. Strong SQL and Python proficiency
Experience with both batch and streaming paradigms
Technical leadership in a cross-functional environment — setting standards, mentoring engineers, conducting design reviews, and influencing engineering direction without necessarily holding a direct management title
Deep hands-on Snowflake expertise: data sharing, zero-copy cloning, dynamic tables, streams and tasks, RBAC design, row access policies, dynamic masking, warehouse sizing, and query optimization. Snowflake certification is a strong plus
Proficient with GitHub for version control, pull request workflows, and GitHub Actions for CI/CD automation. Experience designing branching strategies and automated test/deploy pipelines for data workloads
Hands-on experience building transformation tools — models, tests, macros, packages, sources, and exposures. Coalesce experience or familiarity is an advantage. Understanding of DAG-based transformation orchestration
Has built or adopted reusable automated unit testing frameworks for data pipelines or transformation models. Understands test pyramid concepts in a data context: unit, integration, and contract tests
Has designed and implemented RLS frameworks at the platform layer (e.g., Snowflake row access policies). Understands the intersection of data governance policy and platform enforcement
Has implemented data quality monitoring frameworks and observability instrumentation in production environments
Strong grasp of medallion architecture (Bronze/Silver/Gold), dimensional modeling (star schema, SCD types), and modern lakehouse/warehouse modeling patterns. Has published or enforced modeling standards
Has led or meaningfully contributed to a data engineering modernization initiative — re-platforming, cycle time reduction, or adoption of modern tooling. Can articulate before/after outcomes with metrics
Has experimented with or productionised GenAI tools to enhance data engineering workflows — AI code assistants, LLM-powered documentation, natural language querying, or AI-driven anomaly analysis.

Skills

Framework Authorship & Adoption Leadership

Design, document, and version-control all six engineering frameworks in a central standards repository (GitHub), ensuring they are discoverable, living documents with clear change governance.

Conduct framework enablement sessions, workshops, and pair-programming to drive active adoption — not just publication — across the engineering team.

Define conformance criteria and lightweight review checkpoints so that new pipeline work is assessed against framework standards before promotion to production.

Act as the technical authority and tiebreaker on engineering design decisions — establishing consistent patterns while preserving pragmatic flexibility where needed.

DataOps & CI/CD Pipeline Engineering

Design and implement CI/CD pipelines for data engineering workloads using GitHub Actions or equivalent — covering lint, unit test, schema validation, and environment promotion stages.

Establish automated unit testing patterns — including test coverage standards and coverage reporting.

Data Quality & Observability Engineering

Implement data contract frameworks at ingestion, transformation, and consumption boundaries — defining schemas, SLOs, and acceptable value ranges as code.

Build reusable data quality monitoring templates — parameterizable and composable across data products.

Instrument pipelines with observability metadata: lineage, runtime metrics, freshness timestamps, and row count deltas — surfaced into operational dashboards.

Design and test the incident response workflow for data quality breaches: automated alerting, quarantine patterns, stakeholder notification, and self-healing logic where feasible.

Snowflake Platform & Access Control Engineering

Design and implement scalable RBAC models in Snowflake — covering functional roles, object ownership hierarchies, and data product consumer roles.

Build row-level security (RLS) frameworks using Snowflake row access policies — creating reusable, metadata-driven policy templates that can be applied consistently across Finance data products.

Define and implement dynamic data masking policies aligned to the data classification taxonomy — ensuring sensitive financial data is protected at the platform layer, not just the application layer.

Govern Snowflake resource utilization: warehouse sizing standards, query optimization guidelines, and cost attribution tagging by domain or product.

GenAI-Augmented Data Engineering

Champion the exploration and adoption of GenAI tooling to amplify data engineering productivity — including AI-assisted SQL/python code generation, automated documentation, and intelligent pipeline debugging.

Prototype and evaluate LLM-powered data engineering assistants: natural language to SQL interfaces, automated data contract generation, and AI-driven anomaly root cause analysis.

Define guardrails and governance standards for GenAI use in data engineering workflows — covering code review requirements, hallucination risk in data contexts, and audit traceability.

Share findings and tooling recommendations with the wider data engineering community through internal demos, documentation, and engineering blog posts.

Modernization & Delivery Velocity

Identify and eliminate sources of engineering friction — legacy patterns, manual deployment steps, inconsistent environments — and replace with automated, standards-driven equivalents.

Measure and report on delivery cycle time improvements attributable to framework adoption: pipeline build time, time to production, defect escape rate, and time to recovery.

Lead or contribute to data engineering modernization initiatives: migrating legacy ETL workloads, re-platforming to Snowflake, and adopting modern orchestration patterns.