What You’ll Do
If you desire to be part of something special, to be part of a winning team, to be part of a fun team – winning is fun. We are looking forward to hire
Sr. Data Engineer in
Pune, India. In Eaton, making our work exciting, engaging, meaningful; ensuring safety, health, wellness; and being a model of inclusion & diversity are already embedded in
who we are - it’s in our values, part of our vision, and our clearly defined aspirational goals. This exciting role offers opportunity to:
- The Senior Data Engineer is a pivotal role within the Finance Data Hub and the Enterprise Data platform, focused on establishing standards, building frameworks, and elevating engineering capabilities across data organization.
- Rather than solely delivering features, this position defines and codifies principles and practices for building, testing, deploying, monitoring, and governing data pipelines in a modern DataOps and data mesh environment.
- The impact extends beyond functional pipelines, creating a reusable foundation that empowers every data team member to deliver high-quality data products efficiently and confidently.
- The ideal candidate brings deep technical expertise, a platform engineering mindset, and strong leadership to drive adoption of new standards, with a forward-looking approach to GenAI-augmented data engineering.
- This role is directly accountable for establishing, documenting, and driving adoption of six foundational engineering frameworks that will define the data engineering operating model for the Finance Data Hub:
- This role will define and codify leading practices across the full data lifecycle — including
- DataOps framework for CI/CD-driven pipeline deployment, automated unit testing,
- Data Quality framework with data contract testing, schema validation, and anomaly detection,
- Data Observability standard for end-to-end lineage tracking, freshness monitoring, and incident response,
- Data Modeling standard aligned to medallion or dimensional patterns with naming conventions and style guides,
- Data Governance and Access Control framework covering classification, masking, and role-based access, and
- Pipeline Design Pattern library of reusable, idempotent, and testable ELT/ETL templates.
Qualifications
Requirement :
- B E/M.Tech in Electrical/Electronics/Computer Science
- 10+ years
- End-to-end delivery of production data pipelines at enterprise scale: ingestion, transformation, orchestration, and serving layers. Strong SQL and Python proficiency
- Experience with both batch and streaming paradigms
- Technical leadership in a cross-functional environment — setting standards, mentoring engineers, conducting design reviews, and influencing engineering direction without necessarily holding a direct management title
- Deep hands-on Snowflake expertise: data sharing, zero-copy cloning, dynamic tables, streams and tasks, RBAC design, row access policies, dynamic masking, warehouse sizing, and query optimization. Snowflake certification is a strong plus
- Proficient with GitHub for version control, pull request workflows, and GitHub Actions for CI/CD automation. Experience designing branching strategies and automated test/deploy pipelines for data workloads
- Hands-on experience building transformation tools — models, tests, macros, packages, sources, and exposures. Coalesce experience or familiarity is an advantage. Understanding of DAG-based transformation orchestration
- Has built or adopted reusable automated unit testing frameworks for data pipelines or transformation models. Understands test pyramid concepts in a data context: unit, integration, and contract tests
- Has designed and implemented RLS frameworks at the platform layer (e.g., Snowflake row access policies). Understands the intersection of data governance policy and platform enforcement
- Has implemented data quality monitoring frameworks and observability instrumentation in production environments
- Strong grasp of medallion architecture (Bronze/Silver/Gold), dimensional modeling (star schema, SCD types), and modern lakehouse/warehouse modeling patterns. Has published or enforced modeling standards
- Has led or meaningfully contributed to a data engineering modernization initiative — re-platforming, cycle time reduction, or adoption of modern tooling. Can articulate before/after outcomes with metrics
- Has experimented with or productionised GenAI tools to enhance data engineering workflows — AI code assistants, LLM-powered documentation, natural language querying, or AI-driven anomaly analysis.
Skills
- Framework Authorship & Adoption Leadership
Design, document, and version-control all six engineering frameworks in a central standards repository (GitHub), ensuring they are discoverable, living documents with clear change governance.
Conduct framework enablement sessions, workshops, and pair-programming to drive active adoption — not just publication — across the engineering team.
Define conformance criteria and lightweight review checkpoints so that new pipeline work is assessed against framework standards before promotion to production.
Act as the technical authority and tiebreaker on engineering design decisions — establishing consistent patterns while preserving pragmatic flexibility where needed.
- DataOps & CI/CD Pipeline Engineering
Design and implement CI/CD pipelines for data engineering workloads using GitHub Actions or equivalent — covering lint, unit test, schema validation, and environment promotion stages.
Establish automated unit testing patterns — including test coverage standards and coverage reporting.
- Data Quality & Observability Engineering
Implement data contract frameworks at ingestion, transformation, and consumption boundaries — defining schemas, SLOs, and acceptable value ranges as code.
Build reusable data quality monitoring templates — parameterizable and composable across data products.
Instrument pipelines with observability metadata: lineage, runtime metrics, freshness timestamps, and row count deltas — surfaced into operational dashboards.
Design and test the incident response workflow for data quality breaches: automated alerting, quarantine patterns, stakeholder notification, and self-healing logic where feasible.
- Snowflake Platform & Access Control Engineering
Design and implement scalable RBAC models in Snowflake — covering functional roles, object ownership hierarchies, and data product consumer roles.
Build row-level security (RLS) frameworks using Snowflake row access policies — creating reusable, metadata-driven policy templates that can be applied consistently across Finance data products.
Define and implement dynamic data masking policies aligned to the data classification taxonomy — ensuring sensitive financial data is protected at the platform layer, not just the application layer.
Govern Snowflake resource utilization: warehouse sizing standards, query optimization guidelines, and cost attribution tagging by domain or product.
- GenAI-Augmented Data Engineering
Champion the exploration and adoption of GenAI tooling to amplify data engineering productivity — including AI-assisted SQL/python code generation, automated documentation, and intelligent pipeline debugging.
Prototype and evaluate LLM-powered data engineering assistants: natural language to SQL interfaces, automated data contract generation, and AI-driven anomaly root cause analysis.
Define guardrails and governance standards for GenAI use in data engineering workflows — covering code review requirements, hallucination risk in data contexts, and audit traceability.
Share findings and tooling recommendations with the wider data engineering community through internal demos, documentation, and engineering blog posts.
- Modernization & Delivery Velocity
Identify and eliminate sources of engineering friction — legacy patterns, manual deployment steps, inconsistent environments — and replace with automated, standards-driven equivalents.
Measure and report on delivery cycle time improvements attributable to framework adoption: pipeline build time, time to production, defect escape rate, and time to recovery.
Lead or contribute to data engineering modernization initiatives: migrating legacy ETL workloads, re-platforming to Snowflake, and adopting modern orchestration patterns.