Job Purpose and Impact
As a Senior Data Engineer in Cargill Data Function, you will design, build and deliver high-performance, data-centric solutions using comprehensive data capabilities of Cargill Data Platform. You will build data structures and pipelines to collect, curate and enable data for consumption.
Key Accountabilities
- Collaborate with the Business/Application/Process owners, and Product teams to define requirements and design data products.
- Participate in the architecture decision-making process.
- Develop data products utilizing cloud-based technologies and ensure they are designed and built to be robust, scalable and sustainable.
- Perform data modeling and prepare data in databases for use in various analytics tools and configurate and develop data pipelines to move and optimize data assets.
- Must have Product Mindset and treat Data as an Asset.
- Provide necessary technical support through all phases of Product life cycle.
- Build prototypes to test new concepts and be a key contributor of ideas for reusable frameworks, components and data products.
- Help drive the adoption of new technologies and best practices within the Data Engineering team and be a role model and mentor for data engineers.
- Independently handle complex issues with minimal supervision, while escalating only the most complex issues to appropriate staff.
Qualifications
Minimum Qualifications
- 4+ years of experience in Data Integration working proficiency in SQL and NoSQL Databases.
- 4+ years of experience in Programming using Scala / Python / PySpark / Java etc.
- 4+ years of experience working with Hadoop or other Cloud Data platforms (ex: Snowflake).
- Experience in building CI/CD Pipelines and Unix scripting.
- Demonstrated ability to quickly learn new/open-source technologies to stay current in the Data Engineering world.
- Experience in developing software using agile methodologies such as Scrum/Kanban.
- Bachelor's degree in a related field or equivalent experience.
Preferred Qualifications
- Experience in building batch and streaming pipelines using Sqoop, Kafka, Pulsar and/or Spark.
- Experience in storage using HDFS / AWS S3 / Azure ADLS etc.
- Experience in Orchestration and Scheduling using Oozie / Airflow / AWS Glue etc.
- Experience in Data transformations using PySpark / dbt etc.
- Experience in open-source projects leveraging collaboration tools like GitHub.