Overview:
We are seeking a Cloud Specialist (Data-to-AI Modernization) who understands that the path to impactful Artificial Intelligence begins with a modern, high-quality data foundation.
This role is for those who believe "Data is the fuel for AI." You will be responsible for re-architecting legacy environments into cloud-native ecosystems, applying rigorous business logic to data handling, and ensuring that every pipeline you build is "AI-ready." You will guide clients from raw data migration to the deployment of agentic and generative AI solutions on Google Cloud.
Responsibilities
1. Data Modernization & Migration- Legacy-to-Cloud Transition: Support the re-architecting and migration of legacy SQL/NoSQL databases and ETL processes to Google Cloud (BigQuery, Cloud SQL).
- Architecture Design: Implement scalable Medallion architectures (Bronze/Silver/Gold) and data lakehouses to ensure data is structured for both reporting and AI consumption.
- Pipeline Development: Design, build, and maintain scalable batch and streaming data pipelines (ETL/ELT), leveraging modern orchestration tools and integration patterns to ensure seamless data flow from diverse sources to AI-ready sinks.
2. Engineering Excellence & Business Logic- Advanced Modeling: Apply dimensional modeling (Kimball/Inmon) to ensure data structures reflect actual business processes, making them intuitive for AI models and analysts alike.
- Automation & Lifecycle Management (DataOps): Implement automated deployment workflows and infrastructure management practices to ensure high reliability, environment consistency, and rapid iteration across the entire data-to-AI lifecycle.
- Optimization: Continuously monitor and tune query performance and storage costs to ensure a lean, efficient data environment.
3. Bridging Data to AI- AI-Ready Data Preparation: Prepare data for Generative AI use cases, including vectorization, feature engineering, and the creation of Feature Stores for MLOps.
- GenAI Implementation: Assist in deploying AI solutions such as Gemini-powered agents, Vertex AI Search, and RAG (Retrieval-Augmented Generation) workflows.
4. Data Governance & Quality- Reliability: Develop and implement robust data quality frameworks and validation protocols to monitor data health, ensuring that "trustworthy," high-fidelity data is available for downstream AI/ML consumption.
- Security & Compliance: Ensure all data solutions adhere to global compliance standards and internal security protocols, including encryption at rest/transit, granular IAM roles, and robust data masking techniques.
- Documentation: Maintain clear, professional documentation of data lineage, schemas, and system architecture to ensure transparency, auditability, and ease of maintenance.
Qualifications
Technical Requirements:- Education: Bachelor’s degree in Computer Science, Data Science, Engineering, or a related field.
- Programming Mastery: Advanced-level SQL proficiency and strong Python skills, with the ability to translate business logic into robust code.
- Cloud Data Expertise: Proficiency knowledge of cloud data services, including BigQuery, Cloud Dataflow, and Data Lakes, along with proven experience in modern ETL/ELT tools and patterns.
- Professional Experience: Mid-Senior 3–5+ years of hands-on experience architecting and implementing complex data pipelines on GCP or equivalent cloud platforms.
Soft Skills & Leadership:- Strategic Thinking: Ability to translate abstract business requirements into technical specifications.
- Communication: Ability to explain complex technical concepts (like data lineage or vector embeddings) to non-technical stakeholders.
- Mentorship: Experience guiding junior engineers and conducting rigorous code reviews.
Preferred Qualifications:- Google Cloud Certifications: Professional Data Engineer or Professional Machine Learning Engineer.
- AI Experience: Familiarity with LLM data preparation workflows and Vertex AI.