Machine Learning Engineer

NeoITO logo

NeoITO

View Salaries, Reviews, and more  

Job Summary


Job Type
-

Seniority

Years of Experience
Information not provided

Tech Stacks
Python PostgreSQL Linux Prometheus PyTorch CI LoRa API Docker Amazon S3

Job Description

AI / ML Engineer โ€“ SLM & RAG Specialist


Location: Trivandrum(Kerala)

Company: NeoITO

Experience: 5+ Years


About the Role


NeoITO is hiring an AI / ML Engineer to build and own an AI-powered Proposal & RFP generation system designed to transform meeting notes into structured, client-ready proposals within minutes.


You will be responsible for designing and managing the core AI layer, including the inference engine, RAG pipeline, embedding models, and compliance validation system.

Y

ou will collaborate closely with backend (Node.js) and frontend (React) engineers to deliver a production-ready AI system within a defined delivery timeline.



Key Responsibilities

Model Deployment & Inference

  • Deploy and manage Small Language Models (SLMs) on on-premise GPU infrastructure.
  • Configure and optimize LLM inference pipelines using frameworks such as vLLM or HuggingFace Transformers.
  • Implement token streaming, continuous batching, and optimized sampling strategies for reliable text generation.
  • Apply quantization techniques (GPTQ/AWQ) to reduce GPU memory footprint while maintaining inference performance.
  • Monitor GPU health and performance metrics including VRAM usage, latency, and throughput

Retrieval-Augmented Generation (RAG)

  • Design and implement RAG pipelines to enable context-aware proposal generation.
  • Build text chunking pipelines and generate embeddings using sentence-transformer models.
  • Store and retrieve vector embeddings using PostgreSQL with pgvector.
  • Implement semantic similarity search to retrieve relevant historical proposal data.
  • Continuously evaluate and optimize retrieval quality and performance.

AI-Driven Proposal Generation

  • Design structured pipelines to generate multi-section proposals including:
  • Executive Summary
  • Project Scope
  • Technical Approach
  • Implementation Timeline
  • Investment Summary
  • Risk Mitigation
  • Create section-specific prompts and templates for high-quality generation.
  • Implement real-time streaming responses to backend services.
  • Support partial regeneration of sections for iterative proposal refinement.

AI Quality, Validation & Compliance

  • Develop a validation engine to ensure generated content meets compliance and quality standards.
  • Implement rule-based checks including:
  • Client name verification
  • Budget reference validation
  • Section completeness
  • Sensitive data detection
  • Support an optional AI-based review layer for deeper quality checks.
  • Deliver structured feedback and annotations for use within editing workflows.

Prompt Engineering & Model Optimization

  • Design and maintain structured prompts for classification, generation, and validation tasks.
  • Conduct iterative prompt optimization to improve accuracy, tone, and consistency.
  • Maintain prompt versioning and regression testing frameworks.
  • Evaluate output quality through structured human evaluation metrics.

Fine-Tuning & Model Improvement

  • Lead fine-tuning initiatives to improve model performance over time.
  • Prepare and curate training datasets from finalized proposals.
  • Implement LoRA / QLoRA fine-tuning strategies for efficient model updates.
  • Track experiments and model versions using tools such as MLflow.

Collaboration & Engineering Practices

  • Expose AI capabilities via FastAPI services consumed by backend applications.
  • Collaborate with backend teams on job orchestration, queue processing, and event streaming.
  • Implement unit tests and quality checks for ML pipelines.
  • Contribute to containerized deployment environments using Docker.
  • Support CI/CD pipelines with automated testing and linting workflows.

Required Skills & Experience


Large Language Models & AI Systems

  • Hands-on experience with LLMs or SLMs
  • Experience deploying models using vLLM, HuggingFace Transformers, or similar frameworks
  • Knowledge of quantization techniques and inference optimization

RAG & Vector Search

  • Experience building Retrieval-Augmented Generation pipelines
  • Knowledge of vector databases such as pgvector, FAISS, or similar
  • Familiarity with embedding models and semantic search

Programming & Frameworks

  • Strong Python development experience
  • Experience with FastAPI, Pydantic, and PyTorch
  • Knowledge of libraries such as sentence-transformers, LangChain, or LlamaIndex

Infrastructure & GPU Systems

  • Experience working with GPU-based model deployment
  • Familiarity with CUDA environments and GPU monitoring
  • Experience deploying applications with Docker on Linux environments

Databases & Storage

  • Experience with PostgreSQL
  • Familiarity with vector extensions or vector search databases
  • Knowledge of object storage solutions such as S3 or MinIO

MLOps & Model Lifecycle

  • Experience with LoRA / QLoRA fine-tuning
  • Familiarity with experiment tracking tools
  • Knowledge of dataset preparation and model evaluation

Nice to Have

  • Experience working with Meta Llama models
  • Familiarity with document generation systems
  • Experience with queue-based ML pipelines
  • Exposure to secure enterprise environments requiring strict data governance
  • Knowledge of observability tools such as Prometheus


In this role, you will:

  • Deliver a fully functional AI proposal generation system running entirely on-premise
  • Achieve high-quality, structured proposal outputs
  • Ensure stable performance under concurrent usage
  • Establish a foundation for continuous model improvement through fine-tuning


Tech Stack


Primary Language: Python

API Framework: FastAPI

LLM Inference: vLLM / Transformers

Embedding Models: Sentence Transformers

Vector Database: PostgreSQL + pgvector

GPU Infrastructure: NVIDIA GPU environments

Containerization: Docker

Monitoring: Prometheus

Testing: Pytest


Interview Questions of Machine Learning Engineer at NeoITO

Currently, there aren't any interview questions for this role at NeoITO shared by other job seekers.
View more interview questions of similar roles from other companies โ†’
banner icon
Prepare For Your Interview in 1 Week?
Equip yourself with possible questions that interviewers might ask you, based on your work experience and job description.
Get Started!

Salary Insights of Machine Learning Engineer at NeoITO

Currently, there aren't any salaries for this role at NeoITO shared by other job seekers.

View more salaries from NeoITO โ†’

Achieve your dream job with our top-notch tools!

Resume Checker Illustration

Resume Checker

Our free resume checker analyzes the job description and identifies important keywords and skills missing from your resume in just a minute!

Check Now
Interview Preparation Illustration

AI InterviewPrep

Utilizing advanced AI, our tool generates tailored interview questions based on your industry, role, and experience. Practice and receive feedback on your answers in real time!

Check Now
Resume Builder Illustration

Resume Builder

Let us show you the differences between a bad, good, and great resume, and guide you in building a resume that helps you stand out to employers, ensuring you land your next position faster!

Check Now