Job Description

Job Title: QA Engineer – Generative AI Systems

Role Description: We are building production-grade Generative AI applications, including LLM-powered systems, AI agents, multi-agent orchestration, Retrieval-Augmented Generation (RAG) pipelines, and AI-driven APIs.

We are looking for a QA Engineer with 2–3 years of experience who is passionate about testing complex AI and web applications. You will ensure our AI products are robust, reliable, and production-ready by designing and executing tests that capture edge cases, inconsistent outputs, and integration challenges unique to AI systems.

This role is ideal for someone curious, analytical, and hands-on with testing AI, APIs, and web applications.

Location: [Hybrid / Onsite]

Relevant Work Experience: 2-4 Years

Work Location: Pune

Key Responsibilities:

AI & System Testing

Design and execute test cases for LLM-driven apps, AI agents, and RAG pipelines.
Perform black-box, functional, regression, integration, and exploratory testing.
Validate API responses for correctness, consistency, latency, and error handling.
Test prompts across variations, ambiguous inputs, and adversarial cases.
Evaluate multi-agent workflows, tool usage, and agent-to-agent interactions.
Detect logical inconsistencies, hallucinations, and unsafe outputs.
Conduct performance and stress tests for APIs and AI pipelines.
Collaborate with developers and AI engineers to reproduce and resolve issues.
Maintain clear QA documentation, bug reports, and test coverage records.

Test Design & Automation

Build structured test cases for:
Prompt variations and context limits
Multi-turn conversations
Tool or function-calling workflows
Structured output validation (JSON/schema)
Develop automated regression suites for APIs and workflows.
Support creation of evaluation benchmarks and red-teaming frameworks.

API & Integration Testing

Validate AI APIs for performance, error handling, streaming responses, and token efficiency.
Ensure seamless integration across microservices, AI pipelines, and web interfaces.
Test compatibility across model providers and environments.

Safety, Security & Quality Monitoring

Test for prompt injection and data leakage vulnerabilities.
Evaluate bias, fairness, and adherence to AI governance standards.
Define measurable QA metrics and track AI reliability over time.

Required Qualifications

2–3 years of QA experience, ideally with web applications and APIs.
Hands-on experience testing AI/ML-powered applications (chatbots, NLP systems, LLMs).

Strong knowledge of:

REST APIs & JSON validation
Automated testing frameworks
Load and performance testing
Experience with Python or similar scripting languages.
Understanding of prompt engineering and LLM behavior.
Strong analytical, investigative, and problem-solving skills.

Preferred Qualifications

Exposure to RAG pipelines, vector databases, or agent frameworks.
Experience with AI evaluation frameworks, A/B testing, and model benchmarking.
Knowledge of AI safety, red-teaming, and adversarial testing.
Familiarity with OpenAI or other LLM APIs.
Understanding of CI/CD pipelines and release processes.
Experience in fast-paced product or startup environments.

Why This Role Is Unique

Test non-deterministic AI systems where the same input may yield different outputs.
Design QA strategies for complex, multi-agent AI workflows.
Play a key role in shaping the reliability and safety of next-generation AI systems.

Success Metrics