Job Title: QA Engineer – Generative AI Systems
Role Description: We are building production-grade Generative AI applications, including LLM-powered systems, AI agents, multi-agent orchestration, Retrieval-Augmented Generation (RAG) pipelines, and AI-driven APIs.
We are looking for a QA Engineer with 2–3 years of experience who is passionate about testing complex AI and web applications. You will ensure our AI products are robust, reliable, and production-ready by designing and executing tests that capture edge cases, inconsistent outputs, and integration challenges unique to AI systems.
This role is ideal for someone curious, analytical, and hands-on with testing AI, APIs, and web applications.
Location: [Hybrid / Onsite]
Relevant Work Experience: 2-4 Years
Work Location: Pune
Key Responsibilities:
AI & System Testing
- Design and execute test cases for LLM-driven apps, AI agents, and RAG pipelines.
- Perform black-box, functional, regression, integration, and exploratory testing.
- Validate API responses for correctness, consistency, latency, and error handling.
- Test prompts across variations, ambiguous inputs, and adversarial cases.
- Evaluate multi-agent workflows, tool usage, and agent-to-agent interactions.
- Detect logical inconsistencies, hallucinations, and unsafe outputs.
- Conduct performance and stress tests for APIs and AI pipelines.
- Collaborate with developers and AI engineers to reproduce and resolve issues.
- Maintain clear QA documentation, bug reports, and test coverage records.
Test Design & Automation
- Build structured test cases for:
- Prompt variations and context limits
- Multi-turn conversations
- Tool or function-calling workflows
- Structured output validation (JSON/schema)
- Develop automated regression suites for APIs and workflows.
- Support creation of evaluation benchmarks and red-teaming frameworks.
API & Integration Testing
- Validate AI APIs for performance, error handling, streaming responses, and token efficiency.
- Ensure seamless integration across microservices, AI pipelines, and web interfaces.
- Test compatibility across model providers and environments.
Safety, Security & Quality Monitoring
- Test for prompt injection and data leakage vulnerabilities.
- Evaluate bias, fairness, and adherence to AI governance standards.
- Define measurable QA metrics and track AI reliability over time.
Required Qualifications
- 2–3 years of QA experience, ideally with web applications and APIs.
- Hands-on experience testing AI/ML-powered applications (chatbots, NLP systems, LLMs).
Strong knowledge of:
- REST APIs & JSON validation
- Automated testing frameworks
- Load and performance testing
- Experience with Python or similar scripting languages.
- Understanding of prompt engineering and LLM behavior.
- Strong analytical, investigative, and problem-solving skills.
Preferred Qualifications
- Exposure to RAG pipelines, vector databases, or agent frameworks.
- Experience with AI evaluation frameworks, A/B testing, and model benchmarking.
- Knowledge of AI safety, red-teaming, and adversarial testing.
- Familiarity with OpenAI or other LLM APIs.
- Understanding of CI/CD pipelines and release processes.
- Experience in fast-paced product or startup environments.
Why This Role Is Unique
- Test non-deterministic AI systems where the same input may yield different outputs.
- Design QA strategies for complex, multi-agent AI workflows.
- Play a key role in shaping the reliability and safety of next-generation AI systems.
Success Metrics
- Reduced production AI failures and hallucinations.
- Stable evaluation benchmarks and measurable QA improvements.
- High API reliability and minimal incident rates.
- Well-documented system limitations and edge cases.
Why Join Us
- Work on cutting-edge Generative AI products.
- Influence the QA layer for multi-agent and LLM-based systems.
- Collaborate with AI engineers, researchers, and product leaders.
- Build frameworks for testing AI systems that "think."
If you are passionate about breaking complex AI systems before users do, this is the role for you.