AI Tools
AI Tools for QA Engineers
Notes on how I use AI tools in my day-to-day testing and automation work. This isn't about AI testing AI — it's about using LLMs as a force multiplier in a QA automation workflow.
At Optilogic I've led efforts to adopt and standardize AI tooling across the QA team, including Claude Code, GitHub Copilot, and local LLMs via Ollama.
Tools I Use
| Tool | Type | Best For |
|---|---|---|
| Claude Code | Agentic CLI (Anthropic) | Complex multi-file tasks, refactoring, code review |
| GitHub Copilot | IDE autocomplete + chat | In-editor suggestions, quick generation |
| Ollama + Local LLM | Self-hosted | Sensitive codebases, offline use, experimentation |
Claude Code
Claude Code is an agentic CLI tool from Anthropic. Unlike Copilot, it can read the full codebase, run commands, edit multiple files, and reason across a whole task — not just complete the next line.
What I use it for in QA:
- Scaffolding test suites — Give it a feature spec or API schema, ask it to generate a full Pytest + Playwright test file with fixtures and page objects.
- Refactoring legacy tests — Point it at an old Selenium Java suite and have it convert patterns to modern equivalents.
- Generating test data — Ask it to write factory functions or seed scripts based on a data model.
- Writing documentation — First drafts of onboarding docs, README files, and inline comments.
- Code review — Ask it to review a PR diff for test coverage gaps, flaky test patterns, or missing assertions.
Tips: - Give it context upfront — tell it your framework, language, and conventions before asking it to generate code. - Review everything it writes. It's fast but not infallible, especially on domain-specific logic. - Use it for the boring parts: boilerplate, repetitive patterns, and first drafts. You still own the quality.
GitHub Copilot
Copilot lives in the IDE and shines at in-context autocomplete and quick generation. It's faster than Claude Code for single-file work where you already know what you want.
What I use it for:
- Completing repetitive test methods (it learns your patterns within the file)
- Generating docstrings and comments
- Quick one-off scripts
- Inline chat for "what does this do" questions
Tips: - It works best when the file already has examples to learn from. Write one test well and it will pattern-match the rest. - Use the chat panel for explanations; use autocomplete for generation. - Be skeptical of its assertions and imports — verify they're correct for your specific version of a library.
Ollama + Local LLMs
Ollama lets you run LLMs locally. Useful when: - You're working with sensitive internal code you can't send to a cloud API - You want to experiment with models (Mistral, Llama, Qwen, etc.) without API costs - You need offline access
Local models are weaker than Claude or GPT-4 for complex reasoning, but for code explanation, simple generation, and documentation they're useful and free after the initial setup.
Models worth trying for code work: - Qwen2.5-Coder — strong on code tasks - Mistral — good general-purpose, fast - Llama 3.2 — solid reasoning, Meta's open model
Practical AI Workflow for Test Automation
This is roughly how I integrate AI into a testing task:
- Understand the feature first — read the ticket, talk to a dev, look at the code. AI is faster when you already know what to ask.
- Use Claude Code to scaffold — generate the skeleton of the test file, page objects, and fixtures.
- Fill in with Copilot — let it autocomplete the repetitive parts as you work through the file.
- Review and run — always run the tests. AI-generated tests can be syntactically correct but logically wrong.
- Iterate — if a test is flaky or the assertions are weak, ask Claude Code to review it.
What AI Won't Do For You
- It won't understand your product. You still need domain knowledge to write meaningful tests.
- It won't catch logic bugs in the code under test — it can only test what you tell it to test.
- It won't keep tests maintainable. Over-generated test suites become a maintenance burden.
- It hallucinates APIs and method signatures — always verify against real docs.
The goal is to spend less time on boilerplate and more time on thinking about what actually needs to be tested.