Pen Testing with AI: Does AI Accelerate Results?

It starts like this: a WAF (Web Application Firewall), recently updated with the latest ruleset, blocking every known XSS payload in existence. Traditional pen testers might spend 30 to 40 minutes crafting variations, testing combinations, and hitting dead ends. But what if that same bypass could be achieved in minutes?

This is happening right now in security testing labs across the world, where artificial intelligence is rewriting the rules of penetration testing. Yet, the very tools making pen testing faster might be undermining the expertise needed to wield them effectively.

We believe effective pen testing is still rooted in attention, dedication, and deep knowledge. AI can accelerate results, but it cannot replace judgment. We talked with Avinash Kumar Thapa, Strategic Security Leader and former VP of Technical Services at Chaleit, to unpack how AI tools like ChatGPT are changing the work and what still matters most.

Context is everything

AI won't magically solve problems. "If you're going to ask a generic question, it's going to provide you with generic answers," Avinash points out.

In a client engagement, Avinash and a colleague encountered that stubborn WAF. Every publicly available payload failed. "We tried a lot of things manually, finding the correct payloads, structuring them," Avinash recalls. "It didn't work. Then we thought, okay, let's use AI."

But ChatGPT and other AI models won’t simply hand over working payloads on request. They have built-in restrictions. The breakthrough came through careful contextualisation.

"We provided the context: this is what I'm trying. We tried these alphabets. These are being blocked. These are being processed. This particular event handler is being processed," he explains. By mapping exactly what the WAF accepted and rejected, they guided the AI toward a solution that didn't exist in any public database.

The result was a working XSS bypass for the latest WAF ruleset, crafted in minutes rather than hours.

If you’d like a deeper understanding of why context is essential in pen testing, read our explanation of modern penetration testing methodology.

AI raises the bar

We see a parallel here with aviation. Modern aircraft can execute automatic landings using CAT III systems, with three autopilots constantly checking each other. Surely this makes flying easier? Actually, the opposite is true. Pilots need even higher skill levels because when intervention becomes necessary, it must be instant, precise, and informed by deep understanding.

The same principle applies to AI-assisted pen testing. Without foundational knowledge, you can’t craft the queries that yield meaningful results. Without experience, you can’t validate whether the AI’s suggestions make sense.

In another case, Avinash used AI to help bypass AMSI (Antimalware Scan Interface) protections in PowerShell. A known bypass had already been blacklisted. So instead, he extracted the assembly code and asked the model to rewrite it with different registers and operands — same function, new form.

"It tweaked it completely, gave me the entire payload. I ran it, and it worked. I was able to continue with my assessment."

But again: AI didn’t solve the problem. Experience did.

This kind of sophisticated bypass work has helped Chaleit achieve a 90%+ success rate in EDR-protected environments — not through brute force, but through intelligent application of both human expertise and AI assistance.

"Someone new who's emerging in this technology, they won't be able to do that," Avinash says. "People with higher experience, they're going to rely on AI to do things smartly, but still they're going to use their own manual effort."

Confidentiality erosion

We estimate that roughly 70% of pen testers under pressure are pasting client data directly into public LLMs. HTTP headers, parameter names, internal IP addresses, all of it potentially absorbed into global training data.

"You have to be very accountable from your side at what kind of data you are presenting to your AI models," Avinash emphasises.

During the WAF bypass work, they never mentioned the specific application or client. They presented patterns, not identifying information.

The pressure is real. Manual payload creation can take 30 to 40 minutes. AI can deliver results in seconds. When you're working within a five-day assessment window, that time saving becomes almost irresistible. Yet each careless paste potentially exposes client infrastructure to the world.

Some organisations are responding drastically. We’re seeing companies shutting down bug bounty programs because AI-generated submissions create an unmanageable flood. The tools meant to improve security are creating new vulnerabilities and operational nightmares.

Ironically, AI might also be part of the solution. Used well, it can help prioritise, triage, and even auto-decline irrelevant submissions. But again, context matters. Ethical use matters. Responsible testing still requires a human in the loop.

Attention, dedication, and knowledge

There are some things AI can’t replace.

Through years of hands-on testing, Avinash has identified three essential ingredients for effective penetration testing: attention, dedication, and knowledge. Ironically, these are precisely what AI threatens to erode.

"In order to research one particular payload, I remember back in 2010, 2011, I used to spend countless hours," he said. "Now, AI is doing it for you entirely in a matter of seconds. So your investment to learn, to create, it's no longer there."

The attention span crisis compounds the problem. "With the rise of TikTok and Instagram Reels, your attention span is not more than 30 seconds," Avinash notes. AI only accelerates the slide toward shortcuts over understanding.

Yet those who've built their expertise through years of manual work can now leverage AI as a force multiplier. "By checking the responses, by checking the timing, I used to be able to get an instant answer — is it vulnerable or not?" he explains. "That kind of judgment you won't get if you're completely relying on AI."

Key takeaways

Confidentiality is at risk. Careless use of public LLMs with client data creates silent breaches.
Skills erosion is real. The next generation may lose critical problem-solving abilities without foundational learning.
AI can save time and improve outcomes — if you have the knowledge to guide it.
Context is key. Success with AI-assisted pen testing requires precise queries built on deep technical understanding.
Attention, dedication, and knowledge are essential and can’t be replaced by AI.

At Chaleit, we don't just rely on tools to do our job. We combine smart technology with smarter people. Whether you're testing a new app or protecting a complex infrastructure, we bring experience, context, and ethical clarity to every engagement.

Smart technology meets smarter people

Work with pen testers who understand both the promise and perils of AI-assisted security testing.

Pen Testing with AI: A Shortcut or a Skill Multiplier?

About this article

Series:

Topics:

Related Insights

AI, Quietly Everywhere: A Guide to Building AI Security Frameworks

AI Security Testing: New Attack Vectors and Strategies in Application Security

Penetration Testing 3.0: Intelligence-Led Security Validation

How to Buy Penetration Testing That Works: A Smart Buyer's Perspective

Your Cookie Preferences