What are some limitations of red teaming as a method for testing GenAI tools/LLMs?

3.9k viewscircle icon4 Comments
Sort by:
CISO/CPO & Adjunct Law Professor in Finance (non-banking)2 years ago

Red teaming against an AI tool is challenging because opposed to static defenses and known repeatable responses LLMs have variability in their outputs. 

Each output is derived from calculating the distance between embeddings (words converted to numbers) and applying a framework of rules to the level one output. For example, the word "analysis" is usually close to the word "of" followed by a noun, that is a first level output. The first level out should be put through additional processing to ensure it makes sense in the specific (or as lawyers say instant) context. The framework could evaluate the word in from of the word analysis and also ensure the sentence doesn't end with "of". If the word before analysis is "lengthy" then "analysis" can end the sentence.

The way AI tools are usually constructed there are certain word groupings that "break" the tool for graphic output and perhaps there are derivations that break text only models. Inserting improper text is possible due to the challenges with edit checks for AI prompts. The improper text may be assimilated to corrupt the system.  Imagine MS Excel learning from the data which is input and a person telling excel that 2+2 =5 repeatedly. Eventually it would be corrupted, but the corruption might not be obvious on every use.

Even if text systems aren’t corrupted multi-modal (audio, video, image, input/output) tools are the future therefore just breaking images can hobble the system. Like any other tool there are attackers and defenders, and it is likely that the Nightshade poison pill will be nullified in the future.

MIT paper reference below.  https://arxiv.org/pdf/2310.13828.pdf

The difficulty of red teaming doesn’t allow companies breathing room though since improper outputs can run afoul of upcoming AI laws or current laws.  Internal compliance teams and certainly lawyers will be anxious for IT teams to check potential AI results.

Lightbulb on1 circle icon1 Reply
no title2 years ago

Nvidia has released Nemo Guardrails which can address the point to some extent but largely I agree what you mentioned and its very challenging to to Red Team specifically against GenAI<br>

Chair and Professor, Startup CTO in Education2 years ago

What is red teaming?

1 Reply
no title2 years ago

Red team is the group pretending to be an attacker, blue team is defense and purple is a combination.<br>

Content you might like

Determine product vision and roadmap13%

Orchestrate AI agents and tools to deliver software autonomously40%

Build AI/ML powered solutions for end users53%

Ground AI models with RAG and other techniques33%

Design guardrails and guidelines for ethical and secure use of AI60%

Build and manage robust AI pipelines and automate deployment20%

Scale and automate common AI capabilities and engineering tools 33%

Co-develop software solutions directly with business and customer teams20%

Design solution architecture 27%

Deploy and monitor AI models13%

Something else – share in comments

View Results

Very confident14%

Confident – there could be some shadow AI but I doubt it49%

Sort of confident – some shadow AI, but aware of the important stuff28%

Not confident – still trying to determine extent of GenAI use8%

View Results