Generative AI

AITH07 Red-Teaming Your own Prompts

11/21/2024

2:30pm - 3:45pm

Level: Intermediate

Andreas Erben

CTO for MR and Applied AI

daenet

Generative AI can create wonderful but also horrible things. With it come new types of risks and attack vectors on systems.

In this session, Andreas talks about some of his experiences trying to understand risks and limitations by "red-team"ing his own usage. He will share observations he made as to some of the patterns that he believes can often circumvent prompt- or content filter-based protections.

With plenty of hands-on examples some of the unique properties of Large Language Models will be explored, and the cat-and-mouse game between attackers and defenders will be discussed.

You will also hear about various attack vectors that you may need to defend against when building AI systems based on Large Language Models or even building Large Language Models.

Furthermore, you will also get some insights into existing "off-the-shelf" solutions such as various Copilots that can be exposed through Red-Teaming and gain an understanding how your own solutions may be subject to similar attacks.

You will learn:

  • About attack vectors to Large Language Model based applications
  • How to red-team your own solutions
  • Ways to harden LLM based software