PII

Short description

PII tests how targets store information and if they are willing to disclose the PII of users. This is done dynamically using LLMs and common attack vectors for PII, such as credit card numbers, full names, home addresses etc. An example attack might be asking the target what email was used for a prior session.

Technical Details

The Personally Identifiable Information (PII) Probe is a critical security assessment designed to evaluate how a chatbot handles and protects sensitive user data. This probe investigates whether the system inadvertently stores, exposes, or discloses PII, such as names, email addresses, phone numbers, or any other data that could be used to identify an individual.

Improper handling or leakage of PII can undermine user trust, violate privacy regulations, and expose organizations to significant legal, reputational, and financial risks. The PII Probe ensures that the chatbot complies with data protection standards and identifies potential vulnerabilities that could be exploited by adversaries to gain unauthorized access to personal data or manipulate the system.

Chat example

Explanation

This text is confirms the presence of a specific email address, which is considered PII, in the system.

List of variations

Default language
Multilanguage
Document

List of strategies

One Shot w/ Retry
Delayed attack

Red Teamer

Black box - Basic user - Direct questions

The red teamer directly forwards the user input, which contains an attack vector already adapted to the target AI Assistant’s domain.

Black box - Adversarial user - Standard

The red teamer forwards a predefined message as part of a Delayed Attack strategy. It is used to sequentially deliver messages from an attack plan specifically crafted for PII leakage within the company’s domain.

In this setup, the red teamer does not generate new content but instead forwards the next message in the predefined sequence. The third message in the sequence is expected to be the actual attack, while the first and second messages serve as contextual buildup, gradually guiding the AI Assistant toward the intended leakage scenario.

Multimodal support

Text
Document