The Shadow AI Report
This feature requires an LLM entitlement. More details about signing up for this feature will be shared upon request. The Shadow AI report is currently a closed beta.
Overview
With Mend AI, you can generate an awareness report (Shadow AI report) which provides a detailed map of AI usage across the organization. This report offers visibility into the volume of AI usage across different products, projects, and organizational units.
The report is generated using the Software Bill of Materials (SBOM) and identifies key indicators for various AI technologies, processes, frameworks and patterns, including:
3d-Party Large Language Model (LLM) APIs: Identifiers for known LLM APIs such as OpenAI, Azure, Bedrock, etc.
Open ML models: Identifiers for usage of open models from registries such as HuggingFace & Kaggle
Embedders: Utilization of embedding libraries such as HuggingFace’s
sentence-transformers
, and their role in a technology or framework, e.g., RAG or Autonomous AgentץRetrieval-Augmented Generation (RAG): Indicators of a project implementing a RAG process.
Autonomous Agents: Detection of agents built with libraries like Langchain.
Vector DB: Identifiers for use of vector DB clients such as
llama-index
,pinecone
,pg-vector (postgres)
, etc.
One example of identifying an AI technology based on library indicators is the RAG deduction rule:
Retrieval-Augmented Generation (RAG)
Goal: Enhance the capabilities of language models by retrieving relevant information to inform generation.
Required Library Types:
Vector Databases (for storing and retrieving embeddings) Elasticsearch, Pinecone, Weaviate
Embedding Libraries (for generating embeddings) Sentence-Transformers, spaCy, Gensim
LLM Libraries (for text generation) Transformers (Hugging Face), GPT-3, T5
Combination Example:
Use Sentence-Transformers to create embeddings of documents and queries.
Store embeddings in Elasticsearch.
Retrieve relevant documents using Elasticsearch.
Use a model from Transformers to generate responses based on retrieved documents.
These indicators help to better understand the AI-related risk landscape of the organization, and allow it to mitigate the risks more effectively.
Overall, with the Shadow AI report an organization has more visibility into the following potential risks:
OWASP Top 10 LLM risks
Data security vulnerabilities
ML model license compliance
Malicious ML model activity
ML model vulnerabilities
Getting It Done
Prerequisites
Your Mend organization has an LLM entitlement
Your Mend organization has access to the Mend Platform to view the results
Scan your dependencies with the Mend CLI SCA
To run a scan and generate results using the new AI BoM report capability, follow the steps outlined in our Scan your open source components (SCA) with the Mend CLI documentation for initiating a scan. Once the scan is completed, you can view the AI Technologies embedded in the BoM report in the Mend Platform to analyze the results.
View the results in the Mend Platform
You can access the Shadow AI report on two levels:
Project - a project that is identified to contain AI technologies will carry dedicated Shadow AI labels
Project Libraries - in a project SBOM there is an added Tech tag on each library that belongs to a group of libraries that make up an AI technology.
To view Shadow AI project labels, navigate to an application’s project list in the Mend Platform.
Next, in order to view the libraries of a Shadow AI labeled project, it is required to choose a labeled project and navigate to the SBOM view. In the SBOM view, any library that was identified as part of the Shadow AI label will have a similar indication to the project label in the Tech Tags column:
Note that a project may be labeled with multiple Shadow AI labels, and a library may have multiple Tech Tags.