Skip to main content
Skip table of contents

Shadow Hugging Face Models in Code

Overview

Mend AI uses regex-based detection to identify shadow open-source AI models referenced in code by extracting direct and nearby references to models in widely used Hugging Face APIs, including pipelines, transformers, and diffusers.
Regex-based detection prioritizes accuracy while minimizing false-positives and reducing overall noise levels.

Furthermore, Mend AI detects references to the top 100 most commonly used licenses and gated models in codebases.

Limitations

Cases where string concatenation or complex variable assignments lead to model references are not currently supported.

Getting it done

Detected Hugging Face models are automatically added to the AI Inventory, where information about their origin, engine, and additional details will be available.

image-20250116-083439.png

 Code examples

  1. Direct API Call with Model Name:

CODE
from transformers import pipeline
model = pipeline('text-classification', model='bert-base-uncased')

'bert-base-uncased' will be indicated as the detected model.

  1. Model Name Stored in a Variable:

CODE
from transformers import pipeline
model_name = 'bert-base-uncased'
model = pipeline('text-classification', model=model_name)

'bert-base-uncased' will be extracted from the variable assignment and indicated as the detected model.

  1. Model Name Passed to a Function:

CODE
from transformers import pipeline
def load_model(name):
    return pipeline('text-classification', model=name)
model_instance = load_model('bert-base-uncased')

'bert-base-uncased' will be extracted from the function call and indicated as the detected model.

  1. Diffusers:

CODE
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

"CompVis/stable-diffusion-v1-4" will be extracted and indicated as the detected model.

  1. Variable Passed to a Function with Intermediate Assignment:

CODE
from transformers import AutoModel
model_name = "gpt-neo-1.3B"
def initialize_model(name):
    model = AutoModel.from_pretrained(name)
    return model
ai_model = initialize_model(model_name)

'gpt-neo-1.3B' will be extracted from the intermediate variable and indicated as the detected model.

 

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.