Shadow Hugging Face Models in Code

Overview

Mend AI detects shadow open-source AI models referenced in code by extracting direct and nearby references to models in widely used Hugging Face APIs, including pipelines, transformers, and diffusers.
Regex-based detection prioritizes accuracy while minimizing false-positives and reducing overall noise levels.

Limitations

Cases where string concatenation or complex variable assignments lead to model references are not currently supported.
Model names must be specified explicitly within the API calls (see examples at the bottom of this page).

Getting it done

Detected Hugging Face models are automatically added to the AI Model Inventory, where information about their origin, engine, and additional details will be available.

Code examples

Direct API Call with Model Name (pattern-based):

CODE

from transformers import pipeline
model = pipeline('text-classification', model='bert-base-uncased')

'bert-base-uncased' will be indicated as the detected model.

Diffusers (pattern-based):

CODE

from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

"CompVis/stable-diffusion-v1-4" will be extracted and indicated as the detected model.

Model Name stored in a Variable (AI-based):

CODE

from transformers import pipeline
model_name = 'bert-base-uncased'
model = pipeline('text-classification', model=model_name)

"bert-base-uncased" will be extracted and indicated as the detected model.

Model Name passed to Function (AI-based):

CODE

from transformers import pipeline
def load_model(name):
    return pipeline('text-classification', model=name)
model_instance = load_model('bert-base-uncased')

"bert-base-uncased" will be extracted and indicated as the detected model.