This guide walks through enabling the reachability cache for the Mend self-hosted repository integration. It covers both supported caching options:
-
Mend-hosted cache (simplest, no infrastructure required).
-
Customer-hosted cache using your own S3 or S3-compatible bucket.
1. Background
When SCA Reachability is enabled, Mend analyzes each library to determine which vulnerable code is actually reachable from your application. As part of this analysis, Mend generates a "dot file" dependency graph for each library (for example, each .jar). The first time a library is encountered, its graph is calculated on the fly and then stored in a cache. On subsequent scans, the cached graph is reused instead of being recalculated.
The benefit of the cache is performance: reachability scans complete faster because previously analyzed libraries do not need to be reprocessed. The cache is shared across scans, so the more you scan, the greater the benefit.
You have two ways to provide that cache, described below. You only need to configure one of them.
2. Prerequisites (common to both options)
These apply regardless of which caching option you choose.
2.1 Reachability must be enabled
Reachability must be turned on in your scan settings, for example:
"scanSettings": {
"enableReachability": true
}
Reachability (and therefore the cache) currently applies to the supported reachability languages, including Java, JavaScript, and Python.
2.2 The orchestrator must be enabled
The scanner orchestrator must be active:
MEND_SCA_ORCHESTRATOR_ENABLED=true
2.3 Environment variables go on the scanner pods
All caching-related environment variables must be set on the scanner pods specifically, not the controller or any other component. The chart's key for injecting environment variables varies by version (commonly scanner.extraEnv or scanner.env); the examples below use scanner.extraEnv, so map it to whatever your chart exposes.
3. Option A: Mend-hosted cache (recommended to start)
With this option, Mend hosts and manages the cache for you. There is no bucket to provision, no credentials to manage, and no IAM policy to configure. This is the fastest way to get the cache working and is a good way to validate reachability end to end before deciding whether to host your own cache.
Configuration
Set the following on the scanner pods:
scanner:
extraEnv:
- name: MEND_SCA_ORCHESTRATOR_ENABLED
value: "true"
- name: MEND_SCA_REACHABILITY_CACHE
value: "true"
That is the entire configuration for the Mend-hosted option. Verify the cache is active per Section 5.
Note: Do not combine this with the customer-hosted variables in Section 4. Choose one option.
4. Option B: Customer-hosted cache (your own S3-compatible bucket)
With this option, the cache data is stored in a bucket that you own and control. This keeps the cached graphs within your own environment. The scanner supports both AWS S3 and other S3-compatible object stores (for example, MinIO, or any vendor that exposes an S3-compatible API).
4.1 Configuration variables
Set the following on the scanner pods. The bucket and prefix have defaults but should be set explicitly for clarity.
|
Variable |
Required |
Default |
Description |
|---|---|---|---|
|
|
Optional |
|
The S3 endpoint URL. Leave at default for AWS S3 (us-east-1), set to a regional AWS endpoint, or set to your S3-compatible server URL. |
|
|
Optional |
|
The bucket region. |
|
|
Optional |
|
The bucket name to use for the cache. |
|
|
Optional |
|
The key prefix (logical folder) under which cache objects are stored. |
|
|
Required* |
(none) |
Access key for the bucket. |
|
|
Required* |
(none) |
Secret key for the bucket. |
* See the authentication note in Section 4.2.
4.2 Important: authentication method
The scanner accepts an access key and secret key for the bucket. Configure authentication using these two variables:
REACHABILITY_AWS_S3_ACCESS_KEY=<your-access-key>
REACHABILITY_AWS_S3_SECRET_KEY=<your-secret-key>
Two points to be aware of:
-
Use long-lived keys. The scanner does not provide a field for a session token, so temporary/session credentials (for example, AWS keys beginning with
ASIA) cannot be used. On AWS, use a long-lived key (beginning withAKIA) issued to a dedicated IAM user. On other S3-compatible systems, use a standard (non-expiring) access/secret key pair created for a service account.
IAM role-based authentication is currently not working for the cache. On AWS, attaching an IAM role to the instance and omitting the keys is not sufficient at this time. A problem with the IAM role-based authentication method has been identified and Mend R&D is currently investigating it. Until it is resolved, please configure the access key and secret key as above.
4.3 Using AWS S3
-
Create a dedicated IAM user for the scanner (for example,
mend-reachability-cache). Do not use a personal user's credentials. -
Attach a least-privilege policy to that user. Replace
YOUR-BUCKET-NAMEand the prefix as needed:JSON{ "Version": "2012-10-17", "Statement": [ { "Sid": "ListAllBuckets", "Effect": "Allow", "Action": "s3:ListAllMyBuckets", "Resource": "*" }, { "Sid": "ListTargetBucket", "Effect": "Allow", "Action": "s3:ListBucket", "Resource": "arn:aws:s3:::YOUR-BUCKET-NAME" }, { "Sid": "ReadWriteCacheObjects", "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject" ], "Resource": "arn:aws:s3:::YOUR-BUCKET-NAME/dot-file-cache/*" } ] }Why
s3:ListAllMyBucketsis included: when using an access key and secret key, the scanner validates the S3 client at startup by listing buckets at the account level. This action cannot be scoped to a single bucket (its resource must be*), but it only exposes bucket names, not object contents. Without it, the scanner reports an authorization error similar to:S3 client is not valid: User: ... is not authorized to perform: s3:ListAllMyBuckets ...
Note: The s3:ListAllMyBuckets requirement is tied to the current access-key authentication path. It may change in a future scanner release. For now it should be included in the policy.
-
Generate an access key for the IAM user and record the access key ID (
AKIA...) and secret (shown only once). -
Create the bucket (for example
reachability-cache) in your chosen region if it does not already exist. -
Configure the scanner environment. Non-secret values inline; the keys referenced from a Secret (see Section 4.5):
YAMLscanner: extraEnv: - name: MEND_SCA_ORCHESTRATOR_ENABLED value: "true" - name: REACHABILITY_AWS_S3_REGION value: "us-east-2" - name: REACHABILITY_AWS_S3_BUCKET value: "reachability-cache" - name: REACHABILITY_AWS_S3_KEY_PREFIX value: "dot-file-cache" - name: REACHABILITY_AWS_S3_ACCESS_KEY valueFrom: secretKeyRef: name: reachability-cache-creds key: accessKey - name: REACHABILITY_AWS_S3_SECRET_KEY valueFrom: secretKeyRef: name: reachability-cache-creds key: secretKey # REACHABILITY_AWS_S3_ENDPOINT can be omitted for AWS S3; # the region above is sufficient for routing.
4.4 Using other S3-compatible storage (for example, MinIO)
The configuration is the same, with two differences: you point the endpoint at your own server, and you create the access policy in your storage system rather than in AWS IAM.
-
Create the bucket on your S3-compatible system (for example,
reachability-cache). -
Create a service account / user with an access key and secret key, and grant it read/write on the bucket and prefix. Many S3-compatible systems (including MinIO) accept the same policy JSON syntax shown in Section 4.3, so the policy above can typically be reused, including the
s3:ListAllMyBucketsstatement that the scanner uses for client validation. -
Configure the scanner environment, pointing the endpoint at your server (keys referenced from a Secret, see Section 4.5):
YAMLscanner: extraEnv: - name: MEND_SCA_ORCHESTRATOR_ENABLED value: "true" - name: REACHABILITY_AWS_S3_ENDPOINT value: "https://minio.your-domain.internal:9000" - name: REACHABILITY_AWS_S3_REGION value: "us-east-1" - name: REACHABILITY_AWS_S3_BUCKET value: "reachability-cache" - name: REACHABILITY_AWS_S3_KEY_PREFIX value: "dot-file-cache" - name: REACHABILITY_AWS_S3_ACCESS_KEY valueFrom: secretKeyRef: name: reachability-cache-creds key: accessKey - name: REACHABILITY_AWS_S3_SECRET_KEY valueFrom: secretKeyRef: name: reachability-cache-creds key: secretKeyNotes:
-
REACHABILITY_AWS_S3_ENDPOINTis required here and must point at your S3-compatible server (scheme, host, and port). -
REACHABILITY_AWS_S3_REGIONcan be set to whatever your system expects; many S3-compatible systems acceptus-east-1. -
If the endpoint uses a privately-signed certificate, ensure the scanner pod trusts the issuing CA.
-
4.5 The credentials Secret
The scanner env examples above reference a Secret named reachability-cache-creds with keys accessKey and secretKey. Create it through whatever mechanism you normally use (plain Secret, External Secrets, Sealed Secrets, Vault, etc.); the example assumes:
kubectl create secret generic reachability-cache-creds -n <namespace> \
--from-literal=accessKey=<access-key> \
--from-literal=secretKey=<secret-key>
Adjust the Secret name and keys in the secretKeyRef blocks if you prefer different names.
5. Verifying that the cache is working
By default, the scanner logs do not show much detail about caching. To confirm the cache is active, enable the diagnostic logging variable on the scanner pods:
scanner:
extraEnv:
- name: MEND_SCA_REACHABILITY_DEBUG_LOG
value: "true"
Then run a reachability scan on a repository in a supported language (Java, JavaScript, or Python) and check the scanner logs. With the debug variable enabled, you should see log lines describing cache activity; filtering on reachability, cache, or s3 helps surface them.
What you should expect to see. The examples below are trimmed and have context IDs redacted for readability. The log lines differ slightly between the two options: the initialization and lookup lines indicate which backend is in use, while the "cache hit" line is the same in both cases. In either case, filtering on reachability, cache, or s3 surfaces the relevant lines.
Option A (Mend-hosted cache). Three kinds of lines confirm the cache is working:
-
Initialization, indicating the scanner is using Mend's cloud storage service and is not configuring a direct S3 client:
wss-scanner | ... [Reachability] DEBUG c.m.p.c.s.a.repositories.dot.CloudS3 -- Mend cloud storage service enabled, skipping S3 configuration -
Cache lookups / reads, served over the Mend cloud storage service:
wss-scanner | ... [Reachability] DEBUG c.m.p.c.s.a.repositories.dot.CloudS3 -- Cloud storage GET request => status code 200 OK -
Cache hits, where an existing graph is read back instead of being recalculated:
wss-scanner | ... [Reachability] DEBUG c.m.p.c.converters.InputConverter -- Downloaded graph from S3 cache for boom-boom-0.4.2.tgz-0.4.2 wss-scanner | ... [Reachability] DEBUG c.m.p.c.converters.InputConverter -- Downloaded graph from S3 cache for debug-debug-2.2.0.tgz-2.2.0
Option B (customer-hosted cache). The same three kinds of lines, but the initialization and lookup lines reference your own S3-compatible bucket:
-
Client initialization, confirming the scanner built an S3 client using the credentials you provided:
wss-scanner | ... [Reachability] DEBUG c.m.p.c.s.a.repositories.dot.CloudS3 -- Building S3 client with credentials provided -
Cache lookups, one per library, checking whether a cached graph already exists under the prefix:
wss-scanner | ... [Reachability] DEBUG c.m.p.c.s.a.repositories.dot.CloudS3 -- MinIO S3 object status: dot-file-cache/js/v1.2/lodash-lodash-4.17.21.tgz-4.17.21_<hash>.dot wss-scanner | ... [Reachability] DEBUG c.m.p.c.s.a.repositories.dot.CloudS3 -- MinIO S3 object status: dot-file-cache/js/v1.2/handlebars-handlebars-1.0.2-beta.tgz-1.0.2-beta_<hash>.dot -
Cache hits, identical in form to Option A:
wss-scanner | ... [Reachability] DEBUG c.m.p.c.converters.InputConverter -- Downloaded graph from S3 cache for handlebars-handlebars-1.0.2-beta.tgz-1.0.2-beta
The full log lines also include a [CTX=...;SCAN_CTX=...;SCAN_ID=...] prefix and scanner metadata (agent, agent_version, engine, and so on), omitted here. For the customer-hosted option, the MinIO S3 object status wording reflects a cache lookup in this example, and the object key always sits under your configured REACHABILITY_AWS_S3_KEY_PREFIX (here, dot-file-cache/). In both options, seeing lookups/reads followed by downloads confirms the cache is being read; on a library's first encounter there is no cached graph, so the scanner computes it and uploads it for reuse on later scans.
For the customer-hosted option, also confirm objects accumulate in your bucket under the configured prefix, which proves the scanner is writing to it:
aws s3 ls s3://reachability-cache/dot-file-cache/ --region us-east-2
This diagnostic variable is intended for initial validation. Leave it on or remove it afterward per your logging preferences.
6. Troubleshooting
|
Symptom |
Likely cause |
Resolution |
|---|---|---|
|
No cache-related lines in the scanner logs at all |
Diagnostic logging not enabled |
Set |
|
Variables appear to be ignored |
Set on the wrong pods, or the chart's env key was mismatched |
Confirm they land on the scanner pods (verify with |
|
|
Policy missing the account-level list action |
Add the |
|
|
Object actions not granted, or prefix mismatch |
Ensure the policy allows |
|
|
Temporary (session) keys used, or wrong/inactive keys |
Use long-lived keys ( |
|
Cache appears inactive despite an IAM role being attached |
Known issue with IAM role-based authentication, under investigation by Mend R&D |
Use access key and secret key instead (Section 4.2). |