Novamind AI/ML Security Pipeline

Baseline SAST + quality scanning on an ML pipeline, with the gaps named honestly

Three-stage ML pipeline - data, training, serving API. Bandit and Pylint run on every push, with findings committed back to the repo as artifacts. Not full AI threat coverage. A floor that most ML teams do not have yet.

Year

2026

Role

DevSecOps for ML

Stack

Python, GitHub Actions

Coverage

SAST + quality

ML PipelineBanditPylintSASTGitHub ActionsEvidence Committed

github.com/gocko1004/novamind-ai-security-pipeline ->

High-level design
The ML pipeline, the data flow,
and where each scanner lives.
Three-stage ML pipeline on a single GitHub repo. Numbered arrows trace the commit-to-prediction path, 1 through 8. E is the evidence loop. Full step names in the legend.
Actor
ML Engineer
commits to main
Raw dataset
local / object store
External- github.com
GitHub repo
source of truth
Actions runner
security-scan.yml
On every push
security-reports/
committed back
Diffable history
ML Pipeline- python 3.11 / scikit-learn
data_pipeline.py
ingest + clean
Bandit
train.py
fit model
BanditPylint
api.py
serving endpoint
Bandit
model.pkl
pickle artefact / not scanned
pickle deserialisationsupply risk
Consumer
Client
HTTP /predict
1git push
2trigger
3SAST scan (Bandit / Pylint)
4raw data in
5processed dataset
6save pickle
7load_model()
8predict
Ecommit scan artefacts
What is covered
Bandit SAST on all three Python stages → Pylint quality on train.py → both reports committed to security-reports/ on every push. Diffable, not ephemeral.
What is not covered yet
The pickle artefact is the largest blind spot. Model scanning, adversarial input tests on /predict, training-data poisoning checks - next iteration. The case study says so out loud.
A floor, not a ceiling. SAST on every stage is the baseline most ML teams still skip.

What is not here yet

Model-artifact scanning for pickle-based payloads. Training-data poisoning detection. Adversarial input testing against the serving API. Garak on the endpoint.

These belong in the next iteration. I am not claiming the current version replaces them - I am claiming it is the starting floor most ML teams have not built.

What this shows about me

I apply DevSecOps to ML, not just web apps.

The same pipeline discipline, different attack surface.

I commit the evidence.

security-reports/ gets the scan output on every push. Diffable history.

I know the difference between SAST and AI security.

I do not conflate them. The case study says so.

I ship baselines, not perfect systems.

A floor others can build on beats a ceiling nobody reaches.

Outcome

A working scan baseline on an ML pipeline, with evidence shipped as repo artifacts. The case study names exactly what the next layer of AI-specific controls needs to cover.

ML stages covered3 (data, train, serve)

ScannersBandit + Pylint

Evidence artifactsJSON + text committed

Next layerModel + adversarial testing

Honest framingBaseline, not ceiling

Coursework originSecureAI hands-on lab

home/work/novamind

Book a call →email

← Home