Building Production-Grade Secrets Scanning: From 5 Keywords to 700+ Patterns
I just wrapped up a major security upgrade that's been bugging me for weeks. My memstack-pro repo had this embarrassingly basic secrets scanning setup — just 5 regex patterns checking for obvious stuff like api_key and password. Meanwhile, I had a manual secrets-scanner skill that could catch 14+ patterns but required running it by hand. Time to fix this properly.
The Problem
Here's what I was working with:
- Pre-push hook: 5 generic keywords via regex (pretty much useless)
- Manual scanner: 14+ patterns but advisory-only
- Zero pre-commit protection
- No entropy analysis or service-specific patterns
Basically, if someone committed their AWS keys or a JWT token, my "security" would miss it completely.
The Upgrade
I decided to implement production-grade scanning that covers 700+ credential formats. The tricky part was making it work seamlessly across different development environments without breaking existing workflows.
Two-Layer Protection
I built a dual-layer approach:
Pre-commit scanning — catches secrets at staging time before they enter git history. This runs on git commit and scans only the staged files, so it's fast.
Pre-push scanning — final safety net that scans across multiple commits. If something slipped through or was committed directly, this catches it before it hits the remote.
Smart Binary Discovery
Here's where it gets interesting. The scanning tool I'm using doesn't always land on the system PATH (especially with Windows package managers). So I built a discovery cascade that tries multiple locations:
- Check if it's on PATH
- Try WinGet install location
- Try Scoop location
- Try Go install paths
If none of those work, it silently falls back to the original regex patterns. No broken builds, no setup friction.
Configuration That Makes Sense
I added a project-level config that excludes the obvious false positives:
- Test fixtures and example files
- Development diary entries
- Node modules and git internals
No point scanning fake credentials in test data.
Implementation Details
The hooks are pretty straightforward bash scripts that integrate with my existing memstack framework. They're registered in settings.json as PreToolUse entries, so they fire automatically on the right git commands.
I also updated my secrets-scanner skill documentation to reflect the new automated capabilities, while keeping it vendor-neutral (no tool names, just capability descriptions).
Why This Matters
Secrets in git history are a nightmare. Once they're pushed, rotating them is expensive and embarrassing. Having 700+ credential format detection running automatically means I can code fast without worrying about accidentally committing API keys, database URLs, or auth tokens.
The fallback strategy also means this works for anyone using the framework, regardless of what they have installed. Production-grade security shouldn't require a PhD in DevOps to set up.
What's Next
I might add entropy-based detection for custom secrets that don't match known patterns. CI/CD integration could be useful too. But for now, this gives me confidence that my repos are actually secure, not just security theater.
Time to ship some code without paranoia.