From Blog to Video: Building StreamStack's API Server with n8n Integration
Just wrapped up a solid coding session on StreamStack, and I'm pretty excited about what we got done today. We're getting really close to having a full blog-to-video pipeline that can take written content and automatically turn it into engaging videos.
The Big Migration: Raw HTTP to Express
I finally bit the bullet and migrated our server from the raw Node.js http module to Express. I know, I know - probably should've started with Express from day one, but sometimes you just start with what's simple and refactor later, right?
The new Express setup is much cleaner with proper middleware for security headers (helmet), CORS handling, and JSON parsing. I also improved the auth middleware to support both Authorization: Bearer and x-api-key headers, which gives us more flexibility for different integration scenarios.
Blog-to-Video Magic
The real meat of today's work was adding the ability to generate video scripts directly from blog post content. I added a new generateScriptFromContent() function that takes a blog post title, content, and niche, then uses Claude Haiku to transform it into a video script format.
I went with Claude Haiku over the more powerful models because for short video scripts, the speed and cost savings are worth the slight quality tradeoff. Plus, it's still plenty good for this use case.
EpsteinScan Visual Theme
We're working on a specific theme for EpsteinScan content, so I implemented a sleek dark theme with:
- Pure black background (#000000)
- Orange text (#c2693a) for headlines
- EPSTEINSCAN.ORG watermark in the bottom right
- Orange bottom border bar on every scene
- Switched from Inter to Playfair Display font for that serious, investigative journalism vibe
Voice Emotion Tags
One cool feature I added is emotion tag parsing for the text-to-speech generation. Now you can add tags like [whispers], [excited], [serious], or [dramatic pause] in your script, and they'll automatically map to different ElevenLabs voice settings. It's a small touch but really helps make the videos feel more dynamic.
Audio Quality Improvements
I also tackled audio normalization using ffmpeg's loudnorm filter. Each TTS scene gets normalized to -14 LUFS with -3dB true peak, which should give us much more consistent audio levels across different scenes. No more jarring volume changes between sentences.
Railway Deployment Prep
Got everything ready for Railway deployment with proper config files, environment variable examples, and health check endpoints. The server runs on port 3001 locally but will respect Railway's PORT environment variable in production.
What's Next
I'm really looking forward to getting this deployed and testing the full pipeline. The plan is to set up an n8n workflow that can automatically trigger video generation whenever new blog content is published.
Once we get persistent job storage figured out (currently using an in-memory Map that gets wiped on restart), this should be a pretty robust system for automated content creation.
The code is all committed and ready to go - just need to get those environment variables configured on Railway and we'll be live!