Home > ๐ค AI Blog | โฎ๏ธ
๐จ Smarter Image Generation โ Gemini Descriptions, Regeneration, and Model Research
๐ง Three refinements to the automated blog image pipeline: a two-stage prompt pipeline that produces better images, frontmatter metadata tracking, and an on-demand regeneration flag. ๐ Plus a deep dive into every image generation model available on Cloudflareโs free tier.
๐ The Problem With Raw Blog Prompts
๐จ Previously, the image generation pipeline sent the entire blog post as the prompt to Cloudflareโs Flux model. ๐ฌ This had two issues:
- ๐ Prompt overload โ FLUX.1 Schnell accepts prompts up to 2,048 characters, but blog posts routinely exceed that. ๐ The model silently truncated the input, often keeping only the frontmatter and opening paragraph.
- ๐ซ๏ธ Unfocused imagery โ Even when the prompt fit, a wall of Markdown about CI pipelines and YAML workflows isnโt a great image prompt. ๐ The model struggled to extract a coherent visual concept.
๐ง Solution: Two-Stage Prompt Pipeline
๐ The new flow introduces Gemini as a โcreative directorโ that reads the blog post and distills it into a focused image description:
Blog Post โ Gemini (text model) โ Concise Image Description โ Cloudflare FLUX โ Image
๐งฉ A new PromptDescriber type abstracts this step, making it injectable and testable:
type PromptDescriber = (content: string) => Promise<string> ๐ค The describeImageWithGemini function sends the blog content to gemini-3.1-flash-lite-preview (configurable via PROMPT_DESCRIBER_MODEL env var) with a system prompt requesting a concise visual description under 150 words, with no text in the image. โจ The result is a tight, evocative image prompt that Flux can work with.
๐ When GEMINI_API_KEY is available, the describer is automatically created by resolveImageProvider and threaded through to processNote and backfillImages. ๐ When unavailable, the fallback buildImagePrompt strips frontmatter, social media embeds, and meaningless markup then truncates to fit the 2,048-character input window.
๐ Frontmatter Metadata
๐ Every generated image now stamps three properties into the postโs frontmatter:
| ๐ท๏ธ Property | ๐ Value | ๐ฏ Purpose |
|---|---|---|
๐
image_date | ๐ ISO 8601 timestamp | โฐ When the image was generated |
๐ค image_model | ๐๏ธ Model identifier (e.g., @cf/black-forest-labs/flux-1-schnell) | ๐ Reproducibility |
๐ฌ image_prompt | ๐ The prompt sent to the image model | ๐ Debugging and prompt iteration |
๐งฉ A new updateFrontmatterFields function handles this generically โ it creates, updates, or preserves existing frontmatter fields, with proper YAML quoting for values containing special characters like @, :, or quotes. ๐ฏ Built with functional programming patterns: immutable values, reduce, and map instead of imperative loops.
๐ On-Demand Image Regeneration
๐จ Sometimes a generated image doesnโt capture the right mood. ๐ง The new regenerate_image frontmatter property gives manual control:
---
title: My Blog Post
regenerate_image: true
--- โ๏ธ When the pipeline encounters regenerate_image: true:
- ๐๏ธ Removes the existing image embed from the post
- ๐งน Deletes the old attachment file from disk
- ๐ Sets
regenerate_image: falseto prevent infinite loops - ๐ Proceeds with normal generation (describe โ generate โ embed โ metadata)
๐ This works in both the single-file generate-blog-image.ts and the batch backfill-blog-images.ts workflows. ๐ The backfill loop explicitly checks for the flag before skipping posts that already have images.
๐ Cloudflare Workers AI Image Models โ Free Tier Analysis
๐ Hereโs every text-to-image model available on Cloudflare Workers AI as of March 2026, sorted by quality tier and annotated with pricing and constraints. ๐ฐ All models share the 10,000 neurons/day free allocation (resets at 00:00 UTC).
๐ Tier 1: Premium Partner Models
๐ These produce the highest quality output but consume significantly more neurons per image.
| ๐ค Model | โก Neurons/Image (1024ร1024) | ๐ Free Images/Day | ๐ API Format | ๐ Notes |
|---|---|---|---|---|
๐ฅ @cf/black-forest-labs/flux-2-klein-9b | โก ~1,546 | ๐ ~6 | ๐ฆ Multipart | ๐ Flux 2 distilled 9B, best quality, editing support |
๐ฅ @cf/black-forest-labs/flux-2-dev | โก ~1,125 (20 steps) | ๐ ~8 | ๐ฆ Multipart | ๐ฏ Flux 2 full model, highly realistic, multi-reference |
๐ฅ @cf/leonardo/lucid-origin | โก ~660+ | ๐ ~15 | ๐ค Partner | โ๏ธ Leonardoโs most prompt-responsive model, great text rendering |
๐
@cf/leonardo/phoenix-1.0 | โก ~550+ | ๐ ~18 | ๐ค Partner | ๐ฏ Strong prompt adherence and coherent text |
โ ๏ธ Verdict: ๐ Impressive quality, but 6โ18 images/day is too constrained for batch backfill. ๐ง The multipart API format also requires code changes.
โญ Tier 2: Best Value โ FLUX.1 Schnell
| ๐ค Model | โก Neurons/Image (512ร512, 4 steps) | ๐ Free Images/Day | ๐ API Format | ๐ Notes |
|---|---|---|---|---|
๐ @cf/black-forest-labs/flux-1-schnell | โก ~43 | ๐ ~230 | ๐ JSON | ๐ง 12B params, 4 steps, fast, excellent prompt adherence |
๐ This is our current default and the sweet spot. ๐ฏ Key advantages:
- ๐ Simple JSON API:
{ prompt, steps }โ base64 JPEG. ๐ซ No SDK, no multipart hassle. - ๐ฏ Excellent prompt adherence: ๐ผ๏ธ Handles complex scenes and compositional prompts well.
- ๐ 230 free images/day: ๐ฆ More than enough for daily backfill of all blog series.
- โก Fast: ๐๏ธ 4 diffusion steps, near-instant generation.
- ๐ Prompt limit: ๐ 2,048 characters โ our Gemini-described prompts fit easily.
๐ฐ Tier 3: Budget Alternatives
| ๐ค Model | โก Neurons/Image | ๐ Free Images/Day | ๐ API Format | ๐ Notes |
|---|---|---|---|---|
๐น @cf/black-forest-labs/flux-2-klein-4b | โก ~31 (512ร512) | ๐ ~320 | ๐ฆ Multipart | ๐๏ธ Distilled 4B, fast but multipart API |
๐น @cf/stabilityai/stable-diffusion-xl-base-1.0 | โก Near-zero | ๐ Very high | ๐ JSON | ๐จ SDXL, good ecosystem, weaker prompt adherence |
๐น @cf/bytedance/stable-diffusion-xl-lightning | โก Low | ๐ High | ๐ JSON | โก SDXL Lightning, ultra-fast variant |
๐น @cf/lykon/dreamshaper-8-lcm | โก Low | ๐ High | ๐ JSON | ๐จ Artistic/stylized, LoRA-based |
๐ Notes: ๐จ SDXL models are free or nearly free but produce lower quality output โ worse hands, weaker text rendering, less compositional accuracy. ๐ง Flux-2-Klein-4b is promising but requires the multipart API.
๐ Model Recommendation
๐
Stick with @cf/black-forest-labs/flux-1-schnell. โ๏ธ It offers the best balance of:
- ๐ผ๏ธ Image quality (Flux 1 architecture, 12B parameters)
- ๐ฐ Cost efficiency (43 neurons = 230 images/day)
- ๐ API simplicity (JSON, no SDK, no multipart)
- โ Proven reliability (already in production)
๐ฎ The Flux 2 models are higher quality but their multipart API format, higher neuron cost, and โPartnerโ designation make them impractical for our free-tier batch workflow today. ๐ When Cloudflare eventually offers Flux 2 with a JSON API, switching would be a one-line default change.
๐งช Testing
โ 135 tests pass (up from 92). ๐ New test coverage includes:
- ๐งน
cleanContentForPromptโ Frontmatter stripping, social media embed removal, markdown syntax cleanup, code block removal - ๐
buildImagePromptโ Frontmatter stripping, truncation to 2048 chars, short content passthrough - ๐
extractFrontmatterValueโ String extraction, quoted values, missing keys, no-frontmatter case - ๐
shouldRegenerateImageโ True/false/absent/no-frontmatter cases - ๐๏ธ
removeImageEmbedโ Obsidian wiki syntax removal, prefix handling, newline collapse - ๐ฌ
quoteYamlValueโ Simple values, colons, at-signs, internal quotes, newlines - ๐ง
updateFrontmatterFieldsโ Add, update, create frontmatter, YAML quoting, boolean toggling - ๐
processNotewith regeneration โ Full regeneration flow, metadata insertion, describer integration - ๐ฆ
backfillImageswith regeneration โ Regeneration flag detection, describer passthrough - ๐
resolveImageProviderwith describer โ Describer creation with Gemini key, absence without - ๐ค
DEFAULT_DESCRIBER_MODELโ Verifies constant matchesgemini-3.1-flash-lite-preview
โ All 43 new tests plus all 92 existing tests pass.
๐ Key Design Decisions
- ๐งฉ Injectable
PromptDescriberโ Follows the same pattern asImageGenerator, making the Gemini description step mockable and testable without API calls - ๐ Automatic describer creation โ
resolveImageProvidercreates the describer whenGEMINI_API_KEYis available, regardless of which image provider is selected - ๐
regenerate_imageas frontmatter flag โ Simple, declarative, works with Obsidianโs editing workflow โ just flip a boolean to regenerate - ๐ Metadata in frontmatter โ
image_date,image_model,image_promptcreate a complete audit trail for every generated image - โญ FLUX.1 Schnell stays default โ Research confirms itโs the best available model for our constraints (free tier, JSON API, batch backfill)
- ๐ค Single default model constant โ
DEFAULT_DESCRIBER_MODELisgemini-3.1-flash-lite-preview(same as other Gemini text tasks), overrideable viaPROMPT_DESCRIBER_MODELenv var - ๐งฉ Functional
updateFrontmatterFieldsโ Refactored from imperative loops to immutablereduce/mappattern - ๐งน Smart prompt cleaning โ
buildImagePromptstrips frontmatter, social media embeds, and markdown syntax then truncates to fit the 2,048-character Cloudflare input window