Files
wp-agentic-writer/docs/features/image-model-recommendations.md

20 KiB
Raw Blame History

WP Agentic Writer: Image Model Recommendations by Preset

Executive Summary

Your question: "Which image model should we use? We need to match the prompt style to avoid wasting money on bad images."

The answer: Different image models have different "prompt languages" and reasoning capabilities. Choose the right model for your preset, then generate prompts specifically for that model's strengths.

Recommended models (by preset):

  • Budget: FLUX.2 [klein] 4B (fast, cheap, handles simple prompts well)
  • Balanced: Riverflow V2 Max Preview (excellent prompt adherence, context-aware)
  • Premium: FLUX.2 [max] (frontier quality, best photorealism + complex prompts)

Table of Contents

  1. Image Model Comparison Matrix
  2. Budget Preset: FLUX.2 [klein] Recommendation
  3. Balanced Preset: Riverflow V2 Max Recommendation
  4. Premium Preset: FLUX.2 [max] Recommendation
  5. Prompt Styles by Model
  6. Implementation: Adjust Prompts per Model

Image Model Comparison Matrix

Aspect FLUX.2 [klein] Riverflow V2 Max FLUX.2 [max]
Architecture 4B parameters (lightweight) Medium (optimized) 32B parameters (frontier)
Best at Speed, cost, diagrams, simple scenes Photorealism, prompt understanding, details Complex scenes, photorealism, consistency
Prompt complexity Simple (2-3 sentences) Medium (detailed) Complex (very detailed, technical specs)
Text in images Poor Decent Excellent
Multi-reference No No Yes (up to 10 images)
Cost/image $0.0140.042 $0.03 flat $0.070.21
Speed Fast Medium Slower (higher quality)
Photorealism Good Very good Excellent
Consistency Good Very good Excellent
Prompt adherence Good (simple prompts) Excellent Excellent (complex prompts)
For blog articles Hero images, diagrams Professional photos, diagrams Flagship/hero images
Failure modes Complex scenes, text Rare Rare
When to use Budget tier, speed matters Balanced tier, default Premium tier, quality paramount

Budget Preset: FLUX.2 [klein] Recommendation

Why FLUX.2 [klein]?

✅ Cheapest: $0.014/MP (first megapixel)
✅ Fastest: Generates in ~3 seconds
✅ Good enough: Handles simple prompts very well
✅ Diagrams: Excellent for technical diagrams, dashboards
✅ Illustrations: Good for minimalist, illustrated styles
❌ Not ideal: Complex photorealistic scenes, text in images
❌ Not ideal: Multiple objects with precise spatial relationships

Price: $0.014 (first MP) + $0.001 (each additional MP)

  • Standard 1024×576 (16:9): ~$0.0150.020
  • Square 512×512 (1:1): ~$0.0110.015

Prompt Style for FLUX.2 [klein]

Template (SIMPLE, 2-3 sentences MAX):

[Subject/Dashboard/Diagram name], [key elements], [style], [colors]

Good prompts for klein:

1. HERO IMAGE (Dashboard)
"N8n workflow automation dashboard showing colorful nodes 
and connections on blue background, minimalist modern interface, 
professional SaaS design"

2. DIAGRAM (Simple technical)
"Workflow architecture diagram showing trigger, action, condition 
components with arrows, clean lines, blue and purple palette, 
technical illustration style"

3. ILLUSTRATION (Minimalist)
"Minimalist illustration of automation concept with interconnected 
gears and nodes, flat design, blues and greens, modern tech aesthetic"

4. SCREENSHOT (Simple)
"N8n interface showing workflow execution panel with status 
indicators, clean layout, professional dashboard view"

BAD prompts for klein (will waste money):

✗ "A hyper-detailed photorealistic photo of a developer working 
with N8n, cinematic lighting with volumetric fog, 4K quality, 
shot on RED camera with lut grading..."
↳ TOO COMPLEX: klein will fail or produce mediocre results

✗ "Dashboard showing text 'AUTOMATION HUB' in neon letters, 
very detailed typography, complex sci-fi design..."
↳ TEXT RENDERING: klein struggles with readable text

✗ "Intricate 3D render of a neural network architecture with 
thousands of nodes, each labeled with precise colors..."
↳ IMPOSSIBLE: klein can't handle this complexity

Implementation for Budget Preset

// Budget preset prompt generation
$prompt_klein = "N8n workflow dashboard screenshot, showing " .
    "colorful workflow nodes and connections on blue background, " .
    "minimalist professional interface";

// Keep it short and simple
$image_spec = [
    'model' => 'black-forest-labs/flux.2-klein',
    'prompt' => $prompt_klein,  // 2-3 sentences
    'size' => '1024x576',  // Standard blog size
    'guidance_scale' => 3.0,  // Default for klein
];

Balanced Preset: Riverflow V2 Max Recommendation

Why Riverflow V2 Max?

✅ Excellent prompt adherence: Understands nuanced instructions
✅ Photorealistic: Produces professional, polished images
✅ Context-aware: Handles detailed specifications well
✅ Details: Sharp textures, consistent lighting, realistic materials
✅ Speed: Reasonable (~8-15 seconds)
✅ Cost: Flat $0.03/image regardless of size (predictable)
❌ Slightly more expensive: $0.03 vs FLUX.2 klein's $0.014
❌ No multi-reference: Can't maintain consistency across multiple images

Price: $0.03 flat per image (regardless of size)

  • Any size: Consistent $0.03

Prompt Style for Riverflow V2 Max

Template (DETAILED but CONCISE, 3-4 sentences):

[Subject details], [action/context], [style/mood], [lighting], [technical specs]

Good prompts for Riverflow:

1. HERO IMAGE (Professional dashboard)
"N8n automation dashboard interface displaying real-time workflow 
execution with colorful nodes and connections. Clean minimalist design 
with blue accent colors, modern SaaS aesthetic. Professional product 
photography style with studio lighting, sharp details, clean layout"

2. DIAGRAM (Technical with style)
"Technical architecture diagram visualizing workflow components: 
trigger module, conditional routing, action nodes. Components connected 
with clean lines and arrows. Flat design style, blue and purple color 
palette, professional technical illustration, clear readability"

3. PROFESSIONAL PHOTO (Realistic context)
"A developer's laptop screen showing N8n automation workflows with 
detailed node visualization. Warm office lighting with subtle desk lamp, 
shallow depth of field, professional product photography, clear screen 
details visible, modern workspace setup"

4. INFOGRAPHIC (Educational)
"Educational infographic showing 'How N8n Automation Works' with 
step-by-step visual flow. Icons and diagrams arranged in logical sequence, 
minimalist design language, blue and grey colors, professional presentation 
style, clean typography and spacing"

LESS EFFECTIVE for Riverflow:

✗ "Hyper-detailed 8K photorealistic RAW file of a futuristic 
neural network with quantum computing effects..."
↳ OVERKILL: Riverflow is excellent but not designed for sci-fi/fantasy

✗ "Complex scene with 47 different UI elements, each precisely 
positioned with specific pixel values..."
↳ TOO PRESCRIPTIVE: Riverflow works best with conceptual direction, 
not pixel-perfect specs

✗ "Neon text glowing in the dark, cinematic with fog..."
↳ LESS IDEAL: Not Riverflow's strongest point (use FLUX.2 max for this)

Implementation for Balanced Preset

// Balanced preset prompt generation
$prompt_riverflow = "N8n automation dashboard interface displaying " .
    "real-time workflow execution with colorful nodes and connections. " .
    "Clean minimalist design with blue accent colors, modern SaaS aesthetic. " .
    "Professional product photography style with studio lighting, " .
    "sharp details, clean layout";

$image_spec = [
    'model' => 'sourceful/riverflow-v2-max',
    'prompt' => $prompt_riverflow,  // 3-4 sentences, detailed
    'size' => '1024x576',  // Standard blog size
    'guidance_scale' => 3.5,  // Moderate adherence
];

Premium Preset: FLUX.2 [max] Recommendation

Why FLUX.2 [max]?

✅ Frontier quality: Best-in-class photorealism and detail
✅ Complex prompts: Handles intricate specifications excellently
✅ Text rendering: Can generate readable text in images
✅ Multi-reference: Maintains consistency across up to 10 reference images
✅ Photorealism: Superior material properties, lighting, spatial logic
✅ Professional: Production-grade results for flagship content
❌ Expensive: $0.070.21 per image (highest cost)
❌ Slower: Takes ~20-30 seconds (but worth it for quality)

Price: $0.07 (first MP) + $0.03 (each additional MP)

  • Standard 1024×576 (16:9): ~$0.200.25
  • High res 2048×2048 (4K): ~$0.600.80

Prompt Style for FLUX.2 [max]

Template (VERY DETAILED, 4-6 sentences with technical specs):

[Technical foundation], [main subject + action], [environment/context], 
[lighting + mood], [style + aesthetics], [technical specifications]

Good prompts for FLUX.2 [max]:

1. HERO IMAGE (Flagship dashboard)
"Professional product photography of N8n automation dashboard interface 
on a modern laptop screen. The dashboard displays a real-time workflow 
with multiple interconnected nodes in blue, purple, and teal colors. 
The scene is set in a minimalist tech office with warm tungsten lighting 
creating soft shadows on black marble desk. Shot with shallow depth of field, 
sharp focus on the screen, bokeh background. Commercial photography style 
with crisp details, accurate color reproduction, professional white-balance. 
4K resolution with film-grain aesthetics"

2. CINEMATIC DIAGRAM (Complex technical)
"Technical architecture diagram for N8n automation system rendered in 
3D isometric perspective. The diagram shows trigger events (red nodes), 
conditional logic (blue nodes), and action outputs (green nodes) connected 
with flowing animated pathways. Modern tech aesthetic with gradient backgrounds, 
volumetric lighting effects, subtle motion blur on connector lines. 
Professional technical illustration merged with cinematic rendering. 
Rendered with physically-based materials, global illumination, and 
ambient occlusion for depth. 3D product visualization style"

3. HERO ILLUSTRATION (Complex artistic)
"Conceptual illustration of automation workflow represented as flowing 
water streams. Multiple streams of different colors merge and split, 
representing workflow logic and data routing. Streams flow through modern 
architectural elements (nodes, connections). Watercolor painting style 
blended with digital rendering. Cool color palette with blues and teals, 
warm accent lights. Wide-angle perspective showing expansive workflow. 
Ethereal, professional, educational aesthetic"

4. LIFESTYLE + PRODUCT (Complex scene)
"A product lifestyle photograph showing N8n dashboard on a developer's 
laptop alongside coffee, notebook, and desk setup in a modern home office. 
Natural morning sunlight streams through large windows creating warm golden 
hour lighting. The scene includes shallow depth of field with sharp focus on 
the laptop screen showing the N8n interface. Modern Scandinavian aesthetic, 
minimalist desk setup with wood and metal surfaces. Shot on full-frame camera 
with 35mm lens, warm color grading, professional lifestyle photography, 
authentic and aspirational atmosphere"

ADVANCED: Using FLUX.2 [max]'s strengths:

JSON Prompting (FLUX.2 max supports structured prompts):
{
  "scene": "Professional product photography",
  "subject": "N8n dashboard on laptop",
  "environment": "Modern tech office, minimalist desk",
  "lighting": "Warm tungsten, side-lighting, soft shadows",
  "style": "Commercial product photography",
  "color_palette": ["#003D9B", "#6F42C1", "#17A2B8"],
  "technical_specs": "4K, shallow DOF, f/2.8, 85mm lens",
  "mood": "Professional, modern, trustworthy"
}

Implementation for Premium Preset

// Premium preset prompt generation
$prompt_flux_max = "Professional product photography of N8n automation " .
    "dashboard displayed on a modern developer's laptop screen. " .
    "The dashboard shows real-time workflow execution with colorful " .
    "interconnected nodes. Scene set in minimalist tech office with " .
    "warm tungsten studio lighting creating soft shadows on black marble " .
    "desk surface. Shallow depth of field with sharp focus on screen, " .
    "bokeh background. Commercial photography style with crisp details, " .
    "accurate colors, white-balanced. 4K resolution";

$image_spec = [
    'model' => 'black-forest-labs/flux.2-max',
    'prompt' => $prompt_flux_max,  // 4-6 sentences, very detailed
    'size' => '1024x576',  // Can also do 2048×2048 for premium
    'guidance_scale' => 4.0,  // High adherence for complex prompts
];

Prompt Styles by Model

Quick Reference: How to Structure Prompts

Model Length Complexity Style Strength
FLUX.2 [klein] 1-2 sentences Simple Functional description Speed + cost
Riverflow V2 Max 3-4 sentences Medium Detailed but concise Photorealism + clarity
FLUX.2 [max] 4-6 sentences Complex Very detailed with specs Quality + complexity handling

Model-Specific Prompt Tips

FLUX.2 [klein] Tips

✓ Front-load the main subject (Klein prioritizes early tokens)
✓ Use simple adjectives: "minimalist", "clean", "blue"
✓ Avoid: "volumetric fog", "subsurface scattering", "ray-traced"
✓ Best for: Diagrams, simple dashboards, minimalist illustrations
✓ Template: [Main subject], [2 key details], [style]

Riverflow V2 Max Tips

✓ Include context and environment details
✓ Specify lighting style: "studio lighting", "golden hour"
✓ Use photography terms: "shallow DOF", "bokeh", "35mm lens"
✓ Can include moderate technical specs
✓ Best for: Professional photos, detailed diagrams, product shots
✓ Template: [Subject + context], [details], [lighting], [style]

FLUX.2 [max] Tips

✓ Can use very specific technical vocabulary
✓ Specify exact materials and properties
✓ Include color codes (HEX) for brand accuracy
✓ Can describe complex spatial relationships
✓ Use JSON prompting for highest precision
✓ Front-load important elements (tokens matter)
✓ Best for: Hero images, complex scenes, flagship content
✓ Template: [Technical specs], [subject], [environment], [lighting], [style], [resolution/format]

Implementation: Adjust Prompts per Model

Phase 1: Update Prompt Generation System

Modify the prompt generation agent to output model-specific prompts based on selected image model:

<?php
/**
 * Update: Modify generate_image_prompts() to output model-specific prompts
 */

public static function generate_image_prompts( 
    $article_markdown, 
    $placement_data, 
    $writing_model,
    $image_model,  // NEW PARAMETER
    $style_preference = 'minimalist'
) {
    // Select system prompt based on image model
    $system_prompt = self::get_prompt_generation_system_prompt_for_model(
        $image_model  // Adjusts prompt style based on model
    );
    
    $user_input = json_encode([
        'article' => $article_markdown,
        'placement_points' => $placement_data['image_placement_points'],
        'style_preference' => $style_preference,
        'image_count' => $placement_data['recommended_image_count'],
        'target_image_model' => $image_model,  // NEW: Tell agent which model
    ]);
    
    // ... rest of API call
}

/**
 * Return system prompt customized for image model
 */
private static function get_prompt_generation_system_prompt_for_model( $image_model ) {
    $model_configs = [
        'black-forest-labs/flux.2-klein' => [
            'name' => 'FLUX.2 [klein]',
            'prompt_length' => '1-2 sentences',
            'complexity' => 'simple',
            'guidance' => 'Keep prompts short and simple. Focus on main subject, 
                key details, and style. Avoid complex scenes or technical specifications.',
            'template' => 'Subject, key elements, style, color palette'
        ],
        'sourceful/riverflow-v2-max' => [
            'name' => 'Riverflow V2 Max',
            'prompt_length' => '3-4 sentences',
            'complexity' => 'medium-detailed',
            'guidance' => 'Include context, environment details, lighting style, 
                and photographic specifications. Model excels at photorealism.',
            'template' => 'Subject + context, environment details, lighting style, 
                photography style, technical specs'
        ],
        'black-forest-labs/flux.2-max' => [
            'name' => 'FLUX.2 [max]',
            'prompt_length' => '4-6 sentences',
            'complexity' => 'very-detailed-technical',
            'guidance' => 'Use detailed technical vocabulary. Include exact materials, 
                color codes (HEX), spatial relationships, and specifications. 
                Can use JSON prompting for maximum precision.',
            'template' => 'Technical foundation, main subject + action, environment, 
                lighting + mood, style + aesthetics, technical specifications'
        ]
    ];
    
    $config = $model_configs[ $image_model ] ?? $model_configs['sourceful/riverflow-v2-max'];
    
    return <<<PROMPT
System Prompt: Image Prompt Generator for {$config['name']}
═══════════════════════════════════════════════════════════════

You are an Image Prompt Engineer specializing in {$config['name']}.

Target Model: {$config['name']}
Prompt Length: {$config['prompt_length']}
Complexity Level: {$config['complexity']}

Your job: Create precise, cost-efficient prompts optimized for {$config['name']}.

{$config['guidance']}

Prompt Template for {$config['name']}:
{$config['template']}

Generate prompts that exploit {$config['name']}'s strengths and avoid its weaknesses.

[Rest of standard prompt generation instructions...]
PROMPT;
}

Phase 2: Update UI to Show Image Model Info

// In image review modal, show user:
// "Image Model: Riverflow V2 Max Preview
//  Cost per image: $0.03
//  Strength: Photorealism + detailed specifications"

Phase 3: Testing Prompts Before Full Generation

// Optional: Generate 1 test image first, show user, ask "Continue with this style?"
// This costs $0.03 but saves $0.09 if user wants different result

Cost Efficiency Recommendations

Pick the Right Model First

Never:

  • Use FLUX.2 [max] for simple diagrams (expensive, wastes quality)
  • Use FLUX.2 [klein] for complex photorealistic scenes (will fail)
  • Generate with wrong model, get bad result, regenerate with right model (double cost)

Always:

  • Match model to task complexity
  • Match prompt style to model capabilities
  • Test prompt with budget model first if unsure

Cost per Article by Model

Scenario Model Cost Quality
Hero + 2 diagrams FLUX.2 klein $0.0450.060 Good
Hero + professional photo Riverflow V2 Max $0.06 Excellent
Flagship hero image FLUX.2 max $0.200.25 Frontier
Flagship with klein FLUX.2 klein $0.020 + regenerate Waste

Summary: Model Recommendations by Preset

🟢 Budget Preset

Image Model: FLUX.2 [klein] 4B
Cost: $0.0140.042/image
Prompt style: Simple, 1-2 sentences, functional
Best for: Diagrams, dashboards, minimalist illustrations
Template: [Subject], [key elements], [style]

Image Model: Riverflow V2 Max Preview
Cost: $0.03/image flat
Prompt style: Detailed, 3-4 sentences, photorealistic
Best for: Professional photos, infographics, product shots
Template: [Subject + context], [details], [lighting], [style]

🔴 Premium Preset

Image Model: FLUX.2 [max]
Cost: $0.070.21/image
Prompt style: Very detailed, 4-6 sentences, technical specs
Best for: Flagship images, complex scenes, hero content
Template: [Tech foundation], [subject], [environment], [lighting], [style], [specs]


Document version: 1.0
Date: January 27, 2026
Status: Ready for Implementation