Files
wp-agentic-writer/docs/features/image-best-flow-recommendation.md

364 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# WP Agentic Writer: Recommended Best Flow for Images (Cost-Optimized)
## The Challenge You Asked About
**Your question:**
> "After article generation, how do we get image placement with alt by writing agent, then generate recommended images? Need to be cost-efficient with image prompts."
**The answer:** Use the **writing agent itself** to analyze placement + generate prompts (tiny cost), then show user a preview before spending on image generation.
---
## Table of Contents
1. [Recommended Best Flow (Option A - SAFEST)](#recommended-best-flow-option-a---safest)
2. [Alternative Flows (B & C)](#alternative-flows-b--c)
3. [Your Configuration (from screenshot)](#your-configuration-from-screenshot)
4. [Cost Breakdown](#cost-breakdown)
5. [Implementation Priority](#implementation-priority)
---
## Recommended Best Flow (Option A - SAFEST)
This is the flow I recommend for **maximum cost control + quality** based on your plugin's design.
### Step-by-Step
```
┌──────────────────────────────────────────────────────────┐
│ USER ACTION: Generate Article │
│ (Using Writing Model: Claude 3.5 Sonnet from preset) │
└─────────────────┬────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│ PLUGIN AUTOMATIC (Backend) │
├──────────────────────────────────────────────────────────┤
│ Step 1: ANALYZE PLACEMENT │
│ • Model: Same Writing Model (Claude 3.5 Sonnet) │
│ • Input: Full article markdown │
│ • Output: JSON with placement points │
│ • Cost: $0.0008 (tiny token call) │
│ │
│ Step 2: GENERATE IMAGE PROMPTS │
│ • Model: Same Writing Model │
│ • Input: Article + placement points │
│ • Output: 3 image specs (prompt + alt + placement) │
│ • Cost: $0.0015 (tiny token call) │
│ │
│ Status: "Analyzing images..." → "Ready to review" │
└─────────────────┬────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│ MODAL: IMAGE PREVIEW (User Review - $0 cost) │
├──────────────────────────────────────────────────────────┤
│ │
│ "3 images planned for your article" │
│ │
│ ╔════════════════════════════════════════════════════╗ │
│ ║ IMAGE 1: HERO (After Introduction) ║ │
│ ║ ║ │
│ ║ Placement: After intro, before "Getting Started" ║ │
│ ║ Type: Hero/Dashboard ║ │
│ ║ ║ │
│ ║ Prompt (EDITABLE): ║ │
│ ║ "N8n workflow automation dashboard screenshot, ║ │
│ ║ showing colorful nodes on blue background, ║ │
│ ║ modern minimalist SaaS interface" ║ │
│ ║ ║ │
│ ║ Alt Text: "N8n automation dashboard with nodes" ║ │
│ ║ ║ │
│ ║ [Edit Prompt ✎] [Generate $0.03] [Skip] ║ │
│ ╚════════════════════════════════════════════════════╝ │
│ │
│ ╔════════════════════════════════════════════════════╗ │
│ ║ IMAGE 2: DIAGRAM (After Section 1) ║ │
│ ║ ║ │
│ ║ Placement: After "Understanding Workflows" ║ │
│ ║ Type: Technical Diagram ║ │
│ ║ ║ │
│ ║ Prompt (EDITABLE): ║ │
│ ║ "Workflow architecture diagram showing trigger, ║ │
│ ║ condition, action components with arrows, ║ │
│ ║ technical line-art style, blue palette" ║ │
│ ║ ║ │
│ ║ Alt Text: "Workflow trigger-condition-action flow" ║ │
│ ║ ║ │
│ ║ [Edit Prompt ✎] [Generate $0.03] [Skip] ║ │
│ ╚════════════════════════════════════════════════════╝ │
│ │
│ ╔════════════════════════════════════════════════════╗ │
│ ║ IMAGE 3: SCREENSHOT (Before Conclusion) ║ │
│ ║ ║ │
│ ║ Placement: Before "Conclusion" ║ │
│ ║ Type: Product Screenshot ║ │
│ ║ ║ │
│ ║ Prompt (EDITABLE): ║ │
│ ║ "N8n real-time monitoring dashboard showing ║ │
│ ║ workflow execution logs, status indicators, ║ │
│ ║ professional SaaS product design" ║ │
│ ║ ║ │
│ ║ Alt Text: "N8n real-time monitoring interface" ║ │
│ ║ ║ │
│ ║ [Edit Prompt ✎] [Generate $0.03] [Skip] ║ │
│ ╚════════════════════════════════════════════════════╝ │
│ │
│ ───────────────────────────────────────────────────── │
│ Cost Estimate: Individual generation │
│ • Generate all 3: $0.090.21 (based on image tier) │
│ • Generate 2: $0.060.14 │
│ • Generate 1: $0.030.07 │
│ │
│ [Generate All 3] [Generate Selected] [Skip Images] │
│ [Cancel] │
└──────────────────┬───────────────────────────────────────┘
USER CHOOSES (examples):
• Click [Generate All 3] → All images generated now
• Click [Generate] on Image 1 only → Hero only
• Edit Image 1 prompt, then [Generate] → Custom prompt
• Click [Skip Images] → No images, save cost
┌──────────────────────────────────────────────────────────┐
│ AUTOMATIC IMAGE INSERTION │
├──────────────────────────────────────────────────────────┤
│ For each generated image: │
│ 1. Download image from FLUX.2/image model │
│ 2. Upload to WordPress media library │
│ 3. Insert into article at placement point │
│ 4. Add alt text automatically │
│ │
│ Status: "Inserting images..." → "Done!" │
└─────────────────┬────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│ FINAL RESULT: Article with Images │
├──────────────────────────────────────────────────────────┤
│ │
│ # Getting Started with N8n Automation │
│ │
│ Introduction paragraph... │
│ │
│ ![N8n automation dashboard with nodes](image1.jpg) │
│ │
│ ## Getting Started │
│ Content... │
│ │
│ ## Understanding Workflows │
│ Content... │
│ │
│ ![Workflow trigger-condition-action flow](image2.jpg) │
│ │
│ ## Advanced Monitoring │
│ Content... │
│ │
│ ![N8n real-time monitoring interface](image3.jpg) │
│ │
│ [Preview in Gutenberg] [Publish] [Download MD] │
└──────────────────────────────────────────────────────────┘
```
### Key Features of Option A
**Cost control:** User sees cost before spending
**Quality control:** Can edit prompts before generation
**Flexibility:** Generate 0, 1, 2, or 3 images
**User review:** Know exactly what images they'll get
**Selective generation:** Generate only what matters
**Smart placement:** Analyzed by writing agent (best understanding)
**Efficient prompts:** Precise, contextual, no trial-and-error
### Costs with Option A
| Scenario | Analysis | Prompts | Images | Total |
|----------|----------|---------|--------|-------|
| User generates all 3 | $0.0008 | $0.0015 | $0.090.21 | $0.0920.212 |
| User generates 2 | $0.0008 | $0.0015 | $0.060.14 | $0.0630.142 |
| User generates 1 (hero) | $0.0008 | $0.0015 | $0.030.07 | $0.0320.072 |
| User skips images | $0.0008 | $0.0015 | $0 | $0.0023 |
**Best case:** User generates 1 hero = **$0.0320.072/article** (vs $0.210.70 with trial-and-error)
---
## Alternative Flows (B & C)
### Option B: Automatic Full Generation (FASTEST)
```
Article generated
Plugin automatically generates ALL images without review
"Article + images ready!" (1-2 minutes total)
```
**Pros:** One-click, minimal user interaction
**Cons:** Always costs full image budget (no user control)
**Cost:** Full $0.120.35 (analysis + all images always generated)
**Use when:** User has unlimited budget OR you offer it as "premium fast mode"
---
### Option C: Smart Selective with Recommendations (BALANCED)
```
Similar to Option A, but plugin recommends:
- "Hero image has best impact/cost ratio" [Generate hero]
- "Diagrams help understanding" [Generate diagram?]
- "Screenshot is optional" [Generate?]
```
**Pros:** Guides user toward cost-effective choices
**Cons:** Slightly more UI complexity
**Cost:** User-controlled (guided)
**Use when:** You want to educate users about cost-benefit tradeoffs
---
## Your Configuration (from screenshot)
Based on your current model configuration:
```
Chat Model: Google: Gemini 2.5 Flash
Clarity Model: Google: Gemini 2.5 Flash
Planning Model: Google: Gemini 2.5 Flash
Writing Model: Anthropic: Claude 3.5 Sonnet
Refinement Model: Anthropic: Claude 3.5 Sonnet
Image Model: Gpt 4o (or FLUX.2 from preset)
```
### Recommended Implementation
```php
// Option A implementation (safest, recommended)
// 1. After article generation, automatically:
$placement_data = analyze_article_for_images(
$article,
'anthropic/claude-3.5-sonnet' // Use same writing model
);
// 2. Generate prompts
$image_specs = generate_image_prompts(
$article,
$placement_data,
'anthropic/claude-3.5-sonnet' // Same model
);
// 3. Show UI (don't generate images yet)
show_image_review_modal($image_specs);
// 4. User clicks [Generate All] or individual [Generate]
// 5. Only then call image generation
// Cost so far: $0.0023 (tiny)
// User controls image generation cost: $0.030.21
```
---
## Cost Breakdown
### Analysis + Prompt Generation (Automatic, Non-Optional)
| Task | Tokens In | Tokens Out | Cost |
|------|-----------|------------|------|
| Placement analysis | 2,000 | 800 | $0.0008 |
| Prompt generation | 3,000 | 1,000 | $0.0015 |
| **Total** | **5,000** | **1,800** | **$0.0023** |
**This is already paid by article generation (uses writing model already called).**
### Image Generation (User-Controlled)
**Per image (based on model tier):**
| Image Model | Cost/Image | 3 Images |
|------------|-----------|----------|
| FLUX.2 klein (Budget) | $0.030.05 | $0.090.15 |
| Riverflow/FLUX.2 Pro (Balanced) | $0.060.10 | $0.180.30 |
| FLUX.2 max (Premium) | $0.070.21 | $0.210.63 |
### Total Article Cost
| Scenario | Text | Analysis | Prompts | Images | Total |
|----------|------|----------|---------|--------|-------|
| Article only | $0.030.07 | $0.0008 | $0.0015 | $0 | **$0.0320.072** |
| Article + 1 hero | $0.030.07 | $0.0008 | $0.0015 | $0.030.21 | **$0.0620.292** |
| Article + 2 images | $0.030.07 | $0.0008 | $0.0015 | $0.060.42 | **$0.0920.492** |
| Article + 3 images | $0.030.07 | $0.0008 | $0.0015 | $0.090.63 | **$0.1220.702** |
---
## Implementation Priority
### Phase 1: Core Logic (3-4 hours)
```php
analyze_article_for_images() // Identify placements
generate_image_prompts() // Create specs
generate_image_from_prompt() // Call image model
insert_images_into_article() // Embed in markdown
```
### Phase 2: User Interface (4-5 hours)
```php
Image review modal UI // Show 3 specs
[Generate] button per image // Individual generation
[Generate All] button // Batch generation
[Edit Prompt] capability // Let users customize
Cost calculator display // Show estimated cost
```
### Phase 3: Polish (2-3 hours)
```php
Image preview before insertion // Show user the image
Error handling + retry logic // Handle failures
Success notifications // Feedback
Progress indicators // "Generating image 2/3..."
```
---
## Why Option A is Best for Your Plugin
1. **User controls costs** → They see preview before spending
2. **Respects budgets** → Budget tier users generate 1 image
3. **Quality focus** → Users can edit prompts if needed
4. **Flexible** → Some users skip images entirely (saves costs)
5. **Educational** → Users learn what good prompts look like
6. **Smart prompts** → Using writing agent (best context understanding)
---
## Summary: Recommended Best Flow
```
AUTOMATIC (Backend):
1. Analyze article for placement → $0.0008
2. Generate image specs/prompts → $0.0015
3. Show user preview modal → $0 (free review)
MANUAL (User Selects):
4. User clicks [Generate] on images → User controls cost
5. Plugin inserts into article → Automatic
RESULT:
- Article + images ready for Gutenberg
- User spent only what they wanted
- Total cost: $0.0320.702 (user-controlled)
- Quality: High (smart placement + customizable prompts)
```
---
**Document version:** 1.0
**Date:** January 27, 2026
**Status:** Ready for Implementation