Add AI hybrid generation workflow note
This commit is contained in:
262
AI_HYBRID_GENERATION_WORKFLOW.md
Normal file
262
AI_HYBRID_GENERATION_WORKFLOW.md
Normal file
@@ -0,0 +1,262 @@
|
|||||||
|
# AI Hybrid Generation Workflow
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Allow admins to generate either:
|
||||||
|
|
||||||
|
- a single AI question
|
||||||
|
- or multiple AI questions in one run
|
||||||
|
|
||||||
|
without losing control over the quality of each generated item.
|
||||||
|
|
||||||
|
The system should support both precision workflows and exploration workflows.
|
||||||
|
|
||||||
|
## Core Principle
|
||||||
|
|
||||||
|
Generation request and generated items must be treated as different things.
|
||||||
|
|
||||||
|
That means:
|
||||||
|
|
||||||
|
1. One admin action creates a **generation run**
|
||||||
|
2. One generation run can produce one or many **generated variants**
|
||||||
|
3. Each generated variant remains an individually reviewable item
|
||||||
|
|
||||||
|
This is the cleanest way to support both single and bulk generation.
|
||||||
|
|
||||||
|
## Why This Is Better
|
||||||
|
|
||||||
|
Admins do not always have the same intent.
|
||||||
|
|
||||||
|
### Precision mode
|
||||||
|
|
||||||
|
The admin wants:
|
||||||
|
|
||||||
|
- one strong output
|
||||||
|
- high control
|
||||||
|
- easy review
|
||||||
|
|
||||||
|
This is best served by single generation.
|
||||||
|
|
||||||
|
### Exploration mode
|
||||||
|
|
||||||
|
The admin wants:
|
||||||
|
|
||||||
|
- multiple candidates
|
||||||
|
- idea exploration
|
||||||
|
- later curation
|
||||||
|
|
||||||
|
This is best served by bulk generation.
|
||||||
|
|
||||||
|
A rigid one-size-fits-all generation flow is worse for both modes.
|
||||||
|
|
||||||
|
## Recommended Model
|
||||||
|
|
||||||
|
### Parent / Basis Question
|
||||||
|
|
||||||
|
The canonical source or promoted basis item.
|
||||||
|
|
||||||
|
### Generation Run
|
||||||
|
|
||||||
|
Represents one AI request.
|
||||||
|
|
||||||
|
Suggested fields:
|
||||||
|
|
||||||
|
- parent question id
|
||||||
|
- source question version id
|
||||||
|
- target difficulty
|
||||||
|
- requested count
|
||||||
|
- model
|
||||||
|
- prompt version
|
||||||
|
- created by
|
||||||
|
- created at
|
||||||
|
- optional operator notes
|
||||||
|
|
||||||
|
### Generated Variant
|
||||||
|
|
||||||
|
Each output item from the generation run.
|
||||||
|
|
||||||
|
Suggested fields:
|
||||||
|
|
||||||
|
- generation run id
|
||||||
|
- parent question id
|
||||||
|
- source version id
|
||||||
|
- difficulty
|
||||||
|
- status
|
||||||
|
- stem
|
||||||
|
- options
|
||||||
|
- answer
|
||||||
|
- explanation
|
||||||
|
- review notes
|
||||||
|
- reviewer
|
||||||
|
- reviewed at
|
||||||
|
|
||||||
|
## Required Lifecycle
|
||||||
|
|
||||||
|
Each generated item must be individually manageable.
|
||||||
|
|
||||||
|
Suggested statuses:
|
||||||
|
|
||||||
|
- `draft`
|
||||||
|
- `approved`
|
||||||
|
- `rejected`
|
||||||
|
- `archived`
|
||||||
|
- `stale`
|
||||||
|
|
||||||
|
This is required even when a run generates many items at once.
|
||||||
|
|
||||||
|
## UX Principle
|
||||||
|
|
||||||
|
Do not treat bulk output as one indivisible package.
|
||||||
|
|
||||||
|
Bulk generation should be:
|
||||||
|
|
||||||
|
- one producer action
|
||||||
|
- many independently reviewable outputs
|
||||||
|
|
||||||
|
This means the admin can:
|
||||||
|
|
||||||
|
- approve 2 items
|
||||||
|
- reject 1 item
|
||||||
|
- archive 1 item
|
||||||
|
- regenerate only one item
|
||||||
|
|
||||||
|
from the same generation run.
|
||||||
|
|
||||||
|
## Recommended Admin UX
|
||||||
|
|
||||||
|
Inside the parent question page:
|
||||||
|
|
||||||
|
### Generation Form
|
||||||
|
|
||||||
|
- target difficulty
|
||||||
|
- model
|
||||||
|
- count
|
||||||
|
- optional notes or style instructions
|
||||||
|
- generate button
|
||||||
|
|
||||||
|
### Guidance Text
|
||||||
|
|
||||||
|
The system should guide, not over-restrict.
|
||||||
|
|
||||||
|
Recommended copy:
|
||||||
|
|
||||||
|
- “You can generate one or many variants in one run.”
|
||||||
|
- “Recommended: 1–3 variants per run for better consistency and easier review.”
|
||||||
|
- “Larger runs may reduce cost per item but increase overlap, correlated mistakes, and review effort.”
|
||||||
|
|
||||||
|
### Result View
|
||||||
|
|
||||||
|
After generation, show each item separately with actions:
|
||||||
|
|
||||||
|
- approve
|
||||||
|
- reject
|
||||||
|
- archive
|
||||||
|
- edit
|
||||||
|
- regenerate this item
|
||||||
|
- compare with parent
|
||||||
|
|
||||||
|
## Recommendation vs Restriction
|
||||||
|
|
||||||
|
The product should not hard-limit normal admin workflow at very low counts like 2 or 3.
|
||||||
|
|
||||||
|
Instead:
|
||||||
|
|
||||||
|
- provide recommendation text in the UI
|
||||||
|
- allow single and bulk generation
|
||||||
|
- preserve admin control
|
||||||
|
|
||||||
|
However, the backend should still apply a technical safety ceiling.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
- no UX hard limit at 2 or 3
|
||||||
|
- backend safety limit at something like 20 or 50
|
||||||
|
|
||||||
|
This is not a workflow restriction. It is abuse and cost protection.
|
||||||
|
|
||||||
|
## Recommended Count Guidance
|
||||||
|
|
||||||
|
### 1 item
|
||||||
|
|
||||||
|
Best for:
|
||||||
|
|
||||||
|
- high quality
|
||||||
|
- careful review
|
||||||
|
- final production-ready generation
|
||||||
|
|
||||||
|
### 2–3 items
|
||||||
|
|
||||||
|
Best default.
|
||||||
|
|
||||||
|
Good balance of:
|
||||||
|
|
||||||
|
- cost
|
||||||
|
- quality
|
||||||
|
- review effort
|
||||||
|
|
||||||
|
### 4–8 items
|
||||||
|
|
||||||
|
Useful for exploration.
|
||||||
|
|
||||||
|
Tradeoff:
|
||||||
|
|
||||||
|
- more candidate variety
|
||||||
|
- heavier review burden
|
||||||
|
- higher chance of repeated structure and correlated errors
|
||||||
|
|
||||||
|
### More than 8 items
|
||||||
|
|
||||||
|
Should still be allowed if product policy permits, but treated as exploration mode.
|
||||||
|
|
||||||
|
The UI should warn that:
|
||||||
|
|
||||||
|
- review effort increases
|
||||||
|
- quality consistency may drop
|
||||||
|
- variants may become repetitive
|
||||||
|
|
||||||
|
## Cost and Quality Insight
|
||||||
|
|
||||||
|
Bulk generation can reduce cost per item because:
|
||||||
|
|
||||||
|
- parent context is sent once
|
||||||
|
- prompt overhead is amortized
|
||||||
|
|
||||||
|
But quality risk increases because:
|
||||||
|
|
||||||
|
- errors can repeat across all outputs in a run
|
||||||
|
- structure can become too similar
|
||||||
|
- weaker prompts can produce multiple low-quality siblings
|
||||||
|
|
||||||
|
So the best product design is:
|
||||||
|
|
||||||
|
- permit bulk
|
||||||
|
- recommend lower counts
|
||||||
|
- review outputs individually
|
||||||
|
|
||||||
|
## Recommended Policy
|
||||||
|
|
||||||
|
1. Allow both single and bulk generation
|
||||||
|
2. Keep generated items reviewable one-by-one
|
||||||
|
3. Store lineage through generation run metadata
|
||||||
|
4. Provide UI recommendations instead of rigid low hard caps
|
||||||
|
5. Enforce only a high backend safety cap
|
||||||
|
6. Keep parent question as the operational center of the workflow
|
||||||
|
|
||||||
|
## Product Direction
|
||||||
|
|
||||||
|
The ideal flow is:
|
||||||
|
|
||||||
|
1. Admin opens parent question
|
||||||
|
2. Admin chooses difficulty, model, and count
|
||||||
|
3. System creates one generation run
|
||||||
|
4. System creates one or many generated child variants
|
||||||
|
5. Admin reviews each child separately
|
||||||
|
6. Admin approves, rejects, archives, or regenerates per item
|
||||||
|
|
||||||
|
This gives:
|
||||||
|
|
||||||
|
- low friction
|
||||||
|
- high control
|
||||||
|
- strong auditability
|
||||||
|
- better quality governance
|
||||||
|
- flexibility for different admin intents
|
||||||
Reference in New Issue
Block a user