263 lines
5.0 KiB
Markdown
263 lines
5.0 KiB
Markdown
# AI Hybrid Generation Workflow
|
||
|
||
## Goal
|
||
|
||
Allow admins to generate either:
|
||
|
||
- a single AI question
|
||
- or multiple AI questions in one run
|
||
|
||
without losing control over the quality of each generated item.
|
||
|
||
The system should support both precision workflows and exploration workflows.
|
||
|
||
## Core Principle
|
||
|
||
Generation request and generated items must be treated as different things.
|
||
|
||
That means:
|
||
|
||
1. One admin action creates a **generation run**
|
||
2. One generation run can produce one or many **generated variants**
|
||
3. Each generated variant remains an individually reviewable item
|
||
|
||
This is the cleanest way to support both single and bulk generation.
|
||
|
||
## Why This Is Better
|
||
|
||
Admins do not always have the same intent.
|
||
|
||
### Precision mode
|
||
|
||
The admin wants:
|
||
|
||
- one strong output
|
||
- high control
|
||
- easy review
|
||
|
||
This is best served by single generation.
|
||
|
||
### Exploration mode
|
||
|
||
The admin wants:
|
||
|
||
- multiple candidates
|
||
- idea exploration
|
||
- later curation
|
||
|
||
This is best served by bulk generation.
|
||
|
||
A rigid one-size-fits-all generation flow is worse for both modes.
|
||
|
||
## Recommended Model
|
||
|
||
### Parent / Basis Question
|
||
|
||
The canonical source or promoted basis item.
|
||
|
||
### Generation Run
|
||
|
||
Represents one AI request.
|
||
|
||
Suggested fields:
|
||
|
||
- parent question id
|
||
- source question version id
|
||
- target difficulty
|
||
- requested count
|
||
- model
|
||
- prompt version
|
||
- created by
|
||
- created at
|
||
- optional operator notes
|
||
|
||
### Generated Variant
|
||
|
||
Each output item from the generation run.
|
||
|
||
Suggested fields:
|
||
|
||
- generation run id
|
||
- parent question id
|
||
- source version id
|
||
- difficulty
|
||
- status
|
||
- stem
|
||
- options
|
||
- answer
|
||
- explanation
|
||
- review notes
|
||
- reviewer
|
||
- reviewed at
|
||
|
||
## Required Lifecycle
|
||
|
||
Each generated item must be individually manageable.
|
||
|
||
Suggested statuses:
|
||
|
||
- `draft`
|
||
- `approved`
|
||
- `rejected`
|
||
- `archived`
|
||
- `stale`
|
||
|
||
This is required even when a run generates many items at once.
|
||
|
||
## UX Principle
|
||
|
||
Do not treat bulk output as one indivisible package.
|
||
|
||
Bulk generation should be:
|
||
|
||
- one producer action
|
||
- many independently reviewable outputs
|
||
|
||
This means the admin can:
|
||
|
||
- approve 2 items
|
||
- reject 1 item
|
||
- archive 1 item
|
||
- regenerate only one item
|
||
|
||
from the same generation run.
|
||
|
||
## Recommended Admin UX
|
||
|
||
Inside the parent question page:
|
||
|
||
### Generation Form
|
||
|
||
- target difficulty
|
||
- model
|
||
- count
|
||
- optional notes or style instructions
|
||
- generate button
|
||
|
||
### Guidance Text
|
||
|
||
The system should guide, not over-restrict.
|
||
|
||
Recommended copy:
|
||
|
||
- “You can generate one or many variants in one run.”
|
||
- “Recommended: 1–3 variants per run for better consistency and easier review.”
|
||
- “Larger runs may reduce cost per item but increase overlap, correlated mistakes, and review effort.”
|
||
|
||
### Result View
|
||
|
||
After generation, show each item separately with actions:
|
||
|
||
- approve
|
||
- reject
|
||
- archive
|
||
- edit
|
||
- regenerate this item
|
||
- compare with parent
|
||
|
||
## Recommendation vs Restriction
|
||
|
||
The product should not hard-limit normal admin workflow at very low counts like 2 or 3.
|
||
|
||
Instead:
|
||
|
||
- provide recommendation text in the UI
|
||
- allow single and bulk generation
|
||
- preserve admin control
|
||
|
||
However, the backend should still apply a technical safety ceiling.
|
||
|
||
Example:
|
||
|
||
- no UX hard limit at 2 or 3
|
||
- backend safety limit at something like 20 or 50
|
||
|
||
This is not a workflow restriction. It is abuse and cost protection.
|
||
|
||
## Recommended Count Guidance
|
||
|
||
### 1 item
|
||
|
||
Best for:
|
||
|
||
- high quality
|
||
- careful review
|
||
- final production-ready generation
|
||
|
||
### 2–3 items
|
||
|
||
Best default.
|
||
|
||
Good balance of:
|
||
|
||
- cost
|
||
- quality
|
||
- review effort
|
||
|
||
### 4–8 items
|
||
|
||
Useful for exploration.
|
||
|
||
Tradeoff:
|
||
|
||
- more candidate variety
|
||
- heavier review burden
|
||
- higher chance of repeated structure and correlated errors
|
||
|
||
### More than 8 items
|
||
|
||
Should still be allowed if product policy permits, but treated as exploration mode.
|
||
|
||
The UI should warn that:
|
||
|
||
- review effort increases
|
||
- quality consistency may drop
|
||
- variants may become repetitive
|
||
|
||
## Cost and Quality Insight
|
||
|
||
Bulk generation can reduce cost per item because:
|
||
|
||
- parent context is sent once
|
||
- prompt overhead is amortized
|
||
|
||
But quality risk increases because:
|
||
|
||
- errors can repeat across all outputs in a run
|
||
- structure can become too similar
|
||
- weaker prompts can produce multiple low-quality siblings
|
||
|
||
So the best product design is:
|
||
|
||
- permit bulk
|
||
- recommend lower counts
|
||
- review outputs individually
|
||
|
||
## Recommended Policy
|
||
|
||
1. Allow both single and bulk generation
|
||
2. Keep generated items reviewable one-by-one
|
||
3. Store lineage through generation run metadata
|
||
4. Provide UI recommendations instead of rigid low hard caps
|
||
5. Enforce only a high backend safety cap
|
||
6. Keep parent question as the operational center of the workflow
|
||
|
||
## Product Direction
|
||
|
||
The ideal flow is:
|
||
|
||
1. Admin opens parent question
|
||
2. Admin chooses difficulty, model, and count
|
||
3. System creates one generation run
|
||
4. System creates one or many generated child variants
|
||
5. Admin reviews each child separately
|
||
6. Admin approves, rejects, archives, or regenerates per item
|
||
|
||
This gives:
|
||
|
||
- low friction
|
||
- high control
|
||
- strong auditability
|
||
- better quality governance
|
||
- flexibility for different admin intents
|