23 KiB
Agentic Context Management Strategy
Date: January 25, 2026
Version: 0.1.3+
Purpose: AI-powered context management for multilingual, intelligent user experience
🎯 Core Philosophy: Let AI Handle AI Context
The Problem with Hardcoded Solutions
Previous Approach (FLAWED):
// ❌ English-only, brittle, not scalable
if (content.includes('outline') || content.includes('structure')) {
return 'create_outline';
}
Issues:
- ❌ Only works in English
- ❌ Breaks in Indonesian, Arabic, Chinese, etc.
- ❌ Misses nuanced intent
- ❌ Requires constant maintenance
- ❌ Goes against "agentic" philosophy
Agentic Principle
"If AI is smart enough to write articles, it's smart enough to manage its own context."
New Approach:
- ✅ Use AI to summarize chat history
- ✅ Use AI to detect user intent
- ✅ Language-agnostic (works in any language)
- ✅ Adapts to context automatically
- ✅ True "agentic" experience
💰 Cost Analysis: AI-Powered Context Management
Your Cost Reference
Action: meta_description
Model: deepseek-chat-v3-032
Tokens: 510
Cost: $0.0001
This is EXTREMELY cheap! Let's use this model for context operations.
Proposed Actions
1. Action: summarize_context
Purpose: Condense long chat history into key points
Input:
{
"action": "summarize_context",
"chat_history": [
{"role": "user", "content": "Saya ingin menulis tentang keamanan WordPress"},
{"role": "assistant", "content": "[Long response in Indonesian...]"},
{"role": "user", "content": "Fokus pada plugin vulnerabilities saja"},
{"role": "assistant", "content": "[Detailed plugin security response...]"}
]
}
Prompt:
Summarize this conversation into key points that capture the user's intent and requirements.
Focus on:
- Main topic
- Specific focus areas
- Rejected/excluded topics
- User preferences (tone, audience, etc.)
Keep the summary concise (max 200 words) but preserve critical context.
Write in the same language as the conversation.
Output format:
TOPIC: [main topic]
FOCUS: [what to include]
EXCLUDE: [what to avoid]
PREFERENCES: [any specific requirements]
Expected Output:
TOPIC: WordPress security
FOCUS: Plugin vulnerabilities only
EXCLUDE: Performance optimization, backup strategies (user rejected these)
PREFERENCES: Technical audience, detailed explanations
Cost Estimate:
- Input: 4,000 tokens (long chat history)
- Output: 100 tokens (summary)
- Model: deepseek-chat-v3-032
- Cost: ~$0.0001 per summarization
When to Use:
- Chat history > 6 messages
- Before generating outline
- Before executing article
2. Action: detect_intent
Purpose: Understand what user wants to do next
Input:
{
"action": "detect_intent",
"last_message": "Baiklah, sekarang buatkan outline-nya",
"has_plan": false,
"current_mode": "chat"
}
Prompt:
Based on the user's message, determine their intent. Choose ONE:
1. "create_outline" - User wants to create an article outline/structure
2. "start_writing" - User wants to write the full article
3. "refine_content" - User wants to improve existing content
4. "continue_chat" - User wants to continue discussing/exploring
5. "clarify" - User is asking questions or needs clarification
Consider:
- The user's explicit request
- Whether they have an outline already (has_plan: {has_plan})
- Current mode (current_mode: {current_mode})
Respond with ONLY the intent code (e.g., "create_outline").
Expected Output:
create_outline
Cost Estimate:
- Input: 100 tokens (last message + context)
- Output: 5 tokens (intent code)
- Model: deepseek-chat-v3-032
- Cost: ~$0.00002 per detection
When to Use:
- After every user message in Chat mode
- To show contextual action buttons
- To auto-suggest next steps
📊 Cost Comparison: Full History vs AI-Powered
Scenario: 5 Agent + 4 Human Messages
| Approach | Input Tokens | Output Tokens | Cost per Request | Quality | Language Support |
|---|---|---|---|---|---|
| Full History | 4,365 | 0 | $0.013 (Claude) | ✅ Best | ✅ All |
| AI Summarization | 100 (summary) | 0 | $0.003 (Claude) + $0.0001 (summary) | ✅ Good | ✅ All |
| Hardcoded Pruning | 1,800 | 0 | $0.005 (Claude) | ⚠️ Fair | ❌ English only |
| No Context | 0 | 0 | $0.000 | ❌ Poor | ✅ All |
Cost Breakdown for 100 Articles/Month
Full History:
- Planning: 100 × $0.00033 = $0.033
- Execution: 100 × $0.013 = $1.30
- Total: $1.33/month
AI Summarization:
- Summarization: 100 × $0.0001 = $0.01
- Planning: 100 × $0.00008 = $0.008
- Execution: 100 × $0.003 = $0.30
- Total: $0.32/month
- Savings: $1.01/month (76% reduction)
Intent Detection:
- Per message: $0.00002
- Average 10 messages per article: 100 × 10 × $0.00002 = $0.02
- Total: $0.02/month (negligible)
🎯 The Big Picture: Agentic Experience
User Journey Analysis
Current Flow (Fragmented):
1. User opens editor
2. User manually switches to Chat mode
3. User types message
4. Agent responds
5. User types more
6. Agent responds
7. User manually switches to Planning mode
8. User types "create outline"
9. Outline generated
10. User manually clicks "Start Writing"
11. Article generated
Problems:
- Too many manual mode switches
- User must know when to switch
- No guidance on next steps
- Friction in workflow
Proposed Agentic Flow (Seamless):
1. User opens editor (any mode)
2. User types: "Saya ingin menulis tentang keamanan WordPress"
3. Agent responds with suggestions
4. User types: "Fokus pada plugin vulnerabilities"
5. Agent responds with refined ideas
6. 💡 UI shows: [📝 Ready to create outline?] (AI-detected intent)
7. User clicks button (or types "yes" or "buatkan outline")
8. ✨ AI summarizes chat history (0.1 seconds)
9. Outline generated with clean context
10. 💡 UI shows: [✍️ Start Writing] (auto-suggested)
11. User clicks
12. Article generated
Improvements:
- ✅ No manual mode switching needed
- ✅ AI suggests next steps proactively
- ✅ Context automatically optimized
- ✅ Smooth, guided experience
- ✅ Works in any language
🔧 Implementation Design
Backend: New Actions
/**
* Handle context summarization request.
*
* @param WP_REST_Request $request REST request.
* @return WP_REST_Response|WP_Error Response.
*/
public function handle_summarize_context( $request ) {
$params = $request->get_json_params();
$chat_history = $params['chatHistory'] ?? array();
if ( empty( $chat_history ) || count( $chat_history ) < 4 ) {
// No need to summarize short history
return new WP_REST_Response(
array(
'summary' => '',
'use_full_history' => true,
),
200
);
}
// Build summarization prompt
$history_text = '';
foreach ( $chat_history as $msg ) {
$role = ucfirst( $msg['role'] ?? 'Unknown' );
$content = $msg['content'] ?? '';
$history_text .= "{$role}: {$content}\n\n";
}
$prompt = "Summarize this conversation into key points that capture the user's intent and requirements.
Focus on:
- Main topic
- Specific focus areas
- Rejected/excluded topics
- User preferences (tone, audience, etc.)
Keep the summary concise (max 200 words) but preserve critical context.
Write in the same language as the conversation.
Output format:
TOPIC: [main topic]
FOCUS: [what to include]
EXCLUDE: [what to avoid]
PREFERENCES: [any specific requirements]
Conversation:
{$history_text}";
$provider = WP_Agentic_Writer_OpenRouter_Provider::get_instance();
$messages = array(
array(
'role' => 'user',
'content' => $prompt,
),
);
// Use cheap model for summarization
$response = $provider->chat( $messages, array(), 'summarize' );
if ( is_wp_error( $response ) ) {
return $response;
}
// Track cost
do_action(
'wp_aw_after_api_request',
$params['postId'] ?? 0,
$response['model'] ?? '',
'summarize_context',
$response['input_tokens'] ?? 0,
$response['output_tokens'] ?? 0,
$response['cost'] ?? 0
);
return new WP_REST_Response(
array(
'summary' => $response['content'] ?? '',
'use_full_history' => false,
'cost' => $response['cost'] ?? 0,
),
200
);
}
/**
* Handle intent detection request.
*
* @param WP_REST_Request $request REST request.
* @return WP_REST_Response|WP_Error Response.
*/
public function handle_detect_intent( $request ) {
$params = $request->get_json_params();
$last_message = $params['lastMessage'] ?? '';
$has_plan = $params['hasPlan'] ?? false;
$current_mode = $params['currentMode'] ?? 'chat';
if ( empty( $last_message ) ) {
return new WP_REST_Response(
array( 'intent' => 'continue_chat' ),
200
);
}
$prompt = "Based on the user's message, determine their intent. Choose ONE:
1. \"create_outline\" - User wants to create an article outline/structure
2. \"start_writing\" - User wants to write the full article
3. \"refine_content\" - User wants to improve existing content
4. \"continue_chat\" - User wants to continue discussing/exploring
5. \"clarify\" - User is asking questions or needs clarification
Consider:
- The user's explicit request
- Whether they have an outline already (has_plan: " . ( $has_plan ? 'true' : 'false' ) . ")
- Current mode (current_mode: {$current_mode})
User's message: \"{$last_message}\"
Respond with ONLY the intent code (e.g., \"create_outline\").";
$provider = WP_Agentic_Writer_OpenRouter_Provider::get_instance();
$messages = array(
array(
'role' => 'user',
'content' => $prompt,
),
);
$response = $provider->chat( $messages, array(), 'intent_detection' );
if ( is_wp_error( $response ) ) {
return $response;
}
// Track cost
do_action(
'wp_aw_after_api_request',
$params['postId'] ?? 0,
$response['model'] ?? '',
'detect_intent',
$response['input_tokens'] ?? 0,
$response['output_tokens'] ?? 0,
$response['cost'] ?? 0
);
$intent = trim( strtolower( $response['content'] ?? 'continue_chat' ) );
return new WP_REST_Response(
array(
'intent' => $intent,
'cost' => $response['cost'] ?? 0,
),
200
);
}
Frontend: Agentic UX
// Auto-detect intent after each message
const handleMessageSent = async (userMessage) => {
// Send message to chat
const chatResponse = await sendChatMessage(userMessage);
// Detect intent in background
const intentResponse = await fetch('/wp-json/wp-agentic-writer/v1/detect-intent', {
method: 'POST',
body: JSON.stringify({
lastMessage: userMessage,
hasPlan: !!currentPlan,
currentMode: agentMode,
postId: postId
})
});
const { intent } = await intentResponse.json();
// Show contextual action based on intent
setDetectedIntent(intent);
showContextualAction(intent);
};
// Render contextual action buttons
const renderContextualAction = () => {
if (!detectedIntent) return null;
switch (detectedIntent) {
case 'create_outline':
return (
<div className="contextual-action">
<p>💡 Ready to create an outline?</p>
<button onClick={handleCreateOutlineWithSummary} className="primary">
📝 Create Outline
</button>
</div>
);
case 'start_writing':
if (!currentPlan) {
return (
<div className="contextual-action">
<p>⚠️ You need an outline first</p>
<button onClick={handleCreateOutlineWithSummary}>
📝 Create Outline First
</button>
</div>
);
}
return (
<div className="contextual-action">
<p>💡 Ready to write the article?</p>
<button onClick={handleStartWriting} className="primary">
✍️ Start Writing
</button>
</div>
);
case 'refine_content':
return (
<div className="contextual-action">
<p>💡 Use @block to refine specific sections</p>
</div>
);
default:
return null;
}
};
// Create outline with AI summarization
const handleCreateOutlineWithSummary = async () => {
setIsLoading(true);
// Step 1: Summarize chat history if needed
let contextToSend = messages;
if (messages.length > 6) {
showStatus('Optimizing context...');
const summaryResponse = await fetch('/wp-json/wp-agentic-writer/v1/summarize-context', {
method: 'POST',
body: JSON.stringify({
chatHistory: messages,
postId: postId
})
});
const { summary, use_full_history, cost } = await summaryResponse.json();
if (!use_full_history && summary) {
// Use summarized context
contextToSend = [
{
role: 'system',
content: `Context Summary:\n${summary}`
},
...messages.slice(-2) // Keep last exchange
];
console.log('Context optimized. Cost:', cost);
}
}
// Step 2: Generate outline with optimized context
showStatus('Creating outline...');
const outlineResponse = await fetch('/wp-json/wp-agentic-writer/v1/generate-plan', {
method: 'POST',
body: JSON.stringify({
topic: extractTopic(messages),
chatHistory: contextToSend,
postId: postId,
postConfig: postConfig,
stream: true
})
});
// Handle streaming response...
};
🎨 UX Enhancements
1. Contextual Action Cards
┌─────────────────────────────────────────────────────┐
│ Agent: "I can help you create a comprehensive │
│ outline for WordPress plugin security..." │
├─────────────────────────────────────────────────────┤
│ 💡 Detected Intent: Create Outline │
│ │
│ [📝 Create Outline] [💬 Continue Discussing] │
│ │
│ 💰 Context will be optimized (~$0.0001) │
└─────────────────────────────────────────────────────┘
2. Context Optimization Indicator
┌─────────────────────────────────────────────────────┐
│ ⚡ Optimizing context... │
│ • 9 messages → Summary (200 words) │
│ • Token reduction: 4,365 → 450 (90%) │
│ • Cost: $0.0001 │
│ ✓ Done in 0.2s │
└─────────────────────────────────────────────────────┘
3. Smart Mode Transitions
User in Chat mode types: "buatkan outline-nya"
┌─────────────────────────────────────────────────────┐
│ 💡 Switching to Planning mode... │
│ • Detected intent: Create outline │
│ • Optimizing 7 messages of context │
│ • Generating outline... │
└─────────────────────────────────────────────────────┘
[Outline appears]
┌─────────────────────────────────────────────────────┐
│ ✨ Outline ready! │
│ │
│ Next step: │
│ [✍️ Start Writing Article] │
└─────────────────────────────────────────────────────┘
📊 Decision Matrix: When to Use What?
| Situation | Recommended Approach | Reason |
|---|---|---|
| Chat history ≤ 4 messages | Send full history | Short enough, no optimization needed |
| Chat history 5-8 messages | AI summarization | Balance cost and quality |
| Chat history > 8 messages | AI summarization + last 2 | Keep recent context verbatim |
| User switches modes | Detect intent | Guide user to next action |
| Before outline generation | Summarize context | Clean, focused input |
| Before article execution | Use plan (no chat history) | Plan already has all context |
| Block refinement | No chat history | Block content is sufficient |
| User types "/reset" | Clear all context | Fresh start |
🎯 Recommendation: Hybrid Intelligent Approach
The Winning Strategy
Combine AI-powered + Smart Defaults:
-
Default Behavior (No User Action)
- Chat history ≤ 4 messages → Send full history
- Chat history > 4 messages → Auto-summarize with AI
- Cost: ~$0.0001 per summarization (negligible)
-
Intent Detection (Automatic)
- After every user message → Detect intent
- Show contextual action buttons
- Cost: ~$0.00002 per detection (negligible)
-
User Control (Optional)
- Settings: "Context Mode" → Auto/Full/Minimal
- "/reset" command → Clear context
- Manual selection UI (advanced users)
Why This Works
✅ Language-Agnostic
- Works in English, Indonesian, Arabic, Chinese, etc.
- No hardcoded keywords
✅ Cost-Effective
- 76% cost reduction vs full history
- Total added cost: ~$0.34/month for 100 articles
- ROI: Better quality + Lower cost
✅ True Agentic Experience
- AI manages its own context
- Proactive suggestions
- Seamless workflow
- No manual mode switching
✅ User-Friendly
- Automatic by default
- Optional manual control
- Transparent (shows what's happening)
- Fast (summarization takes 0.1-0.3s)
🔧 Implementation Plan
Phase 1: Core Infrastructure (Week 1)
Backend:
- Add
/summarize-contextendpoint - Add
/detect-intentendpoint - Add
summarizeandintent_detectionoperation types to cost tracking - Update OpenRouter provider to support these actions
Frontend:
- Add
handleSummarizeContext()function - Add
handleDetectIntent()function - Add context optimization indicator component
Testing:
- Test summarization in English, Indonesian, Arabic
- Test intent detection in multiple languages
- Verify cost tracking
Phase 2: UX Integration (Week 2)
Frontend:
- Add contextual action cards
- Auto-detect intent after each message
- Show "Optimizing context..." status
- Add smart mode transitions
Settings:
- Add "Context Mode" setting (Auto/Full/Minimal)
- Add context optimization toggle
- Add cost estimates in settings
Testing:
- Test full user journey (chat → outline → write)
- Test in multiple languages
- Verify smooth transitions
Phase 3: Advanced Features (Week 3)
Features:
- Add
/resetcommand - Add manual context selection UI (optional)
- Add context analytics (token usage, cost breakdown)
- Add context caching (reuse summaries)
Optimization:
- Implement smart caching for summaries
- Add context relevance scoring
- Optimize prompt templates
Documentation:
- Update user guide
- Add context management section
- Document cost implications
💰 Final Cost Analysis
Per Article (Average)
| Component | Cost | Frequency |
|---|---|---|
| Intent detection | $0.00002 × 10 messages | = $0.0002 |
| Context summarization | $0.0001 × 1 time | = $0.0001 |
| Planning (with summary) | $0.003 | = $0.003 |
| Execution (no history) | $0.50-$2.00 | = $1.00 avg |
| Total per article | ≈ $1.0033 |
Compared to Full History:
- Full history approach: $1.013 per article
- AI-powered approach: $1.0033 per article
- Savings: $0.01 per article (negligible)
But wait - the real benefit:
- ✅ Better quality (clean, focused context)
- ✅ Language-agnostic (works everywhere)
- ✅ Better UX (proactive suggestions)
- ✅ Scalable (no hardcoded rules)
🎯 Answer to Your Question
"Send all chat history vs AI summarization?"
Answer: AI Summarization is Better
Reasons:
-
Cost is Nearly Identical
- Full history: $1.013/article
- AI summary: $1.0033/article
- Difference: $0.01 (1% savings)
-
Quality is Better
- Summary removes contradicted ideas
- Summary focuses on final intent
- Summary prevents pollution
- AI explicitly told what to focus on
-
Language Support
- Full history: Works in all languages ✅
- AI summary: Works in all languages ✅
- Hardcoded: Only English ❌
-
Agentic Experience
- AI managing AI context = true agentic
- Proactive intent detection
- Seamless workflow
- No user friction
-
Scalability
- No hardcoded rules to maintain
- Adapts to new languages automatically
- Handles edge cases gracefully
The Plan:
- ✅ Implement AI summarization (not hardcoded)
- ✅ Implement AI intent detection (not hardcoded)
- ✅ Make it automatic (no user action needed)
- ✅ Add user controls (optional override)
- ✅ Track costs transparently (show user what's happening)
Status: 🚀 READY TO IMPLEMENT
Approach: AI-Powered (Agentic)
Cost Impact: Negligible (+$0.34/month for 100 articles)
Quality Impact: Significant improvement
UX Impact: Seamless, guided experience