Files

dwindown 97426d5ab1 first commit all files

2026-01-28 00:26:00 +07:00

23 KiB

Raw Permalink Blame History

Agentic Context Management Strategy

Date: January 25, 2026
Version: 0.1.3+
Purpose: AI-powered context management for multilingual, intelligent user experience

🎯 Core Philosophy: Let AI Handle AI Context

The Problem with Hardcoded Solutions

Previous Approach (FLAWED):

// ❌ English-only, brittle, not scalable
if (content.includes('outline') || content.includes('structure')) {
    return 'create_outline';
}

Issues:

❌ Only works in English
❌ Breaks in Indonesian, Arabic, Chinese, etc.
❌ Misses nuanced intent
❌ Requires constant maintenance
❌ Goes against "agentic" philosophy

Agentic Principle

"If AI is smart enough to write articles, it's smart enough to manage its own context."

New Approach:

✅ Use AI to summarize chat history
✅ Use AI to detect user intent
✅ Language-agnostic (works in any language)
✅ Adapts to context automatically
✅ True "agentic" experience

💰 Cost Analysis: AI-Powered Context Management

Your Cost Reference

Action: meta_description
Model: deepseek-chat-v3-032
Tokens: 510
Cost: $0.0001

This is EXTREMELY cheap! Let's use this model for context operations.

Proposed Actions

1. Action: `summarize_context`

Purpose: Condense long chat history into key points

Input:

{
    "action": "summarize_context",
    "chat_history": [
        {"role": "user", "content": "Saya ingin menulis tentang keamanan WordPress"},
        {"role": "assistant", "content": "[Long response in Indonesian...]"},
        {"role": "user", "content": "Fokus pada plugin vulnerabilities saja"},
        {"role": "assistant", "content": "[Detailed plugin security response...]"}
    ]
}

Prompt:

Summarize this conversation into key points that capture the user's intent and requirements. 
Focus on:
- Main topic
- Specific focus areas
- Rejected/excluded topics
- User preferences (tone, audience, etc.)

Keep the summary concise (max 200 words) but preserve critical context.
Write in the same language as the conversation.

Output format:
TOPIC: [main topic]
FOCUS: [what to include]
EXCLUDE: [what to avoid]
PREFERENCES: [any specific requirements]

Expected Output:

TOPIC: WordPress security
FOCUS: Plugin vulnerabilities only
EXCLUDE: Performance optimization, backup strategies (user rejected these)
PREFERENCES: Technical audience, detailed explanations

Cost Estimate:

Input: 4,000 tokens (long chat history)
Output: 100 tokens (summary)
Model: deepseek-chat-v3-032
Cost: ~$0.0001 per summarization

When to Use:

Chat history > 6 messages
Before generating outline
Before executing article

2. Action: `detect_intent`

Purpose: Understand what user wants to do next

Input:

{
    "action": "detect_intent",
    "last_message": "Baiklah, sekarang buatkan outline-nya",
    "has_plan": false,
    "current_mode": "chat"
}

Prompt:

Based on the user's message, determine their intent. Choose ONE:

1. "create_outline" - User wants to create an article outline/structure
2. "start_writing" - User wants to write the full article
3. "refine_content" - User wants to improve existing content
4. "continue_chat" - User wants to continue discussing/exploring
5. "clarify" - User is asking questions or needs clarification

Consider:
- The user's explicit request
- Whether they have an outline already (has_plan: {has_plan})
- Current mode (current_mode: {current_mode})

Respond with ONLY the intent code (e.g., "create_outline").

Expected Output:

create_outline

Cost Estimate:

Input: 100 tokens (last message + context)
Output: 5 tokens (intent code)
Model: deepseek-chat-v3-032
Cost: ~$0.00002 per detection

When to Use:

After every user message in Chat mode
To show contextual action buttons
To auto-suggest next steps

📊 Cost Comparison: Full History vs AI-Powered

Scenario: 5 Agent + 4 Human Messages

Approach	Input Tokens	Cost per Request	Quality	Language Support
Full History	4,365	$0.013 (Claude)	✅ Best	✅ All
AI Summarization	100 (summary)	$0.003 (Claude) + $0.0001 (summary)	✅ Good	✅ All
Hardcoded Pruning	1,800	$0.005 (Claude)	⚠️ Fair	❌ English only
No Context	0	$0.000	❌ Poor	✅ All

Cost Breakdown for 100 Articles/Month

Full History:

Planning: 100 × $0.00033 = $0.033
Execution: 100 × $0.013 = $1.30
Total: $1.33/month

AI Summarization:

Summarization: 100 × $0.0001 = $0.01
Planning: 100 × $0.00008 = $0.008
Execution: 100 × $0.003 = $0.30
Total: $0.32/month
Savings: $1.01/month (76% reduction)

Intent Detection:

Per message: $0.00002
Average 10 messages per article: 100 × 10 × $0.00002 = $0.02
Total: $0.02/month (negligible)

🎯 The Big Picture: Agentic Experience

User Journey Analysis

Current Flow (Fragmented):

1. User opens editor
2. User manually switches to Chat mode
3. User types message
4. Agent responds
5. User types more
6. Agent responds
7. User manually switches to Planning mode
8. User types "create outline"
9. Outline generated
10. User manually clicks "Start Writing"
11. Article generated

Problems:

Too many manual mode switches
User must know when to switch
No guidance on next steps
Friction in workflow

Proposed Agentic Flow (Seamless):

1. User opens editor (any mode)
2. User types: "Saya ingin menulis tentang keamanan WordPress"
3. Agent responds with suggestions
4. User types: "Fokus pada plugin vulnerabilities"
5. Agent responds with refined ideas
6. 💡 UI shows: [📝 Ready to create outline?] (AI-detected intent)
7. User clicks button (or types "yes" or "buatkan outline")
8. ✨ AI summarizes chat history (0.1 seconds)
9. Outline generated with clean context
10. 💡 UI shows: [✍️ Start Writing] (auto-suggested)
11. User clicks
12. Article generated

Improvements:

✅ No manual mode switching needed
✅ AI suggests next steps proactively
✅ Context automatically optimized
✅ Smooth, guided experience
✅ Works in any language

🔧 Implementation Design

Backend: New Actions

/**
 * Handle context summarization request.
 *
 * @param WP_REST_Request $request REST request.
 * @return WP_REST_Response|WP_Error Response.
 */
public function handle_summarize_context( $request ) {
    $params = $request->get_json_params();
    $chat_history = $params['chatHistory'] ?? array();
    
    if ( empty( $chat_history ) || count( $chat_history ) < 4 ) {
        // No need to summarize short history
        return new WP_REST_Response(
            array(
                'summary' => '',
                'use_full_history' => true,
            ),
            200
        );
    }
    
    // Build summarization prompt
    $history_text = '';
    foreach ( $chat_history as $msg ) {
        $role = ucfirst( $msg['role'] ?? 'Unknown' );
        $content = $msg['content'] ?? '';
        $history_text .= "{$role}: {$content}\n\n";
    }
    
    $prompt = "Summarize this conversation into key points that capture the user's intent and requirements.

Focus on:
- Main topic
- Specific focus areas
- Rejected/excluded topics
- User preferences (tone, audience, etc.)

Keep the summary concise (max 200 words) but preserve critical context.
Write in the same language as the conversation.

Output format:
TOPIC: [main topic]
FOCUS: [what to include]
EXCLUDE: [what to avoid]
PREFERENCES: [any specific requirements]

Conversation:
{$history_text}";
    
    $provider = WP_Agentic_Writer_OpenRouter_Provider::get_instance();
    $messages = array(
        array(
            'role' => 'user',
            'content' => $prompt,
        ),
    );
    
    // Use cheap model for summarization
    $response = $provider->chat( $messages, array(), 'summarize' );
    
    if ( is_wp_error( $response ) ) {
        return $response;
    }
    
    // Track cost
    do_action(
        'wp_aw_after_api_request',
        $params['postId'] ?? 0,
        $response['model'] ?? '',
        'summarize_context',
        $response['input_tokens'] ?? 0,
        $response['output_tokens'] ?? 0,
        $response['cost'] ?? 0
    );
    
    return new WP_REST_Response(
        array(
            'summary' => $response['content'] ?? '',
            'use_full_history' => false,
            'cost' => $response['cost'] ?? 0,
        ),
        200
    );
}

/**
 * Handle intent detection request.
 *
 * @param WP_REST_Request $request REST request.
 * @return WP_REST_Response|WP_Error Response.
 */
public function handle_detect_intent( $request ) {
    $params = $request->get_json_params();
    $last_message = $params['lastMessage'] ?? '';
    $has_plan = $params['hasPlan'] ?? false;
    $current_mode = $params['currentMode'] ?? 'chat';
    
    if ( empty( $last_message ) ) {
        return new WP_REST_Response(
            array( 'intent' => 'continue_chat' ),
            200
        );
    }
    
    $prompt = "Based on the user's message, determine their intent. Choose ONE:

1. \"create_outline\" - User wants to create an article outline/structure
2. \"start_writing\" - User wants to write the full article
3. \"refine_content\" - User wants to improve existing content
4. \"continue_chat\" - User wants to continue discussing/exploring
5. \"clarify\" - User is asking questions or needs clarification

Consider:
- The user's explicit request
- Whether they have an outline already (has_plan: " . ( $has_plan ? 'true' : 'false' ) . ")
- Current mode (current_mode: {$current_mode})

User's message: \"{$last_message}\"

Respond with ONLY the intent code (e.g., \"create_outline\").";
    
    $provider = WP_Agentic_Writer_OpenRouter_Provider::get_instance();
    $messages = array(
        array(
            'role' => 'user',
            'content' => $prompt,
        ),
    );
    
    $response = $provider->chat( $messages, array(), 'intent_detection' );
    
    if ( is_wp_error( $response ) ) {
        return $response;
    }
    
    // Track cost
    do_action(
        'wp_aw_after_api_request',
        $params['postId'] ?? 0,
        $response['model'] ?? '',
        'detect_intent',
        $response['input_tokens'] ?? 0,
        $response['output_tokens'] ?? 0,
        $response['cost'] ?? 0
    );
    
    $intent = trim( strtolower( $response['content'] ?? 'continue_chat' ) );
    
    return new WP_REST_Response(
        array(
            'intent' => $intent,
            'cost' => $response['cost'] ?? 0,
        ),
        200
    );
}

Frontend: Agentic UX

// Auto-detect intent after each message
const handleMessageSent = async (userMessage) => {
    // Send message to chat
    const chatResponse = await sendChatMessage(userMessage);
    
    // Detect intent in background
    const intentResponse = await fetch('/wp-json/wp-agentic-writer/v1/detect-intent', {
        method: 'POST',
        body: JSON.stringify({
            lastMessage: userMessage,
            hasPlan: !!currentPlan,
            currentMode: agentMode,
            postId: postId
        })
    });
    
    const { intent } = await intentResponse.json();
    
    // Show contextual action based on intent
    setDetectedIntent(intent);
    showContextualAction(intent);
};

// Render contextual action buttons
const renderContextualAction = () => {
    if (!detectedIntent) return null;
    
    switch (detectedIntent) {
        case 'create_outline':
            return (
                <div className="contextual-action">
                    <p>💡 Ready to create an outline?</p>
                    <button onClick={handleCreateOutlineWithSummary} className="primary">
                        📝 Create Outline
                    </button>
                </div>
            );
        
        case 'start_writing':
            if (!currentPlan) {
                return (
                    <div className="contextual-action">
                        <p>⚠️ You need an outline first</p>
                        <button onClick={handleCreateOutlineWithSummary}>
                            📝 Create Outline First
                        </button>
                    </div>
                );
            }
            return (
                <div className="contextual-action">
                    <p>💡 Ready to write the article?</p>
                    <button onClick={handleStartWriting} className="primary">
                        ✍️ Start Writing
                    </button>
                </div>
            );
        
        case 'refine_content':
            return (
                <div className="contextual-action">
                    <p>💡 Use @block to refine specific sections</p>
                </div>
            );
        
        default:
            return null;
    }
};

// Create outline with AI summarization
const handleCreateOutlineWithSummary = async () => {
    setIsLoading(true);
    
    // Step 1: Summarize chat history if needed
    let contextToSend = messages;
    
    if (messages.length > 6) {
        showStatus('Optimizing context...');
        
        const summaryResponse = await fetch('/wp-json/wp-agentic-writer/v1/summarize-context', {
            method: 'POST',
            body: JSON.stringify({
                chatHistory: messages,
                postId: postId
            })
        });
        
        const { summary, use_full_history, cost } = await summaryResponse.json();
        
        if (!use_full_history && summary) {
            // Use summarized context
            contextToSend = [
                {
                    role: 'system',
                    content: `Context Summary:\n${summary}`
                },
                ...messages.slice(-2) // Keep last exchange
            ];
            
            console.log('Context optimized. Cost:', cost);
        }
    }
    
    // Step 2: Generate outline with optimized context
    showStatus('Creating outline...');
    
    const outlineResponse = await fetch('/wp-json/wp-agentic-writer/v1/generate-plan', {
        method: 'POST',
        body: JSON.stringify({
            topic: extractTopic(messages),
            chatHistory: contextToSend,
            postId: postId,
            postConfig: postConfig,
            stream: true
        })
    });
    
    // Handle streaming response...
};

🎨 UX Enhancements

1. Contextual Action Cards

┌─────────────────────────────────────────────────────┐
│  Agent: "I can help you create a comprehensive      │
│  outline for WordPress plugin security..."          │
├─────────────────────────────────────────────────────┤
│  💡 Detected Intent: Create Outline                 │
│                                                      │
│  [📝 Create Outline]  [💬 Continue Discussing]      │
│                                                      │
│  💰 Context will be optimized (~$0.0001)            │
└─────────────────────────────────────────────────────┘

2. Context Optimization Indicator

┌─────────────────────────────────────────────────────┐
│  ⚡ Optimizing context...                           │
│  • 9 messages → Summary (200 words)                 │
│  • Token reduction: 4,365 → 450 (90%)               │
│  • Cost: $0.0001                                    │
│  ✓ Done in 0.2s                                     │
└─────────────────────────────────────────────────────┘

3. Smart Mode Transitions

User in Chat mode types: "buatkan outline-nya"

┌─────────────────────────────────────────────────────┐
│  💡 Switching to Planning mode...                   │
│  • Detected intent: Create outline                  │
│  • Optimizing 7 messages of context                 │
│  • Generating outline...                            │
└─────────────────────────────────────────────────────┘

[Outline appears]

┌─────────────────────────────────────────────────────┐
│  ✨ Outline ready!                                   │
│                                                      │
│  Next step:                                          │
│  [✍️ Start Writing Article]                         │
└─────────────────────────────────────────────────────┘

📊 Decision Matrix: When to Use What?

Situation	Recommended Approach	Reason
Chat history ≤ 4 messages	Send full history	Short enough, no optimization needed
Chat history 5-8 messages	AI summarization	Balance cost and quality
Chat history > 8 messages	AI summarization + last 2	Keep recent context verbatim
User switches modes	Detect intent	Guide user to next action
Before outline generation	Summarize context	Clean, focused input
Before article execution	Use plan (no chat history)	Plan already has all context
Block refinement	No chat history	Block content is sufficient
User types "/reset"	Clear all context	Fresh start

🎯 Recommendation: Hybrid Intelligent Approach

The Winning Strategy

Combine AI-powered + Smart Defaults:

Default Behavior (No User Action)
- Chat history ≤ 4 messages → Send full history
- Chat history > 4 messages → Auto-summarize with AI
- Cost: ~$0.0001 per summarization (negligible)
Intent Detection (Automatic)
- After every user message → Detect intent
- Show contextual action buttons
- Cost: ~$0.00002 per detection (negligible)
User Control (Optional)
- Settings: "Context Mode" → Auto/Full/Minimal
- "/reset" command → Clear context
- Manual selection UI (advanced users)

Why This Works

✅ Language-Agnostic

Works in English, Indonesian, Arabic, Chinese, etc.
No hardcoded keywords

✅ Cost-Effective

76% cost reduction vs full history
Total added cost: ~$0.34/month for 100 articles
ROI: Better quality + Lower cost

✅ True Agentic Experience

AI manages its own context
Proactive suggestions
Seamless workflow
No manual mode switching

✅ User-Friendly

Automatic by default
Optional manual control
Transparent (shows what's happening)
Fast (summarization takes 0.1-0.3s)

🔧 Implementation Plan

Phase 1: Core Infrastructure (Week 1)

Backend:

Add /summarize-context endpoint
Add /detect-intent endpoint
Add summarize and intent_detection operation types to cost tracking
Update OpenRouter provider to support these actions

Frontend:

Add handleSummarizeContext() function
Add handleDetectIntent() function
Add context optimization indicator component

Testing:

Test summarization in English, Indonesian, Arabic
Test intent detection in multiple languages
Verify cost tracking

Phase 2: UX Integration (Week 2)

Frontend:

Add contextual action cards
Auto-detect intent after each message
Show "Optimizing context..." status
Add smart mode transitions

Settings:

Add "Context Mode" setting (Auto/Full/Minimal)
Add context optimization toggle
Add cost estimates in settings

Testing:

Test full user journey (chat → outline → write)
Test in multiple languages
Verify smooth transitions

Phase 3: Advanced Features (Week 3)

Features:

Add /reset command
Add manual context selection UI (optional)
Add context analytics (token usage, cost breakdown)
Add context caching (reuse summaries)

Optimization:

Implement smart caching for summaries
Add context relevance scoring
Optimize prompt templates

Documentation:

Update user guide
Add context management section
Document cost implications

💰 Final Cost Analysis

Per Article (Average)

Component	Cost	Frequency
Intent detection	$0.00002 × 10 messages	= $0.0002
Context summarization	$0.0001 × 1 time	= $0.0001
Planning (with summary)	$0.003	= $0.003
Execution (no history)	$0.50-$2.00	= $1.00 avg
Total per article		≈ $1.0033

Compared to Full History:

Full history approach: $1.013 per article
AI-powered approach: $1.0033 per article
Savings: $0.01 per article (negligible)

But wait - the real benefit:

✅ Better quality (clean, focused context)
✅ Language-agnostic (works everywhere)
✅ Better UX (proactive suggestions)
✅ Scalable (no hardcoded rules)

🎯 Answer to Your Question

"Send all chat history vs AI summarization?"

Answer: AI Summarization is Better

Reasons:

Cost is Nearly Identical
- Full history: $1.013/article
- AI summary: $1.0033/article
- Difference: $0.01 (1% savings)
Quality is Better
- Summary removes contradicted ideas
- Summary focuses on final intent
- Summary prevents pollution
- AI explicitly told what to focus on
Language Support
- Full history: Works in all languages ✅
- AI summary: Works in all languages ✅
- Hardcoded: Only English ❌
Agentic Experience
- AI managing AI context = true agentic
- Proactive intent detection
- Seamless workflow
- No user friction
Scalability
- No hardcoded rules to maintain
- Adapts to new languages automatically
- Handles edge cases gracefully

The Plan:

✅ Implement AI summarization (not hardcoded)
✅ Implement AI intent detection (not hardcoded)
✅ Make it automatic (no user action needed)
✅ Add user controls (optional override)
✅ Track costs transparently (show user what's happening)

Status: 🚀 READY TO IMPLEMENT
Approach: AI-Powered (Agentic)
Cost Impact: Negligible (+$0.34/month for 100 articles)
Quality Impact: Significant improvement
UX Impact: Seamless, guided experience

23 KiB Raw Permalink Blame History Unescape Escape