Files
wp-agentic-writer/AGENTIC_CONTEXT_STRATEGY.md
2026-01-28 00:26:00 +07:00

23 KiB
Raw Permalink Blame History

Agentic Context Management Strategy

Date: January 25, 2026
Version: 0.1.3+
Purpose: AI-powered context management for multilingual, intelligent user experience


🎯 Core Philosophy: Let AI Handle AI Context

The Problem with Hardcoded Solutions

Previous Approach (FLAWED):

// ❌ English-only, brittle, not scalable
if (content.includes('outline') || content.includes('structure')) {
    return 'create_outline';
}

Issues:

  • Only works in English
  • Breaks in Indonesian, Arabic, Chinese, etc.
  • Misses nuanced intent
  • Requires constant maintenance
  • Goes against "agentic" philosophy

Agentic Principle

"If AI is smart enough to write articles, it's smart enough to manage its own context."

New Approach:

  • Use AI to summarize chat history
  • Use AI to detect user intent
  • Language-agnostic (works in any language)
  • Adapts to context automatically
  • True "agentic" experience

💰 Cost Analysis: AI-Powered Context Management

Your Cost Reference

Action: meta_description
Model: deepseek-chat-v3-032
Tokens: 510
Cost: $0.0001

This is EXTREMELY cheap! Let's use this model for context operations.

Proposed Actions

1. Action: summarize_context

Purpose: Condense long chat history into key points

Input:

{
    "action": "summarize_context",
    "chat_history": [
        {"role": "user", "content": "Saya ingin menulis tentang keamanan WordPress"},
        {"role": "assistant", "content": "[Long response in Indonesian...]"},
        {"role": "user", "content": "Fokus pada plugin vulnerabilities saja"},
        {"role": "assistant", "content": "[Detailed plugin security response...]"}
    ]
}

Prompt:

Summarize this conversation into key points that capture the user's intent and requirements. 
Focus on:
- Main topic
- Specific focus areas
- Rejected/excluded topics
- User preferences (tone, audience, etc.)

Keep the summary concise (max 200 words) but preserve critical context.
Write in the same language as the conversation.

Output format:
TOPIC: [main topic]
FOCUS: [what to include]
EXCLUDE: [what to avoid]
PREFERENCES: [any specific requirements]

Expected Output:

TOPIC: WordPress security
FOCUS: Plugin vulnerabilities only
EXCLUDE: Performance optimization, backup strategies (user rejected these)
PREFERENCES: Technical audience, detailed explanations

Cost Estimate:

  • Input: 4,000 tokens (long chat history)
  • Output: 100 tokens (summary)
  • Model: deepseek-chat-v3-032
  • Cost: ~$0.0001 per summarization

When to Use:

  • Chat history > 6 messages
  • Before generating outline
  • Before executing article

2. Action: detect_intent

Purpose: Understand what user wants to do next

Input:

{
    "action": "detect_intent",
    "last_message": "Baiklah, sekarang buatkan outline-nya",
    "has_plan": false,
    "current_mode": "chat"
}

Prompt:

Based on the user's message, determine their intent. Choose ONE:

1. "create_outline" - User wants to create an article outline/structure
2. "start_writing" - User wants to write the full article
3. "refine_content" - User wants to improve existing content
4. "continue_chat" - User wants to continue discussing/exploring
5. "clarify" - User is asking questions or needs clarification

Consider:
- The user's explicit request
- Whether they have an outline already (has_plan: {has_plan})
- Current mode (current_mode: {current_mode})

Respond with ONLY the intent code (e.g., "create_outline").

Expected Output:

create_outline

Cost Estimate:

  • Input: 100 tokens (last message + context)
  • Output: 5 tokens (intent code)
  • Model: deepseek-chat-v3-032
  • Cost: ~$0.00002 per detection

When to Use:

  • After every user message in Chat mode
  • To show contextual action buttons
  • To auto-suggest next steps

📊 Cost Comparison: Full History vs AI-Powered

Scenario: 5 Agent + 4 Human Messages

Approach Input Tokens Output Tokens Cost per Request Quality Language Support
Full History 4,365 0 $0.013 (Claude) Best All
AI Summarization 100 (summary) 0 $0.003 (Claude) + $0.0001 (summary) Good All
Hardcoded Pruning 1,800 0 $0.005 (Claude) ⚠️ Fair English only
No Context 0 0 $0.000 Poor All

Cost Breakdown for 100 Articles/Month

Full History:

  • Planning: 100 × $0.00033 = $0.033
  • Execution: 100 × $0.013 = $1.30
  • Total: $1.33/month

AI Summarization:

  • Summarization: 100 × $0.0001 = $0.01
  • Planning: 100 × $0.00008 = $0.008
  • Execution: 100 × $0.003 = $0.30
  • Total: $0.32/month
  • Savings: $1.01/month (76% reduction)

Intent Detection:

  • Per message: $0.00002
  • Average 10 messages per article: 100 × 10 × $0.00002 = $0.02
  • Total: $0.02/month (negligible)

🎯 The Big Picture: Agentic Experience

User Journey Analysis

Current Flow (Fragmented):

1. User opens editor
2. User manually switches to Chat mode
3. User types message
4. Agent responds
5. User types more
6. Agent responds
7. User manually switches to Planning mode
8. User types "create outline"
9. Outline generated
10. User manually clicks "Start Writing"
11. Article generated

Problems:

  • Too many manual mode switches
  • User must know when to switch
  • No guidance on next steps
  • Friction in workflow

Proposed Agentic Flow (Seamless):

1. User opens editor (any mode)
2. User types: "Saya ingin menulis tentang keamanan WordPress"
3. Agent responds with suggestions
4. User types: "Fokus pada plugin vulnerabilities"
5. Agent responds with refined ideas
6. 💡 UI shows: [📝 Ready to create outline?] (AI-detected intent)
7. User clicks button (or types "yes" or "buatkan outline")
8. ✨ AI summarizes chat history (0.1 seconds)
9. Outline generated with clean context
10. 💡 UI shows: [✍️ Start Writing] (auto-suggested)
11. User clicks
12. Article generated

Improvements:

  • No manual mode switching needed
  • AI suggests next steps proactively
  • Context automatically optimized
  • Smooth, guided experience
  • Works in any language

🔧 Implementation Design

Backend: New Actions

/**
 * Handle context summarization request.
 *
 * @param WP_REST_Request $request REST request.
 * @return WP_REST_Response|WP_Error Response.
 */
public function handle_summarize_context( $request ) {
    $params = $request->get_json_params();
    $chat_history = $params['chatHistory'] ?? array();
    
    if ( empty( $chat_history ) || count( $chat_history ) < 4 ) {
        // No need to summarize short history
        return new WP_REST_Response(
            array(
                'summary' => '',
                'use_full_history' => true,
            ),
            200
        );
    }
    
    // Build summarization prompt
    $history_text = '';
    foreach ( $chat_history as $msg ) {
        $role = ucfirst( $msg['role'] ?? 'Unknown' );
        $content = $msg['content'] ?? '';
        $history_text .= "{$role}: {$content}\n\n";
    }
    
    $prompt = "Summarize this conversation into key points that capture the user's intent and requirements.

Focus on:
- Main topic
- Specific focus areas
- Rejected/excluded topics
- User preferences (tone, audience, etc.)

Keep the summary concise (max 200 words) but preserve critical context.
Write in the same language as the conversation.

Output format:
TOPIC: [main topic]
FOCUS: [what to include]
EXCLUDE: [what to avoid]
PREFERENCES: [any specific requirements]

Conversation:
{$history_text}";
    
    $provider = WP_Agentic_Writer_OpenRouter_Provider::get_instance();
    $messages = array(
        array(
            'role' => 'user',
            'content' => $prompt,
        ),
    );
    
    // Use cheap model for summarization
    $response = $provider->chat( $messages, array(), 'summarize' );
    
    if ( is_wp_error( $response ) ) {
        return $response;
    }
    
    // Track cost
    do_action(
        'wp_aw_after_api_request',
        $params['postId'] ?? 0,
        $response['model'] ?? '',
        'summarize_context',
        $response['input_tokens'] ?? 0,
        $response['output_tokens'] ?? 0,
        $response['cost'] ?? 0
    );
    
    return new WP_REST_Response(
        array(
            'summary' => $response['content'] ?? '',
            'use_full_history' => false,
            'cost' => $response['cost'] ?? 0,
        ),
        200
    );
}

/**
 * Handle intent detection request.
 *
 * @param WP_REST_Request $request REST request.
 * @return WP_REST_Response|WP_Error Response.
 */
public function handle_detect_intent( $request ) {
    $params = $request->get_json_params();
    $last_message = $params['lastMessage'] ?? '';
    $has_plan = $params['hasPlan'] ?? false;
    $current_mode = $params['currentMode'] ?? 'chat';
    
    if ( empty( $last_message ) ) {
        return new WP_REST_Response(
            array( 'intent' => 'continue_chat' ),
            200
        );
    }
    
    $prompt = "Based on the user's message, determine their intent. Choose ONE:

1. \"create_outline\" - User wants to create an article outline/structure
2. \"start_writing\" - User wants to write the full article
3. \"refine_content\" - User wants to improve existing content
4. \"continue_chat\" - User wants to continue discussing/exploring
5. \"clarify\" - User is asking questions or needs clarification

Consider:
- The user's explicit request
- Whether they have an outline already (has_plan: " . ( $has_plan ? 'true' : 'false' ) . ")
- Current mode (current_mode: {$current_mode})

User's message: \"{$last_message}\"

Respond with ONLY the intent code (e.g., \"create_outline\").";
    
    $provider = WP_Agentic_Writer_OpenRouter_Provider::get_instance();
    $messages = array(
        array(
            'role' => 'user',
            'content' => $prompt,
        ),
    );
    
    $response = $provider->chat( $messages, array(), 'intent_detection' );
    
    if ( is_wp_error( $response ) ) {
        return $response;
    }
    
    // Track cost
    do_action(
        'wp_aw_after_api_request',
        $params['postId'] ?? 0,
        $response['model'] ?? '',
        'detect_intent',
        $response['input_tokens'] ?? 0,
        $response['output_tokens'] ?? 0,
        $response['cost'] ?? 0
    );
    
    $intent = trim( strtolower( $response['content'] ?? 'continue_chat' ) );
    
    return new WP_REST_Response(
        array(
            'intent' => $intent,
            'cost' => $response['cost'] ?? 0,
        ),
        200
    );
}

Frontend: Agentic UX

// Auto-detect intent after each message
const handleMessageSent = async (userMessage) => {
    // Send message to chat
    const chatResponse = await sendChatMessage(userMessage);
    
    // Detect intent in background
    const intentResponse = await fetch('/wp-json/wp-agentic-writer/v1/detect-intent', {
        method: 'POST',
        body: JSON.stringify({
            lastMessage: userMessage,
            hasPlan: !!currentPlan,
            currentMode: agentMode,
            postId: postId
        })
    });
    
    const { intent } = await intentResponse.json();
    
    // Show contextual action based on intent
    setDetectedIntent(intent);
    showContextualAction(intent);
};

// Render contextual action buttons
const renderContextualAction = () => {
    if (!detectedIntent) return null;
    
    switch (detectedIntent) {
        case 'create_outline':
            return (
                <div className="contextual-action">
                    <p>💡 Ready to create an outline?</p>
                    <button onClick={handleCreateOutlineWithSummary} className="primary">
                        📝 Create Outline
                    </button>
                </div>
            );
        
        case 'start_writing':
            if (!currentPlan) {
                return (
                    <div className="contextual-action">
                        <p>⚠️ You need an outline first</p>
                        <button onClick={handleCreateOutlineWithSummary}>
                            📝 Create Outline First
                        </button>
                    </div>
                );
            }
            return (
                <div className="contextual-action">
                    <p>💡 Ready to write the article?</p>
                    <button onClick={handleStartWriting} className="primary">
                        ✍️ Start Writing
                    </button>
                </div>
            );
        
        case 'refine_content':
            return (
                <div className="contextual-action">
                    <p>💡 Use @block to refine specific sections</p>
                </div>
            );
        
        default:
            return null;
    }
};

// Create outline with AI summarization
const handleCreateOutlineWithSummary = async () => {
    setIsLoading(true);
    
    // Step 1: Summarize chat history if needed
    let contextToSend = messages;
    
    if (messages.length > 6) {
        showStatus('Optimizing context...');
        
        const summaryResponse = await fetch('/wp-json/wp-agentic-writer/v1/summarize-context', {
            method: 'POST',
            body: JSON.stringify({
                chatHistory: messages,
                postId: postId
            })
        });
        
        const { summary, use_full_history, cost } = await summaryResponse.json();
        
        if (!use_full_history && summary) {
            // Use summarized context
            contextToSend = [
                {
                    role: 'system',
                    content: `Context Summary:\n${summary}`
                },
                ...messages.slice(-2) // Keep last exchange
            ];
            
            console.log('Context optimized. Cost:', cost);
        }
    }
    
    // Step 2: Generate outline with optimized context
    showStatus('Creating outline...');
    
    const outlineResponse = await fetch('/wp-json/wp-agentic-writer/v1/generate-plan', {
        method: 'POST',
        body: JSON.stringify({
            topic: extractTopic(messages),
            chatHistory: contextToSend,
            postId: postId,
            postConfig: postConfig,
            stream: true
        })
    });
    
    // Handle streaming response...
};

🎨 UX Enhancements

1. Contextual Action Cards

┌─────────────────────────────────────────────────────┐
│  Agent: "I can help you create a comprehensive      │
│  outline for WordPress plugin security..."          │
├─────────────────────────────────────────────────────┤
│  💡 Detected Intent: Create Outline                 │
│                                                      │
│  [📝 Create Outline]  [💬 Continue Discussing]      │
│                                                      │
│  💰 Context will be optimized (~$0.0001)            │
└─────────────────────────────────────────────────────┘

2. Context Optimization Indicator

┌─────────────────────────────────────────────────────┐
│  ⚡ Optimizing context...                           │
│  • 9 messages → Summary (200 words)                 │
│  • Token reduction: 4,365 → 450 (90%)               │
│  • Cost: $0.0001                                    │
│  ✓ Done in 0.2s                                     │
└─────────────────────────────────────────────────────┘

3. Smart Mode Transitions

User in Chat mode types: "buatkan outline-nya"

┌─────────────────────────────────────────────────────┐
│  💡 Switching to Planning mode...                   │
│  • Detected intent: Create outline                  │
│  • Optimizing 7 messages of context                 │
│  • Generating outline...                            │
└─────────────────────────────────────────────────────┘

[Outline appears]

┌─────────────────────────────────────────────────────┐
│  ✨ Outline ready!                                   │
│                                                      │
│  Next step:                                          │
│  [✍️ Start Writing Article]                         │
└─────────────────────────────────────────────────────┘

📊 Decision Matrix: When to Use What?

Situation Recommended Approach Reason
Chat history ≤ 4 messages Send full history Short enough, no optimization needed
Chat history 5-8 messages AI summarization Balance cost and quality
Chat history > 8 messages AI summarization + last 2 Keep recent context verbatim
User switches modes Detect intent Guide user to next action
Before outline generation Summarize context Clean, focused input
Before article execution Use plan (no chat history) Plan already has all context
Block refinement No chat history Block content is sufficient
User types "/reset" Clear all context Fresh start

🎯 Recommendation: Hybrid Intelligent Approach

The Winning Strategy

Combine AI-powered + Smart Defaults:

  1. Default Behavior (No User Action)

    • Chat history ≤ 4 messages → Send full history
    • Chat history > 4 messages → Auto-summarize with AI
    • Cost: ~$0.0001 per summarization (negligible)
  2. Intent Detection (Automatic)

    • After every user message → Detect intent
    • Show contextual action buttons
    • Cost: ~$0.00002 per detection (negligible)
  3. User Control (Optional)

    • Settings: "Context Mode" → Auto/Full/Minimal
    • "/reset" command → Clear context
    • Manual selection UI (advanced users)

Why This Works

Language-Agnostic

  • Works in English, Indonesian, Arabic, Chinese, etc.
  • No hardcoded keywords

Cost-Effective

  • 76% cost reduction vs full history
  • Total added cost: ~$0.34/month for 100 articles
  • ROI: Better quality + Lower cost

True Agentic Experience

  • AI manages its own context
  • Proactive suggestions
  • Seamless workflow
  • No manual mode switching

User-Friendly

  • Automatic by default
  • Optional manual control
  • Transparent (shows what's happening)
  • Fast (summarization takes 0.1-0.3s)

🔧 Implementation Plan

Phase 1: Core Infrastructure (Week 1)

Backend:

  • Add /summarize-context endpoint
  • Add /detect-intent endpoint
  • Add summarize and intent_detection operation types to cost tracking
  • Update OpenRouter provider to support these actions

Frontend:

  • Add handleSummarizeContext() function
  • Add handleDetectIntent() function
  • Add context optimization indicator component

Testing:

  • Test summarization in English, Indonesian, Arabic
  • Test intent detection in multiple languages
  • Verify cost tracking

Phase 2: UX Integration (Week 2)

Frontend:

  • Add contextual action cards
  • Auto-detect intent after each message
  • Show "Optimizing context..." status
  • Add smart mode transitions

Settings:

  • Add "Context Mode" setting (Auto/Full/Minimal)
  • Add context optimization toggle
  • Add cost estimates in settings

Testing:

  • Test full user journey (chat → outline → write)
  • Test in multiple languages
  • Verify smooth transitions

Phase 3: Advanced Features (Week 3)

Features:

  • Add /reset command
  • Add manual context selection UI (optional)
  • Add context analytics (token usage, cost breakdown)
  • Add context caching (reuse summaries)

Optimization:

  • Implement smart caching for summaries
  • Add context relevance scoring
  • Optimize prompt templates

Documentation:

  • Update user guide
  • Add context management section
  • Document cost implications

💰 Final Cost Analysis

Per Article (Average)

Component Cost Frequency
Intent detection $0.00002 × 10 messages = $0.0002
Context summarization $0.0001 × 1 time = $0.0001
Planning (with summary) $0.003 = $0.003
Execution (no history) $0.50-$2.00 = $1.00 avg
Total per article ≈ $1.0033

Compared to Full History:

  • Full history approach: $1.013 per article
  • AI-powered approach: $1.0033 per article
  • Savings: $0.01 per article (negligible)

But wait - the real benefit:

  • Better quality (clean, focused context)
  • Language-agnostic (works everywhere)
  • Better UX (proactive suggestions)
  • Scalable (no hardcoded rules)

🎯 Answer to Your Question

"Send all chat history vs AI summarization?"

Answer: AI Summarization is Better

Reasons:

  1. Cost is Nearly Identical

    • Full history: $1.013/article
    • AI summary: $1.0033/article
    • Difference: $0.01 (1% savings)
  2. Quality is Better

    • Summary removes contradicted ideas
    • Summary focuses on final intent
    • Summary prevents pollution
    • AI explicitly told what to focus on
  3. Language Support

    • Full history: Works in all languages
    • AI summary: Works in all languages
    • Hardcoded: Only English
  4. Agentic Experience

    • AI managing AI context = true agentic
    • Proactive intent detection
    • Seamless workflow
    • No user friction
  5. Scalability

    • No hardcoded rules to maintain
    • Adapts to new languages automatically
    • Handles edge cases gracefully

The Plan:

  1. Implement AI summarization (not hardcoded)
  2. Implement AI intent detection (not hardcoded)
  3. Make it automatic (no user action needed)
  4. Add user controls (optional override)
  5. Track costs transparently (show user what's happening)

Status: 🚀 READY TO IMPLEMENT
Approach: AI-Powered (Agentic)
Cost Impact: Negligible (+$0.34/month for 100 articles)
Quality Impact: Significant improvement
UX Impact: Seamless, guided experience