Files
wp-agentic-writer/docs/features/hybrid-local-cloud-ai-provider-b09890.md

15 KiB

Local Backend + Codex Provider System with Cloud Fallback

Implement a provider system allowing text generation via Local Backend (Claude CLI proxy) and Codex API, while keeping image generation on OpenRouter's cloud API.

Architecture Overview (Based on local-backend-feature.md)

Current State:

  • Plugin uses WP_Agentic_Writer_OpenRouter_Provider for all AI tasks
  • All requests go to cloud APIs (OpenRouter)
  • Costs per token, rate limits apply
  • 23+ files directly call provider singleton

New State:

  • Local Backend: User runs Node.js proxy on their machine (Claude CLI)
  • Codex Provider: Direct integration with OpenAI Codex API
  • OpenRouter: Fallback + image generation only
  • Provider Manager: Routes tasks to appropriate provider

Flow:

WordPress Plugin → Provider Manager → Local Backend (http://user-ip:8080)
                                   → Codex API (https://api.openai.com)
                                   → OpenRouter (images + fallback)

Provider Architecture

1. Provider Interface (Common Contract)

interface WP_Agentic_Writer_AI_Provider_Interface {
    public function chat($messages, $options, $type);
    public function chat_stream($messages, $options, $type, $callback);
    public function generate_image($prompt, $model, $options);
    public function is_configured();
    public function test_connection();
    public function supports_task_type($type);
}

2. Provider Manager (Router)

class WP_Agentic_Writer_Provider_Manager {
    public static function get_provider_for_task($type) {
        $settings = get_option('wp_agentic_writer_settings');
        $task_providers = $settings['task_providers'] ?? [];
        
        $provider_name = $task_providers[$type] ?? 'openrouter';
        
        switch ($provider_name) {
            case 'local_backend':
                return new WP_Agentic_Writer_Local_Backend_Provider();
            case 'codex':
                return new WP_Agentic_Writer_Codex_Provider();
            case 'openrouter':
            default:
                return WP_Agentic_Writer_OpenRouter_Provider::get_instance();
        }
    }
}

3. Provider Implementations

A. Local Backend Provider (Primary for text tasks)

  • File: includes/class-local-backend-provider.php
  • Endpoint: http://192.168.x.x:8080 (user's machine)
  • Protocol: HTTP POST to /v1/messages (OpenAI-compatible)
  • Backend: Node.js proxy → Claude CLI
  • Supports: chat, clarity, planning, writing, refinement
  • Cost: $0 (uses user's Claude CLI + Z.ai/Anthropic)

B. Codex Provider (Alternative text provider)

  • File: includes/class-codex-provider.php
  • Endpoint: https://api.openai.com/v1/chat/completions
  • Protocol: Standard OpenAI API
  • Supports: chat, clarity, planning, writing, refinement
  • Cost: Per OpenAI pricing

C. OpenRouter Provider (Existing, for images + fallback)

  • File: includes/class-openrouter-provider.php (existing)
  • Endpoint: https://openrouter.ai/api/v1/chat/completions
  • Supports: ALL task types (fallback when local unavailable)
  • Primary use: image generation only in hybrid mode

Configuration Strategy

Settings Structure

'wp_agentic_writer_settings' => [
    // Provider routing
    'provider_mode' => 'hybrid', // 'cloud', 'local', 'hybrid'
    'task_providers' => [
        'chat' => 'local_backend',
        'clarity' => 'local_backend',
        'planning' => 'local_backend', 
        'writing' => 'local_backend',
        'refinement' => 'codex',  // Or local_backend
        'image' => 'openrouter'   // Always OpenRouter
    ],
    
    // Local Backend settings
    'local_backend_url' => 'http://192.168.1.105:8080',
    'local_backend_key' => 'dummy',
    'local_backend_model' => 'claude-via-cli',
    'local_backend_enabled' => true,
    
    // Codex settings
    'codex_api_key' => 'sk-...',
    'codex_model' => 'gpt-4',
    'codex_enabled' => true,
    
    // OpenRouter (existing)
    'openrouter_api_key' => 'sk-or-...',
    'image_model' => 'black-forest-labs/flux-1.1-pro',
]

Optimal Hybrid Setup:

chat       → Local Backend (free, private, fast)
clarity    → Local Backend (free, fast)
planning   → Local Backend (free, fast)
writing    → Local Backend (free, unlimited)
refinement → Codex (cloud quality when needed)
image      → OpenRouter (only option for FLUX/Recraft)

Benefits:

  • 80%+ requests via Local Backend = $0 cost
  • Privacy for all text content
  • Codex as quality alternative
  • Images via best models (OpenRouter)

Implementation Components

1. Local Backend Package (Separate Distribution)

Package: agentic-writer-local-backend.zip

Contents:

agentic-writer-local-backend/
├── claude-proxy.js          # Node.js HTTP server
├── start-proxy.sh           # Launch with IP detection
├── stop-proxy.sh            # Clean shutdown
├── test-connection.sh       # Verify proxy works
├── get-local-ip.sh          # Find machine IP
├── package.json             # Express dependency
├── README.md                # Setup guide
└── TROUBLESHOOTING.md       # Common issues

Proxy Server (claude-proxy.js):

  • Spawns user's Claude CLI for each request
  • OpenAI-compatible /v1/messages endpoint
  • Health check /ping endpoint
  • Binds to 0.0.0.0:8080 for LAN access
  • Logs requests for debugging

User Flow:

  1. Download ZIP from plugin settings
  2. Extract and run ./start-proxy.sh
  3. Copy displayed Base URL (e.g., http://192.168.1.105:8080)
  4. Paste into plugin settings
  5. Test connection → generate content

2. Plugin Integration Files

New Files:

includes/class-local-backend-provider.php
includes/class-codex-provider.php
includes/class-provider-manager.php
includes/interface-ai-provider.php
views/settings/tab-local-backend.php
admin/js/test-local-backend.js
downloads/agentic-writer-local-backend.zip

Modified Files:

includes/class-openrouter-provider.php
  → Implement WP_Agentic_Writer_AI_Provider_Interface
  → No behavior changes

includes/class-gutenberg-sidebar.php
  → Replace: WP_Agentic_Writer_OpenRouter_Provider::get_instance()
  → With: WP_Agentic_Writer_Provider_Manager::get_provider_for_task($type)

+ 20 other files with provider calls

3. Settings UI

New Tab: "Local Backend"

  • Download local backend package
  • Base URL input
  • API Key input (dummy)
  • Model selector
  • "Test Connection" button
  • Connection status indicator
  • Troubleshooting guide

Per-Task Routing (Advanced):

  • Simple mode: Enable/Disable Local Backend (uses for all text)
  • Advanced mode: Task routing matrix

4. Migration & Backwards Compatibility

Phase 1: Abstraction (Non-Breaking)

  • Create interface-ai-provider.php
  • Create class-provider-manager.php
  • OpenRouter implements interface
  • All calls route through manager → defaults to OpenRouter
  • 100% backwards compatible, no settings changes

Phase 2: Local Backend Provider

  • Implement class-local-backend-provider.php
  • Create proxy package (claude-proxy.js + scripts)
  • Add "Local Backend" settings tab
  • Implement connection test handler
  • Test with user's local setup

Phase 3: Codex Provider

  • Implement class-codex-provider.php
  • Add Codex API key to settings
  • Add Codex as task routing option
  • Test Codex integration

Phase 4: Update All Provider Calls

  • Update 23+ files to use Provider Manager
  • Test all task types (chat, clarity, planning, writing, refinement, image)
  • Ensure streaming works with all providers
  • Verify cost tracking

Key Technical Decisions

Local Backend Protocol

Why OpenAI-compatible format:

  • Plugin already uses message-based format
  • Easy to proxy to Claude CLI
  • Future-proof for other local models

Request Format:

POST http://192.168.1.105:8080/v1/messages
{
  "messages": [
    {"role": "user", "content": "Write about AI"}
  ]
}

Response Format:

{
  "id": "local-1234567890",
  "object": "chat.completion",
  "model": "claude-local",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Article content..."
    },
    "finish_reason": "stop"
  }]
}

Codex Integration

Direct API Calls:

  • Use OpenAI PHP library or wp_remote_post
  • Standard chat completions endpoint
  • Same format as OpenRouter

Why Codex:

  • High quality for coding/technical content
  • Alternative to Local Backend
  • Cloud-based when user's machine offline

Cost Tracking Integration

Challenge: Local Backend = $0, Codex/OpenRouter = cost

Solution:

// Provider returns cost data
$result = $provider->chat($messages, $options, $type);
$cost = $result['cost'] ?? 0;

if ($cost > 0 && $post_id > 0) {
    do_action('wp_aw_after_api_request',
        $post_id,
        $result['model'] ?? 'unknown',
        $type,
        $result['input_tokens'] ?? 0,
        $result['output_tokens'] ?? 0,
        $cost
    );
}

Dashboard Display:

Session Cost: $0.15
  - Local Backend: 12 requests (free)
  - Codex: 3 requests ($0.10)
  - OpenRouter: 2 images ($0.05)

Today: $2.40
Month: $45.00

Error Handling & Fallbacks

Local Backend Unreachable

$local_provider = new WP_Agentic_Writer_Local_Backend_Provider();

if (!$local_provider->is_available()) {
    // Fallback to OpenRouter
    error_log('Local Backend unavailable, using OpenRouter fallback');
    return WP_Agentic_Writer_OpenRouter_Provider::get_instance();
}

Admin Notice: "⚠️ Local Backend unreachable. Using OpenRouter fallback. Check proxy: ./start-proxy.sh"

Connection Test Results

✅ Connected! Proxy responding correctly.
❌ Connection timeout. Is proxy running? Check: ps aux | grep claude-proxy
❌ Connection refused. Start proxy: ./start-proxy.sh
❌ Wrong IP. Find correct IP: ./get-local-ip.sh
❌ Claude CLI not responding. Test: echo "test" | claude

UI/UX Considerations

Settings Page Flow

  1. Tab: Local Backend

    • Big download button for proxy package
    • Prerequisites checklist
    • Base URL input (pre-filled from clipboard?)
    • Test connection button
    • Status: 🟢 Connected / 🔴 Offline
  2. Tab: Providers

    • Simple mode: "Use Local Backend" toggle
    • Advanced mode: Task routing matrix
    • Provider status indicators
  3. Tab: Models (existing)

    • Add Codex models
    • Show provider per model

Sidebar Indicators

Provider Badge:

🏠 Local (Free)
🔗 Codex ($0.02)
☁️ OpenRouter ($0.05)

Connection Status:

🟢 Local Backend: Connected
🔴 Local Backend: Offline (using OpenRouter)

Testing Strategy

Test Cases:

  1. Cloud-only mode (existing behavior)
  2. Local-only mode (Ollama for all text)
  3. Hybrid mode (recommended config)
  4. Fallback when Ollama unavailable
  5. Streaming works with both providers
  6. Cost tracking accurate
  7. Model selection per provider

Performance Implications

Local Backend:

  • Latency: ~50-200ms LAN vs ~500-2000ms cloud
  • Throughput: Limited by Claude CLI (~20-30 tokens/sec)
  • Concurrency: One request at a time (spawn per request)
  • Quality: Same as cloud Claude (uses same models)

Codex:

  • Latency: Standard OpenAI API latency
  • Quality: High for technical/coding content
  • Cost: Per-token pricing

OpenRouter:

  • Image Generation: Only option for FLUX/Recraft
  • Fallback: When local backend offline
  • Cost: Per-token pricing

Deployment Scenarios

Scenario 1: Local Development (User's Machine)

Setup:

  • WordPress on Local by Flywheel (bricks.local)
  • Node.js proxy on same machine (localhost:8080)
  • Claude CLI configured with Z.ai

Config:

Local Backend URL: http://localhost:8080
All text tasks: Local Backend
Images: OpenRouter
Cost: ~$0 for text, ~$0.05/image

Scenario 2: Local Dev + Cloud Production

Dev:

  • Use Local Backend for free development
  • Test with real Claude quality

Production:

  • Auto-switch to OpenRouter when local unavailable
  • Seamless fallback

Scenario 3: Agency with Shared Local Backend

Setup:

  • One machine runs proxy on LAN
  • Multiple WordPress sites connect to it
  • All sites share one Z.ai account

Config:

Local Backend URL: http://192.168.1.50:8080
Cost: Free for entire team

Implementation Phases

Phase 1: Core Infrastructure (Week 1)

  • Create provider interface
  • Create provider manager
  • OpenRouter implements interface
  • Update 3-5 files to use manager (test)
  • Verify backwards compatibility

Phase 2: Local Backend Package (Week 1)

  • Create claude-proxy.js with /v1/messages endpoint
  • Create startup/shutdown scripts
  • Test with actual Claude CLI
  • Package as ZIP
  • Write README with setup guide

Phase 3: Local Backend Provider (Week 2)

  • Implement class-local-backend-provider.php
  • Add settings tab UI
  • Implement connection test
  • Add ZIP download from settings
  • Test end-to-end flow

Phase 4: Codex Provider (Week 2)

  • Implement class-codex-provider.php
  • Add Codex API key to settings
  • Test Codex integration
  • Add to task routing options

Phase 5: Full Rollout (Week 3)

  • Update all 23+ files to use provider manager
  • Test all task types
  • Verify streaming works
  • Test cost tracking
  • Documentation

Phase 6: Polish (Week 3)

  • Connection status widget
  • Auto-fallback logic
  • Error messages with actionable guidance
  • Video tutorial
  • Troubleshooting guide

Implementation Estimate

Phase 1 (Infrastructure): 4-5 hours

  • Provider interface, manager, OpenRouter refactor
  • Test with 3-5 files

Phase 2 (Local Backend Package): 6-8 hours

  • Node.js proxy development
  • Scripts (start, stop, test)
  • ZIP packaging
  • Documentation

Phase 3 (Local Backend Integration): 8-10 hours

  • Provider class
  • Settings UI
  • Connection test
  • End-to-end testing

Phase 4 (Codex): 4-6 hours

  • Provider implementation
  • Settings integration
  • Testing

Phase 5 (Full Rollout): 8-10 hours

  • Update 23+ files
  • Test all scenarios
  • Cost tracking
  • Documentation

Phase 6 (Polish): 4-6 hours

  • UI improvements
  • Error handling
  • Video tutorial
  • Troubleshooting docs

Total: 34-45 hours (~1-1.5 weeks)

Success Criteria

User can download local backend package User can start proxy on their machine Plugin connects to local backend successfully All text tasks work via local backend ($0 cost) Images work via OpenRouter Codex works as alternative provider Automatic fallback to OpenRouter when local offline Cost tracking shows local = $0, cloud = actual cost Streaming works with all providers 100% backwards compatible (defaults to OpenRouter)

Ready to Implement

This plan matches local-backend-feature.md requirements:

  • Claude CLI proxy via Node.js
  • HTTP-based local backend
  • Codex integration
  • OpenRouter for images
  • Provider abstraction system
  • Fallback logic
  • Complete UI/UX flow

Confirm to proceed with implementation.