feat: consolidate docs, backend/session infra, and settings updates
This commit is contained in:
556
docs/features/hybrid-local-cloud-ai-provider-b09890.md
Normal file
556
docs/features/hybrid-local-cloud-ai-provider-b09890.md
Normal file
@@ -0,0 +1,556 @@
|
||||
# Local Backend + Codex Provider System with Cloud Fallback
|
||||
|
||||
Implement a provider system allowing text generation via Local Backend (Claude CLI proxy) and Codex API, while keeping image generation on OpenRouter's cloud API.
|
||||
|
||||
## Architecture Overview (Based on local-backend-feature.md)
|
||||
|
||||
**Current State:**
|
||||
- Plugin uses `WP_Agentic_Writer_OpenRouter_Provider` for all AI tasks
|
||||
- All requests go to cloud APIs (OpenRouter)
|
||||
- Costs per token, rate limits apply
|
||||
- 23+ files directly call provider singleton
|
||||
|
||||
**New State:**
|
||||
- **Local Backend**: User runs Node.js proxy on their machine (Claude CLI)
|
||||
- **Codex Provider**: Direct integration with OpenAI Codex API
|
||||
- **OpenRouter**: Fallback + image generation only
|
||||
- **Provider Manager**: Routes tasks to appropriate provider
|
||||
|
||||
**Flow:**
|
||||
```
|
||||
WordPress Plugin → Provider Manager → Local Backend (http://user-ip:8080)
|
||||
→ Codex API (https://api.openai.com)
|
||||
→ OpenRouter (images + fallback)
|
||||
```
|
||||
|
||||
## Provider Architecture
|
||||
|
||||
### 1. Provider Interface (Common Contract)
|
||||
|
||||
```php
|
||||
interface WP_Agentic_Writer_AI_Provider_Interface {
|
||||
public function chat($messages, $options, $type);
|
||||
public function chat_stream($messages, $options, $type, $callback);
|
||||
public function generate_image($prompt, $model, $options);
|
||||
public function is_configured();
|
||||
public function test_connection();
|
||||
public function supports_task_type($type);
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Provider Manager (Router)
|
||||
|
||||
```php
|
||||
class WP_Agentic_Writer_Provider_Manager {
|
||||
public static function get_provider_for_task($type) {
|
||||
$settings = get_option('wp_agentic_writer_settings');
|
||||
$task_providers = $settings['task_providers'] ?? [];
|
||||
|
||||
$provider_name = $task_providers[$type] ?? 'openrouter';
|
||||
|
||||
switch ($provider_name) {
|
||||
case 'local_backend':
|
||||
return new WP_Agentic_Writer_Local_Backend_Provider();
|
||||
case 'codex':
|
||||
return new WP_Agentic_Writer_Codex_Provider();
|
||||
case 'openrouter':
|
||||
default:
|
||||
return WP_Agentic_Writer_OpenRouter_Provider::get_instance();
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Provider Implementations
|
||||
|
||||
**A. Local Backend Provider** (Primary for text tasks)
|
||||
- **File**: `includes/class-local-backend-provider.php`
|
||||
- **Endpoint**: `http://192.168.x.x:8080` (user's machine)
|
||||
- **Protocol**: HTTP POST to `/v1/messages` (OpenAI-compatible)
|
||||
- **Backend**: Node.js proxy → Claude CLI
|
||||
- **Supports**: `chat`, `clarity`, `planning`, `writing`, `refinement`
|
||||
- **Cost**: $0 (uses user's Claude CLI + Z.ai/Anthropic)
|
||||
|
||||
**B. Codex Provider** (Alternative text provider)
|
||||
- **File**: `includes/class-codex-provider.php`
|
||||
- **Endpoint**: `https://api.openai.com/v1/chat/completions`
|
||||
- **Protocol**: Standard OpenAI API
|
||||
- **Supports**: `chat`, `clarity`, `planning`, `writing`, `refinement`
|
||||
- **Cost**: Per OpenAI pricing
|
||||
|
||||
**C. OpenRouter Provider** (Existing, for images + fallback)
|
||||
- **File**: `includes/class-openrouter-provider.php` (existing)
|
||||
- **Endpoint**: `https://openrouter.ai/api/v1/chat/completions`
|
||||
- **Supports**: ALL task types (fallback when local unavailable)
|
||||
- **Primary use**: `image` generation only in hybrid mode
|
||||
|
||||
### Configuration Strategy
|
||||
|
||||
#### Settings Structure
|
||||
|
||||
```php
|
||||
'wp_agentic_writer_settings' => [
|
||||
// Provider routing
|
||||
'provider_mode' => 'hybrid', // 'cloud', 'local', 'hybrid'
|
||||
'task_providers' => [
|
||||
'chat' => 'local_backend',
|
||||
'clarity' => 'local_backend',
|
||||
'planning' => 'local_backend',
|
||||
'writing' => 'local_backend',
|
||||
'refinement' => 'codex', // Or local_backend
|
||||
'image' => 'openrouter' // Always OpenRouter
|
||||
],
|
||||
|
||||
// Local Backend settings
|
||||
'local_backend_url' => 'http://192.168.1.105:8080',
|
||||
'local_backend_key' => 'dummy',
|
||||
'local_backend_model' => 'claude-via-cli',
|
||||
'local_backend_enabled' => true,
|
||||
|
||||
// Codex settings
|
||||
'codex_api_key' => 'sk-...',
|
||||
'codex_model' => 'gpt-4',
|
||||
'codex_enabled' => true,
|
||||
|
||||
// OpenRouter (existing)
|
||||
'openrouter_api_key' => 'sk-or-...',
|
||||
'image_model' => 'black-forest-labs/flux-1.1-pro',
|
||||
]
|
||||
```
|
||||
|
||||
#### Recommended Configuration
|
||||
|
||||
**Optimal Hybrid Setup:**
|
||||
```
|
||||
chat → Local Backend (free, private, fast)
|
||||
clarity → Local Backend (free, fast)
|
||||
planning → Local Backend (free, fast)
|
||||
writing → Local Backend (free, unlimited)
|
||||
refinement → Codex (cloud quality when needed)
|
||||
image → OpenRouter (only option for FLUX/Recraft)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- 80%+ requests via Local Backend = $0 cost
|
||||
- Privacy for all text content
|
||||
- Codex as quality alternative
|
||||
- Images via best models (OpenRouter)
|
||||
|
||||
## Implementation Components
|
||||
|
||||
### 1. Local Backend Package (Separate Distribution)
|
||||
|
||||
**Package:** `agentic-writer-local-backend.zip`
|
||||
|
||||
**Contents:**
|
||||
```
|
||||
agentic-writer-local-backend/
|
||||
├── claude-proxy.js # Node.js HTTP server
|
||||
├── start-proxy.sh # Launch with IP detection
|
||||
├── stop-proxy.sh # Clean shutdown
|
||||
├── test-connection.sh # Verify proxy works
|
||||
├── get-local-ip.sh # Find machine IP
|
||||
├── package.json # Express dependency
|
||||
├── README.md # Setup guide
|
||||
└── TROUBLESHOOTING.md # Common issues
|
||||
```
|
||||
|
||||
**Proxy Server (`claude-proxy.js`):**
|
||||
- Spawns user's Claude CLI for each request
|
||||
- OpenAI-compatible `/v1/messages` endpoint
|
||||
- Health check `/ping` endpoint
|
||||
- Binds to `0.0.0.0:8080` for LAN access
|
||||
- Logs requests for debugging
|
||||
|
||||
**User Flow:**
|
||||
1. Download ZIP from plugin settings
|
||||
2. Extract and run `./start-proxy.sh`
|
||||
3. Copy displayed Base URL (e.g., `http://192.168.1.105:8080`)
|
||||
4. Paste into plugin settings
|
||||
5. Test connection → generate content
|
||||
|
||||
### 2. Plugin Integration Files
|
||||
|
||||
**New Files:**
|
||||
```
|
||||
includes/class-local-backend-provider.php
|
||||
includes/class-codex-provider.php
|
||||
includes/class-provider-manager.php
|
||||
includes/interface-ai-provider.php
|
||||
views/settings/tab-local-backend.php
|
||||
admin/js/test-local-backend.js
|
||||
downloads/agentic-writer-local-backend.zip
|
||||
```
|
||||
|
||||
**Modified Files:**
|
||||
```
|
||||
includes/class-openrouter-provider.php
|
||||
→ Implement WP_Agentic_Writer_AI_Provider_Interface
|
||||
→ No behavior changes
|
||||
|
||||
includes/class-gutenberg-sidebar.php
|
||||
→ Replace: WP_Agentic_Writer_OpenRouter_Provider::get_instance()
|
||||
→ With: WP_Agentic_Writer_Provider_Manager::get_provider_for_task($type)
|
||||
|
||||
+ 20 other files with provider calls
|
||||
```
|
||||
|
||||
### 3. Settings UI
|
||||
|
||||
**New Tab:** "Local Backend"
|
||||
- Download local backend package
|
||||
- Base URL input
|
||||
- API Key input (dummy)
|
||||
- Model selector
|
||||
- "Test Connection" button
|
||||
- Connection status indicator
|
||||
- Troubleshooting guide
|
||||
|
||||
**Per-Task Routing (Advanced):**
|
||||
- Simple mode: Enable/Disable Local Backend (uses for all text)
|
||||
- Advanced mode: Task routing matrix
|
||||
|
||||
### 4. Migration & Backwards Compatibility
|
||||
|
||||
**Phase 1: Abstraction (Non-Breaking)**
|
||||
- Create `interface-ai-provider.php`
|
||||
- Create `class-provider-manager.php`
|
||||
- OpenRouter implements interface
|
||||
- All calls route through manager → defaults to OpenRouter
|
||||
- **100% backwards compatible, no settings changes**
|
||||
|
||||
**Phase 2: Local Backend Provider**
|
||||
- Implement `class-local-backend-provider.php`
|
||||
- Create proxy package (claude-proxy.js + scripts)
|
||||
- Add "Local Backend" settings tab
|
||||
- Implement connection test handler
|
||||
- Test with user's local setup
|
||||
|
||||
**Phase 3: Codex Provider**
|
||||
- Implement `class-codex-provider.php`
|
||||
- Add Codex API key to settings
|
||||
- Add Codex as task routing option
|
||||
- Test Codex integration
|
||||
|
||||
**Phase 4: Update All Provider Calls**
|
||||
- Update 23+ files to use Provider Manager
|
||||
- Test all task types (chat, clarity, planning, writing, refinement, image)
|
||||
- Ensure streaming works with all providers
|
||||
- Verify cost tracking
|
||||
|
||||
## Key Technical Decisions
|
||||
|
||||
### Local Backend Protocol
|
||||
|
||||
**Why OpenAI-compatible format:**
|
||||
- Plugin already uses message-based format
|
||||
- Easy to proxy to Claude CLI
|
||||
- Future-proof for other local models
|
||||
|
||||
**Request Format:**
|
||||
```json
|
||||
POST http://192.168.1.105:8080/v1/messages
|
||||
{
|
||||
"messages": [
|
||||
{"role": "user", "content": "Write about AI"}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Response Format:**
|
||||
```json
|
||||
{
|
||||
"id": "local-1234567890",
|
||||
"object": "chat.completion",
|
||||
"model": "claude-local",
|
||||
"choices": [{
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": "Article content..."
|
||||
},
|
||||
"finish_reason": "stop"
|
||||
}]
|
||||
}
|
||||
```
|
||||
|
||||
### Codex Integration
|
||||
|
||||
**Direct API Calls:**
|
||||
- Use OpenAI PHP library or `wp_remote_post`
|
||||
- Standard chat completions endpoint
|
||||
- Same format as OpenRouter
|
||||
|
||||
**Why Codex:**
|
||||
- High quality for coding/technical content
|
||||
- Alternative to Local Backend
|
||||
- Cloud-based when user's machine offline
|
||||
|
||||
## Cost Tracking Integration
|
||||
|
||||
**Challenge:** Local Backend = $0, Codex/OpenRouter = cost
|
||||
|
||||
**Solution:**
|
||||
```php
|
||||
// Provider returns cost data
|
||||
$result = $provider->chat($messages, $options, $type);
|
||||
$cost = $result['cost'] ?? 0;
|
||||
|
||||
if ($cost > 0 && $post_id > 0) {
|
||||
do_action('wp_aw_after_api_request',
|
||||
$post_id,
|
||||
$result['model'] ?? 'unknown',
|
||||
$type,
|
||||
$result['input_tokens'] ?? 0,
|
||||
$result['output_tokens'] ?? 0,
|
||||
$cost
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
**Dashboard Display:**
|
||||
```
|
||||
Session Cost: $0.15
|
||||
- Local Backend: 12 requests (free)
|
||||
- Codex: 3 requests ($0.10)
|
||||
- OpenRouter: 2 images ($0.05)
|
||||
|
||||
Today: $2.40
|
||||
Month: $45.00
|
||||
```
|
||||
|
||||
## Error Handling & Fallbacks
|
||||
|
||||
### Local Backend Unreachable
|
||||
|
||||
```php
|
||||
$local_provider = new WP_Agentic_Writer_Local_Backend_Provider();
|
||||
|
||||
if (!$local_provider->is_available()) {
|
||||
// Fallback to OpenRouter
|
||||
error_log('Local Backend unavailable, using OpenRouter fallback');
|
||||
return WP_Agentic_Writer_OpenRouter_Provider::get_instance();
|
||||
}
|
||||
```
|
||||
|
||||
**Admin Notice:**
|
||||
"⚠️ Local Backend unreachable. Using OpenRouter fallback. Check proxy: `./start-proxy.sh`"
|
||||
|
||||
### Connection Test Results
|
||||
|
||||
```
|
||||
✅ Connected! Proxy responding correctly.
|
||||
❌ Connection timeout. Is proxy running? Check: ps aux | grep claude-proxy
|
||||
❌ Connection refused. Start proxy: ./start-proxy.sh
|
||||
❌ Wrong IP. Find correct IP: ./get-local-ip.sh
|
||||
❌ Claude CLI not responding. Test: echo "test" | claude
|
||||
```
|
||||
|
||||
## UI/UX Considerations
|
||||
|
||||
### Settings Page Flow
|
||||
|
||||
1. **Tab: Local Backend**
|
||||
- Big download button for proxy package
|
||||
- Prerequisites checklist
|
||||
- Base URL input (pre-filled from clipboard?)
|
||||
- Test connection button
|
||||
- Status: 🟢 Connected / 🔴 Offline
|
||||
|
||||
2. **Tab: Providers**
|
||||
- Simple mode: "Use Local Backend" toggle
|
||||
- Advanced mode: Task routing matrix
|
||||
- Provider status indicators
|
||||
|
||||
3. **Tab: Models** (existing)
|
||||
- Add Codex models
|
||||
- Show provider per model
|
||||
|
||||
### Sidebar Indicators
|
||||
|
||||
**Provider Badge:**
|
||||
```
|
||||
🏠 Local (Free)
|
||||
🔗 Codex ($0.02)
|
||||
☁️ OpenRouter ($0.05)
|
||||
```
|
||||
|
||||
**Connection Status:**
|
||||
```
|
||||
🟢 Local Backend: Connected
|
||||
🔴 Local Backend: Offline (using OpenRouter)
|
||||
```
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
**Test Cases:**
|
||||
1. Cloud-only mode (existing behavior)
|
||||
2. Local-only mode (Ollama for all text)
|
||||
3. Hybrid mode (recommended config)
|
||||
4. Fallback when Ollama unavailable
|
||||
5. Streaming works with both providers
|
||||
6. Cost tracking accurate
|
||||
7. Model selection per provider
|
||||
|
||||
## Performance Implications
|
||||
|
||||
**Local Backend:**
|
||||
- **Latency**: ~50-200ms LAN vs ~500-2000ms cloud
|
||||
- **Throughput**: Limited by Claude CLI (~20-30 tokens/sec)
|
||||
- **Concurrency**: One request at a time (spawn per request)
|
||||
- **Quality**: Same as cloud Claude (uses same models)
|
||||
|
||||
**Codex:**
|
||||
- **Latency**: Standard OpenAI API latency
|
||||
- **Quality**: High for technical/coding content
|
||||
- **Cost**: Per-token pricing
|
||||
|
||||
**OpenRouter:**
|
||||
- **Image Generation**: Only option for FLUX/Recraft
|
||||
- **Fallback**: When local backend offline
|
||||
- **Cost**: Per-token pricing
|
||||
|
||||
## Deployment Scenarios
|
||||
|
||||
### Scenario 1: Local Development (User's Machine)
|
||||
|
||||
**Setup:**
|
||||
- WordPress on Local by Flywheel (bricks.local)
|
||||
- Node.js proxy on same machine (localhost:8080)
|
||||
- Claude CLI configured with Z.ai
|
||||
|
||||
**Config:**
|
||||
```
|
||||
Local Backend URL: http://localhost:8080
|
||||
All text tasks: Local Backend
|
||||
Images: OpenRouter
|
||||
Cost: ~$0 for text, ~$0.05/image
|
||||
```
|
||||
|
||||
### Scenario 2: Local Dev + Cloud Production
|
||||
|
||||
**Dev:**
|
||||
- Use Local Backend for free development
|
||||
- Test with real Claude quality
|
||||
|
||||
**Production:**
|
||||
- Auto-switch to OpenRouter when local unavailable
|
||||
- Seamless fallback
|
||||
|
||||
### Scenario 3: Agency with Shared Local Backend
|
||||
|
||||
**Setup:**
|
||||
- One machine runs proxy on LAN
|
||||
- Multiple WordPress sites connect to it
|
||||
- All sites share one Z.ai account
|
||||
|
||||
**Config:**
|
||||
```
|
||||
Local Backend URL: http://192.168.1.50:8080
|
||||
Cost: Free for entire team
|
||||
```
|
||||
|
||||
## Implementation Phases
|
||||
|
||||
### Phase 1: Core Infrastructure (Week 1)
|
||||
- [ ] Create provider interface
|
||||
- [ ] Create provider manager
|
||||
- [ ] OpenRouter implements interface
|
||||
- [ ] Update 3-5 files to use manager (test)
|
||||
- [ ] Verify backwards compatibility
|
||||
|
||||
### Phase 2: Local Backend Package (Week 1)
|
||||
- [ ] Create `claude-proxy.js` with `/v1/messages` endpoint
|
||||
- [ ] Create startup/shutdown scripts
|
||||
- [ ] Test with actual Claude CLI
|
||||
- [ ] Package as ZIP
|
||||
- [ ] Write README with setup guide
|
||||
|
||||
### Phase 3: Local Backend Provider (Week 2)
|
||||
- [ ] Implement `class-local-backend-provider.php`
|
||||
- [ ] Add settings tab UI
|
||||
- [ ] Implement connection test
|
||||
- [ ] Add ZIP download from settings
|
||||
- [ ] Test end-to-end flow
|
||||
|
||||
### Phase 4: Codex Provider (Week 2)
|
||||
- [ ] Implement `class-codex-provider.php`
|
||||
- [ ] Add Codex API key to settings
|
||||
- [ ] Test Codex integration
|
||||
- [ ] Add to task routing options
|
||||
|
||||
### Phase 5: Full Rollout (Week 3)
|
||||
- [ ] Update all 23+ files to use provider manager
|
||||
- [ ] Test all task types
|
||||
- [ ] Verify streaming works
|
||||
- [ ] Test cost tracking
|
||||
- [ ] Documentation
|
||||
|
||||
### Phase 6: Polish (Week 3)
|
||||
- [ ] Connection status widget
|
||||
- [ ] Auto-fallback logic
|
||||
- [ ] Error messages with actionable guidance
|
||||
- [ ] Video tutorial
|
||||
- [ ] Troubleshooting guide
|
||||
|
||||
## Implementation Estimate
|
||||
|
||||
**Phase 1 (Infrastructure):** 4-5 hours
|
||||
- Provider interface, manager, OpenRouter refactor
|
||||
- Test with 3-5 files
|
||||
|
||||
**Phase 2 (Local Backend Package):** 6-8 hours
|
||||
- Node.js proxy development
|
||||
- Scripts (start, stop, test)
|
||||
- ZIP packaging
|
||||
- Documentation
|
||||
|
||||
**Phase 3 (Local Backend Integration):** 8-10 hours
|
||||
- Provider class
|
||||
- Settings UI
|
||||
- Connection test
|
||||
- End-to-end testing
|
||||
|
||||
**Phase 4 (Codex):** 4-6 hours
|
||||
- Provider implementation
|
||||
- Settings integration
|
||||
- Testing
|
||||
|
||||
**Phase 5 (Full Rollout):** 8-10 hours
|
||||
- Update 23+ files
|
||||
- Test all scenarios
|
||||
- Cost tracking
|
||||
- Documentation
|
||||
|
||||
**Phase 6 (Polish):** 4-6 hours
|
||||
- UI improvements
|
||||
- Error handling
|
||||
- Video tutorial
|
||||
- Troubleshooting docs
|
||||
|
||||
**Total:** 34-45 hours (~1-1.5 weeks)
|
||||
|
||||
## Success Criteria
|
||||
|
||||
✅ User can download local backend package
|
||||
✅ User can start proxy on their machine
|
||||
✅ Plugin connects to local backend successfully
|
||||
✅ All text tasks work via local backend ($0 cost)
|
||||
✅ Images work via OpenRouter
|
||||
✅ Codex works as alternative provider
|
||||
✅ Automatic fallback to OpenRouter when local offline
|
||||
✅ Cost tracking shows local = $0, cloud = actual cost
|
||||
✅ Streaming works with all providers
|
||||
✅ 100% backwards compatible (defaults to OpenRouter)
|
||||
|
||||
## Ready to Implement
|
||||
|
||||
This plan matches `local-backend-feature.md` requirements:
|
||||
- ✅ Claude CLI proxy via Node.js
|
||||
- ✅ HTTP-based local backend
|
||||
- ✅ Codex integration
|
||||
- ✅ OpenRouter for images
|
||||
- ✅ Provider abstraction system
|
||||
- ✅ Fallback logic
|
||||
- ✅ Complete UI/UX flow
|
||||
|
||||
Confirm to proceed with implementation.
|
||||
Reference in New Issue
Block a user