23 KiB
WP Agentic Writer Plugin Audit Report
Status: COMPLETE / SUPERSEDED
Completion marker date: 2026-05-24
Follow-up trace audit: docs/architecture/PLUGIN_AUDIT_FOLLOWUP_2026-05-24.md
This report is retained as the historical baseline. Its implementation has been traced in the 2026-05-24 follow-up audit, so remaining work should be tracked from the follow-up report instead of reopening duplicate jobs from this file.
Audit date: 2026-05-22
Plugin version observed: 0.1.3
Scope: UI, UX, admin settings, Gutenberg sidebar workflow, conversation context/history, cost tracking, provider/model routing, image generation, local backend, security, data lifecycle, maintainability.
Executive Summary
WP Agentic Writer has a strong product direction: a plan-first writing assistant inside Gutenberg with chat, planning, writing, refinement, research, image suggestions, SEO/GEO helpers, provider routing, local backend support, and cost visibility. The problem is not lack of ambition. The problem is that too many responsibilities are packed into a few files without stable contracts between state, persistence, providers, and UI.
The highest risk pattern is this: the plugin now has two overlapping persistence models for conversation history. Older post meta storage (_wpaw_chat_history, _wpaw_plan, _wpaw_memory) still exists, while newer session storage (wpaw_conversations) was added but is not reliably migrated or permission-scoped. That creates the exact failure mode you described: fixing one flow can silently break another because different screens and endpoints read from different truth sources.
Overall readiness assessment: beta/prototype with several production blockers. Syntax checks pass for key PHP and JS files, but there are serious runtime, migration, security, and model-cache defects.
Critical Findings
P0: Conversation Table Migration Is Not Wired
The new conversation manager expects a wpaw_conversations table, but activation only creates cost and image tables. wpaw_run_migrations() exists but is not called. Worse, the general DB version is set to 1.1.0, while the conversation migration checks for < 0.1.4, so sites can be marked upgraded without the conversation table ever being created.
Evidence:
wp-agentic-writer.php:136-180creates default options, custom models, cost table, and image tables, but not conversations.wp-agentic-writer.php:219-231updateswpaw_db_versionto1.1.0.includes/class-conversation-migration.php:17-44defines conversation table creation.includes/class-conversation-migration.php:64-69defines migration runner, but it is not hooked or called.
Impact:
- New chat/session UX can fail with DB insert/read errors on clean installs or upgraded installs.
- Fixing frontend session behavior may appear broken because the database contract is missing.
Recommendation:
- Split DB versions per table/domain, for example
wpaw_cost_db_version,wpaw_image_db_version,wpaw_conversation_db_version. - Call conversation migrations on activation and on
plugins_loadedidempotently. - Add a visible admin health check that verifies required tables exist.
P0: OpenRouter Model Cache Has Conflicting Shapes
get_cached_models() stores the full OpenRouter model objects in transient wpaw_openrouter_models. validate_model_availability() uses the same transient key but expects a flat list of model IDs. If the settings page has already cached full model objects, streaming and image validation will reject valid models because in_array($model_id, $available_models, true) compares a string to arrays.
Evidence:
includes/class-openrouter-provider.php:105-172caches full model objects underwpaw_openrouter_models.includes/class-openrouter-provider.php:239-255reads the same transient key as if it contains IDs.- Validation is used before streaming and image generation at
includes/class-openrouter-provider.php:548andincludes/class-openrouter-provider.php:755.
Impact:
- Valid models can fail as "not available".
- Refreshing the model list in settings can break generation.
- This creates a brittle A/B loop: model UI fixes can break streaming/image execution.
Recommendation:
- Use separate cache keys, e.g.
wpaw_openrouter_model_objectsandwpaw_openrouter_model_ids. - Normalize model validation to accept both canonical IDs and suffix variants like
:online, without poisoning the settings model cache. - Add a regression test around cached full model objects plus streaming validation.
P0: PHP Requirement Is 7.4 But Code Uses PHP 8 Functions
The plugin header declares PHP 7.4 support, but the provider streaming parsers call str_starts_with(), which requires PHP 8.
Evidence:
wp-agentic-writer.php:13-14declaresRequires PHP: 7.4.includes/class-openrouter-provider.php:642,includes/class-local-backend-provider.php:208, andincludes/class-codex-provider.php:207callstr_starts_with().
Impact:
- Fatal errors on PHP 7.4 sites when streaming code paths load.
Recommendation:
- Either raise
Requires PHPto 8.0+ or replace with0 === strpos($line, 'data: ').
P0: Conversation Endpoints Lack Per-Session Ownership Checks
REST permission is only current_user_can('edit_posts'). The conversation handlers read, update, delete, and overwrite messages by session_id without checking that the session belongs to the current user or that the user can edit the linked post.
Evidence:
includes/class-gutenberg-sidebar.php:847-849grants all REST routes to anyone who can edit posts.includes/class-gutenberg-sidebar.php:6977-6991returns any session bysession_id.includes/class-gutenberg-sidebar.php:7001-7033updates any session bysession_id.includes/class-gutenberg-sidebar.php:7043-7060deletes any session bysession_id.includes/class-gutenberg-sidebar.php:7070-7098overwrites messages for any session bysession_id.
Impact:
- Any editor-level user who obtains or guesses a session ID can read or modify another user's conversation.
- Stored article prompts, SEO keywords, unpublished plans, and drafts can leak.
Recommendation:
- Add
Conversation_Manager::current_user_can_access($session_id)and enforce it on all session routes. - For linked post sessions, also require
current_user_can('edit_post', $post_id). - Increase session IDs to a stronger token, e.g.
wp_generate_uuid4()orbin2hex(random_bytes(16)).
High Priority Findings
P1: Two Context Stores Compete Instead Of Cooperating
Current code keeps post meta chat history and new session messages at the same time.
Evidence:
handle_chat_request()updates post meta chat history atincludes/class-gutenberg-sidebar.php:924-930and later.- Frontend saves every message array to
/conversations/{session_id}/messagesatassets/js/sidebar.js:287-318. - Frontend initializes sessions through
/conversations/post/{postId}and/conversations?uncompleted=1atassets/js/sidebar.js:192-267.
Impact:
- Chat mode, planning mode, writing mode, and resume mode can see different histories.
- Clearing context deletes post meta but not necessarily the session messages.
- "Continue conversation" can restore messages while
_wpaw_planor_wpaw_memoryremains stale.
Recommendation:
- Pick one source of truth for conversational history. Prefer
wpaw_conversationsfor messages and context, with post meta only storing the current plan and lightweight indexes. - Define a single context assembly service used by chat, plan, write, refine, SEO, and image flows.
- Make "clear context" clear both the active session messages/context and legacy post meta during migration.
P1: Provider Routing Falls Back Silently To OpenRouter
If a configured local backend is unreachable or unsupported, provider manager silently falls back to OpenRouter.
Evidence:
includes/class-provider-manager.php:33-45returns OpenRouter fallback if selected provider is not configured or local connection test fails.
Impact:
- A user choosing local/private/free generation may unknowingly send prompts to OpenRouter.
- Cost expectations and privacy expectations can be violated.
- Debugging provider behavior becomes confusing because UI selection is not guaranteed execution.
Recommendation:
- Make fallback behavior explicit and configurable: "fail closed" vs "fallback to OpenRouter".
- Return provider metadata in each API response so the UI can show the actual provider used.
- Add a preflight provider health state in settings and sidebar.
P1: Cost Tracking Setting Does Not Stop Tracking Or Enforce Budget
cost_tracking_enabled controls parts of the frontend display, but the backend cost hook always writes records. Monthly budget is display-only and does not prevent expensive calls.
Evidence:
- Cost tracker always registers the hook at
includes/class-cost-tracker.php:42-44. add_request()inserts every event without checking settings atincludes/class-cost-tracker.php:58-75.- Frontend skips fetching if disabled at
assets/js/sidebar.js:501-505, but backend still records.
Impact:
- The setting name implies disabling tracking, but data is still stored.
- Budget UI can be misleading because it is not a guardrail.
Recommendation:
- Decide whether the setting means "hide UI" or "do not store usage"; rename or implement accordingly.
- Add optional soft and hard budget policies before provider calls.
- Track actual provider, request ID, session ID, and failure state for reconciliation.
P1: API Route Contracts Are Too Loose
Most REST routes accept raw JSON and manually read fields. Routes do not declare args schemas or sanitize/validate centrally.
Evidence:
- Routes are registered without
argsschemas beginning atincludes/class-gutenberg-sidebar.php:287-365. - Handler code manually reads arbitrary payloads, e.g.
handle_chat_request()atincludes/class-gutenberg-sidebar.php:858-914.
Impact:
- Small frontend changes can break backend assumptions.
- Security review becomes harder because validation is spread across handlers.
- No machine-readable contract exists for tests.
Recommendation:
- Add route
argsdefinitions for all simple endpoints. - Introduce request DTO/helper methods for complex generation/refinement requests.
- Add contract tests for each endpoint with valid, missing, malformed, and unauthorized payloads.
P1: Main Backend Class Is Too Large To Change Safely
includes/class-gutenberg-sidebar.php is roughly 7,200 lines and owns asset enqueueing, route registration, request validation, prompt assembly, streaming, SEO, GEO, research, image routes, conversation routes, and persistence.
Impact:
- Any change has a large blast radius.
- Prompt changes, UI changes, and persistence changes are tangled.
- This directly contributes to "fix A, lose B" cycles.
Recommendation:
- Split by ownership:
Rest_Routesregisters routes only.Context_Serviceassembles messages/context/history.Workflow_Servicehandles planning/writing/refinement state.Provider_Servicewraps provider selection and fallback.Cost_Servicehandles usage policies.Conversation_Rest_Controller,Image_Rest_Controller,Seo_Rest_Controller.
Medium Priority Findings
P2: Admin Settings Depend On External CDNs
The settings page enqueues Bootstrap and Select2 from CDN.
Evidence:
includes/class-settings-v2.php:67-75loads CDN CSS/JS.
Impact:
- Settings UI can break offline or in restricted admin environments.
- Supply-chain and privacy expectations are weaker for a plugin admin page.
Recommendation:
- Bundle vendor assets locally or use WordPress-native components where possible.
P2: Uninstall Is Incomplete And Duplicated
There is both register_uninstall_hook() in the main plugin file and an uninstall.php. Cleanup differs between them and neither fully cleans new data.
Evidence:
- Main uninstall deletes settings and cost/image tables at
wp-agentic-writer.php:259-267. uninstall.phpdeletes settings,_wpaw_plan, and cost table only.- Neither path deletes
wp_agentic_writer_custom_models,wpaw_db_version,wpaw_conversations,_wpaw_chat_history,_wpaw_memory,_wpaw_post_config,_wpaw_detected_language, writing state meta, or image-related post meta.
Impact:
- Reinstall behavior is unpredictable.
- Old settings and tables can affect fresh testing.
Recommendation:
- Use one uninstall path.
- Add a documented "delete all data on uninstall" option.
- Clean all plugin options, transients, tables, upload temp files, scheduled events, and post meta.
P2: Image Generation Is Partially Integrated
The image manager has tables, recommendations, variants, commit flow, and temp cleanup, but cost tracking and error handling are incomplete.
Risks:
- Image generation costs are not consistently inserted into the cost tracking table.
- Temp files are written with
file_put_contents()without checking result or validating MIME/content length. - Committed variants use
media_handle_sideload()from the temp path, so failure modes can delete/move temp files unexpectedly.
Recommendation:
- Add
wp_aw_after_api_requestevents for image generation. - Validate downloaded image type and size before writing.
- Add image state transitions: pending -> generating -> temp_ready -> committed -> failed.
P2: Settings Defaults And Model Labels Are Inconsistent
Defaults differ across activation, settings V2, OpenRouter provider, settings fallback, and UI copy.
Examples:
- Activation uses
execution_modelbut current code useswriting_model. - Activation default planning model is
google/gemini-2.0-flash-exp, while settings/provider defaults usegoogle/gemini-2.5-flash. - Refinement defaults vary between Haiku and Sonnet.
Impact:
- Fresh install, upgraded install, and settings save can select different models.
- Model bugs are hard to reproduce because initial state depends on install path.
Recommendation:
- Create a single model preset registry in PHP and expose it to JS.
- Run one migration that maps
execution_modeltowriting_modeland removes stale defaults. - Add "current saved model is unavailable" UI with fallback choice.
P2: Debug Logging Is Too Noisy For Production
Several error_log() and console.log() calls are unconditional or reveal request behavior and settings.
Examples:
- Asset enqueue logs at
includes/class-gutenberg-sidebar.php:73-74. - Provider routing logs at
includes/class-provider-manager.php:28. - Streaming provider settings logs at
includes/class-gutenberg-sidebar.php:3041-3042. - Frontend session logs at
assets/js/sidebar.js:5119-5130.
Impact:
- Logs can expose topics, model choices, local backend status, and partial AI responses.
- Debug noise hides real defects.
Recommendation:
- Add
wpaw_debug_log()gated behindWP_DEBUG && SCRIPT_DEBUGor a plugin debug setting. - Never log API keys, full prompts, full responses, or private drafts by default.
UI/UX Assessment
What Works
- The product concept is coherent: chat -> clarify -> plan -> write -> refine.
- Gutenberg-side integration is stronger than a typical "AI text box" plugin.
- @mentions and block toolbar actions are a strong foundation for an IDE-like writing workflow.
- The admin settings V2 layout gives a clearer mental model for model selection, local backend, cost analytics, and docs.
UX Gaps
- The sidebar has too many implicit modes. Users can be in chat, planning, writing, sessions list, welcome screen, empty writing state, cost tab, SEO tab, and clarification mode, but those states do not share a single state machine.
- "Writing mode" can behave like discussion-only in some paths, while actual writing requires a plan. This is easy to misunderstand.
- Context status is not transparent enough. Users cannot easily see "what the agent remembers", "which session is active", "which provider will run", or "what will be sent".
- Cost UI shows spend, but not clear preflight estimates or post-call reconciliation by provider.
- There is no review/accept/reject safety layer for high-impact article edits. Generated blocks can be inserted directly.
Recommended UX Direction
Replace mode ambiguity with a visible workflow state:
- Context: topic, keyword, language, audience, source material.
- Plan: outline draft, editable sections, approve plan.
- Write: section-by-section generation with pause/resume.
- Review: diff, SEO/GEO checks, image recommendations.
- Publish assist: metadata, schema, final checklist.
Each state should expose the active provider, cost estimate, context source, and next best action.
System Architecture Assessment
Current Shape
flowchart TD
UI["assets/js/sidebar.js"]
Routes["class-gutenberg-sidebar.php"]
OR["OpenRouter Provider"]
Local["Local Backend Provider"]
Codex["Codex Provider"]
Cost["Cost Tracker"]
Meta["Post Meta"]
Conv["wpaw_conversations"]
Images["Image Manager"]
UI --> Routes
Routes --> OR
Routes --> Local
Routes --> Codex
Routes --> Cost
Routes --> Meta
Routes --> Conv
Routes --> Images
UI --> Conv
UI --> Meta
The core issue is that both UI and backend understand too much about everything. The architecture needs boundaries more than it needs new features.
Target Shape
flowchart TD
UI["Sidebar UI"]
REST["REST Controllers"]
Workflow["Workflow Service"]
Context["Context Service"]
Provider["Provider Gateway"]
Cost["Cost Policy + Ledger"]
Store["Conversation + Post State Store"]
UI --> REST
REST --> Workflow
Workflow --> Context
Workflow --> Provider
Workflow --> Cost
Context --> Store
Cost --> Store
The important change is that every generation path asks the same Context_Service for context and the same Provider_Gateway for provider execution. That gives you one place to fix context bugs and one place to fix provider/cost bugs.
Context And History Audit
Current context layers:
- Frontend React state: immediate but volatile.
localStorage: agent mode only.- Post meta:
_wpaw_chat_history,_wpaw_plan,_wpaw_memory,_wpaw_post_config,_wpaw_detected_language, writing state. - Conversation table: session messages/context/status/title/focus keyword.
Key gaps:
- Session context field exists but frontend mostly saves messages, not a normalized workflow context.
- Post-linked and uncompleted sessions are mixed into the same UI without a clear transition.
- Auto-save of every messages array can overwrite richer backend state with stale frontend state.
- There is no schema/version for message objects, so plan cards, timeline entries, assistant messages, and system info live in the same array.
Recommended contract:
{
"session_id": "uuid",
"post_id": 123,
"workflow_state": "context|planning|writing|review|done",
"messages": [],
"context_summary": "",
"plan_id": "uuid",
"active_provider": "openrouter|local_backend|codex",
"cost_session_id": "uuid",
"updated_at": "datetime"
}
Cost Tracking Audit
Current strengths:
- Central cost hook exists.
- Sidebar and settings cost views exist.
- Cost log grouping by post is useful.
Current gaps:
- No session ID in cost records.
- No provider column.
- No request status or error records.
- No distinction between estimated and actual cost.
- No hard budget stop.
- Disabled tracking does not stop backend inserts.
- Local backend and Codex cost semantics differ from OpenRouter but share the same table model.
Recommended table changes:
providersession_idrequest_idstatusestimated_costactual_costcurrencymetadata_json
Models And Provider Audit
Current strengths:
- Per-task model selection is directionally right.
- OpenRouter model refresh exists.
- Custom models can be added.
- Provider routing supports OpenRouter, local backend, and Codex.
Current gaps:
- Model cache bug is production-blocking.
- Provider fallback is silent.
- Codex provider uses older Chat Completions assumptions and hardcoded stale pricing.
- Local backend test runs an inference call, which may be unexpectedly slow/costly for a "test connection".
- Image model selection trusts OpenRouter modalities but custom models bypass capability validation.
Recommended provider contract:
ProviderResult {
provider: string,
model: string,
content: string,
usage: Usage,
cost: Cost,
capabilities: string[],
warnings: string[]
}
Test And Verification Gaps
Checks run during this audit:
php -l wp-agentic-writer.phpphp -l includes/class-gutenberg-sidebar.phpphp -l includes/class-settings-v2.phpphp -l includes/class-openrouter-provider.phpphp -l includes/class-image-manager.phpphp -l includes/class-conversation-migration.phpnode --check assets/js/sidebar.jsnode --check assets/js/settings-v2.jsnode --check assets/js/sidebar-utils.jsnode --check assets/js/block-refine.jsnode --check assets/js/block-image-generate.js
All checked files passed syntax checks.
Missing test coverage:
- Activation/migration tests for clean install and upgrade.
- REST permission tests for conversations and post config.
- Provider model-cache regression tests.
- Context assembly snapshots per mode.
- Streaming parser tests for OpenRouter, local backend, and Codex.
- Cost ledger tests with tracking disabled, zero-cost local calls, and failed requests.
- Gutenberg e2e tests for chat -> plan -> write -> refresh -> resume.
Stabilization Roadmap
Phase 1: Stop Runtime Breakage
- Fix PHP 7.4 compatibility or raise PHP requirement.
- Fix OpenRouter model cache shape conflict.
- Wire conversation migrations correctly.
- Add ownership checks on all conversation endpoints.
- Gate debug logging.
Phase 2: Stabilize State
- Declare one source of truth for conversation messages.
- Create a context service used by all generation paths.
- Migrate legacy post meta chat history into sessions.
- Make clear context/session/post behavior explicit.
- Add workflow state to session context.
Phase 3: Stabilize Cost And Provider Behavior
- Add provider metadata to all AI responses.
- Make provider fallback explicit.
- Add budget preflight and optional hard limit.
- Expand cost table with provider/session/request fields.
- Track image and failed request costs consistently.
Phase 4: Reduce Blast Radius
- Split
class-gutenberg-sidebar.phpinto controllers and services. - Add REST schemas and shared request validators.
- Build integration tests around the main workflows.
- Add a small internal fixture suite for model/provider responses.
- Remove backup files and duplicate settings/documentation paths after confirming they are unused.
Highest Leverage Opportunities
- Make the plugin feel safer: add preview/diff/accept/reject for refinements and article-wide edits.
- Make the agent feel smarter: show "current context" and let users edit what the agent remembers.
- Make costs trustworthy: show preflight estimate, actual cost, provider, and model after every operation.
- Make local backend trustworthy: no silent cloud fallback unless the user explicitly opts in.
- Make model selection resilient: capability badges, availability checks, and clear fallbacks.
- Make the codebase easier to evolve: services plus tests around the workflows that matter.
Suggested Definition Of Done For Future Fixes
For any feature or bug fix touching chat, planning, writing, refinement, context, provider, or cost:
- It must state which storage layer is authoritative.
- It must include the provider/model actually used in the response.
- It must update or preserve cost records intentionally.
- It must pass at least one workflow test from chat to final editor state.
- It must not add another source of truth for the same state.
This is the guardrail that prevents losing A while fixing B.