7.0 KiB
7.0 KiB
Yellow Bank Soal Perfection Tasklist
Date: 2026-04-29
Purpose: hands-off development guide for hardening the system, improving correctness, and polishing the admin/user experience.
1. Security and Auth
- Add centralized authentication dependencies for student, website admin, and system admin roles.
- Replace raw
X-Website-IDtrust with token-derived website access. - Require authorization on reports, tryout configuration updates, imports, calibration, and session endpoints.
- Add session ownership checks using verified WordPress identity.
- Add rate limiting for admin login, AI generation, imports, and WordPress verification.
- Add admin login rate limiting (IP-based, Redis-backed attempt window).
- Add CSRF tokens to all admin POST forms.
- Mark admin session cookies
securein production. - Fail production startup when default or empty secrets are used.
- Add tests proving cross-website access is blocked.
- Add token integrity tests (issue/decode, tamper rejection, expiry rejection).
2. Session Integrity
- Verify
submit_answeritem belongs to the session'swebsite_idandtryout_id. - Prevent answer submission for items not issued by
next_item. - Stop returning
correct_answerduring live adaptive sessions. - Decide whether explanations should be shown only after completion or never during an active session.
- Add duplicate-answer validation before DB commit.
- Make repeated submissions return
409 Conflictinstead of DB errors. - Validate or auto-create WordPress users before creating sessions.
- Add tests for invalid item IDs, foreign-tryout items, repeated answers, and completed sessions.
3. Scoring Correctness
- Revisit CTT
total_bobot_maxlogic so earned and max weights use the same item set. - Define scoring behavior for mixed-level tryouts.
- Confirm whether fixed tryouts should require every item to be answered before completion.
- Add tests for all-correct, all-wrong, partial, mixed-level, missing-bobot, and duplicate-answer cases.
- Add regression tests for static, dynamic, and hybrid normalization switching.
- Confirm NM, NN, theta, and report formulas against PRD examples.
- Add explicit handling for zero/near-zero standard deviation in reporting and normalization.
4. Database and Migrations
- Resolve model/migration drift for item uniqueness indexes.
- Decide whether items are unique by
(website_id, tryout_id, slot)or(website_id, tryout_id, slot, level). - Align Excel import duplicate detection with the final uniqueness rule.
- Remove production
create_allstartup behavior or gate it to development only. - Add migration smoke tests for fresh database upgrade to head.
- Add DB constraint tests for FK failures and uniqueness conflicts.
- Create seed/dev fixtures for websites, users, tryouts, items, and sessions.
- Document migration rollback expectations.
5. API Reliability
- Standardize error response shape across routers.
- Convert expected DB constraint failures into clear
400,404, or409responses. - Add request size limits for Excel and JSON imports.
- Add structured logging with request IDs.
- Add health checks that distinguish DB, Redis, WordPress, and OpenRouter status.
- Add OpenAPI examples for core workflows.
- Add pagination to list/report endpoints that can grow large.
- Add timeout and retry policy for external service calls.
6. Import and Export
- Validate website existence before Excel preview and import.
- Validate tryout existence before Excel question import.
- Add downloadable validation error reports.
- Add import preview diff for new records, skipped duplicates, and updates.
- Clean up generated export temp files after response lifecycle.
- Add tests for malformed Excel, duplicate slots, invalid p-values, invalid bobot values, and missing tryout.
- Add tests for JSON snapshot import edge cases.
- Add file size/type hardening beyond extension checks.
7. Reporting
- Persist report schedules in the database instead of process memory.
- Add real scheduler/worker execution for scheduled reports.
- Add email delivery or remove recipient fields until delivery is implemented.
- Add report permission checks.
- Add tests for empty reports, partial data, and multi-tryout comparisons.
- Add pagination/export limits for large report datasets.
- Verify
avg_nn, pass rate, medians, and standard deviations against fixture data. - Add user-facing messages when report data is incomplete.
8. Admin UI and UX
- Add responsive mobile/tablet layout.
- Add active navigation state and breadcrumbs.
- Add pagination, sorting, and search to admin tables.
- Replace destructive browser confirms with safer confirmation modals.
- Add inline validation and success/error banners that persist after redirects.
- Add import progress indicators and clearer preview screens.
- Add empty states with recommended next actions.
- Improve visual hierarchy for dashboard stats and high-risk actions.
- Add accessibility pass: labels, focus states, contrast, keyboard navigation.
9. Testing and Tooling
- Add
pyproject.tomlorpytest.iniwith test config. - Add pinned dependency lock workflow.
- Add
make test,make lint,make migrate, andmake devcommands. - Add CI for lint, tests, mapper config, Alembic upgrade, and import smoke tests.
- Add integration tests using a test database.
- Add auth boundary tests for every tenant-scoped endpoint.
- Add regression tests for previously found defects.
- Document the canonical local setup path.
10. Production Readiness
- Validate required secrets in production startup.
- Document deployment environment variables.
- Add backup and restore guidance for PostgreSQL.
- Add observability: logs, metrics, traces, and error monitoring.
- Add operational runbooks for import failures, calibration failures, WordPress API outages, and AI provider outages.
- Add Redis availability checks when admin or background jobs are enabled.
- Add deployment checklist for migrations, admin credentials, CORS, HTTPS, and rollback.
Suggested Execution Order
- Security and auth hardening.
- Session integrity and scoring correctness.
- Database/migration alignment.
- Test and tooling foundation.
- Import/export and reporting reliability.
- Admin UI/UX polish.
- Production readiness and operations.
Definition of Perfect Enough
- Every tenant-scoped endpoint has an authorization test.
- Every scoring path has deterministic fixture tests.
- Fresh database migration to head succeeds in CI.
- Admin destructive actions are CSRF-protected.
- Live sessions cannot reveal answers before completion.
- Imports fail safely with actionable validation output.
- Reports are reproducible, permissioned, and persisted where scheduled.
- The app can be installed, tested, migrated, and run from documented commands.