138 lines
7.0 KiB
Markdown
138 lines
7.0 KiB
Markdown
# Yellow Bank Soal Perfection Tasklist
|
|
|
|
Date: 2026-04-29
|
|
Purpose: hands-off development guide for hardening the system, improving correctness, and polishing the admin/user experience.
|
|
|
|
## 1. Security and Auth
|
|
|
|
- [x] Add centralized authentication dependencies for student, website admin, and system admin roles.
|
|
- [x] Replace raw `X-Website-ID` trust with token-derived website access.
|
|
- [x] Require authorization on reports, tryout configuration updates, imports, calibration, and session endpoints.
|
|
- [x] Add session ownership checks using verified WordPress identity.
|
|
- [x] Add rate limiting for admin login, AI generation, imports, and WordPress verification.
|
|
- [x] Add admin login rate limiting (IP-based, Redis-backed attempt window).
|
|
- [x] Add CSRF tokens to all admin POST forms.
|
|
- [x] Mark admin session cookies `secure` in production.
|
|
- [x] Fail production startup when default or empty secrets are used.
|
|
- [x] Add tests proving cross-website access is blocked.
|
|
- [x] Add token integrity tests (issue/decode, tamper rejection, expiry rejection).
|
|
|
|
## 2. Session Integrity
|
|
|
|
- [ ] Verify `submit_answer` item belongs to the session's `website_id` and `tryout_id`.
|
|
- [ ] Prevent answer submission for items not issued by `next_item`.
|
|
- [ ] Stop returning `correct_answer` during live adaptive sessions.
|
|
- [ ] Decide whether explanations should be shown only after completion or never during an active session.
|
|
- [ ] Add duplicate-answer validation before DB commit.
|
|
- [ ] Make repeated submissions return `409 Conflict` instead of DB errors.
|
|
- [ ] Validate or auto-create WordPress users before creating sessions.
|
|
- [ ] Add tests for invalid item IDs, foreign-tryout items, repeated answers, and completed sessions.
|
|
|
|
## 3. Scoring Correctness
|
|
|
|
- [ ] Revisit CTT `total_bobot_max` logic so earned and max weights use the same item set.
|
|
- [ ] Define scoring behavior for mixed-level tryouts.
|
|
- [ ] Confirm whether fixed tryouts should require every item to be answered before completion.
|
|
- [ ] Add tests for all-correct, all-wrong, partial, mixed-level, missing-bobot, and duplicate-answer cases.
|
|
- [ ] Add regression tests for static, dynamic, and hybrid normalization switching.
|
|
- [ ] Confirm NM, NN, theta, and report formulas against PRD examples.
|
|
- [ ] Add explicit handling for zero/near-zero standard deviation in reporting and normalization.
|
|
|
|
## 4. Database and Migrations
|
|
|
|
- [ ] Resolve model/migration drift for item uniqueness indexes.
|
|
- [ ] Decide whether items are unique by `(website_id, tryout_id, slot)` or `(website_id, tryout_id, slot, level)`.
|
|
- [ ] Align Excel import duplicate detection with the final uniqueness rule.
|
|
- [ ] Remove production `create_all` startup behavior or gate it to development only.
|
|
- [ ] Add migration smoke tests for fresh database upgrade to head.
|
|
- [ ] Add DB constraint tests for FK failures and uniqueness conflicts.
|
|
- [ ] Create seed/dev fixtures for websites, users, tryouts, items, and sessions.
|
|
- [ ] Document migration rollback expectations.
|
|
|
|
## 5. API Reliability
|
|
|
|
- [ ] Standardize error response shape across routers.
|
|
- [ ] Convert expected DB constraint failures into clear `400`, `404`, or `409` responses.
|
|
- [ ] Add request size limits for Excel and JSON imports.
|
|
- [ ] Add structured logging with request IDs.
|
|
- [ ] Add health checks that distinguish DB, Redis, WordPress, and OpenRouter status.
|
|
- [ ] Add OpenAPI examples for core workflows.
|
|
- [ ] Add pagination to list/report endpoints that can grow large.
|
|
- [ ] Add timeout and retry policy for external service calls.
|
|
|
|
## 6. Import and Export
|
|
|
|
- [ ] Validate website existence before Excel preview and import.
|
|
- [ ] Validate tryout existence before Excel question import.
|
|
- [ ] Add downloadable validation error reports.
|
|
- [ ] Add import preview diff for new records, skipped duplicates, and updates.
|
|
- [ ] Clean up generated export temp files after response lifecycle.
|
|
- [ ] Add tests for malformed Excel, duplicate slots, invalid p-values, invalid bobot values, and missing tryout.
|
|
- [x] Add tests for JSON snapshot import edge cases.
|
|
- [ ] Add file size/type hardening beyond extension checks.
|
|
|
|
## 7. Reporting
|
|
|
|
- [ ] Persist report schedules in the database instead of process memory.
|
|
- [ ] Add real scheduler/worker execution for scheduled reports.
|
|
- [ ] Add email delivery or remove recipient fields until delivery is implemented.
|
|
- [ ] Add report permission checks.
|
|
- [ ] Add tests for empty reports, partial data, and multi-tryout comparisons.
|
|
- [ ] Add pagination/export limits for large report datasets.
|
|
- [ ] Verify `avg_nn`, pass rate, medians, and standard deviations against fixture data.
|
|
- [ ] Add user-facing messages when report data is incomplete.
|
|
|
|
## 8. Admin UI and UX
|
|
|
|
- [ ] Add responsive mobile/tablet layout.
|
|
- [ ] Add active navigation state and breadcrumbs.
|
|
- [ ] Add pagination, sorting, and search to admin tables.
|
|
- [ ] Replace destructive browser confirms with safer confirmation modals.
|
|
- [ ] Add inline validation and success/error banners that persist after redirects.
|
|
- [ ] Add import progress indicators and clearer preview screens.
|
|
- [ ] Add empty states with recommended next actions.
|
|
- [ ] Improve visual hierarchy for dashboard stats and high-risk actions.
|
|
- [ ] Add accessibility pass: labels, focus states, contrast, keyboard navigation.
|
|
|
|
## 9. Testing and Tooling
|
|
|
|
- [ ] Add `pyproject.toml` or `pytest.ini` with test config.
|
|
- [ ] Add pinned dependency lock workflow.
|
|
- [ ] Add `make test`, `make lint`, `make migrate`, and `make dev` commands.
|
|
- [ ] Add CI for lint, tests, mapper config, Alembic upgrade, and import smoke tests.
|
|
- [ ] Add integration tests using a test database.
|
|
- [ ] Add auth boundary tests for every tenant-scoped endpoint.
|
|
- [x] Add regression tests for previously found defects.
|
|
- [ ] Document the canonical local setup path.
|
|
|
|
## 10. Production Readiness
|
|
|
|
- [ ] Validate required secrets in production startup.
|
|
- [ ] Document deployment environment variables.
|
|
- [ ] Add backup and restore guidance for PostgreSQL.
|
|
- [ ] Add observability: logs, metrics, traces, and error monitoring.
|
|
- [ ] Add operational runbooks for import failures, calibration failures, WordPress API outages, and AI provider outages.
|
|
- [ ] Add Redis availability checks when admin or background jobs are enabled.
|
|
- [ ] Add deployment checklist for migrations, admin credentials, CORS, HTTPS, and rollback.
|
|
|
|
## Suggested Execution Order
|
|
|
|
1. Security and auth hardening.
|
|
2. Session integrity and scoring correctness.
|
|
3. Database/migration alignment.
|
|
4. Test and tooling foundation.
|
|
5. Import/export and reporting reliability.
|
|
6. Admin UI/UX polish.
|
|
7. Production readiness and operations.
|
|
|
|
## Definition of Perfect Enough
|
|
|
|
- [ ] Every tenant-scoped endpoint has an authorization test.
|
|
- [ ] Every scoring path has deterministic fixture tests.
|
|
- [ ] Fresh database migration to head succeeds in CI.
|
|
- [ ] Admin destructive actions are CSRF-protected.
|
|
- [ ] Live sessions cannot reveal answers before completion.
|
|
- [ ] Imports fail safely with actionable validation output.
|
|
- [ ] Reports are reproducible, permissioned, and persisted where scheduled.
|
|
- [ ] The app can be installed, tested, migrated, and run from documented commands.
|