# Yellow Bank Soal - Defect and Gap Audit Report Date: 2026-03-31 Auditor: Codex (GPT-5) Scope: Static code trace, dependency-aware runtime checks, targeted test execution ## 1) Executive Summary This audit identified critical reliability and security risks that should be addressed before production rollout: - P0: ORM relationship configuration is broken for key entities (`Tryout` to `Item`/`Session`/`TryoutStats`), causing mapper setup failure. - P0: Admin panel is mounted while authentication provider registration is disabled. - P1: Multiple SQL aggregation queries use invalid `func.cast(..., type_=func.INTEGER)` patterns that will fail at runtime. - P1: Normalization parameter retrieval uses incorrect scalar fetch semantics for multi-column queries. - P1: Reporting logic contains at least one mathematically invalid metric computation (`avg_nn`). Overall status: **Not production-ready** until P0 and P1 findings are remediated. ## 2) Methodology - Repository structure and architecture trace across routers, services, models, schemas, and admin modules. - Runtime checks after dependency install in a local virtual environment. - Targeted validation of SQLAlchemy mapper setup and SQL expression construction. - Existing test suite execution. ## 3) Verification Results - Dependency install: successful (`.venv`, requirements installed). - Test execution: `.venv/bin/pytest -q` -> `3 passed`. - Mapper validation: failed (`NoForeignKeysError`) when forcing mapper configuration. - SQLAlchemy expression validation: failed for `func.cast(..., type_=func.INTEGER)` usage. Note: passing tests do not indicate system correctness due to weak assertion coverage (see finding G-02). ## 4) Findings ## P0 Findings ### P0-01: Broken ORM relationship joins (mapper configuration failure) Severity: P0 (Critical) Category: Data model / Runtime stability Description: `Tryout` relationships to `Item`, `Session`, and `TryoutStats` are defined, but corresponding child fields (`tryout_id`) are plain strings without foreign key constraints or explicit `primaryjoin` definitions. SQLAlchemy cannot infer join conditions for relationship mapping. Evidence: - `app/models/item.py:67` and `app/models/item.py:169` - `app/models/session.py:79` and `app/models/session.py:160` - `app/models/tryout_stats.py:52` and `app/models/tryout_stats.py:120` - `app/models/tryout.py:162-170` Runtime proof: - `configure_mappers()` raises `sqlalchemy.exc.NoForeignKeysError` with message indicating no FK between `items` and `tryouts`. Impact: - Relationship loading may fail when mappings are configured. - Admin/data operations depending on relationship loading are unstable. - High risk of runtime failures in production paths. Recommendation: - Add proper FK model constraints (or explicit `primaryjoin` if intentional composite mapping). - Prefer canonical FK design: - `items.tryout_pk` -> `tryouts.id` - `sessions.tryout_pk` -> `tryouts.id` - `tryout_stats.tryout_pk` -> `tryouts.id` - Keep `tryout_id` business identifier as separate indexed field if needed. --- ### P0-02: Admin panel mounted with auth provider not registered Severity: P0 (Critical) Category: Security / Access control Description: Admin app is mounted in main app, but auth provider registration is commented out. Evidence: - `app/main.py:176` (admin mounted on `/admin`) - `app/admin.py:607` (`admin_app.settings.auth_provider = AdminAuthProvider()` commented) Impact: - Potential unauthorized access to administrative resources. - High security exposure depending on default behavior/deployment setup. Recommendation: - Enable and enforce auth provider before exposing `/admin`. - Add environment-aware hard gate to disable admin in production until auth is verified. - Add integration tests for admin route authentication and authorization. ## P1 Findings ### P1-01: Invalid SQLAlchemy cast pattern in aggregate queries Severity: P1 (High) Category: Query correctness / Runtime stability Description: Several queries use `func.cast(..., type_=func.INTEGER)`. This is not valid SQLAlchemy typing usage and fails during expression construction. Evidence: - `app/services/ctt_scoring.py:193` - `app/services/reporting.py:418` - `app/services/reporting.py:681` - `app/routers/tryouts.py:295` Impact: - Runtime failures in calibration, reporting, and CTT calculations. Recommendation: - Replace with canonical cast: - `from sqlalchemy import cast, Integer` - `func.sum(cast(UserAnswer.is_correct, Integer))` --- ### P1-02: Incorrect multi-column fetch in normalization service Severity: P1 (High) Category: Business logic / Runtime stability Description: Queries selecting `(Tryout.static_rataan, Tryout.static_sb)` use `scalar_one_or_none()` and then attempt tuple unpacking. `scalar_one_or_none()` returns a single scalar (first column), not both columns. Evidence: - `app/services/normalization.py:311`, `:318` - `app/services/normalization.py:355`, `:377` Impact: - Type/logic errors during normalization parameter retrieval. - Can break score normalization flows. Recommendation: - Use `one_or_none()` for tuple rows: - `row = result.one_or_none()` - `rataan, sb = row` --- ### P1-03: Invalid `avg_nn` computation in tryout comparison report Severity: P1 (High) Category: Reporting correctness Description: `avg_nn` is currently calculated as `stats.rataan + 500`, which does not match the normalization formula and is mathematically incorrect. Evidence: - `app/services/reporting.py:713` Impact: - Misleading management and educational performance reports. - Incorrect decisions from bad analytics output. Recommendation: - Compute average NN from session-level NN values directly. - If unavailable, explicitly return `None` and annotate report incompleteness. --- ### P1-04: Timestamp defaults use string literal `"NOW()"` instead of SQL function Severity: P1 (High) Category: Data integrity / Schema correctness Description: Model fields use `server_default="NOW()"`, which compiles to string literal default `'NOW()'`, not database function execution. Evidence: - Present in all models, e.g. `app/models/website.py:51`, `:56` Impact: - Incorrect or invalid default timestamps depending on DB behavior. - Inconsistent created/updated tracking. Recommendation: - Use SQLAlchemy expressions: - `server_default=func.now()` - `onupdate=func.now()` or DB trigger strategy. ## P2 Findings ### P2-01: Multi-tenant consistency gap in session creation endpoint Severity: P2 (Medium) Category: Authorization / Data isolation Description: Most routes enforce tenant context via `X-Website-ID` header, but session creation accepts `website_id` in request body without header-based tenant check. Evidence: - `app/routers/sessions.py:341-390` Impact: - Increased risk of cross-tenant write mistakes. Recommendation: - Enforce `X-Website-ID` across all write endpoints. - Reject payload tenant mismatch if header and body both exist. --- ### P2-02: Test suite has weak assertions Severity: P2 (Medium) Category: Test quality Description: Current tests are print-oriented and mostly non-assertive, allowing false confidence. Evidence: - `tests/test_normalization.py` uses print outputs and contains no robust asserts in core test blocks. Impact: - Regressions can pass CI undetected. Recommendation: - Rewrite tests with deterministic assertions. - Add coverage for: - mapper configuration - report calculations - normalization path selection - admin auth enforcement --- ### P2-03: Scheduled reports are in-memory only Severity: P2 (Medium) Category: Operational reliability Description: Scheduled report registry is process-local (`_scheduled_reports` dict), not persisted. Evidence: - `app/services/reporting.py:1344` Impact: - Schedules disappear on restart. - Inconsistent behavior in multi-process deployment. Recommendation: - Move scheduling metadata to persistent storage (DB/Redis + worker scheduler). --- ### P2-04: Migration gap Severity: P2 (Medium) Category: Delivery process Description: Alembic setup exists but no migration version files are present. Impact: - Schema evolution lacks reproducibility and deployment confidence. Recommendation: - Generate and commit baseline migration + incremental migration workflow. ## 5) Risk Register Snapshot - Security risk: high (admin auth gap). - Runtime stability risk: high (ORM mapping + invalid query casts). - Data correctness risk: high (reporting and timestamp defaults). - Delivery risk: medium (test quality and migration process). ## 6) Remediation Priority Plan Phase 1 (Immediate, block release): 1. Fix admin auth enforcement and gate `/admin`. 2. Repair ORM join/FK model design and validate mapper configuration. 3. Replace invalid cast expressions with valid SQLAlchemy casts. 4. Fix normalization tuple fetch bug. Phase 2 (Short-term): 1. Correct reporting formulas (`avg_nn` and related derivations). 2. Correct timestamp defaults across all models. 3. Enforce tenant header consistency on all write paths. Phase 3 (Hardening): 1. Introduce robust assertion-based tests with CI gate. 2. Persist scheduled report metadata. 3. Establish migration discipline (`alembic/versions` under source control). ## 7) Acceptance Criteria for Closure All of the following must be true: - Mapper configuration passes (`configure_mappers()` without errors). - All P0 and P1 findings resolved and validated by tests. - Admin endpoints require verified auth in target environment. - Query and report correctness covered by automated tests. - Migration scripts exist and apply cleanly from empty DB to latest.