Files
yellow-bank-soal/DEFECT_GAP_AUDIT_REPORT.md

9.5 KiB

Yellow Bank Soal - Defect and Gap Audit Report

Date: 2026-03-31
Auditor: Codex (GPT-5)
Scope: Static code trace, dependency-aware runtime checks, targeted test execution

1) Executive Summary

This audit identified critical reliability and security risks that should be addressed before production rollout:

  • P0: ORM relationship configuration is broken for key entities (Tryout to Item/Session/TryoutStats), causing mapper setup failure.
  • P0: Admin panel is mounted while authentication provider registration is disabled.
  • P1: Multiple SQL aggregation queries use invalid func.cast(..., type_=func.INTEGER) patterns that will fail at runtime.
  • P1: Normalization parameter retrieval uses incorrect scalar fetch semantics for multi-column queries.
  • P1: Reporting logic contains at least one mathematically invalid metric computation (avg_nn).

Overall status: Not production-ready until P0 and P1 findings are remediated.

2) Methodology

  • Repository structure and architecture trace across routers, services, models, schemas, and admin modules.
  • Runtime checks after dependency install in a local virtual environment.
  • Targeted validation of SQLAlchemy mapper setup and SQL expression construction.
  • Existing test suite execution.

3) Verification Results

  • Dependency install: successful (.venv, requirements installed).
  • Test execution: .venv/bin/pytest -q -> 3 passed.
  • Mapper validation: failed (NoForeignKeysError) when forcing mapper configuration.
  • SQLAlchemy expression validation: failed for func.cast(..., type_=func.INTEGER) usage.

Note: passing tests do not indicate system correctness due to weak assertion coverage (see finding G-02).

4) Findings

P0 Findings

P0-01: Broken ORM relationship joins (mapper configuration failure)

Severity: P0 (Critical)
Category: Data model / Runtime stability

Description:

Tryout relationships to Item, Session, and TryoutStats are defined, but corresponding child fields (tryout_id) are plain strings without foreign key constraints or explicit primaryjoin definitions. SQLAlchemy cannot infer join conditions for relationship mapping.

Evidence:

  • app/models/item.py:67 and app/models/item.py:169
  • app/models/session.py:79 and app/models/session.py:160
  • app/models/tryout_stats.py:52 and app/models/tryout_stats.py:120
  • app/models/tryout.py:162-170

Runtime proof:

  • configure_mappers() raises sqlalchemy.exc.NoForeignKeysError with message indicating no FK between items and tryouts.

Impact:

  • Relationship loading may fail when mappings are configured.
  • Admin/data operations depending on relationship loading are unstable.
  • High risk of runtime failures in production paths.

Recommendation:

  • Add proper FK model constraints (or explicit primaryjoin if intentional composite mapping).
  • Prefer canonical FK design:
    • items.tryout_pk -> tryouts.id
    • sessions.tryout_pk -> tryouts.id
    • tryout_stats.tryout_pk -> tryouts.id
  • Keep tryout_id business identifier as separate indexed field if needed.

P0-02: Admin panel mounted with auth provider not registered

Severity: P0 (Critical)
Category: Security / Access control

Description:

Admin app is mounted in main app, but auth provider registration is commented out.

Evidence:

  • app/main.py:176 (admin mounted on /admin)
  • app/admin.py:607 (admin_app.settings.auth_provider = AdminAuthProvider() commented)

Impact:

  • Potential unauthorized access to administrative resources.
  • High security exposure depending on default behavior/deployment setup.

Recommendation:

  • Enable and enforce auth provider before exposing /admin.
  • Add environment-aware hard gate to disable admin in production until auth is verified.
  • Add integration tests for admin route authentication and authorization.

P1 Findings

P1-01: Invalid SQLAlchemy cast pattern in aggregate queries

Severity: P1 (High)
Category: Query correctness / Runtime stability

Description:

Several queries use func.cast(..., type_=func.INTEGER). This is not valid SQLAlchemy typing usage and fails during expression construction.

Evidence:

  • app/services/ctt_scoring.py:193
  • app/services/reporting.py:418
  • app/services/reporting.py:681
  • app/routers/tryouts.py:295

Impact:

  • Runtime failures in calibration, reporting, and CTT calculations.

Recommendation:

  • Replace with canonical cast:
    • from sqlalchemy import cast, Integer
    • func.sum(cast(UserAnswer.is_correct, Integer))

P1-02: Incorrect multi-column fetch in normalization service

Severity: P1 (High)
Category: Business logic / Runtime stability

Description:

Queries selecting (Tryout.static_rataan, Tryout.static_sb) use scalar_one_or_none() and then attempt tuple unpacking. scalar_one_or_none() returns a single scalar (first column), not both columns.

Evidence:

  • app/services/normalization.py:311, :318
  • app/services/normalization.py:355, :377

Impact:

  • Type/logic errors during normalization parameter retrieval.
  • Can break score normalization flows.

Recommendation:

  • Use one_or_none() for tuple rows:
    • row = result.one_or_none()
    • rataan, sb = row

P1-03: Invalid avg_nn computation in tryout comparison report

Severity: P1 (High)
Category: Reporting correctness

Description:

avg_nn is currently calculated as stats.rataan + 500, which does not match the normalization formula and is mathematically incorrect.

Evidence:

  • app/services/reporting.py:713

Impact:

  • Misleading management and educational performance reports.
  • Incorrect decisions from bad analytics output.

Recommendation:

  • Compute average NN from session-level NN values directly.
  • If unavailable, explicitly return None and annotate report incompleteness.

P1-04: Timestamp defaults use string literal "NOW()" instead of SQL function

Severity: P1 (High)
Category: Data integrity / Schema correctness

Description:

Model fields use server_default="NOW()", which compiles to string literal default 'NOW()', not database function execution.

Evidence:

  • Present in all models, e.g. app/models/website.py:51, :56

Impact:

  • Incorrect or invalid default timestamps depending on DB behavior.
  • Inconsistent created/updated tracking.

Recommendation:

  • Use SQLAlchemy expressions:
    • server_default=func.now()
    • onupdate=func.now() or DB trigger strategy.

P2 Findings

P2-01: Multi-tenant consistency gap in session creation endpoint

Severity: P2 (Medium)
Category: Authorization / Data isolation

Description:

Most routes enforce tenant context via X-Website-ID header, but session creation accepts website_id in request body without header-based tenant check.

Evidence:

  • app/routers/sessions.py:341-390

Impact:

  • Increased risk of cross-tenant write mistakes.

Recommendation:

  • Enforce X-Website-ID across all write endpoints.
  • Reject payload tenant mismatch if header and body both exist.

P2-02: Test suite has weak assertions

Severity: P2 (Medium)
Category: Test quality

Description:

Current tests are print-oriented and mostly non-assertive, allowing false confidence.

Evidence:

  • tests/test_normalization.py uses print outputs and contains no robust asserts in core test blocks.

Impact:

  • Regressions can pass CI undetected.

Recommendation:

  • Rewrite tests with deterministic assertions.
  • Add coverage for:
    • mapper configuration
    • report calculations
    • normalization path selection
    • admin auth enforcement

P2-03: Scheduled reports are in-memory only

Severity: P2 (Medium)
Category: Operational reliability

Description:

Scheduled report registry is process-local (_scheduled_reports dict), not persisted.

Evidence:

  • app/services/reporting.py:1344

Impact:

  • Schedules disappear on restart.
  • Inconsistent behavior in multi-process deployment.

Recommendation:

  • Move scheduling metadata to persistent storage (DB/Redis + worker scheduler).

P2-04: Migration gap

Severity: P2 (Medium)
Category: Delivery process

Description:

Alembic setup exists but no migration version files are present.

Impact:

  • Schema evolution lacks reproducibility and deployment confidence.

Recommendation:

  • Generate and commit baseline migration + incremental migration workflow.

5) Risk Register Snapshot

  • Security risk: high (admin auth gap).
  • Runtime stability risk: high (ORM mapping + invalid query casts).
  • Data correctness risk: high (reporting and timestamp defaults).
  • Delivery risk: medium (test quality and migration process).

6) Remediation Priority Plan

Phase 1 (Immediate, block release):

  1. Fix admin auth enforcement and gate /admin.
  2. Repair ORM join/FK model design and validate mapper configuration.
  3. Replace invalid cast expressions with valid SQLAlchemy casts.
  4. Fix normalization tuple fetch bug.

Phase 2 (Short-term):

  1. Correct reporting formulas (avg_nn and related derivations).
  2. Correct timestamp defaults across all models.
  3. Enforce tenant header consistency on all write paths.

Phase 3 (Hardening):

  1. Introduce robust assertion-based tests with CI gate.
  2. Persist scheduled report metadata.
  3. Establish migration discipline (alembic/versions under source control).

7) Acceptance Criteria for Closure

All of the following must be true:

  • Mapper configuration passes (configure_mappers() without errors).
  • All P0 and P1 findings resolved and validated by tests.
  • Admin endpoints require verified auth in target environment.
  • Query and report correctness covered by automated tests.
  • Migration scripts exist and apply cleanly from empty DB to latest.