fix: harden admin access, repair ORM joins, and add migration/tests
This commit is contained in:
339
DEFECT_GAP_AUDIT_REPORT.md
Normal file
339
DEFECT_GAP_AUDIT_REPORT.md
Normal file
@@ -0,0 +1,339 @@
|
||||
# Yellow Bank Soal - Defect and Gap Audit Report
|
||||
|
||||
Date: 2026-03-31
|
||||
Auditor: Codex (GPT-5)
|
||||
Scope: Static code trace, dependency-aware runtime checks, targeted test execution
|
||||
|
||||
## 1) Executive Summary
|
||||
|
||||
This audit identified critical reliability and security risks that should be addressed before production rollout:
|
||||
|
||||
- P0: ORM relationship configuration is broken for key entities (`Tryout` to `Item`/`Session`/`TryoutStats`), causing mapper setup failure.
|
||||
- P0: Admin panel is mounted while authentication provider registration is disabled.
|
||||
- P1: Multiple SQL aggregation queries use invalid `func.cast(..., type_=func.INTEGER)` patterns that will fail at runtime.
|
||||
- P1: Normalization parameter retrieval uses incorrect scalar fetch semantics for multi-column queries.
|
||||
- P1: Reporting logic contains at least one mathematically invalid metric computation (`avg_nn`).
|
||||
|
||||
Overall status: **Not production-ready** until P0 and P1 findings are remediated.
|
||||
|
||||
## 2) Methodology
|
||||
|
||||
- Repository structure and architecture trace across routers, services, models, schemas, and admin modules.
|
||||
- Runtime checks after dependency install in a local virtual environment.
|
||||
- Targeted validation of SQLAlchemy mapper setup and SQL expression construction.
|
||||
- Existing test suite execution.
|
||||
|
||||
## 3) Verification Results
|
||||
|
||||
- Dependency install: successful (`.venv`, requirements installed).
|
||||
- Test execution: `.venv/bin/pytest -q` -> `3 passed`.
|
||||
- Mapper validation: failed (`NoForeignKeysError`) when forcing mapper configuration.
|
||||
- SQLAlchemy expression validation: failed for `func.cast(..., type_=func.INTEGER)` usage.
|
||||
|
||||
Note: passing tests do not indicate system correctness due to weak assertion coverage (see finding G-02).
|
||||
|
||||
## 4) Findings
|
||||
|
||||
## P0 Findings
|
||||
|
||||
### P0-01: Broken ORM relationship joins (mapper configuration failure)
|
||||
|
||||
Severity: P0 (Critical)
|
||||
Category: Data model / Runtime stability
|
||||
|
||||
Description:
|
||||
|
||||
`Tryout` relationships to `Item`, `Session`, and `TryoutStats` are defined, but corresponding child fields (`tryout_id`) are plain strings without foreign key constraints or explicit `primaryjoin` definitions. SQLAlchemy cannot infer join conditions for relationship mapping.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `app/models/item.py:67` and `app/models/item.py:169`
|
||||
- `app/models/session.py:79` and `app/models/session.py:160`
|
||||
- `app/models/tryout_stats.py:52` and `app/models/tryout_stats.py:120`
|
||||
- `app/models/tryout.py:162-170`
|
||||
|
||||
Runtime proof:
|
||||
|
||||
- `configure_mappers()` raises `sqlalchemy.exc.NoForeignKeysError` with message indicating no FK between `items` and `tryouts`.
|
||||
|
||||
Impact:
|
||||
|
||||
- Relationship loading may fail when mappings are configured.
|
||||
- Admin/data operations depending on relationship loading are unstable.
|
||||
- High risk of runtime failures in production paths.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Add proper FK model constraints (or explicit `primaryjoin` if intentional composite mapping).
|
||||
- Prefer canonical FK design:
|
||||
- `items.tryout_pk` -> `tryouts.id`
|
||||
- `sessions.tryout_pk` -> `tryouts.id`
|
||||
- `tryout_stats.tryout_pk` -> `tryouts.id`
|
||||
- Keep `tryout_id` business identifier as separate indexed field if needed.
|
||||
|
||||
---
|
||||
|
||||
### P0-02: Admin panel mounted with auth provider not registered
|
||||
|
||||
Severity: P0 (Critical)
|
||||
Category: Security / Access control
|
||||
|
||||
Description:
|
||||
|
||||
Admin app is mounted in main app, but auth provider registration is commented out.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `app/main.py:176` (admin mounted on `/admin`)
|
||||
- `app/admin.py:607` (`admin_app.settings.auth_provider = AdminAuthProvider()` commented)
|
||||
|
||||
Impact:
|
||||
|
||||
- Potential unauthorized access to administrative resources.
|
||||
- High security exposure depending on default behavior/deployment setup.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Enable and enforce auth provider before exposing `/admin`.
|
||||
- Add environment-aware hard gate to disable admin in production until auth is verified.
|
||||
- Add integration tests for admin route authentication and authorization.
|
||||
|
||||
## P1 Findings
|
||||
|
||||
### P1-01: Invalid SQLAlchemy cast pattern in aggregate queries
|
||||
|
||||
Severity: P1 (High)
|
||||
Category: Query correctness / Runtime stability
|
||||
|
||||
Description:
|
||||
|
||||
Several queries use `func.cast(..., type_=func.INTEGER)`. This is not valid SQLAlchemy typing usage and fails during expression construction.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `app/services/ctt_scoring.py:193`
|
||||
- `app/services/reporting.py:418`
|
||||
- `app/services/reporting.py:681`
|
||||
- `app/routers/tryouts.py:295`
|
||||
|
||||
Impact:
|
||||
|
||||
- Runtime failures in calibration, reporting, and CTT calculations.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Replace with canonical cast:
|
||||
- `from sqlalchemy import cast, Integer`
|
||||
- `func.sum(cast(UserAnswer.is_correct, Integer))`
|
||||
|
||||
---
|
||||
|
||||
### P1-02: Incorrect multi-column fetch in normalization service
|
||||
|
||||
Severity: P1 (High)
|
||||
Category: Business logic / Runtime stability
|
||||
|
||||
Description:
|
||||
|
||||
Queries selecting `(Tryout.static_rataan, Tryout.static_sb)` use `scalar_one_or_none()` and then attempt tuple unpacking. `scalar_one_or_none()` returns a single scalar (first column), not both columns.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `app/services/normalization.py:311`, `:318`
|
||||
- `app/services/normalization.py:355`, `:377`
|
||||
|
||||
Impact:
|
||||
|
||||
- Type/logic errors during normalization parameter retrieval.
|
||||
- Can break score normalization flows.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Use `one_or_none()` for tuple rows:
|
||||
- `row = result.one_or_none()`
|
||||
- `rataan, sb = row`
|
||||
|
||||
---
|
||||
|
||||
### P1-03: Invalid `avg_nn` computation in tryout comparison report
|
||||
|
||||
Severity: P1 (High)
|
||||
Category: Reporting correctness
|
||||
|
||||
Description:
|
||||
|
||||
`avg_nn` is currently calculated as `stats.rataan + 500`, which does not match the normalization formula and is mathematically incorrect.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `app/services/reporting.py:713`
|
||||
|
||||
Impact:
|
||||
|
||||
- Misleading management and educational performance reports.
|
||||
- Incorrect decisions from bad analytics output.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Compute average NN from session-level NN values directly.
|
||||
- If unavailable, explicitly return `None` and annotate report incompleteness.
|
||||
|
||||
---
|
||||
|
||||
### P1-04: Timestamp defaults use string literal `"NOW()"` instead of SQL function
|
||||
|
||||
Severity: P1 (High)
|
||||
Category: Data integrity / Schema correctness
|
||||
|
||||
Description:
|
||||
|
||||
Model fields use `server_default="NOW()"`, which compiles to string literal default `'NOW()'`, not database function execution.
|
||||
|
||||
Evidence:
|
||||
|
||||
- Present in all models, e.g. `app/models/website.py:51`, `:56`
|
||||
|
||||
Impact:
|
||||
|
||||
- Incorrect or invalid default timestamps depending on DB behavior.
|
||||
- Inconsistent created/updated tracking.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Use SQLAlchemy expressions:
|
||||
- `server_default=func.now()`
|
||||
- `onupdate=func.now()` or DB trigger strategy.
|
||||
|
||||
## P2 Findings
|
||||
|
||||
### P2-01: Multi-tenant consistency gap in session creation endpoint
|
||||
|
||||
Severity: P2 (Medium)
|
||||
Category: Authorization / Data isolation
|
||||
|
||||
Description:
|
||||
|
||||
Most routes enforce tenant context via `X-Website-ID` header, but session creation accepts `website_id` in request body without header-based tenant check.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `app/routers/sessions.py:341-390`
|
||||
|
||||
Impact:
|
||||
|
||||
- Increased risk of cross-tenant write mistakes.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Enforce `X-Website-ID` across all write endpoints.
|
||||
- Reject payload tenant mismatch if header and body both exist.
|
||||
|
||||
---
|
||||
|
||||
### P2-02: Test suite has weak assertions
|
||||
|
||||
Severity: P2 (Medium)
|
||||
Category: Test quality
|
||||
|
||||
Description:
|
||||
|
||||
Current tests are print-oriented and mostly non-assertive, allowing false confidence.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `tests/test_normalization.py` uses print outputs and contains no robust asserts in core test blocks.
|
||||
|
||||
Impact:
|
||||
|
||||
- Regressions can pass CI undetected.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Rewrite tests with deterministic assertions.
|
||||
- Add coverage for:
|
||||
- mapper configuration
|
||||
- report calculations
|
||||
- normalization path selection
|
||||
- admin auth enforcement
|
||||
|
||||
---
|
||||
|
||||
### P2-03: Scheduled reports are in-memory only
|
||||
|
||||
Severity: P2 (Medium)
|
||||
Category: Operational reliability
|
||||
|
||||
Description:
|
||||
|
||||
Scheduled report registry is process-local (`_scheduled_reports` dict), not persisted.
|
||||
|
||||
Evidence:
|
||||
|
||||
- `app/services/reporting.py:1344`
|
||||
|
||||
Impact:
|
||||
|
||||
- Schedules disappear on restart.
|
||||
- Inconsistent behavior in multi-process deployment.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Move scheduling metadata to persistent storage (DB/Redis + worker scheduler).
|
||||
|
||||
---
|
||||
|
||||
### P2-04: Migration gap
|
||||
|
||||
Severity: P2 (Medium)
|
||||
Category: Delivery process
|
||||
|
||||
Description:
|
||||
|
||||
Alembic setup exists but no migration version files are present.
|
||||
|
||||
Impact:
|
||||
|
||||
- Schema evolution lacks reproducibility and deployment confidence.
|
||||
|
||||
Recommendation:
|
||||
|
||||
- Generate and commit baseline migration + incremental migration workflow.
|
||||
|
||||
## 5) Risk Register Snapshot
|
||||
|
||||
- Security risk: high (admin auth gap).
|
||||
- Runtime stability risk: high (ORM mapping + invalid query casts).
|
||||
- Data correctness risk: high (reporting and timestamp defaults).
|
||||
- Delivery risk: medium (test quality and migration process).
|
||||
|
||||
## 6) Remediation Priority Plan
|
||||
|
||||
Phase 1 (Immediate, block release):
|
||||
|
||||
1. Fix admin auth enforcement and gate `/admin`.
|
||||
2. Repair ORM join/FK model design and validate mapper configuration.
|
||||
3. Replace invalid cast expressions with valid SQLAlchemy casts.
|
||||
4. Fix normalization tuple fetch bug.
|
||||
|
||||
Phase 2 (Short-term):
|
||||
|
||||
1. Correct reporting formulas (`avg_nn` and related derivations).
|
||||
2. Correct timestamp defaults across all models.
|
||||
3. Enforce tenant header consistency on all write paths.
|
||||
|
||||
Phase 3 (Hardening):
|
||||
|
||||
1. Introduce robust assertion-based tests with CI gate.
|
||||
2. Persist scheduled report metadata.
|
||||
3. Establish migration discipline (`alembic/versions` under source control).
|
||||
|
||||
## 7) Acceptance Criteria for Closure
|
||||
|
||||
All of the following must be true:
|
||||
|
||||
- Mapper configuration passes (`configure_mappers()` without errors).
|
||||
- All P0 and P1 findings resolved and validated by tests.
|
||||
- Admin endpoints require verified auth in target environment.
|
||||
- Query and report correctness covered by automated tests.
|
||||
- Migration scripts exist and apply cleanly from empty DB to latest.
|
||||
|
||||
Reference in New Issue
Block a user