refactor: redesign admin permalinks to RESTful paths

- Replace fragile /admin/ai-playground and /admin/ai-generation routes
  with item-scoped /admin/questions/{id}/generate endpoints
- Add /admin/tryouts/{id}/questions route using Tryout primary key
  instead of composite query params (?tryout_id=X&website_id=Y)
- Fix variant detail and review-bulk endpoints to use scoped paths
- Update all internal links (dashboard, hierarchy, exams) to new routes
- Remove obsolete ai_playground_view/submit/save functions
This commit is contained in:
Dwindi Ramadhana
2026-06-16 23:54:24 +07:00
parent 7adbc5fb97
commit 792f9b7483
16 changed files with 4606 additions and 741 deletions

612
PROJECT_UNDERSTANDING.md Normal file
View File

@@ -0,0 +1,612 @@
# Project Understanding: IRT-Powered Adaptive Question Bank System
> **Project Name:** IRT Bank Soal
> **Version:** 1.0.0
> **Last Updated:** 2026-06-15
> **Repository:** https://git.backoffice.biz.id/dwindown/yellow-bank-soal
---
## Table of Contents
1. [Executive Summary](#executive-summary)
2. [Project Purpose](#project-purpose)
3. [Tech Stack](#tech-stack)
4. [Project Structure](#project-structure)
5. [Core Concepts](#core-concepts)
6. [Data Models](#data-models)
7. [API Endpoints](#api-endpoints)
8. [Key Services](#key-services)
9. [Scoring Formulas](#scoring-formulas)
10. [Configuration](#configuration)
11. [Workflows](#workflows)
12. [Deployment](#deployment)
---
## Executive Summary
This is a **FastAPI-based backend system** for managing adaptive assessment/tryout exams with sophisticated scoring capabilities. The system supports both **Classical Test Theory (CTT)** and **Item Response Theory (IRT)** scoring methods, with multi-website support for WordPress integration.
### Key Features
| Feature | Description |
|---------|-------------|
| **CTT Scoring** | Classical Test Theory with exact Excel formula compatibility |
| **IRT Support** | Item Response Theory (1PL Rasch model) for adaptive testing |
| **Multi-Site** | Single backend serving multiple WordPress sites |
| **AI Generation** | Automatic question variant generation via OpenRouter |
| **Excel Import/Export** | Bulk import/export questions from Excel files |
| **Adaptive Testing** | Computer Adaptive Testing (CAT) with theta estimation |
| **Normalization** | Static, dynamic, or hybrid score normalization |
---
## Project Purpose
The system replaces traditional fixed-difficulty exams with an **adaptive question bank** that:
1. **Measures student ability accurately** using IRT theta estimation
2. **Provides comparable scores** across different exam sessions via normalization
3. **Generates new questions** using AI when needed
4. **Integrates with WordPress** LMS platforms for student access
5. **Reduces exam fraud** by delivering different question variants to each student
---
## Tech Stack
### Core Technologies
```
Framework: FastAPI >= 0.104.1
Server: Uvicorn >= 0.24.0
Database: PostgreSQL + SQLAlchemy 2.0 (async)
ORM: SQLAlchemy >= 2.0.23
Driver: asyncpg >= 0.29.0
Migrations: Alembic >= 1.13.0
Validation: Pydantic >= 2.5.0
```
### Data Processing
```
Excel: openpyxl >= 3.1.2, pandas >= 2.1.4
Math/Science: numpy >= 1.26.2, scipy >= 1.11.4
```
### External Integrations
```
AI: OpenAI >= 1.6.1 (OpenRouter API)
Task Queue: Celery >= 5.3.6, Redis >= 5.0.1
Admin Panel: FastAPI-Admin >= 1.0.0
```
---
## Project Structure
```
yellow-bank-soal/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI app entry point
│ ├── admin.py # FastAPI Admin configuration
│ ├── admin_web.py # Admin web interface
│ ├── database.py # Database configuration & session
│ │
│ ├── api/
│ │ └── v1/
│ │ ├── __init__.py
│ │ └── session.py # Adaptive session endpoints
│ │
│ ├── core/
│ │ ├── __init__.py
│ │ ├── auth.py # Authentication & authorization
│ │ ├── config.py # Settings from environment
│ │ └── rate_limit.py # Rate limiting
│ │
│ ├── models/
│ │ ├── __init__.py
│ │ ├── ai_generation_run.py
│ │ ├── item.py # Question items
│ │ ├── report_schedule.py
│ │ ├── session.py # Student tryout sessions
│ │ ├── tryout.py # Tryout configurations
│ │ ├── tryout_import_snapshot.py
│ │ ├── tryout_snapshot_question.py
│ │ ├── tryout_stats.py # Normalization statistics
│ │ ├── user.py
│ │ ├── user_answer.py # Student responses
│ │ └── website.py
│ │
│ ├── routers/
│ │ ├── __init__.py
│ │ ├── admin.py # Admin-only endpoints
│ │ ├── ai.py # AI generation endpoints
│ │ ├── import_export.py # Excel import/export
│ │ ├── reports.py # Report generation
│ │ ├── sessions.py # Session management
│ │ ├── tryouts.py # Tryout configuration
│ │ └── wordpress.py # WordPress integration
│ │
│ ├── schemas/ # Pydantic request/response models
│ │ ├── __init__.py
│ │ ├── ai.py
│ │ ├── report.py
│ │ ├── session.py
│ │ ├── tryout.py
│ │ └── wordpress.py
│ │
│ └── services/
│ ├── __init__.py
│ ├── ai_generation.py # OpenRouter integration
│ ├── cat_selection.py # Computer Adaptive Testing
│ ├── config_management.py
│ ├── ctt_scoring.py # CTT scoring engine
│ ├── excel_import.py # Excel parsing
│ ├── irt_calibration.py # IRT calibration
│ ├── normalization.py
│ ├── reporting.py
│ ├── tryout_json_import.py
│ └── wordpress_auth.py
├── alembic/ # Database migrations
│ ├── env.py
│ ├── script.py.mako
│ └── versions/
├── tests/ # Unit & integration tests
│ ├── test_auth_scope.py
│ ├── test_auth_tokens.py
│ ├── test_model_mappings.py
│ ├── test_normalization.py
│ ├── test_operational_hardening.py
│ ├── test_route_wiring.py
│ ├── test_security_regressions.py
│ └── test_tryout_json_import.py
├── requirements.txt
├── alembic.ini
├── irt_1pl_mle.py # Standalone IRT MLE script
├── PRD.md # Product Requirements Document
├── project-brief.md # Technical specification
└── handoff.md # Project handoff context
```
---
## Core Concepts
### 1. Tryout (Exam)
A **Tryout** represents a complete exam/test with configurable behavior:
```python
scoring_mode: "ctt" | "irt" | "hybrid"
selection_mode: "fixed" | "adaptive" | "hybrid"
normalization_mode: "static" | "dynamic" | "hybrid"
```
### 2. Item (Question)
An **Item** represents a single question with:
- **Content**: stem (question text), options (A/B/C/D), correct_answer
- **CTT Parameters**: p-value (difficulty), bobot (weight)
- **IRT Parameters**: b (difficulty), se (standard error)
- **Metadata**: slot position, difficulty level, AI generation info
### 3. Session (Student Attempt)
A **Session** tracks a student's attempt:
- Links student (`wp_user_id`) to a Tryout
- Records all answers via `UserAnswer` records
- Stores computed scores: NM, NN, theta
### 4. Website (Multi-Tenant)
The system supports **multiple WordPress websites** from a single backend:
- Each website has isolated data
- Authenticated via `X-Website-ID` header
- WordPress JWT tokens for authentication
---
## Data Models
### Entity Relationship Diagram
```mermaid
erDiagram
Website ||--o{ Tryout : "hosts"
Website ||--o{ User : "contains"
Website ||--o{ Session : "serves"
Website ||--o{ Item : "contains"
Tryout ||--o{ Item : "contains"
Tryout ||--o{ Session : "has"
Tryout ||--o{ TryoutStats : "tracks"
Session ||--o{ UserAnswer : "contains"
Session ||--o{ User : "belongs to"
Item ||--o{ UserAnswer : "answered by"
Item ||--o{ Item : "has variants"
AIGenerationRun ||--o{ Item : "generates"
```
### Model Summary
| Model | Purpose | Key Fields |
|-------|---------|------------|
| `Website` | Multi-tenant isolation | domain, wordpress_url |
| `User` | WordPress user mapping | wp_user_id, website_id |
| `Tryout` | Exam configuration | scoring_mode, selection_mode, normalization_mode |
| `Item` | Question | stem, options, ctt_p, ctt_bobot, irt_b, irt_se |
| `Session` | Student attempt | session_id, NM, NN, theta |
| `UserAnswer` | Single response | response, is_correct, bobot_earned |
| `TryoutStats` | Normalization data | participant_count, rataan, sb |
| `AIGenerationRun` | AI generation batch | model, status, items_generated |
---
## API Endpoints
### Public API (via `/api/v1`)
| Method | Endpoint | Description |
|--------|----------|-------------|
| `GET` | `/tryout/{tryout_id}/config` | Get tryout configuration |
| `PUT` | `/tryout/{tryout_id}/normalization` | Update normalization settings |
| `GET` | `/tryout/` | List tryouts for website |
| `GET` | `/tryout/{tryout_id}/calibration-status` | Get IRT calibration status |
| `POST` | `/tryout/{tryout_id}/calibrate` | Trigger IRT calibration |
| `POST` | `/session/` | Create new session |
| `GET` | `/session/{session_id}` | Get session details |
| `POST` | `/session/{session_id}/complete` | Submit answers, calculate scores |
### Admin API (requires admin role)
| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/ai/generate` | Generate AI questions |
| `POST` | `/import/excel` | Import questions from Excel |
| `GET` | `/export/excel/{tryout_id}` | Export questions to Excel |
| `GET` | `/reports/*` | Generate various reports |
### Adaptive Session API (via `/api/v1/session`)
| Method | Endpoint | Description |
|--------|----------|-------------|
| `POST` | `/adaptive/start` | Start adaptive session |
| `POST` | `/adaptive/respond` | Submit answer, get next item |
| `POST` | `/adaptive/complete` | Complete adaptive session |
---
## Key Services
### 1. CTT Scoring Engine (`ctt_scoring.py`)
Implements Classical Test Theory scoring with exact Excel formulas.
**Key Functions:**
- `calculate_ctt_p()` - Difficulty: p = Σ Benar / Total Peserta
- `calculate_ctt_bobot()` - Weight: Bobot = 1 - p
- `calculate_ctt_nm()` - Raw Score: NM = (Total_Bobot / Total_Bobot_Max) × 1000
- `calculate_ctt_nn()` - Normalized: NN = 500 + 100 × ((NM - Rataan) / SB)
- `categorize_difficulty()` - Categorize by p-value
- `update_tryout_stats()` - Incrementally update normalization stats
### 2. IRT Calibration (`irt_calibration.py`)
Implements Item Response Theory (1PL Rasch model) for adaptive testing.
**Key Functions:**
- `estimate_theta_mle()` - MLE theta estimation for students
- `estimate_b()` - IRT difficulty calibration for items
- `calibrate_item()` - Calibrate single item from response data
- `calibrate_all()` - Batch calibrate all items in tryout
- `calculate_fisher_information()` - Fisher information for item selection
**Parameters:**
- θ (theta): Student ability [-3, +3]
- b: Item difficulty [-3, +3]
- Probability: P(θ) = 1 / (1 + exp(-(θ - b)))
### 3. AI Generation (`ai_generation.py`)
Generates question variants using OpenRouter API.
**Key Functions:**
- `generate_question()` - Generate single question via OpenRouter
- `generate_questions_batch()` - Generate multiple questions
- `save_ai_question()` - Save generated question to database
- `check_cache_reuse()` - Check for reusable similar questions
**Models Supported:**
- Qwen 2.5 32B (balanced)
- Mistral Small (low cost)
- Llama 3.3 70B (premium)
### 4. Excel Import/Export (`excel_import.py`)
Bulk import/export questions from Excel files.
**Key Functions:**
- `parse_excel_import()` - Parse Excel file to items
- `bulk_insert_items()` - Insert parsed items to database
- `export_questions_to_excel()` - Export tryout to Excel
### 5. CAT Selection (`cat_selection.py`)
Computer Adaptive Testing item selection algorithm.
**Key Functions:**
- `select_next_item()` - Select next item based on theta estimate
- `calculate_theta_update()` - Update theta after response
- `check_termination()` - Check if test should end
---
## Scoring Formulas
### CTT (Classical Test Theory)
Based on exact client Excel formulas:
```python
# STEP 1: Tingkat Kesukaran (p-value)
p = Σ Benar / Total Peserta
# STEP 2: Bobot (Weight)
Bobot = 1 - p
# STEP 3: Total Benar per Siswa
Total_Benar = count of correct answers
# STEP 4: Total Bobot Earned per Siswa
Total_Bobot_Siswa = Σ Bobot for each correct answer
# STEP 5: Nilai Mentah (Raw Score)
NM = (Total_Bobot_Siswa / Total_Bobot_Max) × 1000
# STEP 6: Nilai Nasional (Normalized Score)
NN = 500 + 100 × ((NM - Rataan) / SB)
```
### IRT (Item Response Theory)
1PL Rasch Model:
```python
# Probability of correct response
P(θ, b) = 1 / (1 + exp(-(θ - b)))
# Log-likelihood for MLE
LL = Σ [u_i × log(P) + (1-u_i) × log(1-P)]
# Theta estimation via MLE
θ_mle = argmax_θ LL(θ)
```
### Difficulty Categories (CTT Standard)
| p-value | Category | Description |
|---------|----------|-------------|
| p < 0.30 | Sulit | Difficult |
| 0.30 ≤ p ≤ 0.70 | Sedang | Medium |
| p > 0.70 | Mudah | Easy |
---
## Configuration
### Environment Variables
```bash
# Database
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/irt_bank_soal
# FastAPI
SECRET_KEY=your-secret-key-here
ENVIRONMENT=development # development, staging, production
ENABLE_ADMIN=true
ADMIN_USERNAME=admin
ADMIN_PASSWORD=your-password
# OpenRouter (AI)
OPENROUTER_API_KEY=sk-or-v1-xxx
OPENROUTER_MODEL_QWEN=qwen/qwen2.5-32b-instruct
OPENROUTER_MODEL_CHEAP=mistralai/mistral-small-2603
OPENROUTER_MODEL_LLAMA=meta-llama/llama-3.3-70b-instruct
# Redis/Celery
REDIS_URL=redis://localhost:6379/0
CELERY_BROKER_URL=redis://localhost:6379/0
# CORS
ALLOWED_ORIGINS=http://localhost:3000,https://yourdomain.com
```
### Tryout Configuration Options
```python
# Scoring Mode
scoring_mode = "ctt" # Classical Test Theory
scoring_mode = "irt" # Item Response Theory
scoring_mode = "hybrid" # Both (IRT for calibration, CTT for scoring)
# Selection Mode
selection_mode = "fixed" # Fixed order questions
selection_mode = "adaptive" # Computer Adaptive Testing
selection_mode = "hybrid" # Start fixed, switch to adaptive
# Normalization Mode
normalization_mode = "static" # Use hardcoded rataan/sb
normalization_mode = "dynamic" # Calculate from participant data
normalization_mode = "hybrid" # Dynamic when sufficient data
```
---
## Workflows
### 1. Student Taking a Tryout
```mermaid
sequenceDiagram
participant S as Student
participant API as FastAPI
participant WP as WordPress
S->>API: POST /session/ (start session)
API-->>S: session_id
loop For each question
S->>API: GET /session/{id}/next-item
API-->>S: Question data
S->>API: POST /session/{id}/answer
API-->>S: Next question or completion
end
S->>API: POST /session/{id}/complete
API-->>S: NM, NN scores
```
### 2. Admin Importing Questions
```mermaid
flowchart TD
A[Upload Excel File] --> B[Parse Excel]
B --> C{Validate Structure}
C -->|Invalid| D[Return Error]
C -->|Valid| E[Calculate CTT p & bobot]
E --> F[Bulk Insert Items]
F --> G[Commit to Database]
G --> H[Return Import Summary]
```
### 3. AI Question Generation
```mermaid
flowchart TD
A[Request Generation] --> B{Check Cache}
B -->|Found similar| C[Return Cached]
B -->|Not found| D[Call OpenRouter API]
D --> E{Parse Response}
E -->|Parse Error| F[Return Error]
E -->|Success| G[Save to Database]
G --> H[Return Generated Item]
```
### 4. IRT Calibration
```mermaid
flowchart TD
A[Collect Responses] --> B{Enough Data?}
B -->|No| C[Wait for more]
B -->|Yes| D[For each Item]
D --> E[Get Response Matrix]
E --> F[Estimate b via MLE]
F --> G[Calculate Standard Error]
G --> H[Update Item]
H --> D
D --> I[Mark Items Calibrated]
```
---
## Deployment
### Requirements
- Python 3.10+
- PostgreSQL 14+
- Redis 6+ (for Celery)
- Nginx (reverse proxy)
- aaPanel with Python Manager (recommended)
### Running the Application
```bash
# Install dependencies
pip install -r requirements.txt
# Run migrations
alembic upgrade head
# Start server
uvicorn app.main:app --host 0.0.0.0 --port 8000
# Or with reload (development)
uvicorn app.main:app --reload
```
### Running Tests
```bash
pytest tests/ -v
```
### API Documentation
- Swagger UI: `http://localhost:8000/docs`
- ReDoc: `http://localhost:8000/redoc`
- OpenAPI JSON: `http://localhost:8000/openapi.json`
---
## Security Considerations
### Authentication
- WordPress JWT tokens for user authentication
- `X-Website-ID` header for multi-tenant isolation
- Admin routes protected by admin role check
### Production Hardening
1. **SECRET_KEY** must be set to a strong, unique value
2. **ADMIN_PASSWORD** must not be the default
3. **CORS** origins should be explicitly configured
4. **Database** connections should use SSL in production
5. **Rate limiting** enabled for AI generation endpoints
---
## Glossary
| Term | Definition |
|------|------------|
| **Tryout** | An exam/test assessment |
| **Item** | A single question in a tryout |
| **Session** | A student's attempt at a tryout |
| **CTT** | Classical Test Theory - traditional scoring |
| **IRT** | Item Response Theory - modern adaptive scoring |
| **NM** | Nilai Mentah - raw score [0-1000] |
| **NN** | Nilai Nasional - normalized score [0-1000] |
| **θ (theta)** | IRT ability estimate [-3 to +3] |
| **b** | IRT item difficulty [-3 to +3] |
| **p-value** | CTT proportion correct [0 to 1] |
| **Bobot** | CTT weight (1 - p) |
| **Rataan** | Mean (Indonesian) |
| **SB** | Simpangan Baku - Standard Deviation |
| **CAT** | Computer Adaptive Testing |
| **MLE** | Maximum Likelihood Estimation |
---
## References
- [PRD.md](./PRD.md) - Complete Product Requirements Document
- [project-brief.md](./project-brief.md) - Original technical specification
- [FastAPI Documentation](https://fastapi.tiangolo.com/)
- [SQLAlchemy 2.0](https://docs.sqlalchemy.org/en/20/)
- [Item Response Theory](https://en.wikipedia.org/wiki/Item_response_theory)