Files
yellow-bank-soal/PROJECT_UNDERSTANDING.md
Dwindi Ramadhana 792f9b7483 refactor: redesign admin permalinks to RESTful paths
- Replace fragile /admin/ai-playground and /admin/ai-generation routes
  with item-scoped /admin/questions/{id}/generate endpoints
- Add /admin/tryouts/{id}/questions route using Tryout primary key
  instead of composite query params (?tryout_id=X&website_id=Y)
- Fix variant detail and review-bulk endpoints to use scoped paths
- Update all internal links (dashboard, hierarchy, exams) to new routes
- Remove obsolete ai_playground_view/submit/save functions
2026-06-16 23:54:24 +07:00

18 KiB
Raw Permalink Blame History

Project Understanding: IRT-Powered Adaptive Question Bank System

Project Name: IRT Bank Soal
Version: 1.0.0
Last Updated: 2026-06-15
Repository: https://git.backoffice.biz.id/dwindown/yellow-bank-soal


Table of Contents

  1. Executive Summary
  2. Project Purpose
  3. Tech Stack
  4. Project Structure
  5. Core Concepts
  6. Data Models
  7. API Endpoints
  8. Key Services
  9. Scoring Formulas
  10. Configuration
  11. Workflows
  12. Deployment

Executive Summary

This is a FastAPI-based backend system for managing adaptive assessment/tryout exams with sophisticated scoring capabilities. The system supports both Classical Test Theory (CTT) and Item Response Theory (IRT) scoring methods, with multi-website support for WordPress integration.

Key Features

Feature Description
CTT Scoring Classical Test Theory with exact Excel formula compatibility
IRT Support Item Response Theory (1PL Rasch model) for adaptive testing
Multi-Site Single backend serving multiple WordPress sites
AI Generation Automatic question variant generation via OpenRouter
Excel Import/Export Bulk import/export questions from Excel files
Adaptive Testing Computer Adaptive Testing (CAT) with theta estimation
Normalization Static, dynamic, or hybrid score normalization

Project Purpose

The system replaces traditional fixed-difficulty exams with an adaptive question bank that:

  1. Measures student ability accurately using IRT theta estimation
  2. Provides comparable scores across different exam sessions via normalization
  3. Generates new questions using AI when needed
  4. Integrates with WordPress LMS platforms for student access
  5. Reduces exam fraud by delivering different question variants to each student

Tech Stack

Core Technologies

Framework:       FastAPI >= 0.104.1
Server:          Uvicorn >= 0.24.0
Database:        PostgreSQL + SQLAlchemy 2.0 (async)
ORM:             SQLAlchemy >= 2.0.23
Driver:          asyncpg >= 0.29.0
Migrations:      Alembic >= 1.13.0
Validation:      Pydantic >= 2.5.0

Data Processing

Excel:           openpyxl >= 3.1.2, pandas >= 2.1.4
Math/Science:    numpy >= 1.26.2, scipy >= 1.11.4

External Integrations

AI:              OpenAI >= 1.6.1 (OpenRouter API)
Task Queue:      Celery >= 5.3.6, Redis >= 5.0.1
Admin Panel:     FastAPI-Admin >= 1.0.0

Project Structure

yellow-bank-soal/
├── app/
│   ├── __init__.py
│   ├── main.py              # FastAPI app entry point
│   ├── admin.py             # FastAPI Admin configuration
│   ├── admin_web.py         # Admin web interface
│   ├── database.py          # Database configuration & session
│   │
│   ├── api/
│   │   └── v1/
│   │       ├── __init__.py
│   │       └── session.py   # Adaptive session endpoints
│   │
│   ├── core/
│   │   ├── __init__.py
│   │   ├── auth.py          # Authentication & authorization
│   │   ├── config.py        # Settings from environment
│   │   └── rate_limit.py    # Rate limiting
│   │
│   ├── models/
│   │   ├── __init__.py
│   │   ├── ai_generation_run.py
│   │   ├── item.py          # Question items
│   │   ├── report_schedule.py
│   │   ├── session.py       # Student tryout sessions
│   │   ├── tryout.py        # Tryout configurations
│   │   ├── tryout_import_snapshot.py
│   │   ├── tryout_snapshot_question.py
│   │   ├── tryout_stats.py  # Normalization statistics
│   │   ├── user.py
│   │   ├── user_answer.py   # Student responses
│   │   └── website.py
│   │
│   ├── routers/
│   │   ├── __init__.py
│   │   ├── admin.py         # Admin-only endpoints
│   │   ├── ai.py            # AI generation endpoints
│   │   ├── import_export.py # Excel import/export
│   │   ├── reports.py       # Report generation
│   │   ├── sessions.py      # Session management
│   │   ├── tryouts.py       # Tryout configuration
│   │   └── wordpress.py     # WordPress integration
│   │
│   ├── schemas/             # Pydantic request/response models
│   │   ├── __init__.py
│   │   ├── ai.py
│   │   ├── report.py
│   │   ├── session.py
│   │   ├── tryout.py
│   │   └── wordpress.py
│   │
│   └── services/
│       ├── __init__.py
│       ├── ai_generation.py # OpenRouter integration
│       ├── cat_selection.py # Computer Adaptive Testing
│       ├── config_management.py
│       ├── ctt_scoring.py   # CTT scoring engine
│       ├── excel_import.py  # Excel parsing
│       ├── irt_calibration.py # IRT calibration
│       ├── normalization.py
│       ├── reporting.py
│       ├── tryout_json_import.py
│       └── wordpress_auth.py
│
├── alembic/                 # Database migrations
│   ├── env.py
│   ├── script.py.mako
│   └── versions/
│
├── tests/                   # Unit & integration tests
│   ├── test_auth_scope.py
│   ├── test_auth_tokens.py
│   ├── test_model_mappings.py
│   ├── test_normalization.py
│   ├── test_operational_hardening.py
│   ├── test_route_wiring.py
│   ├── test_security_regressions.py
│   └── test_tryout_json_import.py
│
├── requirements.txt
├── alembic.ini
├── irt_1pl_mle.py          # Standalone IRT MLE script
├── PRD.md                  # Product Requirements Document
├── project-brief.md        # Technical specification
└── handoff.md             # Project handoff context

Core Concepts

1. Tryout (Exam)

A Tryout represents a complete exam/test with configurable behavior:

scoring_mode:       "ctt" | "irt" | "hybrid"
selection_mode:     "fixed" | "adaptive" | "hybrid"
normalization_mode: "static" | "dynamic" | "hybrid"

2. Item (Question)

An Item represents a single question with:

  • Content: stem (question text), options (A/B/C/D), correct_answer
  • CTT Parameters: p-value (difficulty), bobot (weight)
  • IRT Parameters: b (difficulty), se (standard error)
  • Metadata: slot position, difficulty level, AI generation info

3. Session (Student Attempt)

A Session tracks a student's attempt:

  • Links student (wp_user_id) to a Tryout
  • Records all answers via UserAnswer records
  • Stores computed scores: NM, NN, theta

4. Website (Multi-Tenant)

The system supports multiple WordPress websites from a single backend:

  • Each website has isolated data
  • Authenticated via X-Website-ID header
  • WordPress JWT tokens for authentication

Data Models

Entity Relationship Diagram

erDiagram
    Website ||--o{ Tryout : "hosts"
    Website ||--o{ User : "contains"
    Website ||--o{ Session : "serves"
    Website ||--o{ Item : "contains"
    
    Tryout ||--o{ Item : "contains"
    Tryout ||--o{ Session : "has"
    Tryout ||--o{ TryoutStats : "tracks"
    
    Session ||--o{ UserAnswer : "contains"
    Session ||--o{ User : "belongs to"
    
    Item ||--o{ UserAnswer : "answered by"
    Item ||--o{ Item : "has variants"
    
    AIGenerationRun ||--o{ Item : "generates"

Model Summary

Model Purpose Key Fields
Website Multi-tenant isolation domain, wordpress_url
User WordPress user mapping wp_user_id, website_id
Tryout Exam configuration scoring_mode, selection_mode, normalization_mode
Item Question stem, options, ctt_p, ctt_bobot, irt_b, irt_se
Session Student attempt session_id, NM, NN, theta
UserAnswer Single response response, is_correct, bobot_earned
TryoutStats Normalization data participant_count, rataan, sb
AIGenerationRun AI generation batch model, status, items_generated

API Endpoints

Public API (via /api/v1)

Method Endpoint Description
GET /tryout/{tryout_id}/config Get tryout configuration
PUT /tryout/{tryout_id}/normalization Update normalization settings
GET /tryout/ List tryouts for website
GET /tryout/{tryout_id}/calibration-status Get IRT calibration status
POST /tryout/{tryout_id}/calibrate Trigger IRT calibration
POST /session/ Create new session
GET /session/{session_id} Get session details
POST /session/{session_id}/complete Submit answers, calculate scores

Admin API (requires admin role)

Method Endpoint Description
POST /ai/generate Generate AI questions
POST /import/excel Import questions from Excel
GET /export/excel/{tryout_id} Export questions to Excel
GET /reports/* Generate various reports

Adaptive Session API (via /api/v1/session)

Method Endpoint Description
POST /adaptive/start Start adaptive session
POST /adaptive/respond Submit answer, get next item
POST /adaptive/complete Complete adaptive session

Key Services

1. CTT Scoring Engine (ctt_scoring.py)

Implements Classical Test Theory scoring with exact Excel formulas.

Key Functions:

  • calculate_ctt_p() - Difficulty: p = Σ Benar / Total Peserta
  • calculate_ctt_bobot() - Weight: Bobot = 1 - p
  • calculate_ctt_nm() - Raw Score: NM = (Total_Bobot / Total_Bobot_Max) × 1000
  • calculate_ctt_nn() - Normalized: NN = 500 + 100 × ((NM - Rataan) / SB)
  • categorize_difficulty() - Categorize by p-value
  • update_tryout_stats() - Incrementally update normalization stats

2. IRT Calibration (irt_calibration.py)

Implements Item Response Theory (1PL Rasch model) for adaptive testing.

Key Functions:

  • estimate_theta_mle() - MLE theta estimation for students
  • estimate_b() - IRT difficulty calibration for items
  • calibrate_item() - Calibrate single item from response data
  • calibrate_all() - Batch calibrate all items in tryout
  • calculate_fisher_information() - Fisher information for item selection

Parameters:

  • θ (theta): Student ability [-3, +3]
  • b: Item difficulty [-3, +3]
  • Probability: P(θ) = 1 / (1 + exp(-(θ - b)))

3. AI Generation (ai_generation.py)

Generates question variants using OpenRouter API.

Key Functions:

  • generate_question() - Generate single question via OpenRouter
  • generate_questions_batch() - Generate multiple questions
  • save_ai_question() - Save generated question to database
  • check_cache_reuse() - Check for reusable similar questions

Models Supported:

  • Qwen 2.5 32B (balanced)
  • Mistral Small (low cost)
  • Llama 3.3 70B (premium)

4. Excel Import/Export (excel_import.py)

Bulk import/export questions from Excel files.

Key Functions:

  • parse_excel_import() - Parse Excel file to items
  • bulk_insert_items() - Insert parsed items to database
  • export_questions_to_excel() - Export tryout to Excel

5. CAT Selection (cat_selection.py)

Computer Adaptive Testing item selection algorithm.

Key Functions:

  • select_next_item() - Select next item based on theta estimate
  • calculate_theta_update() - Update theta after response
  • check_termination() - Check if test should end

Scoring Formulas

CTT (Classical Test Theory)

Based on exact client Excel formulas:

# STEP 1: Tingkat Kesukaran (p-value)
p = Σ Benar / Total Peserta

# STEP 2: Bobot (Weight)
Bobot = 1 - p

# STEP 3: Total Benar per Siswa
Total_Benar = count of correct answers

# STEP 4: Total Bobot Earned per Siswa
Total_Bobot_Siswa = Σ Bobot for each correct answer

# STEP 5: Nilai Mentah (Raw Score)
NM = (Total_Bobot_Siswa / Total_Bobot_Max) × 1000

# STEP 6: Nilai Nasional (Normalized Score)
NN = 500 + 100 × ((NM - Rataan) / SB)

IRT (Item Response Theory)

1PL Rasch Model:

# Probability of correct response
P(θ, b) = 1 / (1 + exp(-(θ - b)))

# Log-likelihood for MLE
LL = Σ [u_i × log(P) + (1-u_i) × log(1-P)]

# Theta estimation via MLE
θ_mle = argmax_θ LL(θ)

Difficulty Categories (CTT Standard)

p-value Category Description
p < 0.30 Sulit Difficult
0.30 ≤ p ≤ 0.70 Sedang Medium
p > 0.70 Mudah Easy

Configuration

Environment Variables

# Database
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/irt_bank_soal

# FastAPI
SECRET_KEY=your-secret-key-here
ENVIRONMENT=development  # development, staging, production
ENABLE_ADMIN=true
ADMIN_USERNAME=admin
ADMIN_PASSWORD=your-password

# OpenRouter (AI)
OPENROUTER_API_KEY=sk-or-v1-xxx
OPENROUTER_MODEL_QWEN=qwen/qwen2.5-32b-instruct
OPENROUTER_MODEL_CHEAP=mistralai/mistral-small-2603
OPENROUTER_MODEL_LLAMA=meta-llama/llama-3.3-70b-instruct

# Redis/Celery
REDIS_URL=redis://localhost:6379/0
CELERY_BROKER_URL=redis://localhost:6379/0

# CORS
ALLOWED_ORIGINS=http://localhost:3000,https://yourdomain.com

Tryout Configuration Options

# Scoring Mode
scoring_mode = "ctt"        # Classical Test Theory
scoring_mode = "irt"        # Item Response Theory
scoring_mode = "hybrid"     # Both (IRT for calibration, CTT for scoring)

# Selection Mode
selection_mode = "fixed"    # Fixed order questions
selection_mode = "adaptive" # Computer Adaptive Testing
selection_mode = "hybrid"   # Start fixed, switch to adaptive

# Normalization Mode
normalization_mode = "static"   # Use hardcoded rataan/sb
normalization_mode = "dynamic"  # Calculate from participant data
normalization_mode = "hybrid"   # Dynamic when sufficient data

Workflows

1. Student Taking a Tryout

sequenceDiagram
    participant S as Student
    participant API as FastAPI
    participant WP as WordPress
    
    S->>API: POST /session/ (start session)
    API-->>S: session_id
    
    loop For each question
        S->>API: GET /session/{id}/next-item
        API-->>S: Question data
        
        S->>API: POST /session/{id}/answer
        API-->>S: Next question or completion
    end
    
    S->>API: POST /session/{id}/complete
    API-->>S: NM, NN scores

2. Admin Importing Questions

flowchart TD
    A[Upload Excel File] --> B[Parse Excel]
    B --> C{Validate Structure}
    C -->|Invalid| D[Return Error]
    C -->|Valid| E[Calculate CTT p & bobot]
    E --> F[Bulk Insert Items]
    F --> G[Commit to Database]
    G --> H[Return Import Summary]

3. AI Question Generation

flowchart TD
    A[Request Generation] --> B{Check Cache}
    B -->|Found similar| C[Return Cached]
    B -->|Not found| D[Call OpenRouter API]
    D --> E{Parse Response}
    E -->|Parse Error| F[Return Error]
    E -->|Success| G[Save to Database]
    G --> H[Return Generated Item]

4. IRT Calibration

flowchart TD
    A[Collect Responses] --> B{Enough Data?}
    B -->|No| C[Wait for more]
    B -->|Yes| D[For each Item]
    D --> E[Get Response Matrix]
    E --> F[Estimate b via MLE]
    F --> G[Calculate Standard Error]
    G --> H[Update Item]
    H --> D
    D --> I[Mark Items Calibrated]

Deployment

Requirements

  • Python 3.10+
  • PostgreSQL 14+
  • Redis 6+ (for Celery)
  • Nginx (reverse proxy)
  • aaPanel with Python Manager (recommended)

Running the Application

# Install dependencies
pip install -r requirements.txt

# Run migrations
alembic upgrade head

# Start server
uvicorn app.main:app --host 0.0.0.0 --port 8000

# Or with reload (development)
uvicorn app.main:app --reload

Running Tests

pytest tests/ -v

API Documentation

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc
  • OpenAPI JSON: http://localhost:8000/openapi.json

Security Considerations

Authentication

  • WordPress JWT tokens for user authentication
  • X-Website-ID header for multi-tenant isolation
  • Admin routes protected by admin role check

Production Hardening

  1. SECRET_KEY must be set to a strong, unique value
  2. ADMIN_PASSWORD must not be the default
  3. CORS origins should be explicitly configured
  4. Database connections should use SSL in production
  5. Rate limiting enabled for AI generation endpoints

Glossary

Term Definition
Tryout An exam/test assessment
Item A single question in a tryout
Session A student's attempt at a tryout
CTT Classical Test Theory - traditional scoring
IRT Item Response Theory - modern adaptive scoring
NM Nilai Mentah - raw score [0-1000]
NN Nilai Nasional - normalized score [0-1000]
θ (theta) IRT ability estimate [-3 to +3]
b IRT item difficulty [-3 to +3]
p-value CTT proportion correct [0 to 1]
Bobot CTT weight (1 - p)
Rataan Mean (Indonesian)
SB Simpangan Baku - Standard Deviation
CAT Computer Adaptive Testing
MLE Maximum Likelihood Estimation

References