Files
yellow-bank-soal/backend/docs/ALUR-APLIKASI-DAN-IRT.md
2026-06-20 01:43:39 +07:00

626 lines
23 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Alur Aplikasi IRT-Powered Question Bank
Dokumen ini menjelaskan alur lengkap aplikasi dari input data hingga menghasilkan next-question berbasis IRT.
---
## 1. Arsitektur Sistem
### 1.1 Teknologi Stack
```
Framework: FastAPI >= 0.104.1
Database: PostgreSQL + SQLAlchemy 2.0 (async)
AI: OpenAI (OpenRouter API)
Admin Panel: FastAPI-Admin
Math: numpy, scipy
Excel: openpyxl, pandas
```
### 1.2 Entity Relationship
```mermaid
erDiagram
Website ||--o{ Tryout : "hosts"
Website ||--o{ User : "contains"
Website ||--o{ Session : "serves"
Website ||--o{ Item : "contains"
Tryout ||--o{ Item : "contains"
Tryout ||--o{ Session : "has"
Session ||--o{ UserAnswer : "contains"
Item ||--o{ Item : "has variants"
Item ||--o{ UserAnswer : "answered by"
AIGenerationRun ||--o{ Item : "generates"
```
---
## 2. Konsep Inti
### 2.1 Tryout (Exam)
**Tryout** merepresentasikan 1 ujian lengkap dengan konfigurasi:
| Field | Opsi | Default | Deskripsi |
|-------|------|---------|-----------|
| `scoring_mode` | `ctt`, `irt`, `hybrid` | `ctt` | Metode kalkulasi score |
| `selection_mode` | `fixed`, `adaptive`, `hybrid` | `fixed` | Strategi pemilihan soal |
| `normalization_mode` | `static`, `dynamic`, `hybrid` | `static` | Metode normalisasi |
### 2.2 Item (Soal)
**Item** merepresentasikan 1 soal dengan parameter:
| Field | Deskripsi |
|-------|-----------|
| `stem` | Teks pertanyaan |
| `options` | Pilihan jawaban (A/B/C/D/E) |
| `correct_answer` | Kunci jawaban |
| `slot` | Posisi nomor soal (1, 2, 3...) |
| `level` | Kategori kesulitan (mudah/sedang/sulit) |
| `parent_item_id` | ID soal original (jika ini variant) |
| `calibrated` | Status IRT calibration |
| `irt_b` | Item difficulty parameter |
| `irt_se` | Standard error |
| `ctt_p` | P-value (tingkat kesukaran CTT) |
| `ctt_bobot` | Bobot soal = 1 - p |
### 2.3 Session (Percobaan Siswa)
**Session** melacak aktivitas siswa:
| Field | Deskripsi |
|-------|-----------|
| `session_id` | Identifier unik |
| `wp_user_id` | ID user dari WordPress |
| `tryout_id` | Tryout yang diambil |
| `theta` | Kemampuan estimasi IRT |
| `theta_se` | Standard error theta |
| `NM` | Nilai Mentah (raw score) |
| `NN` | Nilai Nasional (normalized) |
| `is_completed` | Status selesai |
### 2.4 Website (Multi-Tenant)
Sistem mendukung multiple WordPress websites dari 1 backend:
- Isolasi data per website
- Auth via `X-Website-ID` header
- WordPress JWT tokens
---
## 3. Alur Input Data
### 3.1 Sumber Data Masuk
| Sumber | Format | Endpoint | Fungsi |
|--------|--------|----------|--------|
| Admin Import | Excel (.xlsx) | `POST /import/excel` | Bulk import dari file Excel |
| JSON Import | JSON | `tryout_json_import.py` | Import dari JSON (LMS external) |
| AI Generation | API Request | `POST /ai/generate` | Generate variant soal baru |
### 3.2 Flow Import JSON
```
┌─────────────────────────────────────────────────────────────┐
│ ADMIN: Import Tryout JSON │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Upload JSON file │
│ └─> File berisi 1 tryout lengkap (misal: "TO 2024") │
│ └─> Terdiri dari N soal (slot 1, 2, 3, ...) │
│ │
│ 2. Parse JSON │
│ └─> Extract setiap soal → Item record │
│ └─> Generate unique item_id │
│ │
│ 3. Simpan ke Database │
│ └─> Item.calibrated = False (belum ada IRT params) │
│ └─> Item.ctt_p = NULL (belum ada response data) │
│ │
└─────────────────────────────────────────────────────────────┘
```
### 3.3 Flow AI Generate Variants
```
┌─────────────────────────────────────────────────────────────┐
│ ADMIN: Generate AI Variants │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Pilih Item Original │
│ └─> Ambil 1 soal dari imported tryout │
│ │
│ 2. Request ke OpenRouter API │
│ └─> Kirim prompt dengan soal original │
│ └─> Minta generate variant dengan level berbeda │
│ │
│ 3. Simpan Variant │
│ └─> variant.item_id = unique_id │
│ └─> variant.parent_item_id = original.id │
│ └─> variant.slot = original.slot (nomor sama) │
│ │
│ 4. Result │
│ └─> Slot 1: 1 original + 1 variant = 2 soal │
│ └─> Slot 2: 1 original + 1 variant = 2 soal │
│ └─> Total: 2N soal (N slot × 2 variant) │
│ │
└─────────────────────────────────────────────────────────────┘
```
### 3.4 Contoh Struktur Data Setelah Import + Generate
```
Tryout: "TO-2024"
├── Slot 1
│ ├── Item #1 (original, calibrated=True, irt_b=0.5)
│ └── Item #2 (variant, calibrated=True, irt_b=-0.3)
├── Slot 2
│ ├── Item #3 (original, calibrated=True, irt_b=0.8)
│ └── Item #4 (variant, calibrated=True, irt_b=0.2)
└── ...
```
---
## 4. Pemrosesan Scoring
### 4.1 CTT (Classical Test Theory)
#### Step-by-Step Formula:
```python
# STEP 1: Tingkat Kesukaran (p-value)
p = Σ Benar / Total Peserta
# Contoh: 70 siswa menjawab benar dari 100 siswa → p = 0.70
# STEP 2: Bobot (Weight)
bobot = 1 - p
# Contoh: bobot = 1 - 0.70 = 0.30
# STEP 3: Total Benar per Siswa
total_benar = count(correct answers)
# STEP 4: Total Bobot Earned per Siswa
total_bobot_siswa = Σ bobot for each correct answer
# Contoh: Jawab benar 3 soal dengan bobot [0.3, 0.5, 0.2] = 1.0
# STEP 5: Nilai Mentah (Raw Score)
NM = (Total_Bobot_Siswa / Total_Bobot_Max) × 1000
# Contoh: NM = (1.0 / 2.5) × 1000 = 400
# STEP 6: Nilai Nasional (Normalized Score)
NN = 500 + 100 × ((NM - Rataan) / SB)
# Contoh: NN = 500 + 100 × ((400 - 450) / 80) = 437.5
```
#### Kategori Kesulitan (CTT Standard):
| p-value | Kategori | Arti |
|---------|----------|------|
| p < 0.30 | Sulit | Hanya <30% siswa menjawab benar |
| 0.30 ≤ p ≤ 0.70 | Sedang | 30-70% siswa menjawab benar |
| p > 0.70 | Mudah | >70% siswa menjawab benar |
### 4.2 IRT (Item Response Theory) - 1PL Rasch Model
#### Formula Inti:
```python
# Probability of correct response
P(θ, b) = 1 / (1 + exp(-(θ - b)))
# Di mana:
# - θ (theta) = kemampuan siswa [-3, +3]
# - b = difficulty soal [-3, +3]
# Contoh:
# - Siswa dengan θ = 0.5 menghadapi soal dengan b = 0.5
# - P(0.5, 0.5) = 1 / (1 + exp(0)) = 0.5 (50% kemungkinan benar)
```
#### Interpretasi Theta:
| Theta | Kemampuan | Persentase Benar (jika b=0) |
|-------|-----------|------------------------------|
| -3.0 | Sangat Lemah | ~5% |
| -1.5 | Lemah | ~18% |
| 0.0 | Rata-rata | ~50% |
| +1.5 | Cerdas | ~82% |
| +3.0 | Sangat Cerdas | ~95% |
#### Theta Estimation via MLE:
```python
# Log-likelihood
LL = Σ [u_i × log(P) + (1-u_i) × log(1-P)]
# u_i = 1 jika benar, 0 jika salah
# Theta estimation = maximize LL
θ_mle = argmax_θ LL(θ)
```
### 4.3 Kombinasi Scoring Mode
| Konfigurasi | Arti |
|-------------|------|
| `scoring_mode="ctt"` | Score akhir = NM, NN |
| `scoring_mode="irt"` | Score akhir = theta × 200 + 500 |
| `scoring_mode="hybrid"` | CTT score + IRT theta keduanya di-track |
---
## 5. IRT Calibration
### 5.1 Apa Itu Calibration?
**IRT Calibration** adalah proses mengestimasi parameter `b` (difficulty) untuk setiap soal berdasarkan response data dari siswa.
### 5.2 Kapan Item Became Calibrated?
```
┌─────────────────────────────────────────────────────────────┐
│ SYARAT ITEM CALIBRATED │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Minimum Response Sample │
│ └─> Ada cukup response data (default: 100 siswa) │
│ │
│ 2. IRT b Parameter │
│ └─> Sudah diestimasi via MLE │
│ │
│ 3. IRT SE (Standard Error) │
│ └─> Sudah dihitung │
│ │
│ 4. Item.calibrated = True │
│ │
└─────────────────────────────────────────────────────────────┘
```
### 5.3 Flow IRT Calibration
```mermaid
flowchart TD
A[Collect Response Data] --> B{Have Min Sample?}
B -->|No| C[Wait for more students]
C --> A
B -->|Yes| D[For each Item]
D --> E[Build Response Matrix]
E --> F[Estimate b via MLE]
F --> G[Calculate Standard Error]
G --> H[Update Item.irt_b]
H --> I[Item.calibrated = True]
I --> D
D --> J[Calibration Complete]
```
### 5.4 Trigger Calibration
Calibration bisa dipicu via:
1. **API Endpoint:**
```
POST /tryout/{tryout_id}/calibrate
```
2. **Admin Panel:**
- Buka `/admin` → Tryouts → Pilih tryout → Trigger calibration
3. **Background Job (jika configured):**
- Setelah enough responses terkumpul
---
## 6. Item Selection Modes
### 6.1 Fixed Selection
**Fixed** = Soal disajikan berurutan berdasarkan slot.
```python
# Flow:
1. Siswa mulai session
2. Ambil item dengan slot=1 (urutan terendah)
3. Setelah dijawab, ambil slot=2
4. Lanjutkan sampai selesai
```
**Karakteristik:**
- Predictable, urutan soal tetap
- Tidak butuh IRT calibration
- Semua siswa dapat soal sama di posisi sama
### 6.2 Adaptive Selection (CAT)
**Adaptive** = Soal dipilih berdasarkan kemampuan siswa saat ini (theta).
```python
# Flow:
1. Siswa mulai session (θ = 0.0, default)
2. Pilih item dengan b ≈ θ
3. Siswa jawab → update θ
4. Pilih item baru dengan b ≈ θ baru
5. Ulangi sampai terminate condition
```
**Karakteristik:**
- Personalized, setiap siswa beda soal
- Butuh item calibrated
- Item selection pakai Fisher Information
#### Fisher Information Formula:
```python
# Information at current theta
I(θ) = P(θ) × (1 - P(θ))
# Di mana P(θ) = 1 / (1 + exp(-(θ - b)))
# Item dengan MAX information dipilih
# Maximum information = item paling informatif untuk theta saat ini
```
### 6.3 Hybrid Selection
**Hybrid** = Gabungan fixed + adaptive.
```python
# Flow:
1. Slot 1-N: Fixed selection (sequential)
2. Setelah slot N: Switch ke adaptive selection
3. Theta sudah ter-update dari fixed portion
4. Adaptive portion pakai theta untuk pilih soal
```
**Parameter:**
- `hybrid_transition_slot` = Slot dimana switch ke adaptive
### 6.4 Perbandingan Selection Modes
| Mode | Butuh Calibration | Personalisasi | Predictable |
|------|-------------------|---------------|-------------|
| Fixed | Tidak | Tidak | Ya |
| Adaptive | Ya | Ya | Tidak |
| Hybrid | Parsial | Parsial | Parsial |
---
## 7. Student Session Flow
### 7.1 Full Student Flow
```mermaid
sequenceDiagram
participant S as Student
participant API as FastAPI
participant DB as Database
S->>API: POST /session/ (start session)
API->>DB: Create session, θ=0.0
DB-->>API: session_id
API-->>S: session_id
loop For each question (adaptive/fixed/hybrid)
S->>API: GET /session/{id}/next-item
API->>DB: Query next item based on selection_mode
DB-->>API: Item data
API-->>S: Question
S->>API: POST /session/{id}/answer
API->>API: Update θ (if adaptive)
API->>DB: Save UserAnswer
DB-->>API: Saved
API-->>S: Ack + next question
end
S->>API: POST /session/{id}/complete
API->>API: Calculate NM, NN, final theta
API->>DB: Update session
DB-->>API: Updated
API-->>S: Final scores
```
### 7.2 Next-Item Selection Berdasarkan Mode
```
┌─────────────────────────────────────────────────────────────┐
│ SELECTION MODE = FIXED │
├─────────────────────────────────────────────────────────────┤
│ │
│ SELECT * FROM items │
│ WHERE tryout_id = ? │
│ AND item.id NOT IN (answered_items) │
│ ORDER BY slot ASC │
│ LIMIT 1 │
│ │
│ Result: Item dengan slot terkecil yang belum dijawab │
│ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ SELECTION MODE = ADAPTIVE │
├─────────────────────────────────────────────────────────────┤
│ │
│ current_theta = session.theta -- e.g., 0.5 │
│ │
│ SELECT * FROM items │
│ WHERE tryout_id = ? │
│ AND calibrated = TRUE │
│ AND item.id NOT IN (answered_items) │
│ ORDER BY ABS(irt_b - current_theta) ASC -- terdekat │
│ LIMIT 1 │
│ │
│ Result: Item dengan b ≈ θ │
│ │
└─────────────────────────────────────────────────────────────┘
```
---
## 8. Konfigurasi Tryout
### 8.1 Semua Opsi Konfigurasi
```python
# Scoring
scoring_mode = "ctt" # ctt, irt, hybrid
scoring_mode = "irt" #
scoring_mode = "hybrid" #
# Selection
selection_mode = "fixed" # Sequential
selection_mode = "adaptive" # CAT based on theta
selection_mode = "hybrid" # Fixed until transition slot
# Normalization
normalization_mode = "static" # Use static_rataan, static_sb
normalization_mode = "dynamic" # Calculate from participant data
normalization_mode = "hybrid" # Dynamic when min_sample reached
# IRT Settings
min_calibration_sample = 100 # Min responses for calibration
theta_estimation_method = "mle" # mle, map, eap
fallback_to_ctt_on_error = True # Fallback if IRT fails
# Hybrid Settings
hybrid_transition_slot = 10 # Switch to adaptive at slot 10
# AI Settings
ai_generation_enabled = True # Allow AI generated items
```
### 8.2 Cara Mengubah Konfigurasi
#### Via Database:
```sql
UPDATE tryouts
SET
scoring_mode = 'hybrid',
selection_mode = 'adaptive',
normalization_mode = 'dynamic'
WHERE tryout_id = 'your-tryout-id';
```
#### Via Admin Panel:
1. Buka `/admin`
2. Pilih menu **Tryouts**
3. Edit tryout yang diinginkan
4. Ubah field-field sesuai kebutuhan
5. Save
---
## 9. Ringkasan Alur End-to-End
### 9.1 Admin Flow (Sekali / Periodik)
```
┌─────────────────────────────────────────────────────────────┐
│ 1. IMPORT TRYOUT JSON │
│ Input: File JSON (1 tryout = 1 exam) │
│ Output: N items dalam database │
│ │
│ 2. AI GENERATE VARIANTS │
│ Input: Item original │
│ Output: Item variant (same slot, different content) │
│ Result: 2N items (N slot × 2 variant) │
│ │
│ 3. COLLECT RESPONSE DATA │
│ Input: Student answers │
│ Output: UserAnswer records │
│ │
│ 4. IRT CALIBRATION │
│ Input: Response data (min 100 students) │
│ Output: Item.irt_b, Item.irt_se, Item.calibrated=True │
│ │
│ 5. CONFIGURE TRYOUT │
│ Input: Set selection_mode = 'adaptive' │
│ Output: Tryout siap untuk adaptive testing │
│ │
└─────────────────────────────────────────────────────────────┘
```
### 9.2 Student Flow (Setiap Ujian)
```
┌─────────────────────────────────────────────────────────────┐
│ 1. START SESSION │
│ Input: tryout_id │
│ Output: session_id, theta=0.0 │
│ │
│ 2. ANSWER LOOP │
│ For each question: │
│ - Get next item (based on selection_mode) │
│ - Submit answer │
│ - If adaptive: update theta │
│ │
│ 3. COMPLETE SESSION │
│ Input: All answers │
│ Output: NM, NN, theta, completion status │
│ │
└─────────────────────────────────────────────────────────────┘
```
### 9.3 Konsep Kunci
| Konsep | Penjelasan |
|--------|------------|
| **Tryout** | 1 exam yang di-import dari JSON |
| **Item** | 1 soal (original atau variant) |
| **Slot** | Posisi nomor soal (1, 2, 3...) |
| **Variant** | Soal berbeda di slot yang sama |
| **Calibrated** | Item sudah punya irt_b (siap untuk adaptive) |
| **Theta** | Estimasi kemampuan siswa dalam IRT scale |
---
## 10. FAQ
### Q: Kenapa default scoring_mode = "ctt"?
A: CTT lebih simpel, tidak butuh IRT calibration. Cocok untuk awal sebelum cukup data.
### Q: Kenapa default selection_mode = "fixed"?
A: Fixed selection tidak butuh item calibrated. Bisa jalan langsung setelah import.
### Q: Bagaimana switch ke adaptive?
A:
1. Pastikan item sudah calibrated (`calibrated = True`)
2. Ubah `selection_mode = 'adaptive'` di tryout
3. Student baru akan dapat adaptive selection
### Q: Adaptive butuh berapa banyak data?
A: Default `min_calibration_sample = 100`. Artinya minimal 100 siswa harus sudah menjawab sebelum calibration bisa jalan.
### Q: CTT dan Fixed itu sama?
A: Tidak. Mereka orthogonal:
- **scoring_mode** = bagaimana menghitung score akhir
- **selection_mode** = bagaimana memilih soal berikutnya
### Q: Aplikasi ini membuat exam?
A: Tidak. Aplikasi ini adalah **question bank**. Exam sudah di-import dari JSON. Aplikasi "mengembangbiakkan" soal dengan membuat variants.
---
## 11. Referensi Code
| File | Fungsi |
|------|--------|
| `app/services/ctt_scoring.py` | CTT scoring calculations |
| `app/services/irt_calibration.py` | IRT calibration, theta estimation |
| `app/services/cat_selection.py` | Item selection (fixed/adaptive/hybrid) |
| `app/services/ai_generation.py` | OpenRouter AI integration |
| `app/services/excel_import.py` | Excel import/export |
| `app/routers/sessions.py` | Session management API |
| `app/models/tryout.py` | Tryout model definition |
| `app/models/item.py` | Item model definition |
| `app/models/session.py` | Session model definition |
---
*Document version: 1.0*
*Last updated: 2026-06-15*