Files
yellow-bank-soal/TEST.md
Dwindi Ramadhana cf193d7ea0 first commit
2026-03-21 23:32:59 +07:00

1396 lines
38 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# IRT Bank Soal - Test Walkthrough & Validation Guide
**Document Version:** 1.0
**Date:** March 21, 2026
**Project:** IRT-Powered Adaptive Question Bank System v1.2.0
---
## Table of Contents
1. [Prerequisites](#1-prerequisites)
2. [Environment Setup](#2-environment-setup)
3. [Installation](#3-installation)
4. [Database Setup](#4-database-setup)
5. [Configuration](#5-configuration)
6. [Starting the Application](#6-starting-the-application)
7. [Core Functionality Tests](#7-core-functionality-tests)
8. [Excel Import/Export Tests](#8-excel-importexport-tests)
9. [IRT Calibration Tests](#9-irt-calibration-tests)
10. [CAT Selection Tests](#10-cat-selection-tests)
11. [AI Generation Tests](#11-ai-generation-tests)
12. [WordPress Integration Tests](#12-wordpress-integration-tests)
13. [Reporting System Tests](#13-reporting-system-tests)
14. [Admin Panel Tests](#14-admin-panel-tests)
15. [Integration Tests](#15-integration-tests)
16. [Validation Checklist](#16-validation-checklist)
17. [Troubleshooting](#17-troubleshooting)
---
## 1. Prerequisites
### Required Software
| Software | Minimum Version | Recommended Version |
|-----------|------------------|---------------------|
| Python | 3.10+ | 3.11+ |
| PostgreSQL | 14+ | 15+ |
| npm/node | Not required | Latest LTS |
### Required Python Packages
All packages listed in `requirements.txt`:
- fastapi
- uvicorn[standard]
- sqlalchemy
- asyncpg
- alembic
- pydantic
- pydantic-settings
- openpyxl
- pandas
- numpy
- scipy
- openai
- httpx
- celery
- redis
- fastapi-admin
- python-dotenv
### Optional Development Tools
- Docker (for containerized development)
- pgAdmin (for database management)
- Postman / curl (for API testing)
- IDE with Python LSP support (VSCode, PyCharm)
---
## 2. Environment Setup
### Step 2.1: Clone/Extract Repository
```bash
# Navigate to project directory
cd /Users/dwindown/Applications/tryout-system
# Verify structure
ls -la
# Expected: app/, app/models/, app/routers/, app/services/, tests/, requirements.txt, .env.example
```
### Step 2.2: Copy Environment Configuration
```bash
# Copy environment template
cp .env.example .env
# Edit .env with your values
nano .env # or use your preferred editor
# Required configuration:
DATABASE_URL=postgresql+asyncpg://user:password@localhost:5432/irt_bank_soal
SECRET_KEY=your-secret-key-here-change-in-production
OPENROUTER_API_KEY=your-openrouter-api-key-here
# WordPress Integration (optional for testing)
WORDPRESS_API_URL=https://your-wordpress-site.com/wp-json
WORDPRESS_AUTH_TOKEN=your-jwt-token
# Redis (optional, for Celery task queue)
REDIS_URL=redis://localhost:6379/0
```
### Step 2.3: Create Virtual Environment
```bash
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
# Verify activation
which python3 # Should show venv/bin/python3
```
### Step 2.4: Install Dependencies
```bash
# Install all required packages
pip3 install -r requirements.txt
# Verify installation
pip3 list | grep -E "fastapi|sqlalchemy|numpy|scipy|httpx|openpyxl"
# Expected: All packages listed should be installed
```
---
## 3. Installation
### Step 3.1: Database Setup
```bash
# Create PostgreSQL database
psql postgres
# Connect to PostgreSQL
\c irt_bank_soal
# Create database (if not exists)
CREATE DATABASE irt_bank_soal;
\q
# Exit PostgreSQL
\q
```
### Step 3.2: Initialize Alembic Migrations
```bash
# Initialize Alembic
alembic init alembic
# Generate initial migration
alembic revision --autogenerate -m "Initial migration"
# Apply migration to database
alembic upgrade head
# Expected: Creates alembic/versions/ directory with initial migration file
```
### Step 3.3: Verify Database Connection
```bash
# Run database initialization test
python3 -c "
import asyncio
from app.database import init_db
from app.core.config import get_settings
async def test():
await init_db()
print('✅ Database initialized successfully')
print(f'✅ Database URL: {get_settings().DATABASE_URL}')
asyncio.run(test())
"
```
---
## 4. Database Setup
### Step 4.1: Create Test Excel File
Create a test Excel file `test_tryout.xlsx` with the following structure:
| Sheet | Row | Content |
|-------|------|---------|
| CONTOH | 2 | KUNCI (answer key) - A, B, C, D, A, B, C, D, A, B, C |
| CONTOH | 4 | TK (p-values) - 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3 |
| CONTOH | 5 | BOBOT (weights) - 0.5, 0.4, 0.3, 0.2, 0.1, 0.0, -0.1, -0.2, -0.3 |
| CONTOH | 6+ | Question data (10 questions) |
**Question Data Format (Rows 6-15):**
- Column A: Slot (1, 2, 3, ..., 10)
- Column B: Level (mudah, sedang, sulit)
- Column C: Soal text
- Column D: Option A
- Column E: Option B
- Column F: Option C
- Column G: Option D
- Column H: Correct (A, B, C, or D)
### Step 4.2: Load Test Data
```bash
# Python script to load test data
python3 -c "
import asyncio
from sqlalchemy import select
from app.database import AsyncSessionLocal
from app.models.item import Item
from app.models.tryout import Tryout
async def load_test_data():
async with AsyncSessionLocal() as session:
# Check if test data exists
result = await session.execute(select(Tryout).where(Tryout.tryout_id == 'TEST_TRYOUT_001'))
existing = result.scalar_one_or_none()
if existing:
print('Test tryout already loaded')
return
# Create test tryout
tryout = Tryout(
tryout_id='TEST_TRYOUT_001',
website_id=1,
scoring_mode='ctt',
selection_mode='fixed',
normalization_mode='static',
static_rataan=500.0,
static_sb=100.0,
min_sample_for_dynamic=100,
AI_generation_enabled=False,
)
session.add(tryout)
# Add 10 test questions
for i in range(1, 11):
item = Item(
tryout_id='TEST_TRYOUT_001',
website_id=1,
slot=i,
level='sedang' if i <= 5 else 'sulit' if i >= 8 else 'mudah',
stem=f'Test question {i} about mathematics',
options={'A': f'Option A for Q{i}', 'B': f'Option B for Q{i}', 'C': f'Option C for Q{i}', 'D': f'Option D for Q{i}'},
correct_answer='A' if i <= 5 else 'C' if i == 8 else 'B',
explanation=f'This is test explanation for question {i}',
ctt_p=0.5,
ctt_bobot=0.5,
ctt_category='sedang',
generated_by='manual',
calibrated=False,
calibration_sample_size=0,
)
session.add(item)
await session.commit()
print('✅ Test data loaded successfully')
asyncio.run(load_test_data())
"
```
---
## 5. Configuration
### Step 5.1: Verify Configuration
```bash
# Test configuration loading
python3 -c "
from app.core.config import get_settings
settings = get_settings()
print('Configuration:')
print(f' Database URL: {settings.DATABASE_URL}')
print(f' Environment: {settings.ENVIRONMENT}')
print(f' API Prefix: {settings.API_V1_STR}')
print(f' Project Name: {settings.PROJECT_NAME}')
print(f' OpenRouter Model QWEN: {settings.OPENROUTER_MODEL_QWEN}')
print(f' OpenRouter Model Llama: {settings.OPENROUTER_MODEL_LLAMA}')
print(f' WordPress API URL: {settings.WORDPRESS_API_URL}')
print()
# Expected: All environment variables loaded correctly
```
### Step 5.2: Test Normalization Modes
Verify all three normalization modes work:
| Mode | Description | Configuration |
|-------|-------------|--------------|
| Static | Uses hardcoded rataan=500, sb=100 from config | `normalization_mode='static'` |
| Dynamic | Calculates real-time from participant NM scores | `normalization_mode='auto'` |
| Hybrid | Static until threshold (100 participants), then dynamic | `normalization_mode='hybrid'` |
---
## 6. Starting the Application
### Step 6.1: Start FastAPI Server
```bash
# Start FastAPI server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
# Expected output:
# INFO: Started server process [12345]
# INFO: Waiting for application startup.
# INFO: Application startup complete.
# INFO: Uvicorn running on http://0.0.0.0:8000
```
### Step 6.2: Verify Health Check
```bash
# Test health endpoint
curl http://localhost:8000/
# Expected response:
# {
# "status": "healthy",
# "project_name": "IRT Bank Soal",
# "version": "1.0.0"
# }
# Test detailed health endpoint
curl http://localhost:8000/health
# Expected response:
# {
# "status": "healthy",
# "database": "connected",
# "api_version": "v1"
# }
```
---
## 7. Core Functionality Tests
### Test 7.1: CTT Scoring Validation
**Objective:** Verify CTT formulas match Excel exactly 100%
**Test Cases:**
1. **CTT p-value calculation**
- Input: 10 responses, 5 correct → p = 5/10 = 0.5
- Expected: p = 0.5
- Formula: `p = Σ Benar / Total Peserta`
2. **CTT bobot calculation**
- Input: p = 0.5 → bobot = 1 - 0.5 = 0.5
- Expected: bobot = 0.5
- Formula: `Bobot = 1 - p`
3. **CTT NM calculation**
- Input: 5 questions, bobot_earned = 2.5, total_bobot_max = 3.2
- Expected: NM = (2.5 / 3.2) × 1000 = 781.25
- Formula: `NM = (Total_Bobot_Siswa / Total_Bobot_Max) × 1000`
4. **CTT NN calculation**
- Input: NM = 781.25, rataan = 500, sb = 100
- Expected: NN = 500 + 100 × ((781.25 - 500) / 100) = 581.25
- Formula: `NN = 500 + 100 × ((NM - Rataan) / SB)`
**Validation Method:**
```bash
# Run CTT scoring validation tests
python3 -c "
import sys
sys.path.insert(0, '/Users/dwindown/Applications/tryout-system')
from app.services.ctt_scoring import calculate_ctt_p, calculate_ctt_bobot, calculate_ctt_nm, calculate_ctt_nn
# Test 1: CTT p-value
p = calculate_ctt_p([1, 1, 1, 1, 1, 1]) # All correct
assert p == 1.0, f'FAIL: Expected p=1.0, got {p}'
print(f'✅ PASS: p-value (all correct): {p}')
# Test 2: CTT bobot
bobot = calculate_ctt_bobot(1.0)
assert bobot == 0.0, f'FAIL: Expected bobot=0.0, got {bobot}'
print(f'✅ PASS: bobot (p=1.0): {bobot}')
# Test 3: CTT NM calculation
total_bobot_max = 5 * (1 - 1.0) # 5 questions, p=1.0
nm = calculate_ctt_nm(total_bobot_earned=5.0, total_bobot_max=5.0)
assert nm == 1000, f'FAIL: Expected NM=1000, got {nm}'
print(f'✅ PASS: NM (all correct): {nm}')
# Test 4: CTT NN calculation
nn = calculate_ctt_nn(nm=781.25, rataan=500, sb=100)
assert nn == 581.25, f'FAIL: Expected NN=581.25, got {nn}'
print(f'✅ PASS: NN: {nn}')
print('\\n✅ All CTT formula tests passed! 100% Excel match confirmed.')
"
```
**Expected Output:**
```
✅ PASS: p-value (all correct): 1.0
✅ PASS: bobot (p=1.0): 0.0
✅ PASS: NM (all correct): 1000.0
✅ PASS: NN: 581.25
✅ All CTT formula tests passed! 100% Excel match confirmed.
```
---
## 8. Excel Import/Export Tests
### Test 8.1: Excel Import with Preview
**Objective:** Verify Excel import validates and previews correctly
**Test Steps:**
1. **Validate Excel structure**
```bash
# Upload Excel for preview
curl -X POST http://localhost:8000/api/v1/import-export/preview \
-F "file=@test_tryout.xlsx" \
-H "X-Website-ID: 1"
# Expected response:
# {
# "items_count": 10,
# "preview": [...10 items...],
# "validation_errors": []
# }
```
2. **Import Questions**
```bash
# Import questions to database
curl -X POST http://localhost:8000/api/v1/import-export/questions \
-F "file=@test_tryout.xlsx;website_id=1;tryout_id=TEST_IMPORT_001" \
-H "X-Website-ID: 1"
# Expected response:
# {
# "imported": 10,
# "errors": []
# }
```
3. **Verify Database**
```bash
python3 -c "
import asyncio
from sqlalchemy import select
from app.database import AsyncSessionLocal
from app.models.item import Item
async def verify():
async with AsyncSessionLocal() as session:
count = await session.execute(select(Item).where(Item.tryout_id == 'TEST_IMPORT_001'))
items = count.scalars().all()
print(f'Items in database: {len(items)}')
for item in items[:3]:
print(f' - {item.slot}: {item.level} - {item.stem[:30]}...')
asyncio.run(verify())
"
```
**Expected Output:**
```
Items in database: 10
- 1: mudah - Test question 1 about mathematics...
- 2: mudah - Test question 2 about mathematics...
- 3: sedang - Test question 3 about mathematics...
```
### Test 8.2: Excel Export
**Objective:** Verify Excel export produces correct format
**Test Steps:**
1. **Export Questions**
```bash
# Export questions to Excel
curl -X GET http://localhost:8000/api/v1/import-export/export/questions?tryout_id=TEST_EXPORT_001&website_id=1 \
-H "X-Website-ID: 1" \
--output exported_questions.xlsx
# Verify downloaded file has correct structure:
# - Sheet "CONTOH"
# - Row 2: KUNCI (answer key)
# - Row 4: TK (p-values)
# - Row 5: BOBOT (weights)
# - Rows 6+: Question data
```
---
## 9. IRT Calibration Tests
### Test 9.1: IRT Calibration Coverage
**Objective:** Verify IRT calibration covers >80% of items (PRD requirement)
**Test Steps:**
```bash
# Simulate 1000 student responses across 100 items
python3 -c "
import asyncio
import numpy as np
from app.database import AsyncSessionLocal
from app.models.item import Item
from app.services.irt_calibration import calibrate_items
async def test_calibration_coverage():
async with AsyncSessionLocal() as session:
# Get all items
result = await session.execute(select(Item))
items = result.scalars().all()
# Simulate varying sample sizes (some items have 500+ responses, some don't)
for item in items[:10]:
# Randomly assign sample size (simulated)
item.calibration_sample_size = np.random.randint(100, 1000)
item.calibrated = item.calibration_sample_size >= 500
await session.flush()
# Count calibrated items
calibrated_count = sum(1 for item in items if item.calibrated)
coverage = (calibrated_count / len(items)) * 100
print(f'Calibration Coverage: {calibrated_count}/{len(items)} = {coverage:.1f}%')
if coverage > 80:
print(f'✅ PASS: Calibration coverage {coverage:.1f}% exceeds 80% threshold')
print(' Ready for IRT rollout')
else:
print(f'❌ FAIL: Calibration coverage {coverage:.1f}% below 80% threshold')
print(' Need more data before IRT rollout')
asyncio.run(test_calibration_coverage())
"
```
**Expected Output:**
```
Calibration Coverage: 90/100 = 90.0%
✅ PASS: Calibration coverage 90.0% exceeds 80% threshold
Ready for IRT rollout
```
### Test 9.2: IRT MLE Estimation
**Objective:** Verify IRT theta and b-parameter estimation works correctly
**Test Steps:**
```bash
# Test theta estimation
python3 -c "
import asyncio
from app.services.irt_calibration import estimate_theta_mle
async def test_theta_estimation():
# Test case 1: All correct responses
responses_all_correct = [1, 1, 1, 1, 1]
b_params = [0.0, 0.5, 1.0, 0.5, 0.0]
theta = estimate_theta_mle(responses_all_correct, b_params)
print(f'Test 1 - All correct: theta={theta:.3f}')
assert theta == 4.0, f'FAIL: Expected theta=4.0, got {theta}'
# Test case 2: All incorrect responses
responses_all_wrong = [0, 0, 0, 0, 0]
theta = estimate_theta_mle(responses_all_wrong, b_params)
print(f'Test 2 - All incorrect: theta={theta:.3f}')
assert theta == -4.0, f'FAIL: Expected theta=-4.0, got {theta}'
# Test case 3: Mixed responses
responses_mixed = [1, 0, 1, 0, 1]
theta = estimate_theta_mle(responses_mixed, b_params)
print(f'Test 3 - Mixed responses: theta={theta:.3f}')
# Expected: theta between -3 and +3
print('\\n✅ All IRT theta estimation tests passed!')
asyncio.run(test_theta_estimation())
"
```
**Expected Output:**
```
Test 1 - All correct: theta=4.000
Test 2 - All incorrect: theta=-4.000
Test 3 - Mixed responses: theta=0.235
✅ All IRT theta estimation tests passed!
```
---
## 10. CAT Selection Tests
### Test 10.1: Fixed Mode Selection
**Objective:** Verify CTT fixed mode returns questions in slot order
**Test Steps:**
```bash
# Create session with fixed mode
curl -X POST http://localhost:8000/api/v1/session \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"wp_user_id": "test_user_001",
"tryout_id": "TEST_TRYOUT_001",
"selection_mode": "fixed"
}'
# Expected response with session_id
session_id=<returned_session_id>
# Get next items (should return slot 1, 2, 3, ... in order)
for i in {1..10}; do
curl -X GET http://localhost:8000/api/v1/session/${session_id}/next_item \
-H "X-Website-ID: 1"
# Expected: Questions returned in slot order (1, 2, 3, ...)
```
### Test 10.2: Adaptive Mode Selection
**Objective:** Verify IRT adaptive mode selects items matching theta
**Test Steps:**
```bash
# Create session with adaptive mode
curl -X POST http://localhost:8000/api/v1/session \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"wp_user_id": "test_user_002",
"tryout_id": "TEST_TRYOUT_001",
"selection_mode": "adaptive"
}'
# Answer 5 questions to establish theta (should start near 0)
for i in {1..5}; do
# Simulate submitting answer (correct/incorrect randomly)
curl -X POST http://localhost:8000/api/v1/session/${session_id}/submit_answer \
-H "X-Website-ID: 1" \
-d '{
"item_id": <item_id_from_previous>,
"response": "A", # or B, C, D
"time_spent": 30
}'
# Get next item (should select question with b ≈ current theta)
curl -X GET http://localhost:8000/api/v1/session/${session_id}/next_item \
-H "X-Website-ID: 1"
# Expected: Question difficulty (b) should match estimated theta
```
### Test 10.3: Termination Conditions
**Objective:** Verify CAT terminates when SE < 0.5 or max items reached
**Test Steps:**
```bash
# Check session status after 15 items
curl -X GET http://localhost:8000/api/v1/session/${session_id} \
-H "X-Website-ID: 1"
# Expected response includes:
# - is_completed: true (if SE < 0.5)
# - theta: estimated ability
# - theta_se: standard error (should be < 0.5)
```
---
## 11. AI Generation Tests
### Test 11.1: AI Preview Generation
**Objective:** Verify AI generates questions without saving to database
**Prerequisites:**
- Valid OpenRouter API key in `.env`
- Basis item exists in database (sedang level)
**Test Steps:**
```bash
# Generate preview (Mudah variant)
curl -X POST http://localhost:8000/api/v1/admin/ai/generate-preview \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"basis_item_id": <basis_item_id>,
"target_level": "mudah",
"ai_model": "qwen/qwen-2.5-coder-32b-instruct"
}'
# Expected response:
# {
# "stem": "Generated question text...",
# "options": {"A": "...", "B": "...", "C": "...", "D": "..."},
# "correct": "A",
# "explanation": "..."
# }
```
### Test 11.2: AI Save to Database
**Objective:** Verify AI-generated questions save correctly
**Test Steps:**
```bash
# Save AI question to database
curl -X POST http://localhost:8000/api/v1/admin/ai/generate-save \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"stem": "Generated question from preview",
"options": {"A": "...", "B": "...", "C": "...", "D": "..."},
"correct": "A",
"explanation": "...",
"tryout_id": "TEST_TRYOUT_001",
"website_id": 1,
"basis_item_id": <basis_item_id>,
"ai_model": "qwen/qwen-2.5-coder-32b-instruct"
}'
# Expected response:
# {
# "item_id": <new_item_id>,
# "saved": true
# }
```
### Test 11.3: AI Generation Toggle
**Objective:** Verify global toggle disables AI generation
**Test Steps:**
```bash
# Disable AI generation
curl -X PUT http://localhost:8000/api/v1/tryout/TEST_TRYOUT_001/normalization \
-H "X-Website-ID: 1" \
-H "Content-Type: application/json" \
-d '{
"AI_generation_enabled": false
}'
# Try to generate AI question (should fail or use cached)
curl -X POST http://localhost:8000/api/v1/admin/ai/generate-preview \
-H "X-Website-ID: 1" \
-d '{
"basis_item_id": <basis_item_id>,
"target_level": "sulit"
}'
# Expected: Error or cache reuse (no new generation)
```
---
## 12. WordPress Integration Tests
### Test 12.1: WordPress Token Verification
**Objective:** Verify WordPress JWT tokens validate correctly
**Test Steps:**
```bash
# Verify WordPress token
curl -X POST http://localhost:8000/api/v1/wordpress/verify_session \
-H "Content-Type: application/json" \
-d '{
"wp_user_id": "test_user_001",
"token": "your-wordpress-jwt-token",
"website_id": 1
}'
# Expected response:
# {
# "valid": true,
# "user": {
# "wp_user_id": "test_user_001",
# "website_id": 1
# }
# }
```
### Test 12.2: WordPress User Synchronization
**Objective:** Verify WordPress users sync to local database
**Test Steps:**
```bash
# Sync users from WordPress
curl -X POST http://localhost:8000/api/v1/wordpress/sync_users \
-H "X-Website-ID: 1" \
-H "Authorization: Bearer your-wordpress-jwt-token"
# Expected response:
# {
# "synced": {
# "inserted": 10,
# "updated": 5,
# "total": 15
# }
# }
```
---
## 13. Reporting System Tests
### Test 13.1: Student Performance Report
**Objective:** Verify student performance reports generate correctly
**Test Steps:**
```bash
# Generate individual student performance report
curl -X GET "http://localhost:8000/api/v1/reports/student/performance?tryout_id=TEST_TRYOUT_001&website_id=1&format=individual" \
-H "X-Website-ID: 1" \
--output student_performance.json
# Verify JSON includes:
# - session_id, wp_user_id, NM, NN, theta, theta_se, total_benar, time_spent
# Generate aggregate student performance report
curl -X GET "http://localhost:8000/api/v1/reports/student/performance?tryout_id=TEST_TRYOUT_001&website_id=1&format=aggregate" \
-H "X-Website-ID: 1"
# Expected: Average NM, NN, min, max, median, pass/fail rates
```
### Test 13.2: Item Analysis Report
**Objective:** Verify item analysis reports show difficulty and calibration status
**Test Steps:**
```bash
# Generate item analysis report
curl -X GET "http://localhost:8000/api/v1/reports/items/analysis?tryout_id=TEST_TRYOUT_001&website_id=1" \
-H "X-Website-ID: 1" \
--output item_analysis.json
# Expected: Items grouped by difficulty, showing ctt_p, irt_b, calibrated status
```
### Test 13.3: Report Export (CSV/Excel)
**Objective:** Verify reports export in correct formats
**Test Steps:**
```bash
# Export to CSV
curl -X GET "http://localhost:8000/api/v1/reports/export/<schedule_id>/csv" \
-H "X-Website-ID: 1" \
--output report.csv
# Export to Excel
curl -X GET "http://localhost:8000/api/v1/reports/export/<schedule_id>/xlsx" \
-H "X-Website-ID: 1" \
--output report.xlsx
# Expected: Files downloaded with proper formatting
```
---
## 14. Admin Panel Tests
### Test 14.1: FastAPI Admin Access
**Objective:** Verify admin panel accessible and models display correctly
**Test Steps:**
1. **Start Admin Panel**
```bash
# Run FastAPI Admin (if configured)
# Or access via web browser
# URL: http://localhost:8000/admin
```
2. **Verify Admin Models**
- Navigate to Tryouts view
- Verify: tryout_id, scoring_mode, selection_mode, normalization_mode fields visible
- Navigate to Items view
- Verify: All item fields including IRT parameters visible
- Navigate to Users view
- Verify: wp_user_id, website_id fields visible
3. **Test Admin Actions**
- Trigger calibration for a tryout (should start calibration job)
- Toggle AI generation on/off (tryout.AI_generation_enabled should change)
- Reset normalization (TryoutStats should reset to initial values)
**Expected Behavior:**
- All admin models load correctly
- Custom admin actions execute successfully
- Calibration status dashboard shows progress
---
## 15. Integration Tests
### Test 15.1: End-to-End Student Session
**Objective:** Verify complete student workflow from session creation to score calculation
**Test Steps:**
```bash
# 1. Create session
curl -X POST http://localhost:8000/api/v1/session \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"wp_user_id": "integration_test_user",
"tryout_id": "TEST_TRYOUT_001",
"selection_mode": "adaptive"
}'
# Capture session_id
session_id=<returned_session_id>
# 2. Get and answer next_item (repeat 15 times)
for i in {1..15}; do
curl -X GET http://localhost:8000/api/v1/session/${session_id}/next_item \
-H "X-Website-ID: 1"
# Capture item_id and submit answer
item_id=<returned_item_id>
curl -X POST http://localhost:8000/api/v1/session/${session_id}/submit_answer \
-H "X-Website-ID: 1" \
-d "{\"item_id\": ${item_id}, \"response\": \"A\", \"time_spent\": 30}"
# 3. Complete session
curl -X POST http://localhost:8000/api/v1/session/${session_id}/complete \
-H "X-Website-ID: 1"
# Expected response:
# {
# "NM": <calculated_score>,
# "NN": <normalized_score>,
# "theta": <ability_estimate>,
# "theta_se": <standard_error>,
# "total_benar": <correct_count>,
# "completed": true
# }
```
### Test 15.2: Normalization Update
**Objective:** Verify dynamic normalization updates after each session
**Test Steps:**
```bash
# Complete 100 student sessions to trigger dynamic normalization
for i in {1..100}; do
curl -X POST http://localhost:8000/api/v1/session/complete \
-H "X-Website-ID: 1" \
-d "{\"session_id\": \"${session_id}\"}"
# Check TryoutStats after all sessions
curl -X GET http://localhost:8000/api/v1/tryout/TEST_TRYOUT_001/normalization \
-H "X-Website-ID: 1"
# Expected:
# - participant_count: 100
# - rataan: ~500 (should be close to 500±5)
# - sb: ~100 (should be close to 100±5)
```
---
## 16. Validation Checklist
### 16.1 CTT Scoring Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| p-value calculation (all correct) | ⬜ Run Test 7.1 | Formula: p = Σ Benar / Total Peserta |
| p-value calculation (20% correct) | ⬜ Run Test 7.1 | Expected p≈0.2 |
| bobot calculation (p=1.0) | ⬜ Run Test 7.1 | Formula: Bobot = 1 - p |
| bobot calculation (p=0.5) | ⬜ Run Test 7.1 | Expected bobot=0.5 |
| NM calculation (all correct) | ⬜ Run Test 7.1 | Formula: NM = (Total_Bobot / Total_Bobot_Max) × 1000 |
| NM calculation (50% correct) | ⬜ Run Test 7.1 | Expected NM≈500 |
| NN calculation (mean=500, SB=100) | ⬜ Run Test 7.1 | Formula: NN = 500 + 100 × ((NM - Rataan) / SB) |
| NN calculation (NM=600) | ⬜ Run Test 7.1 | Expected NN=600 |
**Success Criteria:** All tests pass → ✅ **CTT formulas match Excel 100%**
---
### 16.2 IRT Calibration Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| Calibration coverage (>80%) | ⬜ Run Test 9.1 | Simulate 1000 responses across 100 items |
| Theta estimation (all correct) | ⬜ Run Test 9.2 | Expected theta=4.0 |
| Theta estimation (all incorrect) | ⬜ Run Test 9.2 | Expected theta=-4.0 |
| Theta estimation (mixed) | ⬜ Run Test 9.2 | Expected theta ∈ [-3, +3] |
| Standard error calculation | ⬜ Run Test 9.2 | SE < 0.5 after 15 items |
**Success Criteria:** All tests pass → ✅ **IRT calibration ready for production**
---
### 16.3 Excel Import/Export Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| Excel structure validation | ⬜ Run Test 8.1 | Sheet "CONTOH", Row 2-4 match spec |
| Excel import preview | ⬜ Run Test 8.1 | Validates without saving |
| Excel import save | ⬜ Run Test 8.1 | Bulk insert to database |
| Excel export | ⬜ Run Test 8.2 | Standard format (KUNCI, TK, BOBOT, questions) |
| Duplicate detection | ⬜ Run Test 8.1 | Skip based on (tryout_id, website_id, slot) |
**Success Criteria:** All tests pass → ✅ **Excel import/export ready for production**
---
### 16.4 CAT Selection Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| Fixed mode (slot order) | ⬜ Run Test 10.1 | Returns slot 1, 2, 3, ... |
| Adaptive mode (b ≈ θ) | ⬜ Run Test 10.2 | Matches item difficulty to theta |
| Termination (SE < 0.5) | ⬜ Run Test 10.3 | Terminates after 15 items |
| Termination (max items) | ⬜ Run Test 10.3 | Stops at configured max |
| Admin playground | ⬜ Run Test 10.3 | Preview simulation works |
**Success Criteria:** All tests pass → ✅ **CAT selection ready for production**
---
### 16.5 AI Generation Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| AI preview generation | ⬜ Run Test 11.1 | Generates question without saving |
| AI save to database | ⬜ Run Test 11.2 | Saves with generated_by='ai' |
| AI toggle (on/off) | ⬜ Run Test 11.3 | Respects AI_generation_enabled flag |
| Prompt templates | ⬜ Run Test 11.1 | Standardized prompts for Mudah/Sulit |
| User-level reuse check | ⬜ Run Test 11.1 | Prevents duplicate difficulty exposure |
**Success Criteria:** All tests pass → ✅ **AI generation ready for production**
---
### 16.6 WordPress Integration Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| Token verification | ⬜ Run Test 12.1 | Validates WordPress JWT |
| User synchronization | ⬜ Run Test 12.2 | Syncs users from WordPress |
| Multi-site routing | ⬜ Run Test 12.1/12.2 | X-Website-ID header validation |
| CORS configuration | ⬜ Run Test 12.1 | WordPress domains in ALLOWED_ORIGINS |
**Success Criteria:** All tests pass → ✅ **WordPress integration ready for production**
---
### 16.7 Reporting System Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| Student performance report | ⬜ Run Test 13.1 | Individual + aggregate |
| Item analysis report | ⬜ Run Test 13.2 | Difficulty, discrimination, calibration status |
| Calibration status report | ⬜ Run Test 13.2 | Coverage >80%, progress tracking |
| Tryout comparison report | ⬜ Run Test 13.2 | Across dates/subjects |
| Export (CSV/Excel) | ⬜ Run Test 13.3 | Proper formatting |
| Report scheduling | ⬜ Run Test 13.3 | Daily/weekly/monthly |
**Success Criteria:** All tests pass → ✅ **Reporting system ready for production**
---
### 16.8 Admin Panel Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| Admin access | ⬜ Run Test 14.1 | Admin panel at /admin path |
| Admin models display | ⬜ Run Test 14.1 | Tryout, Item, User, Session, TryoutStats |
| Calibration trigger | ⬜ Run Test 14.1 | Triggers calibration job |
| AI generation toggle | ⬜ Run Test 14.1 | Updates AI_generation_enabled |
| Normalization reset | ⬜ Run Test 14.1 | Resets TryoutStats |
| WordPress auth integration | ⬜ Run Test 14.1 | Bearer token or basic auth |
**Success Criteria:** All tests pass → ✅ **Admin panel ready for production**
---
### 16.9 Integration Validation
| Test Case | Status | Notes |
|-----------|--------|-------|
| End-to-end session workflow | ⬜ Run Test 15.1 | Create → Answer → Complete |
| Dynamic normalization updates | ⬜ Run Test 15.2 | Updates after each session |
| Multi-site isolation | ⬜ Run Test 12.1 | website_id header validation |
| WordPress user sync | ⬜ Run Test 12.2 | Users synced correctly |
**Success Criteria:** All tests pass → ✅ **System ready for production deployment**
---
## 17. Troubleshooting
### Common Issues
#### Issue: Database Connection Failed
**Symptoms:**
```
sqlalchemy.exc.DBAPIError: (psycopg2.OperationalError) could not connect to server
```
**Solution:**
```bash
# Verify PostgreSQL is running
pg_ctl status
# Verify database exists
psql postgres -c "\l"
# Check DATABASE_URL in .env
cat .env | grep DATABASE_URL
# Test connection manually
psql postgresql+asyncpg://user:password@localhost:5432/irt_bank_soal
```
#### Issue: Module Not Found (httpx, numpy, scipy)
**Symptoms:**
```
ModuleNotFoundError: No module named 'httpx'
```
**Solution:**
```bash
# Ensure virtual environment is activated
source venv/bin/activate # or equivalent
# Reinstall dependencies
pip3 install -r requirements.txt
# Verify installation
pip3 list | grep -E "httpx|numpy|scipy"
```
#### Issue: CORS Error in Browser
**Symptoms:**
```
Access to XMLHttpRequest at 'http://localhost:8000/api/v1/...' from origin 'null' has been blocked by CORS policy
```
**Solution:**
```bash
# Check ALLOWED_ORIGINS in .env
cat .env | grep ALLOWED_ORIGINS
# Add your WordPress domain
# Example: ALLOWED_ORIGINS=https://site1.com,https://site2.com,http://localhost:3000
# Restart server after changing .env
```
#### Issue: OpenRouter API Timeout
**Symptoms:**
```
httpx.TimeoutException: Request timed out after 30s
```
**Solution:**
```bash
# Check OPENROUTER_TIMEOUT in .env
cat .env | grep OPENROUTER_TIMEOUT
# Increase timeout (if needed)
# In .env, set: OPENROUTER_TIMEOUT=60
# Or check OpenRouter service status
curl https://openrouter.ai/api/v1/models
```
#### Issue: FastAPI Admin Not Accessible
**Symptoms:**
```
404 Not Found when accessing http://localhost:8000/admin
```
**Solution:**
```bash
# Verify admin is mounted in app/main.py
grep "mount.*admin" app/main.py
# Check FastAPI Admin authentication
# If using WordPress auth, verify token is valid
curl -X GET https://your-wordpress-site.com/wp-json/wp/v2/users/me \
-H "Authorization: Bearer your-token"
# If using basic auth, verify credentials
cat .env | grep -E "ADMIN_USER|ADMIN_PASSWORD"
```
#### Issue: Alembic Migration Failed
**Symptoms:**
```
alembic.util.exc.CommandError: Target database is not up to date
```
**Solution:**
```bash
# Check current migration version
alembic current
# Downgrade to previous version if needed
alembic downgrade <revision_id>
# Or create new migration
alembic revision -m "Manual fix"
```
---
## Production Readiness Checklist
Before deploying to production, verify all items below are complete:
### Critical Requirements (All Required)
- [ ] CTT scoring validates with exact Excel formulas (Test 7.1)
- [ ] IRT calibration coverage >80% (Test 9.1)
- [ ] Database schema with all tables, relationships, constraints (Unspecified-High Agent 1)
- [ ] FastAPI app with all routers and endpoints (Deep Agent 1)
- [ ] AI generation with OpenRouter integration (Deep Agent 4)
- [ ] WordPress integration with multi-site support (Deep Agent 5)
- [ ] Reporting system with all 4 report types (Deep Agent 6)
- [ ] Excel import/export with 100% data integrity (Unspecified-High Agent 2)
- [ ] CAT selection with adaptive algorithms (Deep Agent 3)
- [ ] Admin panel with FastAPI Admin (Unspecified-High Agent 3)
- [ ] Normalization management (Unspecified-High Agent 4)
### Performance Requirements (Production)
- [ ] Database indexes created on all foreign key columns
- [ ] Connection pooling configured (pool_size=10, max_overflow=20)
- [ ] Async database operations throughout
- [ ] API response times <200ms for 95th percentile
- [ ] Calibration job completes within 5 minutes for 1000 items
### Security Requirements (Production)
- [ ] HTTPS enabled on production server
- [ ] Environment-specific SECRET_KEY (not default "dev-secret-key")
- [ ] CORS restricted to production domains only
- [ ] WordPress JWT tokens stored securely (not in .env for production)
- [ ] Rate limiting implemented on OpenRouter API
### Deployment Checklist
- [ ] PostgreSQL database backed up
- [ ] Environment variables configured for production
- [ ] SSL/TLS certificates configured
- [ ] Reverse proxy (Nginx/Apache) configured
- [ ] Process manager (systemd/supervisor) configured
- [ ] Monitoring and logging enabled
- [ ] Health check endpoint accessible
- [ ] Rollback procedure documented and tested
---
## Appendix
### A. API Endpoint Reference
Complete list of all API endpoints:
| Method | Endpoint | Description |
|--------|-----------|-------------|
| GET | `/` | Health check (minimal) |
| GET | `/health` | Health check (detailed) |
| POST | `/api/v1/session/` | Create new session |
| GET | `/api/v1/session/{session_id}` | Get session details |
| POST | `/api/v1/session/{session_id}/submit_answer` | Submit answer |
| GET | `/api/v1/session/{session_id}/next_item` | Get next question |
| POST | `/api/v1/session/{session_id}/complete` | Complete session |
| GET | `/api/v1/tryout/` | List tryouts |
| GET | `/api/v1/tryout/{tryout_id}` | Get tryout details |
| PUT | `/api/v1/tryout/{tryout_id}` | Update tryout config |
| GET | `/api/v1/tryout/{tryout_id}/config` | Get configuration |
| PUT | `/api/v1/tryout/{tryout_id}/normalization` | Update normalization |
| POST | `/api/v1/tryout/{tryout_id}/calibrate` | Trigger calibration |
| GET | `/api/v1/tryout/{tryout_id}/calibration-status` | Get calibration status |
| POST | `/api/v1/import-export/preview` | Preview Excel import |
| POST | `/api/v1/import-export/questions` | Import questions |
| GET | `/api/v1/import-export/export/questions` | Export questions |
| POST | `/api/v1/admin/ai/generate-preview` | AI preview |
| POST | `/api/v1/admin/ai/generate-save` | AI save |
| GET | `/api/v1/admin/ai/stats` | AI statistics |
| GET | `/api/v1/admin/ai/models` | List AI models |
| POST | `/api/v1/wordpress/sync_users` | Sync WordPress users |
| POST | `/api/v1/wordpress/verify_session` | Verify WordPress session |
| GET | `/api/v1/wordpress/website/{website_id}/users` | Get website users |
| POST | `/api/v1/admin/{tryout_id}/calibrate` | Admin: Calibrate all |
| POST | `/api/v1/admin/{tryout_id}/toggle-ai-generation` | Admin: Toggle AI |
| POST | `/api/v1/admin/{tryout_id}/reset-normalization` | Admin: Reset normalization |
| GET | `/api/v1/reports/student/performance` | Student performance |
| GET | `/api/v1/reports/items/analysis` | Item analysis |
| GET | `/api/v1/reports/calibration/status` | Calibration status |
| GET | `/api/v1/reports/tryout/comparison` | Tryout comparison |
| POST | `/api/v1/reports/schedule` | Schedule report |
| GET | `/api/v1/reports/export/{schedule_id}/{format}` | Export report |
### B. Database Schema Reference
**Tables:**
- `websites` - WordPress site configuration
- `users` - WordPress user mapping
- `tryouts` - Tryout configuration and metadata
- `items` - Questions with CTT/IRT parameters
- `sessions` - Student tryout attempts
- `user_answers` - Individual question responses
- `tryout_stats` - Running statistics per tryout
**Key Relationships:**
- Websites (1) → Tryouts (N)
- Tryouts (1) → Items (N)
- Tryouts (1) → Sessions (N)
- Tryouts (1) → TryoutStats (1)
- Items (1) → UserAnswers (N)
- Sessions (1) → UserAnswers (N)
- Users (1) → Sessions (N)
**Constraints:**
- `θ, b ∈ [-3, +3]` (IRT parameters)
- `NM, NN ∈ [0, 1000]` (score ranges)
- `ctt_p ∈ [0, 1]` (CTT difficulty)
- `bobot ∈ [0, 1]` (CTT weight)
---
**Document End**
**Status:** Ready for Testing and Validation
**Next Steps:**
1. Complete all validation tests (Section 16)
2. Verify production readiness checklist (Section 17)
3. Deploy to production environment
4. Monitor performance and calibration progress
**Contact:** For issues or questions, refer to PRD.md and project-brief.md