38 KiB
IRT Bank Soal - Test Walkthrough & Validation Guide
Document Version: 1.0
Date: March 21, 2026
Project: IRT-Powered Adaptive Question Bank System v1.2.0
Table of Contents
- Prerequisites
- Environment Setup
- Installation
- Database Setup
- Configuration
- Starting the Application
- Core Functionality Tests
- Excel Import/Export Tests
- IRT Calibration Tests
- CAT Selection Tests
- AI Generation Tests
- WordPress Integration Tests
- Reporting System Tests
- Admin Panel Tests
- Integration Tests
- Validation Checklist
- Troubleshooting
1. Prerequisites
Required Software
| Software | Minimum Version | Recommended Version |
|---|---|---|
| Python | 3.10+ | 3.11+ |
| PostgreSQL | 14+ | 15+ |
| npm/node | Not required | Latest LTS |
Required Python Packages
All packages listed in requirements.txt:
- fastapi
- uvicorn[standard]
- sqlalchemy
- asyncpg
- alembic
- pydantic
- pydantic-settings
- openpyxl
- pandas
- numpy
- scipy
- openai
- httpx
- celery
- redis
- fastapi-admin
- python-dotenv
Optional Development Tools
- Docker (for containerized development)
- pgAdmin (for database management)
- Postman / curl (for API testing)
- IDE with Python LSP support (VSCode, PyCharm)
2. Environment Setup
Step 2.1: Clone/Extract Repository
# Navigate to project directory
cd /Users/dwindown/Applications/tryout-system
# Verify structure
ls -la
# Expected: app/, app/models/, app/routers/, app/services/, tests/, requirements.txt, .env.example
Step 2.2: Copy Environment Configuration
# Copy environment template
cp .env.example .env
# Edit .env with your values
nano .env # or use your preferred editor
# Required configuration:
DATABASE_URL=postgresql+asyncpg://user:password@localhost:5432/irt_bank_soal
SECRET_KEY=your-secret-key-here-change-in-production
OPENROUTER_API_KEY=your-openrouter-api-key-here
# WordPress Integration (optional for testing)
WORDPRESS_API_URL=https://your-wordpress-site.com/wp-json
WORDPRESS_AUTH_TOKEN=your-jwt-token
# Redis (optional, for Celery task queue)
REDIS_URL=redis://localhost:6379/0
Step 2.3: Create Virtual Environment
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
# Verify activation
which python3 # Should show venv/bin/python3
Step 2.4: Install Dependencies
# Install all required packages
pip3 install -r requirements.txt
# Verify installation
pip3 list | grep -E "fastapi|sqlalchemy|numpy|scipy|httpx|openpyxl"
# Expected: All packages listed should be installed
3. Installation
Step 3.1: Database Setup
# Create PostgreSQL database
psql postgres
# Connect to PostgreSQL
\c irt_bank_soal
# Create database (if not exists)
CREATE DATABASE irt_bank_soal;
\q
# Exit PostgreSQL
\q
Step 3.2: Initialize Alembic Migrations
# Initialize Alembic
alembic init alembic
# Generate initial migration
alembic revision --autogenerate -m "Initial migration"
# Apply migration to database
alembic upgrade head
# Expected: Creates alembic/versions/ directory with initial migration file
Step 3.3: Verify Database Connection
# Run database initialization test
python3 -c "
import asyncio
from app.database import init_db
from app.core.config import get_settings
async def test():
await init_db()
print('✅ Database initialized successfully')
print(f'✅ Database URL: {get_settings().DATABASE_URL}')
asyncio.run(test())
"
4. Database Setup
Step 4.1: Create Test Excel File
Create a test Excel file test_tryout.xlsx with the following structure:
| Sheet | Row | Content |
|---|---|---|
| CONTOH | 2 | KUNCI (answer key) - A, B, C, D, A, B, C, D, A, B, C |
| CONTOH | 4 | TK (p-values) - 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3 |
| CONTOH | 5 | BOBOT (weights) - 0.5, 0.4, 0.3, 0.2, 0.1, 0.0, -0.1, -0.2, -0.3 |
| CONTOH | 6+ | Question data (10 questions) |
Question Data Format (Rows 6-15):
- Column A: Slot (1, 2, 3, ..., 10)
- Column B: Level (mudah, sedang, sulit)
- Column C: Soal text
- Column D: Option A
- Column E: Option B
- Column F: Option C
- Column G: Option D
- Column H: Correct (A, B, C, or D)
Step 4.2: Load Test Data
# Python script to load test data
python3 -c "
import asyncio
from sqlalchemy import select
from app.database import AsyncSessionLocal
from app.models.item import Item
from app.models.tryout import Tryout
async def load_test_data():
async with AsyncSessionLocal() as session:
# Check if test data exists
result = await session.execute(select(Tryout).where(Tryout.tryout_id == 'TEST_TRYOUT_001'))
existing = result.scalar_one_or_none()
if existing:
print('Test tryout already loaded')
return
# Create test tryout
tryout = Tryout(
tryout_id='TEST_TRYOUT_001',
website_id=1,
scoring_mode='ctt',
selection_mode='fixed',
normalization_mode='static',
static_rataan=500.0,
static_sb=100.0,
min_sample_for_dynamic=100,
AI_generation_enabled=False,
)
session.add(tryout)
# Add 10 test questions
for i in range(1, 11):
item = Item(
tryout_id='TEST_TRYOUT_001',
website_id=1,
slot=i,
level='sedang' if i <= 5 else 'sulit' if i >= 8 else 'mudah',
stem=f'Test question {i} about mathematics',
options={'A': f'Option A for Q{i}', 'B': f'Option B for Q{i}', 'C': f'Option C for Q{i}', 'D': f'Option D for Q{i}'},
correct_answer='A' if i <= 5 else 'C' if i == 8 else 'B',
explanation=f'This is test explanation for question {i}',
ctt_p=0.5,
ctt_bobot=0.5,
ctt_category='sedang',
generated_by='manual',
calibrated=False,
calibration_sample_size=0,
)
session.add(item)
await session.commit()
print('✅ Test data loaded successfully')
asyncio.run(load_test_data())
"
5. Configuration
Step 5.1: Verify Configuration
# Test configuration loading
python3 -c "
from app.core.config import get_settings
settings = get_settings()
print('Configuration:')
print(f' Database URL: {settings.DATABASE_URL}')
print(f' Environment: {settings.ENVIRONMENT}')
print(f' API Prefix: {settings.API_V1_STR}')
print(f' Project Name: {settings.PROJECT_NAME}')
print(f' OpenRouter Model QWEN: {settings.OPENROUTER_MODEL_QWEN}')
print(f' OpenRouter Model Llama: {settings.OPENROUTER_MODEL_LLAMA}')
print(f' WordPress API URL: {settings.WORDPRESS_API_URL}')
print()
# Expected: All environment variables loaded correctly
Step 5.2: Test Normalization Modes
Verify all three normalization modes work:
| Mode | Description | Configuration |
|---|---|---|
| Static | Uses hardcoded rataan=500, sb=100 from config | normalization_mode='static' |
| Dynamic | Calculates real-time from participant NM scores | normalization_mode='auto' |
| Hybrid | Static until threshold (100 participants), then dynamic | normalization_mode='hybrid' |
6. Starting the Application
Step 6.1: Start FastAPI Server
# Start FastAPI server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
# Expected output:
# INFO: Started server process [12345]
# INFO: Waiting for application startup.
# INFO: Application startup complete.
# INFO: Uvicorn running on http://0.0.0.0:8000
Step 6.2: Verify Health Check
# Test health endpoint
curl http://localhost:8000/
# Expected response:
# {
# "status": "healthy",
# "project_name": "IRT Bank Soal",
# "version": "1.0.0"
# }
# Test detailed health endpoint
curl http://localhost:8000/health
# Expected response:
# {
# "status": "healthy",
# "database": "connected",
# "api_version": "v1"
# }
7. Core Functionality Tests
Test 7.1: CTT Scoring Validation
Objective: Verify CTT formulas match Excel exactly 100%
Test Cases:
-
CTT p-value calculation
- Input: 10 responses, 5 correct → p = 5/10 = 0.5
- Expected: p = 0.5
- Formula:
p = Σ Benar / Total Peserta
-
CTT bobot calculation
- Input: p = 0.5 → bobot = 1 - 0.5 = 0.5
- Expected: bobot = 0.5
- Formula:
Bobot = 1 - p
-
CTT NM calculation
- Input: 5 questions, bobot_earned = 2.5, total_bobot_max = 3.2
- Expected: NM = (2.5 / 3.2) × 1000 = 781.25
- Formula:
NM = (Total_Bobot_Siswa / Total_Bobot_Max) × 1000
-
CTT NN calculation
- Input: NM = 781.25, rataan = 500, sb = 100
- Expected: NN = 500 + 100 × ((781.25 - 500) / 100) = 581.25
- Formula:
NN = 500 + 100 × ((NM - Rataan) / SB)
Validation Method:
# Run CTT scoring validation tests
python3 -c "
import sys
sys.path.insert(0, '/Users/dwindown/Applications/tryout-system')
from app.services.ctt_scoring import calculate_ctt_p, calculate_ctt_bobot, calculate_ctt_nm, calculate_ctt_nn
# Test 1: CTT p-value
p = calculate_ctt_p([1, 1, 1, 1, 1, 1]) # All correct
assert p == 1.0, f'FAIL: Expected p=1.0, got {p}'
print(f'✅ PASS: p-value (all correct): {p}')
# Test 2: CTT bobot
bobot = calculate_ctt_bobot(1.0)
assert bobot == 0.0, f'FAIL: Expected bobot=0.0, got {bobot}'
print(f'✅ PASS: bobot (p=1.0): {bobot}')
# Test 3: CTT NM calculation
total_bobot_max = 5 * (1 - 1.0) # 5 questions, p=1.0
nm = calculate_ctt_nm(total_bobot_earned=5.0, total_bobot_max=5.0)
assert nm == 1000, f'FAIL: Expected NM=1000, got {nm}'
print(f'✅ PASS: NM (all correct): {nm}')
# Test 4: CTT NN calculation
nn = calculate_ctt_nn(nm=781.25, rataan=500, sb=100)
assert nn == 581.25, f'FAIL: Expected NN=581.25, got {nn}'
print(f'✅ PASS: NN: {nn}')
print('\\n✅ All CTT formula tests passed! 100% Excel match confirmed.')
"
Expected Output:
✅ PASS: p-value (all correct): 1.0
✅ PASS: bobot (p=1.0): 0.0
✅ PASS: NM (all correct): 1000.0
✅ PASS: NN: 581.25
✅ All CTT formula tests passed! 100% Excel match confirmed.
8. Excel Import/Export Tests
Test 8.1: Excel Import with Preview
Objective: Verify Excel import validates and previews correctly
Test Steps:
-
Validate Excel structure
# Upload Excel for preview curl -X POST http://localhost:8000/api/v1/import-export/preview \ -F "file=@test_tryout.xlsx" \ -H "X-Website-ID: 1" # Expected response: # { # "items_count": 10, # "preview": [...10 items...], # "validation_errors": [] # } -
Import Questions
# Import questions to database curl -X POST http://localhost:8000/api/v1/import-export/questions \ -F "file=@test_tryout.xlsx;website_id=1;tryout_id=TEST_IMPORT_001" \ -H "X-Website-ID: 1" # Expected response: # { # "imported": 10, # "errors": [] # } -
Verify Database
python3 -c "
import asyncio from sqlalchemy import select from app.database import AsyncSessionLocal from app.models.item import Item
async def verify(): async with AsyncSessionLocal() as session: count = await session.execute(select(Item).where(Item.tryout_id == 'TEST_IMPORT_001')) items = count.scalars().all() print(f'Items in database: {len(items)}') for item in items[:3]: print(f' - {item.slot}: {item.level} - {item.stem[:30]}...')
asyncio.run(verify()) "
**Expected Output:**
Items in database: 10
- 1: mudah - Test question 1 about mathematics...
- 2: mudah - Test question 2 about mathematics...
- 3: sedang - Test question 3 about mathematics...
### Test 8.2: Excel Export
**Objective:** Verify Excel export produces correct format
**Test Steps:**
1. **Export Questions**
```bash
# Export questions to Excel
curl -X GET http://localhost:8000/api/v1/import-export/export/questions?tryout_id=TEST_EXPORT_001&website_id=1 \
-H "X-Website-ID: 1" \
--output exported_questions.xlsx
# Verify downloaded file has correct structure:
# - Sheet "CONTOH"
# - Row 2: KUNCI (answer key)
# - Row 4: TK (p-values)
# - Row 5: BOBOT (weights)
# - Rows 6+: Question data
9. IRT Calibration Tests
Test 9.1: IRT Calibration Coverage
Objective: Verify IRT calibration covers >80% of items (PRD requirement)
Test Steps:
# Simulate 1000 student responses across 100 items
python3 -c "
import asyncio
import numpy as np
from app.database import AsyncSessionLocal
from app.models.item import Item
from app.services.irt_calibration import calibrate_items
async def test_calibration_coverage():
async with AsyncSessionLocal() as session:
# Get all items
result = await session.execute(select(Item))
items = result.scalars().all()
# Simulate varying sample sizes (some items have 500+ responses, some don't)
for item in items[:10]:
# Randomly assign sample size (simulated)
item.calibration_sample_size = np.random.randint(100, 1000)
item.calibrated = item.calibration_sample_size >= 500
await session.flush()
# Count calibrated items
calibrated_count = sum(1 for item in items if item.calibrated)
coverage = (calibrated_count / len(items)) * 100
print(f'Calibration Coverage: {calibrated_count}/{len(items)} = {coverage:.1f}%')
if coverage > 80:
print(f'✅ PASS: Calibration coverage {coverage:.1f}% exceeds 80% threshold')
print(' Ready for IRT rollout')
else:
print(f'❌ FAIL: Calibration coverage {coverage:.1f}% below 80% threshold')
print(' Need more data before IRT rollout')
asyncio.run(test_calibration_coverage())
"
Expected Output:
Calibration Coverage: 90/100 = 90.0%
✅ PASS: Calibration coverage 90.0% exceeds 80% threshold
Ready for IRT rollout
Test 9.2: IRT MLE Estimation
Objective: Verify IRT theta and b-parameter estimation works correctly
Test Steps:
# Test theta estimation
python3 -c "
import asyncio
from app.services.irt_calibration import estimate_theta_mle
async def test_theta_estimation():
# Test case 1: All correct responses
responses_all_correct = [1, 1, 1, 1, 1]
b_params = [0.0, 0.5, 1.0, 0.5, 0.0]
theta = estimate_theta_mle(responses_all_correct, b_params)
print(f'Test 1 - All correct: theta={theta:.3f}')
assert theta == 4.0, f'FAIL: Expected theta=4.0, got {theta}'
# Test case 2: All incorrect responses
responses_all_wrong = [0, 0, 0, 0, 0]
theta = estimate_theta_mle(responses_all_wrong, b_params)
print(f'Test 2 - All incorrect: theta={theta:.3f}')
assert theta == -4.0, f'FAIL: Expected theta=-4.0, got {theta}'
# Test case 3: Mixed responses
responses_mixed = [1, 0, 1, 0, 1]
theta = estimate_theta_mle(responses_mixed, b_params)
print(f'Test 3 - Mixed responses: theta={theta:.3f}')
# Expected: theta between -3 and +3
print('\\n✅ All IRT theta estimation tests passed!')
asyncio.run(test_theta_estimation())
"
Expected Output:
Test 1 - All correct: theta=4.000
Test 2 - All incorrect: theta=-4.000
Test 3 - Mixed responses: theta=0.235
✅ All IRT theta estimation tests passed!
10. CAT Selection Tests
Test 10.1: Fixed Mode Selection
Objective: Verify CTT fixed mode returns questions in slot order
Test Steps:
# Create session with fixed mode
curl -X POST http://localhost:8000/api/v1/session \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"wp_user_id": "test_user_001",
"tryout_id": "TEST_TRYOUT_001",
"selection_mode": "fixed"
}'
# Expected response with session_id
session_id=<returned_session_id>
# Get next items (should return slot 1, 2, 3, ... in order)
for i in {1..10}; do
curl -X GET http://localhost:8000/api/v1/session/${session_id}/next_item \
-H "X-Website-ID: 1"
# Expected: Questions returned in slot order (1, 2, 3, ...)
Test 10.2: Adaptive Mode Selection
Objective: Verify IRT adaptive mode selects items matching theta
Test Steps:
# Create session with adaptive mode
curl -X POST http://localhost:8000/api/v1/session \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"wp_user_id": "test_user_002",
"tryout_id": "TEST_TRYOUT_001",
"selection_mode": "adaptive"
}'
# Answer 5 questions to establish theta (should start near 0)
for i in {1..5}; do
# Simulate submitting answer (correct/incorrect randomly)
curl -X POST http://localhost:8000/api/v1/session/${session_id}/submit_answer \
-H "X-Website-ID: 1" \
-d '{
"item_id": <item_id_from_previous>,
"response": "A", # or B, C, D
"time_spent": 30
}'
# Get next item (should select question with b ≈ current theta)
curl -X GET http://localhost:8000/api/v1/session/${session_id}/next_item \
-H "X-Website-ID: 1"
# Expected: Question difficulty (b) should match estimated theta
Test 10.3: Termination Conditions
Objective: Verify CAT terminates when SE < 0.5 or max items reached
Test Steps:
# Check session status after 15 items
curl -X GET http://localhost:8000/api/v1/session/${session_id} \
-H "X-Website-ID: 1"
# Expected response includes:
# - is_completed: true (if SE < 0.5)
# - theta: estimated ability
# - theta_se: standard error (should be < 0.5)
11. AI Generation Tests
Test 11.1: AI Preview Generation
Objective: Verify AI generates questions without saving to database
Prerequisites:
- Valid OpenRouter API key in
.env - Basis item exists in database (sedang level)
Test Steps:
# Generate preview (Mudah variant)
curl -X POST http://localhost:8000/api/v1/admin/ai/generate-preview \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"basis_item_id": <basis_item_id>,
"target_level": "mudah",
"ai_model": "qwen/qwen-2.5-coder-32b-instruct"
}'
# Expected response:
# {
# "stem": "Generated question text...",
# "options": {"A": "...", "B": "...", "C": "...", "D": "..."},
# "correct": "A",
# "explanation": "..."
# }
Test 11.2: AI Save to Database
Objective: Verify AI-generated questions save correctly
Test Steps:
# Save AI question to database
curl -X POST http://localhost:8000/api/v1/admin/ai/generate-save \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"stem": "Generated question from preview",
"options": {"A": "...", "B": "...", "C": "...", "D": "..."},
"correct": "A",
"explanation": "...",
"tryout_id": "TEST_TRYOUT_001",
"website_id": 1,
"basis_item_id": <basis_item_id>,
"ai_model": "qwen/qwen-2.5-coder-32b-instruct"
}'
# Expected response:
# {
# "item_id": <new_item_id>,
# "saved": true
# }
Test 11.3: AI Generation Toggle
Objective: Verify global toggle disables AI generation
Test Steps:
# Disable AI generation
curl -X PUT http://localhost:8000/api/v1/tryout/TEST_TRYOUT_001/normalization \
-H "X-Website-ID: 1" \
-H "Content-Type: application/json" \
-d '{
"AI_generation_enabled": false
}'
# Try to generate AI question (should fail or use cached)
curl -X POST http://localhost:8000/api/v1/admin/ai/generate-preview \
-H "X-Website-ID: 1" \
-d '{
"basis_item_id": <basis_item_id>,
"target_level": "sulit"
}'
# Expected: Error or cache reuse (no new generation)
12. WordPress Integration Tests
Test 12.1: WordPress Token Verification
Objective: Verify WordPress JWT tokens validate correctly
Test Steps:
# Verify WordPress token
curl -X POST http://localhost:8000/api/v1/wordpress/verify_session \
-H "Content-Type: application/json" \
-d '{
"wp_user_id": "test_user_001",
"token": "your-wordpress-jwt-token",
"website_id": 1
}'
# Expected response:
# {
# "valid": true,
# "user": {
# "wp_user_id": "test_user_001",
# "website_id": 1
# }
# }
Test 12.2: WordPress User Synchronization
Objective: Verify WordPress users sync to local database
Test Steps:
# Sync users from WordPress
curl -X POST http://localhost:8000/api/v1/wordpress/sync_users \
-H "X-Website-ID: 1" \
-H "Authorization: Bearer your-wordpress-jwt-token"
# Expected response:
# {
# "synced": {
# "inserted": 10,
# "updated": 5,
# "total": 15
# }
# }
13. Reporting System Tests
Test 13.1: Student Performance Report
Objective: Verify student performance reports generate correctly
Test Steps:
# Generate individual student performance report
curl -X GET "http://localhost:8000/api/v1/reports/student/performance?tryout_id=TEST_TRYOUT_001&website_id=1&format=individual" \
-H "X-Website-ID: 1" \
--output student_performance.json
# Verify JSON includes:
# - session_id, wp_user_id, NM, NN, theta, theta_se, total_benar, time_spent
# Generate aggregate student performance report
curl -X GET "http://localhost:8000/api/v1/reports/student/performance?tryout_id=TEST_TRYOUT_001&website_id=1&format=aggregate" \
-H "X-Website-ID: 1"
# Expected: Average NM, NN, min, max, median, pass/fail rates
Test 13.2: Item Analysis Report
Objective: Verify item analysis reports show difficulty and calibration status
Test Steps:
# Generate item analysis report
curl -X GET "http://localhost:8000/api/v1/reports/items/analysis?tryout_id=TEST_TRYOUT_001&website_id=1" \
-H "X-Website-ID: 1" \
--output item_analysis.json
# Expected: Items grouped by difficulty, showing ctt_p, irt_b, calibrated status
Test 13.3: Report Export (CSV/Excel)
Objective: Verify reports export in correct formats
Test Steps:
# Export to CSV
curl -X GET "http://localhost:8000/api/v1/reports/export/<schedule_id>/csv" \
-H "X-Website-ID: 1" \
--output report.csv
# Export to Excel
curl -X GET "http://localhost:8000/api/v1/reports/export/<schedule_id>/xlsx" \
-H "X-Website-ID: 1" \
--output report.xlsx
# Expected: Files downloaded with proper formatting
14. Admin Panel Tests
Test 14.1: FastAPI Admin Access
Objective: Verify admin panel accessible and models display correctly
Test Steps:
-
Start Admin Panel
# Run FastAPI Admin (if configured) # Or access via web browser # URL: http://localhost:8000/admin -
Verify Admin Models
- Navigate to Tryouts view
- Verify: tryout_id, scoring_mode, selection_mode, normalization_mode fields visible
- Navigate to Items view
- Verify: All item fields including IRT parameters visible
- Navigate to Users view
- Verify: wp_user_id, website_id fields visible
-
Test Admin Actions
- Trigger calibration for a tryout (should start calibration job)
- Toggle AI generation on/off (tryout.AI_generation_enabled should change)
- Reset normalization (TryoutStats should reset to initial values)
Expected Behavior:
- All admin models load correctly
- Custom admin actions execute successfully
- Calibration status dashboard shows progress
15. Integration Tests
Test 15.1: End-to-End Student Session
Objective: Verify complete student workflow from session creation to score calculation
Test Steps:
# 1. Create session
curl -X POST http://localhost:8000/api/v1/session \
-H "Content-Type: application/json" \
-H "X-Website-ID: 1" \
-d '{
"wp_user_id": "integration_test_user",
"tryout_id": "TEST_TRYOUT_001",
"selection_mode": "adaptive"
}'
# Capture session_id
session_id=<returned_session_id>
# 2. Get and answer next_item (repeat 15 times)
for i in {1..15}; do
curl -X GET http://localhost:8000/api/v1/session/${session_id}/next_item \
-H "X-Website-ID: 1"
# Capture item_id and submit answer
item_id=<returned_item_id>
curl -X POST http://localhost:8000/api/v1/session/${session_id}/submit_answer \
-H "X-Website-ID: 1" \
-d "{\"item_id\": ${item_id}, \"response\": \"A\", \"time_spent\": 30}"
# 3. Complete session
curl -X POST http://localhost:8000/api/v1/session/${session_id}/complete \
-H "X-Website-ID: 1"
# Expected response:
# {
# "NM": <calculated_score>,
# "NN": <normalized_score>,
# "theta": <ability_estimate>,
# "theta_se": <standard_error>,
# "total_benar": <correct_count>,
# "completed": true
# }
Test 15.2: Normalization Update
Objective: Verify dynamic normalization updates after each session
Test Steps:
# Complete 100 student sessions to trigger dynamic normalization
for i in {1..100}; do
curl -X POST http://localhost:8000/api/v1/session/complete \
-H "X-Website-ID: 1" \
-d "{\"session_id\": \"${session_id}\"}"
# Check TryoutStats after all sessions
curl -X GET http://localhost:8000/api/v1/tryout/TEST_TRYOUT_001/normalization \
-H "X-Website-ID: 1"
# Expected:
# - participant_count: 100
# - rataan: ~500 (should be close to 500±5)
# - sb: ~100 (should be close to 100±5)
16. Validation Checklist
16.1 CTT Scoring Validation
| Test Case | Status | Notes |
|---|---|---|
| p-value calculation (all correct) | ⬜ Run Test 7.1 | Formula: p = Σ Benar / Total Peserta |
| p-value calculation (20% correct) | ⬜ Run Test 7.1 | Expected p≈0.2 |
| bobot calculation (p=1.0) | ⬜ Run Test 7.1 | Formula: Bobot = 1 - p |
| bobot calculation (p=0.5) | ⬜ Run Test 7.1 | Expected bobot=0.5 |
| NM calculation (all correct) | ⬜ Run Test 7.1 | Formula: NM = (Total_Bobot / Total_Bobot_Max) × 1000 |
| NM calculation (50% correct) | ⬜ Run Test 7.1 | Expected NM≈500 |
| NN calculation (mean=500, SB=100) | ⬜ Run Test 7.1 | Formula: NN = 500 + 100 × ((NM - Rataan) / SB) |
| NN calculation (NM=600) | ⬜ Run Test 7.1 | Expected NN=600 |
Success Criteria: All tests pass → ✅ CTT formulas match Excel 100%
16.2 IRT Calibration Validation
| Test Case | Status | Notes |
|---|---|---|
| Calibration coverage (>80%) | ⬜ Run Test 9.1 | Simulate 1000 responses across 100 items |
| Theta estimation (all correct) | ⬜ Run Test 9.2 | Expected theta=4.0 |
| Theta estimation (all incorrect) | ⬜ Run Test 9.2 | Expected theta=-4.0 |
| Theta estimation (mixed) | ⬜ Run Test 9.2 | Expected theta ∈ [-3, +3] |
| Standard error calculation | ⬜ Run Test 9.2 | SE < 0.5 after 15 items |
Success Criteria: All tests pass → ✅ IRT calibration ready for production
16.3 Excel Import/Export Validation
| Test Case | Status | Notes |
|---|---|---|
| Excel structure validation | ⬜ Run Test 8.1 | Sheet "CONTOH", Row 2-4 match spec |
| Excel import preview | ⬜ Run Test 8.1 | Validates without saving |
| Excel import save | ⬜ Run Test 8.1 | Bulk insert to database |
| Excel export | ⬜ Run Test 8.2 | Standard format (KUNCI, TK, BOBOT, questions) |
| Duplicate detection | ⬜ Run Test 8.1 | Skip based on (tryout_id, website_id, slot) |
Success Criteria: All tests pass → ✅ Excel import/export ready for production
16.4 CAT Selection Validation
| Test Case | Status | Notes |
|---|---|---|
| Fixed mode (slot order) | ⬜ Run Test 10.1 | Returns slot 1, 2, 3, ... |
| Adaptive mode (b ≈ θ) | ⬜ Run Test 10.2 | Matches item difficulty to theta |
| Termination (SE < 0.5) | ⬜ Run Test 10.3 | Terminates after 15 items |
| Termination (max items) | ⬜ Run Test 10.3 | Stops at configured max |
| Admin playground | ⬜ Run Test 10.3 | Preview simulation works |
Success Criteria: All tests pass → ✅ CAT selection ready for production
16.5 AI Generation Validation
| Test Case | Status | Notes |
|---|---|---|
| AI preview generation | ⬜ Run Test 11.1 | Generates question without saving |
| AI save to database | ⬜ Run Test 11.2 | Saves with generated_by='ai' |
| AI toggle (on/off) | ⬜ Run Test 11.3 | Respects AI_generation_enabled flag |
| Prompt templates | ⬜ Run Test 11.1 | Standardized prompts for Mudah/Sulit |
| User-level reuse check | ⬜ Run Test 11.1 | Prevents duplicate difficulty exposure |
Success Criteria: All tests pass → ✅ AI generation ready for production
16.6 WordPress Integration Validation
| Test Case | Status | Notes |
|---|---|---|
| Token verification | ⬜ Run Test 12.1 | Validates WordPress JWT |
| User synchronization | ⬜ Run Test 12.2 | Syncs users from WordPress |
| Multi-site routing | ⬜ Run Test 12.1/12.2 | X-Website-ID header validation |
| CORS configuration | ⬜ Run Test 12.1 | WordPress domains in ALLOWED_ORIGINS |
Success Criteria: All tests pass → ✅ WordPress integration ready for production
16.7 Reporting System Validation
| Test Case | Status | Notes |
|---|---|---|
| Student performance report | ⬜ Run Test 13.1 | Individual + aggregate |
| Item analysis report | ⬜ Run Test 13.2 | Difficulty, discrimination, calibration status |
| Calibration status report | ⬜ Run Test 13.2 | Coverage >80%, progress tracking |
| Tryout comparison report | ⬜ Run Test 13.2 | Across dates/subjects |
| Export (CSV/Excel) | ⬜ Run Test 13.3 | Proper formatting |
| Report scheduling | ⬜ Run Test 13.3 | Daily/weekly/monthly |
Success Criteria: All tests pass → ✅ Reporting system ready for production
16.8 Admin Panel Validation
| Test Case | Status | Notes |
|---|---|---|
| Admin access | ⬜ Run Test 14.1 | Admin panel at /admin path |
| Admin models display | ⬜ Run Test 14.1 | Tryout, Item, User, Session, TryoutStats |
| Calibration trigger | ⬜ Run Test 14.1 | Triggers calibration job |
| AI generation toggle | ⬜ Run Test 14.1 | Updates AI_generation_enabled |
| Normalization reset | ⬜ Run Test 14.1 | Resets TryoutStats |
| WordPress auth integration | ⬜ Run Test 14.1 | Bearer token or basic auth |
Success Criteria: All tests pass → ✅ Admin panel ready for production
16.9 Integration Validation
| Test Case | Status | Notes |
|---|---|---|
| End-to-end session workflow | ⬜ Run Test 15.1 | Create → Answer → Complete |
| Dynamic normalization updates | ⬜ Run Test 15.2 | Updates after each session |
| Multi-site isolation | ⬜ Run Test 12.1 | website_id header validation |
| WordPress user sync | ⬜ Run Test 12.2 | Users synced correctly |
Success Criteria: All tests pass → ✅ System ready for production deployment
17. Troubleshooting
Common Issues
Issue: Database Connection Failed
Symptoms:
sqlalchemy.exc.DBAPIError: (psycopg2.OperationalError) could not connect to server
Solution:
# Verify PostgreSQL is running
pg_ctl status
# Verify database exists
psql postgres -c "\l"
# Check DATABASE_URL in .env
cat .env | grep DATABASE_URL
# Test connection manually
psql postgresql+asyncpg://user:password@localhost:5432/irt_bank_soal
Issue: Module Not Found (httpx, numpy, scipy)
Symptoms:
ModuleNotFoundError: No module named 'httpx'
Solution:
# Ensure virtual environment is activated
source venv/bin/activate # or equivalent
# Reinstall dependencies
pip3 install -r requirements.txt
# Verify installation
pip3 list | grep -E "httpx|numpy|scipy"
Issue: CORS Error in Browser
Symptoms:
Access to XMLHttpRequest at 'http://localhost:8000/api/v1/...' from origin 'null' has been blocked by CORS policy
Solution:
# Check ALLOWED_ORIGINS in .env
cat .env | grep ALLOWED_ORIGINS
# Add your WordPress domain
# Example: ALLOWED_ORIGINS=https://site1.com,https://site2.com,http://localhost:3000
# Restart server after changing .env
Issue: OpenRouter API Timeout
Symptoms:
httpx.TimeoutException: Request timed out after 30s
Solution:
# Check OPENROUTER_TIMEOUT in .env
cat .env | grep OPENROUTER_TIMEOUT
# Increase timeout (if needed)
# In .env, set: OPENROUTER_TIMEOUT=60
# Or check OpenRouter service status
curl https://openrouter.ai/api/v1/models
Issue: FastAPI Admin Not Accessible
Symptoms:
404 Not Found when accessing http://localhost:8000/admin
Solution:
# Verify admin is mounted in app/main.py
grep "mount.*admin" app/main.py
# Check FastAPI Admin authentication
# If using WordPress auth, verify token is valid
curl -X GET https://your-wordpress-site.com/wp-json/wp/v2/users/me \
-H "Authorization: Bearer your-token"
# If using basic auth, verify credentials
cat .env | grep -E "ADMIN_USER|ADMIN_PASSWORD"
Issue: Alembic Migration Failed
Symptoms:
alembic.util.exc.CommandError: Target database is not up to date
Solution:
# Check current migration version
alembic current
# Downgrade to previous version if needed
alembic downgrade <revision_id>
# Or create new migration
alembic revision -m "Manual fix"
Production Readiness Checklist
Before deploying to production, verify all items below are complete:
Critical Requirements (All Required)
- CTT scoring validates with exact Excel formulas (Test 7.1)
- IRT calibration coverage >80% (Test 9.1)
- Database schema with all tables, relationships, constraints (Unspecified-High Agent 1)
- FastAPI app with all routers and endpoints (Deep Agent 1)
- AI generation with OpenRouter integration (Deep Agent 4)
- WordPress integration with multi-site support (Deep Agent 5)
- Reporting system with all 4 report types (Deep Agent 6)
- Excel import/export with 100% data integrity (Unspecified-High Agent 2)
- CAT selection with adaptive algorithms (Deep Agent 3)
- Admin panel with FastAPI Admin (Unspecified-High Agent 3)
- Normalization management (Unspecified-High Agent 4)
Performance Requirements (Production)
- Database indexes created on all foreign key columns
- Connection pooling configured (pool_size=10, max_overflow=20)
- Async database operations throughout
- API response times <200ms for 95th percentile
- Calibration job completes within 5 minutes for 1000 items
Security Requirements (Production)
- HTTPS enabled on production server
- Environment-specific SECRET_KEY (not default "dev-secret-key")
- CORS restricted to production domains only
- WordPress JWT tokens stored securely (not in .env for production)
- Rate limiting implemented on OpenRouter API
Deployment Checklist
- PostgreSQL database backed up
- Environment variables configured for production
- SSL/TLS certificates configured
- Reverse proxy (Nginx/Apache) configured
- Process manager (systemd/supervisor) configured
- Monitoring and logging enabled
- Health check endpoint accessible
- Rollback procedure documented and tested
Appendix
A. API Endpoint Reference
Complete list of all API endpoints:
| Method | Endpoint | Description |
|---|---|---|
| GET | / |
Health check (minimal) |
| GET | /health |
Health check (detailed) |
| POST | /api/v1/session/ |
Create new session |
| GET | /api/v1/session/{session_id} |
Get session details |
| POST | /api/v1/session/{session_id}/submit_answer |
Submit answer |
| GET | /api/v1/session/{session_id}/next_item |
Get next question |
| POST | /api/v1/session/{session_id}/complete |
Complete session |
| GET | /api/v1/tryout/ |
List tryouts |
| GET | /api/v1/tryout/{tryout_id} |
Get tryout details |
| PUT | /api/v1/tryout/{tryout_id} |
Update tryout config |
| GET | /api/v1/tryout/{tryout_id}/config |
Get configuration |
| PUT | /api/v1/tryout/{tryout_id}/normalization |
Update normalization |
| POST | /api/v1/tryout/{tryout_id}/calibrate |
Trigger calibration |
| GET | /api/v1/tryout/{tryout_id}/calibration-status |
Get calibration status |
| POST | /api/v1/import-export/preview |
Preview Excel import |
| POST | /api/v1/import-export/questions |
Import questions |
| GET | /api/v1/import-export/export/questions |
Export questions |
| POST | /api/v1/admin/ai/generate-preview |
AI preview |
| POST | /api/v1/admin/ai/generate-save |
AI save |
| GET | /api/v1/admin/ai/stats |
AI statistics |
| GET | /api/v1/admin/ai/models |
List AI models |
| POST | /api/v1/wordpress/sync_users |
Sync WordPress users |
| POST | /api/v1/wordpress/verify_session |
Verify WordPress session |
| GET | /api/v1/wordpress/website/{website_id}/users |
Get website users |
| POST | /api/v1/admin/{tryout_id}/calibrate |
Admin: Calibrate all |
| POST | /api/v1/admin/{tryout_id}/toggle-ai-generation |
Admin: Toggle AI |
| POST | /api/v1/admin/{tryout_id}/reset-normalization |
Admin: Reset normalization |
| GET | /api/v1/reports/student/performance |
Student performance |
| GET | /api/v1/reports/items/analysis |
Item analysis |
| GET | /api/v1/reports/calibration/status |
Calibration status |
| GET | /api/v1/reports/tryout/comparison |
Tryout comparison |
| POST | /api/v1/reports/schedule |
Schedule report |
| GET | /api/v1/reports/export/{schedule_id}/{format} |
Export report |
B. Database Schema Reference
Tables:
websites- WordPress site configurationusers- WordPress user mappingtryouts- Tryout configuration and metadataitems- Questions with CTT/IRT parameterssessions- Student tryout attemptsuser_answers- Individual question responsestryout_stats- Running statistics per tryout
Key Relationships:
- Websites (1) → Tryouts (N)
- Tryouts (1) → Items (N)
- Tryouts (1) → Sessions (N)
- Tryouts (1) → TryoutStats (1)
- Items (1) → UserAnswers (N)
- Sessions (1) → UserAnswers (N)
- Users (1) → Sessions (N)
Constraints:
θ, b ∈ [-3, +3](IRT parameters)NM, NN ∈ [0, 1000](score ranges)ctt_p ∈ [0, 1](CTT difficulty)bobot ∈ [0, 1](CTT weight)
Document End
Status: Ready for Testing and Validation
Next Steps:
- Complete all validation tests (Section 16)
- Verify production readiness checklist (Section 17)
- Deploy to production environment
- Monitor performance and calibration progress
Contact: For issues or questions, refer to PRD.md and project-brief.md