RAG System Evaluation Tool

Professional evaluation framework with weighted scoring (100-point scale)

Individual Evaluation Form

Instructions: Enter scores within the maximum points for each metric. The system will automatically calculate the total score. After completing, export to Excel for later batch analysis.

Evaluation Dimension Metric Weight (%) Score (0-Max Points) Weighted Score
Accuracy Factual & Technical Accuracy 25 Max: 25 0.00
Standards Compliance (ISO, DIN, etc.) 15 Max: 15 0.00
Completeness Coverage of Key Information 15 Max: 15 0.00
Provision of Context & Rationale 10 Max: 10 0.00
Directness & Relevance of Answer 5 Max: 5 0.00
Usability & Relevance Clarity & Professional Terminology 10 Max: 10 0.00
Readability & Formatting 5 Max: 5 0.00
Safety & Reliability Safety Warnings & Best Practices 10 Max: 10 0.00
Refusal of Inappropriate/Unsafe Queries 5 Max: 5 0.00
Total Score (100-point scale) 100 0.00

Import Multiple Evaluation Files

Instructions: Upload multiple Excel files (up to 50) containing individual evaluation scores. The system will aggregate all scores for comprehensive analysis.

📁
Drag & Drop Files Here or Click to Select
Supports .xlsx and .xls formats (Maximum 50 files)

Comprehensive Evaluation Analysis

Results: Aggregated analysis from all imported evaluation files. The "Average Total Score" is the final 100-point scale score, averaged across all files.

Radar Chart Logic: The chart shows the average score for each metric as a **percentage** of its maximum possible score. This allows for a fair comparison between metrics with different weights.

Example: If 'Factual & Technical Accuracy' (Max 25 points) has an average score of 20, the chart will display **80%** (20 / 25 = 0.8).

Files Analyzed

0

Average Total Score

0.00

Highest Score

0.00

Lowest Score

0.00