Projects with this topic
Sort by:
-
Robot Framework test harness for LLM evaluation — deterministic grading, containerized execution, multi-model comparison, safety testing, test history, and CI/CD-native.
Updated
Robot Framework test harness for LLM evaluation — deterministic grading, containerized execution, multi-model comparison, safety testing, test history, and CI/CD-native.