TerrierGrader

Academic AI

Production evaluation pipeline with rubric-aligned generative assessment and self-consistency verification. Retrieval-augmented feedback synthesis across configurable assessment criteria. 500+ documents per evaluation cycle, 3 institutional deployments, 60% overhead reduction.

PythonFastAPIClaude APILangChainReactPostgreSQLRedisNLPRAG

About

Production evaluation pipeline deployed across 3 institutional targets, processing 500+ documents per cycle. The architecture addresses four core challenges: maintaining assessment consistency, grounding generative feedback in configurable rubric criteria, balancing throughput with granularity, and providing human-in-the-loop override without disrupting throughput.

The Problem

High-throughput evaluation environments face a fundamental scaling constraint. Manual assessment at scale results in delayed feedback cycles, inconsistent evaluation across parallel tracks, and reduced granularity under time pressure.

The Approach

The system addresses the consistency-throughput tradeoff through rubric-aligned generation with self-consistency verification at the output boundary. Uncertain evaluations are flagged for human review. Configurable assessment criteria allow adaptation across deployment targets without architectural changes.

Tech Stack

Frontend: React 18, TypeScript, TailwindCSS, Vite
Backend: Python 3.11, FastAPI, PostgreSQL, Redis, Celery
AI/ML: Claude API, LangChain, Sentence Transformers, RAG Pipeline, Custom Rubric Engine

Apply

Apply by April 30, 2026

Apply Now →

You'll learn

RAG Systems
Prompt Engineering
NLP
FastAPI
Production Deployment
Self-Consistency Checking

Open roles

Applied AI Engineer

Build and extend RAG evaluation pipelines with self-consistency verification. Optimize rubric-aligned generation and feedback quality at scale.

Apply →

Backend Engineer

Design API layers, document ingestion, async evaluation queues, and instructor override workflows.

Apply →

Frontend Engineer

Build evaluation dashboards with real-time feedback rendering and clear UX for instructors.

Apply →

Data Engineer

Build document processing from raw upload to structured representation for the evaluation pipeline.

Apply →

AI Researcher

Benchmark rubric-aligned RAG and self-consistency; design evaluation datasets, ablation studies, and methodology for institutional deployments.

Apply →

DevOps / MLOps

CI/CD, deployment pipelines, and monitoring for FastAPI, Celery, Redis, and production APIs.

Apply →

ClinicalSearch →