Skip to content
#

model-evaluation

Here are 325 public repositories matching this topic...

A hands-on TensorFlow image recognition project teaching a computer to identify 10 everyday objects, originally for a linear algebra class, with tools to train a CNN, auto-tune settings, and test accuracy on random internet images.

  • Updated Mar 20, 2026
  • Python

PolyCouncil is an open-source multi-model deliberation engine for LM Studio. It runs multiple LLMs in parallel, gathers their answers, scores each response using a shared rubric, and produces a final, consensus-driven result. Designed for testing, comparing, and orchestrating local models with ease.

  • Updated Mar 24, 2026
  • Python

Detect and classify fraudulent transactions using SQL and Python. Generate behavioral features with SQLite, train a Logistic Regression model, and evaluate performance with AUC, precision, recall, and ROC analysis. A complete supervised fraud detection workflow.

  • Updated Oct 21, 2025
  • Python

A comprehensive resource for machine learning interview preparation, featuring coding challenges, algorithm explanations, and practical Python examples. Covers supervised and unsupervised learning, model evaluation, and data preprocessing for technical interviews.

  • Updated May 22, 2025
  • Python

CognitiveLens is a Streamlit-powered analytics tool for exploring alignment between human and AI decisions. It visualizes fairness, calibration, and interpretability through metrics like Cohen’s κ, AUC, and Brier score. Designed for ethical AI, bias auditing, and decision transparency in machine learning systems.

  • Updated Nov 5, 2025
  • Python

A complete end-to-end fraud detection system for financial transactions, featuring data pipelines, cost-sensitive ML modeling, explainability with SHAP, threshold optimization, batch scoring, and an interactive Streamlit dashboard. Designed to simulate real-world fintech fraud-risk workflows.

  • Updated Dec 4, 2025
  • Python

A complete machine-learning system that predicts AI assistant user satisfaction using behavioral signals such as device, usage category, time features, session metrics, and model metadata. Includes full ML pipeline, SHAP explainability, evaluation suite, and an interactive Streamlit analytics dashboard.

  • Updated Dec 5, 2025
  • Python

A decision-safety lab for loan approval: trains a baseline classifier, calibrates probabilities (ECE/Brier), sweeps confidence thresholds to build a coverage, quality frontier and outputs a defensible abstention policy (auto-decide vs review). Includes a Streamlit dashboard for report cards, triage UI, and data quality checks.

  • Updated May 30, 2026
  • Python
TrustLens

Open-source Python library for evaluating ML model reliability beyond accuracy — with calibration, failure, and fairness diagnostics for informed deployment decisions.

  • Updated Jun 3, 2026
  • Python

Improve this page

Add a description, image, and links to the model-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the model-evaluation topic, visit your repo's landing page and select "manage topics."

Learn more