🔍 Minimal examples of machine learning tests for implementation, behaviour, and performance.
-
Updated
Sep 21, 2022 - Python
🔍 Minimal examples of machine learning tests for implementation, behaviour, and performance.
Measure and visualize machine learning model performance without the usual boilerplate.
A High-level Scorecard Modeling API | 评分卡建模尽在于此
A hands-on TensorFlow image recognition project teaching a computer to identify 10 everyday objects, originally for a linear algebra class, with tools to train a CNN, auto-tune settings, and test accuracy on random internet images.
Valor is a lightweight, numpy-based library designed for fast and seamless evaluation of machine learning models.
PolyCouncil is an open-source multi-model deliberation engine for LM Studio. It runs multiple LLMs in parallel, gathers their answers, scores each response using a shared rubric, and produces a final, consensus-driven result. Designed for testing, comparing, and orchestrating local models with ease.
Evaluate the performance of computer vision models and prompts for zero-shot models (Grounding DINO, CLIP, BLIP, DINOv2, ImageBind, models hosted on Roboflow)
🎓 2020 Undergraduate Graduation Project in Jiangnan University ALL codes including Data-convert, keras-Train, model-Evaluate and Web-App
Detect and classify fraudulent transactions using SQL and Python. Generate behavioral features with SQLite, train a Logistic Regression model, and evaluate performance with AUC, precision, recall, and ROC analysis. A complete supervised fraud detection workflow.
Python tools for climate and air quality model evaluation
Open-source platform connecting AI assistants to government open data — MCP server, curated civic MCP directory, and anti-hallucination framework for all 559 Socrata portals
A comprehensive resource for machine learning interview preparation, featuring coding challenges, algorithm explanations, and practical Python examples. Covers supervised and unsupervised learning, model evaluation, and data preprocessing for technical interviews.
Titus 2 : Portable Format for Analytics (PFA) implementation for Python 3.4+
skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.
Rapid Evaluation Framework for climate data
CognitiveLens is a Streamlit-powered analytics tool for exploring alignment between human and AI decisions. It visualizes fairness, calibration, and interpretability through metrics like Cohen’s κ, AUC, and Brier score. Designed for ethical AI, bias auditing, and decision transparency in machine learning systems.
A complete end-to-end fraud detection system for financial transactions, featuring data pipelines, cost-sensitive ML modeling, explainability with SHAP, threshold optimization, batch scoring, and an interactive Streamlit dashboard. Designed to simulate real-world fintech fraud-risk workflows.
A complete machine-learning system that predicts AI assistant user satisfaction using behavioral signals such as device, usage category, time features, session metrics, and model metadata. Includes full ML pipeline, SHAP explainability, evaluation suite, and an interactive Streamlit analytics dashboard.
A decision-safety lab for loan approval: trains a baseline classifier, calibrates probabilities (ECE/Brier), sweeps confidence thresholds to build a coverage, quality frontier and outputs a defensible abstention policy (auto-decide vs review). Includes a Streamlit dashboard for report cards, triage UI, and data quality checks.
Open-source Python library for evaluating ML model reliability beyond accuracy — with calibration, failure, and fairness diagnostics for informed deployment decisions.
Add a description, image, and links to the model-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the model-evaluation topic, visit your repo's landing page and select "manage topics."