int8-quantization

Garuda: CVXIF coprocessor optimizing batch-1 attention microkernels with 7.5-9× lower p99 latency. RISC-V INT8 MAC accelerator for transformer inference.

machine-learning neural-network inference simd low-latency systemverilog attention-mechanism risc-v int8 systemverilog-hdl systolic-arrays edge-ai hardware-accelerator int8-quantization cva6 custom-instructions ai-hardware cvxif

Updated Jan 23, 2026
SystemVerilog

JohnClaw / chatllm.vb

Star

VB.NET api wrapper for llm-inference chatllm.cpp

bindings api-wrapper llama vb-net vbnet gemma mistral int8 int8-inference int8-quantization cpu-inference chatllm ggml llm-inference qwen

Updated Nov 26, 2024
Visual Basic .NET

JohnClaw / chatllm.cs

Star

C# api wrapper for llm-inference chatllm.cpp

csharp inference bindings api-wrapper llama gemma mistral int8 int8-inference int8-quantization cpu-inference llm llms chatllm ggml llm-inference qwen

Updated Nov 20, 2024
C#

dreuxx / Grammar-Checker-v2-BY-ML

Star

Corrects your grammar in 5 languages directly in your browser. Powered by an open-source AI model.

javascript chrome-extension nlp machine-learning browser-extension collaborate student-project grammar-checker onnx int8-quantization t5-model github-copilot

Updated Jul 12, 2025
JavaScript

yester31 / TensorRT_ONNX

Star

Generating tensorrt model using onnx

pytorch quantization tensorrt onnx int8-inference onnxruntime post-training-quantization int8-quantization tensorrt-inference ptq

Updated Jun 22, 2023
C++

Thehunk1206 / Arduino-Surrounding-Sound-Classifier

Star

TinyML project. This system monitors your room or surrounding with an onboard microphone of Arduino nano BLE sense. Still Under Developement

arduino machine-learning deep-learning quantization tf-data audio-processing spectrograms tflite-conversion tflite-models arduino-nano-33-ble-sense int8-quantization tflite-micro

Updated Oct 18, 2021
Jupyter Notebook

ThunderFun / convert_to_quant_QuIP_INT8

Star

A fork of convert_to_quant that adds QuIP quantization for INT‑8 models.

int8 int8-quantization

Updated Jun 2, 2026
Python

chayuto / yamnet-cry-distill-int8

Star

Python ML for training a custom on-device cry model (knowledge-distilled from YAMNet, INT8, deployed on ESP32-S3)

tensorflow esp32 audio-classification knowledge-distillation audioset tflite cry-detection on-device-ml yamnet tinyml int8-quantization esp32-s3 embedded-ml

Updated May 3, 2026
Python

bauratynov / fastface

Star

CPU face-embedding engine: 13 ms/face ArcFace INT8, 99.65% LFW 10-fold (beats FP32), 96 KB binary, 2.4x faster than ONNX Runtime. C99 + AVX-VNNI.

Updated Apr 21, 2026
C

Silicon-proven INT8 systolic NPU (8×8 MAC array) taped out on SkyWater 130nm via LibreLane. Features a custom 32-bit ISA, UART–APB host interface, and fused streaming datapath. Validated on chest X-ray pneumonia detection. Silicon Sprint 2026 — AUC.

tapeout systolic-arrays npu openroad edge-ai asic-design medical-ai int8-quantization digital-ic-design librelane rtl-to-gdsii skywater-130nm

Updated May 24, 2026
Verilog

JohnClaw / gemma-2-2b-it.cs

Star

gemma-2-2b-it int8 cpu inference in one file of pure C#

csharp inference quantization gemma int8 inference-engine model-serving int8-inference int8-quantization cpu-inference llm llms llm-serving llm-inference gemma2 gemma2-2b-it

Updated Jun 14, 2025
C#

olibartfast / model-optimizations

Star

Scripts and tools for optimizing deep learning models

yolo quantization quantization-aware-training int8-quantization ultralytics

Updated May 27, 2026
Python

Improve this page

Add a description, image, and links to the int8-quantization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the int8-quantization topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

int8-quantization

Here are 44 public repositories matching this topic...

willard-yuan / cvt

NJU-Jet / SR_Mobile_Quantization

GiorgosXou / NeuralNetworks

clovaai / frostnet

jahongir7174 / YOLOv8-qat

GiorgosXou / ATTiny85-MNIST-RNN-EEPROM

Howell-Yang / onnx2trt

ENOT-AutoDL / ENOT-transformers

certainly-param / garuda-accelerator

JohnClaw / chatllm.vb

JohnClaw / chatllm.cs

dreuxx / Grammar-Checker-v2-BY-ML

yester31 / TensorRT_ONNX

Thehunk1206 / Arduino-Surrounding-Sound-Classifier

ThunderFun / convert_to_quant_QuIP_INT8

chayuto / yamnet-cry-distill-int8

bauratynov / fastface

Ammar-Wahidi / NPU

JohnClaw / gemma-2-2b-it.cs

olibartfast / model-optimizations

Improve this page

Add this topic to your repo