An unofficial simple wechat pay gem
Yet another re-implement of jetson-containers, targeting for Jetson Thor, Spark, and x86.
Fast and memory-efficient exact attention
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A high-throughput and memory-efficient inference and serving engine for LLMs
8-bit CUDA functions for PyTorch
Hackable and optimized Transformers building blocks, supporting a composable construction.
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications...
Collection of crates used in Parity projects