Yet another re-implement of jetson-containers, targeting for Jetson Thor, Spark, and x86.
Hackable and optimized Transformers building blocks, supporting a composable construction.
FlashInfer: Kernel Library for LLM Serving
Open deep learning compiler stack for cpu, gpu and specialized accelerators
A high-throughput and memory-efficient inference and serving engine for LLMs
Machine Learning Containers for NVIDIA Jetson and JetPack-L4T