Collected molecules will appear here. Add from search or explore.
Official NVIDIA high-performance inference optimization library for Large Language Models on NVIDIA hardware, providing advanced kernels, quantization, and orchestration.
stars
13,319
forks
2,263
TensorRT-LLM is the gold standard for LLM inference on NVIDIA hardware. It is defensible because it is maintained by the hardware manufacturer with deep architectural access that third parties cannot easily replicate. Frontier labs are strategic partners/users rather than competitors, as they rely on this stack to maximize the ROI of their H100/B200 clusters.
TECH STACK
INTEGRATION
library_import
READINESS