Gerolamo
Blink: CPU-Free LLM Inference by Delegating the Serving Stack to GPU and SmartNIC — 7/10 Defensibility | Gerolamo