Co-Location of CPU and GPU Workloads with High Resource Efficiency - Penghao Cen & Jian He
GPU Sharing for Machine Learning Workload on Kubernetes - Henry Zhang & Yang Yu, VMware
How The Massive Power Draw Of Generative AI Is Overtaxing Our Grid
Scaling AI Inference Workloads with GPUs and Kubernetes - Renaud Gaubert & Ryan Olson, NVIDIA
USENIX ATC '20 - FineStream: Fine-Grained Window-Based Stream Processing on CPU-GPU Integrated...
Put Ai Deep Learning Server with 8 x RTX 4090 🔥#ai #deeplearning #ailearning
The Path to GPU as a Service in Kubernetes - Renaud Gaubert, NVIDIA (Intermediate Skill Level)
[HPCA 2018] Efficient and Fair Multi-programming in GPUs via Effective Bandwidth Management
State of the GPU(nion) by Rudi Chiarito, Clarifai
What is a Data Center?
A full-scenario colocation of workloads based on Kubernetes - Dongdong Chen & Lingpeng Chen, Tencent
Minimizing GPU Cost for Your Deep Learning on Kubernetes - Kai Zhang & Yang Che, Alibaba
Ana Klimovic - Scalable Input Data Processing for Resource-Efficient Machine Learning
How does HPE’s GPU-as-a-Service (GPUaaS) solution work? Watch this demo
Networking Optimizations for Multi-Node Deep Learning on Kubernetes - Rajat Chopra & Erez Cohen
USENIX ATC '13 - DeepDive: Transparently Identifying and Managing Performance Interference
USENIX ATC '20 - Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing
OSDI '20 - Twine: a Unified Cluster Management System for Shared Infrastructure
FAST '23 - Intelligent Resource Scheduling for Co-located Latency-critical Services: A Multi-Model
[ENG] Alexander Kanevskiy: "Using advanced hardware platform features in Kubernetes" / #LinuxPiter