Talks and Papers#

This page collects publications, conference talks, and videos related to Iris.

Papers#

Iris: First-Class Multi-GPU Programming Experience in Triton#

Muhammad Awad, Muhammad Osama, Brandon Potter — arXiv, November 2025

Introduces the Iris framework and its SHMEM-like Remote Memory Access (RMA) APIs for multi-GPU programming inside Triton kernels, demonstrating programmability and competitive performance on AMD MI300X GPUs.

BibTeX

@misc{Awad:2025:IFM,
  author        = {Muhammad Awad and Muhammad Osama and Brandon Potter},
  title         = {Iris: First-Class Multi-{GPU} Programming Experience in {Triton}},
  year          = {2025},
  archivePrefix = {arXiv},
  eprint        = {2511.12500},
  primaryClass  = {cs.DC},
  doi           = {10.48550/arXiv.2511.12500}
}

Eliminating Multi-GPU Performance Taxes: A Systems Approach to Efficient Distributed LLMs#

Octavian Alexandru Trifan, Karthik Sangaiah, Muhammad Awad, Muhammad Osama, Sumanth Gudaparthi, Alexandru Nicolau, Alexander Veidenbaum, Ganesh Dasika — arXiv, November 2025

Presents a systems-level approach for reducing communication overhead in distributed large language model inference, leveraging Iris for fine-grained GPU-to-GPU data movement.

BibTeX

@misc{Trifan:2025:EMT,
  author        = {Octavian Alexandru Trifan and Karthik Sangaiah and Muhammad Awad and Muhammad Osama and Sumanth Gudaparthi and Alexandru Nicolau and Alexander Veidenbaum and Ganesh Dasika},
  title         = {Eliminating Multi-{GPU} Performance Taxes: A Systems Approach to Efficient Distributed {LLMs}},
  year          = {2025},
  archivePrefix = {arXiv},
  eprint        = {2511.02168},
  primaryClass  = {cs.DC},
  doi           = {10.48550/arXiv.2511.02168}
}

Software Citation#

If you use the Iris software directly, please also cite the software release:

@software{Awad:2025:IFM:Software,
  author        = {Muhammad Awad and Muhammad Osama and Brandon Potter},
  title         = {Iris: First-Class Multi-{GPU} Programming Experience in {Triton}},
  year          = 2025,
  month         = oct,
  doi           = {10.5281/zenodo.17382307},
  url           = {https://github.com/ROCm/iris}
}

Talks and Videos#

Iris at GPU Mode — September 2025#

Iris was presented at the GPU Mode meetup, covering the design of the RMA API, the symmetric heap, and performance results on multi-GPU workloads.


Iris All-Scatter Taxonomy — August 2025#

A deep-dive video on the taxonomy of multi-GPU programming patterns, with a focus on All-Scatter and GEMM + communication overlap.


Iris Presented in Chinese — September 2025#

Iris was presented in Chinese for participants of the AMD Distributed Inference Kernel Contest.