Installation

Installation#

This guide will help you set up tritonBLAS on your system.

Prerequisites#

Before installing tritonBLAS, ensure you have:

Python 3.10+
PyTorch 2.0+ (ROCm version)
ROCm 6.3.1+ HIP runtime
Triton (compatible version)

Supported Hardware#

tritonBLAS supports the following AMD GPUs:

GPU Model	Support Status
MI300X	Supported
MI300A	Supported
MI308X	Supported
MI350X	Supported
MI355X	Supported

Note: tritonBLAS is optimized for AMD Instinct MI300 and MI350 series GPUs.

Installation#

Install tritonBLAS from source:

# Clone the repository
git clone https://github.com/ROCm/tritonBLAS.git
cd tritonBLAS

# Install tritonBLAS in editable mode
pip3 install -e .
export PYTHONPATH=$(pwd)/include/:$PYTHONPATH

Verifying Installation#

After installation, verify that tritonBLAS is working correctly:

import torch
import tritonblas

# Create test matrices
A = torch.randn(1024, 1024, dtype=torch.float16, device='cuda')
B = torch.randn(1024, 1024, dtype=torch.float16, device='cuda')

# Perform matrix multiplication
C = tritonblas.matmul(A, B)

print("tritonBLAS is working correctly!")
print(f"Result shape: {C.shape}")

Next Steps#

Now that you have tritonBLAS installed, you can:

Follow the Quick Start Guide to run your first example
Explore Examples for common use cases
Read the API Reference for detailed documentation

Installation

Contents

Installation#

Prerequisites#

Supported Hardware#

Installation#

Verifying Installation#

Next Steps#