FlyDSL Documentation
FlyDSL is a Python DSL and MLIR compiler stack for authoring high-performance GPU kernels with explicit layout algebra, targeting AMD ROCm/HIP GPUs.
FlyDSL is the Python front-end (Flexible Layout Python DSL) powered by the
Fly dialect: an MLIR-native compiler stack with first-class layout IR
(!fly.int_tuple, !fly.layout, !fly.coord_tensor, !fly.memref),
explicit algebra and coordinate mapping, plus a composable lowering pipeline
to GPU/ROCDL.
Getting Started
Guides
- Architecture & Compilation Pipeline Guide
- Layout Algebra Guide
- Quick Reference
- 1. Core Types
- 2. Construction
- 3. Coordinate Mapping
- 4. Query Operations
- 5. Layout Algebra
- 6. Product Operations
- 7. Divide Operations
- 8. Structural Operations
- 9. MemRef / View / Copy Operations
- 10. Nested / Hierarchical Layouts
- 11. IntTuple Arithmetic
- 12. Printf Debugging
- 13. Decision Tree
- 14. Source Files
- Kernel Authoring Guide
- Quick Reference
- 1. Basic Kernel Pattern
- 2. Parameter Types
- 3. Thread / Block Hierarchy
- 4. Expression API (
flydsl.expr) - 5. Control Flow
- 6. Shared Memory (LDS)
- 7. Launch Configuration
- 8. Synchronization
- 9. Compilation & Caching
- 10. Debugging
- 11. Complete Example: Preshuffle GEMM
- 12. Decision Tree
- 13. Source Files
- Pre-built Kernel Library Guide
- Testing & Benchmarking Guide
- CuTe Layout Algebra Reference for FlyDSL
API Reference
Tutorials