Features

Overview

OmniTrace is designed to be highly extensible. Internally, it leverages the timemory performance analysis toolkit to manage extensions, resources, data, etc.

Data Collection Modes

  • Dynamic instrumentation

    • Runtime instrumentation

      • Instrument executable and shared libraries at runtime

    • Binary rewriting

      • Generate a new executable and/or library with instrumentation built-in

  • Statistical sampling

    • Periodic software interrupts per-thread

  • Process-level sampling

    • Background thread records process-, system- and device-level metrics while the application executes

  • Causal profiling

    • Quantifies the potential impact of optimizations in parallel codes

Data Analysis

  • High-level summary profiles with mean/min/max/stddev statistics

    • Low overhead, memory efficient

    • Ideal for running at scale

  • Comprehensive traces

    • Every individual event/measurement

  • Application speedup predictions resulting from potential optimizations in functions and lines of code (causal profiling)

Parallelism API Support

  • HIP

  • HSA

  • Pthreads

  • MPI

  • Kokkos-Tools (KokkosP)

  • OpenMP-Tools (OMPT)

GPU Metrics

  • GPU hardware counters

  • HIP API tracing

  • HIP kernel tracing

  • HSA API tracing

  • HSA operation tracing

  • System-level sampling (via rocm-smi)

    • Memory usage

    • Power usage

    • Temperature

    • Utilization

CPU Metrics

  • CPU hardware counters sampling and profiles

  • CPU frequency sampling

  • Various timing metrics

    • Wall time

    • CPU time (process and/or thread)

    • CPU utilization (process and/or thread)

    • User CPU time

    • Kernel CPU time

  • Various memory metrics

    • High-water mark (sampling and profiles)

    • Memory page allocation

    • Virtual memory usage

  • Network statistics

  • I/O metrics

  • … many more

Third-party API support

  • TAU

  • LIKWID

  • Caliper

  • CrayPAT

  • VTune

  • NVTX

  • ROCTX