# bundle.perf_report Extract Tracy profiler zone statistics from CSV exports, store them as structured HDF5 datasets keyed by version and platform, and generate PDF performance reports with automatic cross-version comparison. ## Public API | Class | Module | Description | |---|---|---| | `ProfileExtractor` | `extractor.py` | Parse Tracy CSV files (or `.tracy` capture files) into structured `ProfileData` objects. | | `ProfileRecord` | `extractor.py` | Single zone record (name, src_file, src_line, total_ns, total_perc, counts, mean_ns, min_ns, max_ns, std_ns). | | `ProfileData` | `extractor.py` | All records from one CSV, with `name` and `total_calls` properties. | | `ProfileStorage` | `storage.py` | Multi-version, multi-platform HDF5 storage via `bundle.hdf5.Store`. | ## Full pipeline The recommended way to run profiling is through the test CLI: ```sh bundle testing python pytest --perf # or with a custom output directory: bundle testing python pytest --perf --perf-output ./my-perf-dir ``` This runs the full pipeline automatically: 1. Starts `tracy-capture` in the background → `bundle..tracy` 2. Runs the test suite as a subprocess with `PERF_MODE=true` (Tracy hook active, logs silenced) 3. Waits for `tracy-capture` to finish writing when the subprocess exits 4. Exports the `.tracy` file to CSV via `tracy-csvexport` 5. Loads and stores profile data in HDF5 (`profiles.h5`) 6. Generates a PDF report (`bundle..pdf`) Output files in `/perf/` (or `--perf-output`): - `bundle..tracy` — raw Tracy capture - `bundle..csv` — exported zone statistics - `bundle..pdf` — performance report - `profiles.h5` — historical HDF5 store Prerequisites: `bundle tracy build` (builds and installs `tracy-capture` and `tracy-csvexport`). ## CLI The module provides a standalone `generate` command via the bundle CLI: ```sh # Auto-detect backend from input files bundle perf-report generate -i perf/ -o perf/ # Explicitly select backend bundle perf-report generate --backend tracy -i perf/ -o perf/ bundle perf-report generate --backend cprofile -i references/linux/cprofile/ -o perf/ # Custom PDF filename, skip HDF5 bundle perf-report generate -i perf/ -o perf/ --pdf-name my_report.pdf --no-h5 ``` This auto-detects the profiler backend from input files (`.prof` → cProfile, `.csv`/`.tracy` → Tracy), saves profiling data to HDF5, auto-detects a previous version as baseline for comparison, and generates a PDF with per-profile charts and optional delta columns. ## Usage ### Extract profiles ```python from bundle.perf_report import ProfileExtractor from pathlib import Path # From a Tracy capture file (runs tracy-csvexport internally) profile = ProfileExtractor.extract_from_tracy(Path("bundle.1.0.0.tracy")) # From an already-exported CSV profile = ProfileExtractor.extract(Path("bundle.1.0.0.csv")) # All CSV/Tracy files in a directory profiles = ProfileExtractor.extract_all(Path("perf/")) for rec in profile.records: print(f"{rec.name}: mean={rec.mean_ns}ns total={rec.total_ns}ns ({rec.counts} calls)") ``` ### Store to HDF5 ```python from bundle.perf_report import ProfileStorage from pathlib import Path # Save with version + platform key storage = ProfileStorage.from_directory( prof_dir=Path("perf/"), h5_path=Path("perf/profiles.h5"), machine_id="my-machine", bundle_version="1.5.0", platform_id="linux-x86_64-CPython3.12.8", platform_meta={"system": "linux", "arch": "x86_64", "processor": "..."}, ) # Discover stored data versions = storage.list_versions() # ["1.5.0", "1.5.1"] platforms = storage.list_platforms("1.5.0") # ["linux-x86_64-CPython3.12.8"] # Read back meta = storage.load_meta("1.5.0", "linux-x86_64-CPython3.12.8") profiles = storage.load_profiles("1.5.0", "linux-x86_64-CPython3.12.8") ``` ## Tracy CSV format `tracy-csvexport` produces a CSV with one row per profiled zone: | Column | Type | Description | |---|---|---| | `name` | str | Zone / function name | | `src_file` | str | Source file path | | `src_line` | int | Source line number | | `total_ns` | int | Total time in all calls (nanoseconds) | | `total_perc` | float | Percentage of total capture time | | `counts` | int | Number of zone invocations | | `mean_ns` | int | Mean time per call (nanoseconds) | | `min_ns` | int | Minimum time per call (nanoseconds) | | `max_ns` | int | Maximum time per call (nanoseconds) | | `std_ns` | float | Standard deviation (nanoseconds) | ## HDF5 layout ``` /// meta attrs: machine_id, platform_id, bundle_version, timestamp, system, arch, node, processor, python_version, ... profiles/ structured dataset (name, src_file, src_line, total_ns, total_perc, counts, mean_ns, min_ns, max_ns, std_ns) attrs: csv_path, total_calls ``` ## Dependencies - `bundle.hdf5` (HDF5 store) - `bundle.latex` (PDF generation) - `numpy`, `matplotlib` - `click` (CLI) - `tracy-csvexport` (external binary, installed via `bundle tracy build`)