Files
MiniProfiler/README.md
Atharva Sawant 852957a7de Initialized MiniProfiler project
- Contains the host code with a protocol implementation, data analyser and web-based visualiser
2025-11-27 20:34:41 +05:30

5.3 KiB

MiniProfiler

Real-time profiling tool for embedded STM32 applications using GCC's -finstrument-functions feature.

Features

  • Embedded Profiling: Automatic function instrumentation using GCC hooks
  • Real-time Visualization: Live flame graphs, timelines, and statistics
  • Low Overhead: DMA-based UART transmission, <5% performance impact
  • Symbol Resolution: Automatic function name resolution from ELF/DWARF debug info
  • Web Interface: Modern, responsive web UI with multiple visualization modes

Architecture

MiniProfiler consists of two main components:

1. Embedded Module (STM32)

  • Uses __cyg_profile_func_enter/exit hooks to capture function calls
  • Lock-free ring buffer for storing profiling data
  • UART/Serial communication with host
  • Minimal memory footprint (~2-10KB)

2. Host Application (Python)

  • Serial communication and protocol parsing
  • ELF/DWARF symbol resolution
  • Web server with real-time updates (Flask + SocketIO)
  • Three visualization modes:
    • Flame Graph: Aggregate CPU time by function
    • Timeline: Execution over time (flame chart)
    • Statistics: Call counts, min/max/avg durations

Quick Start

Installation

cd host
pip install -r requirements.txt

Or install as a package:

cd host
pip install -e .

Running the Host Application

# Using the installed CLI
miniprofiler

# Or directly with Python
python -m miniprofiler.cli

# With custom host/port
miniprofiler --host localhost --port 8080

# With verbose logging
miniprofiler --verbose

Testing Without Hardware

Generate sample profiling data to test the visualization:

cd host/tests
python sample_data_generator.py

This creates:

  • sample_profile_data.bin - Binary protocol data
  • sample_flamegraph.json - Flame graph data
  • sample_statistics.json - Statistics data
  • sample_timeline.json - Timeline data

Using the Web Interface

  1. Start the host application: miniprofiler
  2. Open browser to http://localhost:5000
  3. Enter serial port (e.g., /dev/ttyUSB0 or COM3)
  4. Optionally provide path to .elf file for symbol resolution
  5. Click Connect
  6. Click Start Profiling
  7. View real-time profiling data in the three visualization tabs

Protocol

Command-Response Structure

Commands (Host → Embedded)

  • START_PROFILING (0x01)
  • STOP_PROFILING (0x02)
  • GET_STATUS (0x03)
  • RESET_BUFFERS (0x04)
  • GET_METADATA (0x05)

Responses (Embedded → Host)

  • ACK/NACK (0x01/0x02)
  • METADATA (0x03)
  • STATUS (0x04)
  • PROFILE_DATA (0x05)

Profile Record Format

Each profiling record is 14 bytes:

struct ProfileRecord {
    uint32_t func_addr;      // Function address
    uint32_t entry_time;     // Entry timestamp (μs)
    uint32_t duration_us;    // Duration (μs)
    uint16_t depth;          // Call stack depth
} __attribute__((packed));

Packet Format

┌─────────┬──────────┬───────────────┬─────────┬─────┐
│ Header  │  Length  │    Payload    │   CRC   │ End │
│ (0xAA55)│  (2B)    │   (N bytes)   │  (2B)   │(0x0A)│
└─────────┴──────────┴───────────────┴─────────┴─────┘

Development Roadmap

Phase 1: Host Application ✓

  • Protocol implementation
  • Serial communication
  • Symbol resolution (ELF/DWARF)
  • Data analysis and statistics
  • Web interface with Flask + SocketIO
  • Flame graph visualization (d3-flame-graph)
  • Timeline visualization (Plotly.js)
  • Sample data generator

Phase 2: Embedded Module (Next)

  • Instrumentation hooks (__cyg_profile_func_enter/exit)
  • DWT/SysTick timing implementation
  • Ring buffer implementation
  • UART communication with DMA
  • Command handling
  • STM32 example project

Phase 3: Integration & Testing

  • End-to-end testing with real hardware
  • Performance overhead measurement
  • Buffer overflow handling
  • Symbol resolution verification

Phase 4: Renode Emulation

  • Renode platform description
  • Virtual UART setup
  • CI/CD integration
  • Automated testing

Configuration

GCC Compilation Flags

To enable instrumentation in your embedded project:

CFLAGS += -finstrument-functions
CFLAGS += -finstrument-functions-exclude-file-list=drivers/,lib/

Excluding Functions

// Exclude specific functions
void __attribute__((no_instrument_function)) driver_function(void);

// Exclude entire files
#pragma GCC optimize ("no-instrument-functions")

Requirements

Host Application

  • Python 3.8+
  • Flask 3.0+
  • pyserial 3.5+
  • pyelftools 0.29+
  • Modern web browser with JavaScript enabled

Embedded Target

  • STM32 MCU (STM32F4/F7/H7 recommended)
  • GCC ARM toolchain with -finstrument-functions support
  • UART/USB-CDC peripheral
  • ~2-10KB RAM for profiling buffer

License

MIT License - See LICENSE file for details

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Acknowledgments