Initialized MiniProfiler project
- Contains the host code with a protocol implementation, data analyser and web-based visualiser
This commit is contained in:
349
docs/GETTING_STARTED.md
Normal file
349
docs/GETTING_STARTED.md
Normal file
@@ -0,0 +1,349 @@
|
||||
# Getting Started with MiniProfiler
|
||||
|
||||
This guide will help you get started with MiniProfiler for profiling your embedded STM32 applications.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Host System
|
||||
- Python 3.8 or higher
|
||||
- pip package manager
|
||||
- Modern web browser (Chrome, Firefox, Edge)
|
||||
- Serial port access (USB-to-Serial adapter or built-in UART)
|
||||
|
||||
### Embedded Target (for Phase 2)
|
||||
- STM32 microcontroller (STM32F4/F7/H7 recommended)
|
||||
- GCC ARM toolchain
|
||||
- UART or USB-CDC peripheral configured
|
||||
- ST-Link or similar programmer/debugger
|
||||
|
||||
## Installation
|
||||
|
||||
### 1. Clone the Repository
|
||||
|
||||
```bash
|
||||
git clone https://github.com/yourusername/miniprofiler.git
|
||||
cd miniprofiler
|
||||
```
|
||||
|
||||
### 2. Install Python Dependencies
|
||||
|
||||
```bash
|
||||
cd host
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Or install as a package:
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
### 3. Verify Installation
|
||||
|
||||
```bash
|
||||
miniprofiler --help
|
||||
```
|
||||
|
||||
You should see the help message with available options.
|
||||
|
||||
## Testing Without Hardware
|
||||
|
||||
Before connecting to real hardware, you can test the visualization with sample data.
|
||||
|
||||
### Generate Sample Data
|
||||
|
||||
```bash
|
||||
cd host/tests
|
||||
python sample_data_generator.py
|
||||
```
|
||||
|
||||
This creates several sample data files:
|
||||
- `sample_flamegraph.json` - Flame graph visualization data
|
||||
- `sample_statistics.json` - Function statistics
|
||||
- `sample_timeline.json` - Timeline data
|
||||
- `sample_profile_data.bin` - Binary protocol data
|
||||
|
||||
### View Sample Visualizations
|
||||
|
||||
You can view the sample JSON files by loading them in the web interface or by opening them directly:
|
||||
|
||||
```bash
|
||||
# View flame graph data
|
||||
cat sample_flamegraph.json | python -m json.tool
|
||||
|
||||
# View statistics
|
||||
cat sample_statistics.json | python -m json.tool
|
||||
```
|
||||
|
||||
## Running the Host Application
|
||||
|
||||
### Start the Web Server
|
||||
|
||||
```bash
|
||||
# From the host directory
|
||||
python run.py
|
||||
|
||||
# Or using the installed CLI
|
||||
miniprofiler
|
||||
```
|
||||
|
||||
The server will start on `http://localhost:5000` by default.
|
||||
|
||||
### Custom Host/Port
|
||||
|
||||
```bash
|
||||
miniprofiler --host 0.0.0.0 --port 8080
|
||||
```
|
||||
|
||||
### Enable Debug Mode
|
||||
|
||||
```bash
|
||||
miniprofiler --debug --verbose
|
||||
```
|
||||
|
||||
## Using the Web Interface
|
||||
|
||||
### 1. Open the Browser
|
||||
|
||||
Navigate to `http://localhost:5000`
|
||||
|
||||
You should see the MiniProfiler dashboard with:
|
||||
- Connection controls
|
||||
- Profiling controls (disabled until connected)
|
||||
- Status display
|
||||
- Three visualization tabs
|
||||
|
||||
### 2. Configure Connection
|
||||
|
||||
Enter your serial port details:
|
||||
- **Serial Port**: `/dev/ttyUSB0` (Linux/Mac) or `COM3` (Windows)
|
||||
- **Baud Rate**: `115200` (default)
|
||||
- **ELF Path**: Path to your `.elf` file (optional, for symbol resolution)
|
||||
|
||||
**Finding Your Serial Port:**
|
||||
|
||||
Linux/Mac:
|
||||
```bash
|
||||
ls /dev/tty* | grep -i usb
|
||||
# or
|
||||
ls /dev/tty.usb*
|
||||
```
|
||||
|
||||
Windows:
|
||||
- Open Device Manager
|
||||
- Look under "Ports (COM & LPT)"
|
||||
- Note the COM port number (e.g., COM3)
|
||||
|
||||
### 3. Connect to Device
|
||||
|
||||
Click the **Connect** button.
|
||||
|
||||
If successful, you'll see:
|
||||
- Status indicator turns green
|
||||
- "Connected to /dev/ttyUSB0" message
|
||||
- Metadata panel appears with device information
|
||||
- Profiling controls become enabled
|
||||
|
||||
### 4. Start Profiling
|
||||
|
||||
Click **Start Profiling**.
|
||||
|
||||
The device will begin sending profiling data, and you'll see:
|
||||
- Real-time updates in all three visualization tabs
|
||||
- Record count incrementing
|
||||
- Summary statistics updating
|
||||
|
||||
### 5. Explore Visualizations
|
||||
|
||||
#### Flame Graph Tab
|
||||
- Shows aggregate CPU time by function
|
||||
- Wider bars = more time spent
|
||||
- Click to zoom into specific call stacks
|
||||
- Search for functions by name
|
||||
- Hover for details
|
||||
|
||||
#### Timeline Tab
|
||||
- Shows function execution over time
|
||||
- X-axis = time in microseconds
|
||||
- Y-axis = call stack depth
|
||||
- Color = duration (darker = longer)
|
||||
- Useful for finding timing issues
|
||||
|
||||
#### Statistics Tab
|
||||
- Sortable table of function statistics
|
||||
- Columns: Function, Address, Calls, Total/Avg/Min/Max Time
|
||||
- Click column headers to sort
|
||||
- Find hot spots and outliers
|
||||
|
||||
### 6. Control Profiling
|
||||
|
||||
- **Stop Profiling**: Pause data collection
|
||||
- **Clear Data**: Reset all visualizations
|
||||
- **Reset Buffers**: Clear device-side buffers
|
||||
|
||||
### 7. Disconnect
|
||||
|
||||
Click **Disconnect** when done to close the serial connection.
|
||||
|
||||
## Understanding the Visualizations
|
||||
|
||||
### Flame Graph
|
||||
|
||||
The flame graph shows **aggregated** profiling data:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────┐
|
||||
│ main (10s) │ ← Root function
|
||||
├──────────────┬──────────────────┤
|
||||
│ app_loop │ process_data │ ← Called by main
|
||||
│ (6s) │ (4s) │
|
||||
├──────┬───────┼──────────────────┤
|
||||
│ read │ write │ calculate │ ← Nested calls
|
||||
│ (3s) │ (3s) │ (4s) │
|
||||
└──────┴───────┴──────────────────┘
|
||||
```
|
||||
|
||||
**Interpretation:**
|
||||
- Width = total time (including children)
|
||||
- Read from bottom (root) to top (leaves)
|
||||
- Widest bars are hotspots to optimize
|
||||
|
||||
### Timeline
|
||||
|
||||
The timeline shows **chronological** execution:
|
||||
|
||||
```
|
||||
Time ───────────────────────►
|
||||
│ ████ func_a
|
||||
│ ██ func_b (called by func_a)
|
||||
│ ████ func_c
|
||||
│ ██ func_d
|
||||
```
|
||||
|
||||
**Interpretation:**
|
||||
- X-axis = time progression
|
||||
- Y-axis = call depth
|
||||
- Gaps = idle time or excluded functions
|
||||
- Useful for timing analysis and debugging
|
||||
|
||||
### Statistics Table
|
||||
|
||||
| Function | Calls | Total Time | Avg Time |
|
||||
|----------|-------|------------|----------|
|
||||
| main | 1 | 10000 μs | 10000 μs |
|
||||
| app_loop | 100 | 6000 μs | 60 μs |
|
||||
| calculate | 100 | 4000 μs | 40 μs |
|
||||
|
||||
**Interpretation:**
|
||||
- Calls = number of times function was called
|
||||
- Total = cumulative time across all calls
|
||||
- Avg = total / calls
|
||||
- Min/Max = shortest/longest single execution
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Failed to connect to /dev/ttyUSB0"
|
||||
|
||||
**Possible causes:**
|
||||
- Wrong port name
|
||||
- Port in use by another application
|
||||
- Insufficient permissions
|
||||
|
||||
**Solutions:**
|
||||
```bash
|
||||
# Linux: Check permissions
|
||||
ls -l /dev/ttyUSB0
|
||||
sudo chmod 666 /dev/ttyUSB0
|
||||
|
||||
# Or add user to dialout group
|
||||
sudo usermod -a -G dialout $USER
|
||||
# Log out and back in
|
||||
|
||||
# Check if port is in use
|
||||
lsof | grep ttyUSB0
|
||||
```
|
||||
|
||||
### No Data Appearing
|
||||
|
||||
**Check:**
|
||||
1. Is profiling started? (Click "Start Profiling")
|
||||
2. Is embedded device actually profiling?
|
||||
3. Is UART configured correctly on embedded side?
|
||||
4. Check baud rate matches on both sides
|
||||
5. Look for errors in browser console (F12)
|
||||
|
||||
### CRC Errors in Console
|
||||
|
||||
**Possible causes:**
|
||||
- Baud rate mismatch
|
||||
- Electrical noise on UART lines
|
||||
- Cable issues
|
||||
|
||||
**Solutions:**
|
||||
- Verify baud rate configuration
|
||||
- Use shielded cable
|
||||
- Add delays in embedded UART transmission
|
||||
- Reduce baud rate to 57600
|
||||
|
||||
### Buffer Overflows
|
||||
|
||||
**Symptoms:**
|
||||
- `buffer_overflows` counter > 0 in device status
|
||||
- Missing profiling data
|
||||
|
||||
**Solutions:**
|
||||
- Increase baud rate (460800 or 921600)
|
||||
- Increase embedded ring buffer size
|
||||
- Reduce instrumentation (exclude more files)
|
||||
- Use sampling mode (future feature)
|
||||
|
||||
### Symbols Not Resolved
|
||||
|
||||
**Symptoms:**
|
||||
- Function names show as `func_0x08000XXX` or `unknown_0x08000XXX`
|
||||
|
||||
**Solutions:**
|
||||
- Provide path to `.elf` file in connection settings
|
||||
- Ensure `.elf` file has debug symbols (`-g` flag)
|
||||
- Verify `.elf` file matches firmware on device
|
||||
- Check build ID in metadata matches
|
||||
|
||||
### Web Interface Not Loading
|
||||
|
||||
**Check:**
|
||||
1. Is server running? Look for "Starting web server..." message
|
||||
2. Correct URL? Should be `http://localhost:5000`
|
||||
3. Port already in use? Try different port: `miniprofiler --port 8080`
|
||||
4. Firewall blocking? Add exception for Python/Flask
|
||||
|
||||
## Next Steps
|
||||
|
||||
### For Development
|
||||
1. Read [PROTOCOL.md](PROTOCOL.md) to understand the communication protocol
|
||||
2. Review the code in `host/miniprofiler/` to customize behavior
|
||||
3. Modify visualizations in `host/web/`
|
||||
|
||||
### For Embedded Integration
|
||||
1. Wait for Phase 2 implementation of embedded module
|
||||
2. Or start implementing based on protocol specification
|
||||
3. See examples in `embedded/` directory (coming soon)
|
||||
|
||||
### For Testing
|
||||
1. Create custom sample data with `sample_data_generator.py`
|
||||
2. Test with Renode emulation (Phase 4)
|
||||
3. Benchmark overhead on real hardware
|
||||
|
||||
## Support
|
||||
|
||||
- **Documentation**: See `docs/` directory
|
||||
- **Issues**: Open an issue on GitHub
|
||||
- **Examples**: Check `examples/` directory (coming soon)
|
||||
|
||||
## What's Next?
|
||||
|
||||
After getting familiar with the host application:
|
||||
1. **Phase 2**: Implement embedded module for STM32
|
||||
2. **Phase 3**: Test on real hardware
|
||||
3. **Phase 4**: Set up Renode emulation for automated testing
|
||||
|
||||
Stay tuned for updates!
|
||||
422
docs/PROJECT_STRUCTURE.md
Normal file
422
docs/PROJECT_STRUCTURE.md
Normal file
@@ -0,0 +1,422 @@
|
||||
# MiniProfiler Project Structure
|
||||
|
||||
## Directory Layout
|
||||
|
||||
```
|
||||
MiniProfiler/
|
||||
├── docs/ # Documentation
|
||||
│ ├── GETTING_STARTED.md # Quick start guide
|
||||
│ ├── PROTOCOL.md # Communication protocol specification
|
||||
│ └── PROJECT_STRUCTURE.md # This file
|
||||
│
|
||||
├── host/ # Host application (Python)
|
||||
│ ├── miniprofiler/ # Main package
|
||||
│ │ ├── __init__.py # Package initialization
|
||||
│ │ ├── analyzer.py # Data analysis and visualization data generation
|
||||
│ │ ├── cli.py # Command-line interface
|
||||
│ │ ├── protocol.py # Binary protocol implementation
|
||||
│ │ ├── serial_reader.py # Serial communication
|
||||
│ │ ├── symbolizer.py # ELF/DWARF symbol resolution
|
||||
│ │ └── web_server.py # Flask web server with SocketIO
|
||||
│ │
|
||||
│ ├── web/ # Web interface assets
|
||||
│ │ ├── static/
|
||||
│ │ │ ├── css/
|
||||
│ │ │ │ └── style.css # Stylesheet
|
||||
│ │ │ └── js/
|
||||
│ │ │ └── app.js # JavaScript application logic
|
||||
│ │ └── templates/
|
||||
│ │ └── index.html # Main HTML template
|
||||
│ │
|
||||
│ ├── tests/ # Tests and utilities
|
||||
│ │ ├── __init__.py
|
||||
│ │ └── sample_data_generator.py # Generate mock profiling data
|
||||
│ │
|
||||
│ ├── requirements.txt # Python dependencies
|
||||
│ ├── setup.py # Package setup
|
||||
│ └── run.py # Quick start script
|
||||
│
|
||||
├── embedded/ # Embedded module (Phase 2 - TODO)
|
||||
│ ├── src/
|
||||
│ ├── inc/
|
||||
│ └── examples/
|
||||
│
|
||||
├── .gitignore # Git ignore rules
|
||||
├── CLAUDE.md # Project overview for Claude
|
||||
└── README.md # Main project README
|
||||
```
|
||||
|
||||
## Module Descriptions
|
||||
|
||||
### Host Application (`host/miniprofiler/`)
|
||||
|
||||
#### `protocol.py`
|
||||
**Purpose:** Binary protocol implementation for serial communication
|
||||
|
||||
**Key Components:**
|
||||
- `ProfileRecord`: Data class for profiling records (14 bytes)
|
||||
- `Metadata`: Device metadata (MCU clock, timer freq, etc.)
|
||||
- `StatusInfo`: Device status information
|
||||
- `CommandPacket`: Commands sent to device
|
||||
- `ResponsePacket`: Responses from device
|
||||
- CRC16 calculation and validation
|
||||
|
||||
**Used by:** `serial_reader.py`, `analyzer.py`, `sample_data_generator.py`
|
||||
|
||||
---
|
||||
|
||||
#### `serial_reader.py`
|
||||
**Purpose:** Serial port communication and packet parsing
|
||||
|
||||
**Key Components:**
|
||||
- `SerialReader`: Main class for serial I/O
|
||||
- Background thread for continuous reading
|
||||
- State machine for packet parsing
|
||||
- Callback-based event handling
|
||||
- Command sending (START, STOP, GET_STATUS, etc.)
|
||||
|
||||
**Callbacks:**
|
||||
- `on_profile_data`: Profiling records received
|
||||
- `on_metadata`: Device metadata received
|
||||
- `on_status`: Status update received
|
||||
- `on_error`: Error occurred
|
||||
|
||||
**Used by:** `web_server.py`
|
||||
|
||||
---
|
||||
|
||||
#### `symbolizer.py`
|
||||
**Purpose:** Resolve function addresses to names using ELF/DWARF debug info
|
||||
|
||||
**Key Components:**
|
||||
- `Symbolizer`: ELF file parser
|
||||
- Loads symbol table from `.elf` file
|
||||
- Parses DWARF debug info for file/line mappings
|
||||
- Address-to-name resolution
|
||||
- Handles function address ranges
|
||||
|
||||
**Dependencies:** `pyelftools`
|
||||
|
||||
**Used by:** `analyzer.py`, `web_server.py`
|
||||
|
||||
---
|
||||
|
||||
#### `analyzer.py`
|
||||
**Purpose:** Analyze profiling data and generate visualization data structures
|
||||
|
||||
**Key Components:**
|
||||
- `ProfileAnalyzer`: Main analysis engine
|
||||
- Build call tree from flat records
|
||||
- Compute statistics (call counts, durations)
|
||||
- Generate flame graph data (d3-flame-graph format)
|
||||
- Generate timeline data (Plotly format)
|
||||
- Generate statistics table data
|
||||
|
||||
**Data Structures:**
|
||||
- `CallTreeNode`: Hierarchical call tree
|
||||
- `FunctionStats`: Per-function statistics
|
||||
|
||||
**Used by:** `web_server.py`
|
||||
|
||||
---
|
||||
|
||||
#### `web_server.py`
|
||||
**Purpose:** Flask web server with SocketIO for real-time updates
|
||||
|
||||
**Key Components:**
|
||||
- `ProfilerWebServer`: Main server class
|
||||
- Flask HTTP routes (`/`, `/api/status`, `/api/flamegraph`, etc.)
|
||||
- SocketIO event handlers (connect, start_profiling, etc.)
|
||||
- Integrates `SerialReader`, `Symbolizer`, and `ProfileAnalyzer`
|
||||
- Real-time data streaming to web clients
|
||||
|
||||
**Routes:**
|
||||
- `GET /`: Main web interface
|
||||
- `GET /api/status`: Server status JSON
|
||||
- `GET /api/flamegraph`: Flame graph data JSON
|
||||
- `GET /api/timeline`: Timeline data JSON
|
||||
- `GET /api/statistics`: Statistics table JSON
|
||||
|
||||
**SocketIO Events:**
|
||||
- `connect_serial`: Connect to device
|
||||
- `start_profiling`: Start profiling
|
||||
- `stop_profiling`: Stop profiling
|
||||
- `clear_data`: Clear all data
|
||||
- Emits: `flamegraph_update`, `statistics_update`, etc.
|
||||
|
||||
**Used by:** `cli.py`
|
||||
|
||||
---
|
||||
|
||||
#### `cli.py`
|
||||
**Purpose:** Command-line interface entry point
|
||||
|
||||
**Key Components:**
|
||||
- Argument parsing (--host, --port, --debug, --verbose)
|
||||
- Logging configuration
|
||||
- Server initialization and startup
|
||||
|
||||
**Entry point:** `miniprofiler` command
|
||||
|
||||
---
|
||||
|
||||
### Web Interface (`host/web/`)
|
||||
|
||||
#### `templates/index.html`
|
||||
**Purpose:** Main HTML page structure
|
||||
|
||||
**Features:**
|
||||
- Connection controls (serial port, baud rate, ELF path)
|
||||
- Profiling controls (start, stop, clear, reset)
|
||||
- Status display
|
||||
- Metadata panel
|
||||
- Summary panel
|
||||
- Three-tab interface (Flame Graph, Timeline, Statistics)
|
||||
|
||||
**Dependencies:**
|
||||
- Socket.IO client
|
||||
- D3.js
|
||||
- d3-flame-graph
|
||||
- Plotly.js
|
||||
|
||||
---
|
||||
|
||||
#### `static/css/style.css`
|
||||
**Purpose:** Styling and layout
|
||||
|
||||
**Features:**
|
||||
- Dark theme (VSCode-inspired)
|
||||
- Responsive design
|
||||
- Flexbox layouts
|
||||
- Custom button styles
|
||||
- Table styling
|
||||
- Status indicators with animations
|
||||
|
||||
---
|
||||
|
||||
#### `static/js/app.js`
|
||||
**Purpose:** Client-side application logic
|
||||
|
||||
**Key Functions:**
|
||||
- `initializeSocket()`: Set up SocketIO connection
|
||||
- `toggleConnection()`: Connect/disconnect from device
|
||||
- `startProfiling()`, `stopProfiling()`: Control profiling
|
||||
- `updateFlameGraph()`: Render flame graph with d3-flame-graph
|
||||
- `updateTimeline()`: Render timeline with Plotly.js
|
||||
- `updateStatistics()`: Update statistics table
|
||||
- `showTab()`: Tab switching
|
||||
|
||||
**Event Handlers:**
|
||||
- Socket events (connect, disconnect, data updates)
|
||||
- Button clicks
|
||||
- Window resize
|
||||
|
||||
---
|
||||
|
||||
### Tests (`host/tests/`)
|
||||
|
||||
#### `sample_data_generator.py`
|
||||
**Purpose:** Generate realistic mock profiling data for testing
|
||||
|
||||
**Features:**
|
||||
- Simulates typical embedded application (main, init, loop, sensors, etc.)
|
||||
- Generates nested function calls with realistic timing
|
||||
- Creates binary protocol packets
|
||||
- Exports JSON files for visualization testing
|
||||
|
||||
**Outputs:**
|
||||
- `sample_profile_data.bin`: Binary protocol data
|
||||
- `sample_flamegraph.json`: Flame graph data
|
||||
- `sample_statistics.json`: Statistics data
|
||||
- `sample_timeline.json`: Timeline data
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
cd host/tests
|
||||
python sample_data_generator.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Connection and Initialization
|
||||
|
||||
```
|
||||
User Web UI Web Server Serial Reader Device
|
||||
│ │ │ │ │
|
||||
│─── Open Browser ──►│ │ │ │
|
||||
│ │ │ │ │
|
||||
│─── Enter Port ────►│ │ │ │
|
||||
│─── Click Connect ─►│─── connect_serial ──►│─── connect() ─────►│ │
|
||||
│ │ │ │─── Open ─────►│
|
||||
│ │ │ │ │
|
||||
│ │ │─── get_metadata() ►│─── CMD ──────►│
|
||||
│ │ │ │◄── METADATA ──│
|
||||
│ │◄── metadata ─────────│◄── on_metadata() ──│ │
|
||||
│◄── Display Info ───│ │ │ │
|
||||
```
|
||||
|
||||
### Profiling Session
|
||||
|
||||
```
|
||||
User Web UI Web Server Analyzer Device
|
||||
│ │ │ │ │
|
||||
│─── Start ─────────►│─── start_profiling ─►│─── start() ─────►│ │
|
||||
│ │ │ │─── CMD ────────►│
|
||||
│ │ │ │ │
|
||||
│ │ │ │◄── DATA ────────│
|
||||
│ │ │◄── on_profile ───│ │
|
||||
│ │ │ │ │
|
||||
│ │ │── add_records() ►│ │
|
||||
│ │ │ │─ Analyze │
|
||||
│ │ │ │─ Build Tree │
|
||||
│ │ │ │─ Compute Stats │
|
||||
│ │ │◄── JSON ─────────│ │
|
||||
│ │◄─ flamegraph_update ─│ │ │
|
||||
│◄── Update Viz ─────│ │ │ │
|
||||
```
|
||||
|
||||
## Technology Stack
|
||||
|
||||
### Backend
|
||||
- **Python 3.8+**: Main language
|
||||
- **Flask 3.0+**: Web framework
|
||||
- **Flask-SocketIO 5.3+**: Real-time WebSocket communication
|
||||
- **pyserial 3.5+**: Serial port communication
|
||||
- **pyelftools 0.29+**: ELF/DWARF parsing
|
||||
- **crc 6.1+**: CRC16 calculation
|
||||
- **eventlet**: Async I/O for SocketIO
|
||||
|
||||
### Frontend
|
||||
- **HTML5/CSS3**: Structure and styling
|
||||
- **JavaScript (ES6)**: Application logic
|
||||
- **Socket.IO Client**: Real-time communication
|
||||
- **D3.js v7**: Visualization library
|
||||
- **d3-flame-graph 4.1**: Flame graph component
|
||||
- **Plotly.js 2.27**: Timeline/chart visualization
|
||||
|
||||
### Development Tools
|
||||
- **setuptools**: Package management
|
||||
- **pip**: Dependency management
|
||||
- **git**: Version control
|
||||
|
||||
## Configuration Files
|
||||
|
||||
### `requirements.txt`
|
||||
Python package dependencies with minimum versions
|
||||
|
||||
### `setup.py`
|
||||
Package metadata and installation configuration
|
||||
- Entry point: `miniprofiler` CLI command
|
||||
- Package data includes web assets
|
||||
|
||||
### `.gitignore`
|
||||
Excludes:
|
||||
- Python bytecode and caches
|
||||
- Virtual environments
|
||||
- IDE configs
|
||||
- Build artifacts
|
||||
- Generated test data
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
### Why Command-Response Protocol?
|
||||
- Allows host to control profiling (start/stop)
|
||||
- Can request status and metadata
|
||||
- More flexible than auto-start mode
|
||||
- Small overhead acceptable at 115200 baud
|
||||
|
||||
### Why Entry Time + Duration?
|
||||
- Enables both flame graphs (aggregate) and timelines (chronological)
|
||||
- Only 40% more data than duration-only
|
||||
- Essential for debugging timing-sensitive embedded systems
|
||||
|
||||
### Why d3-flame-graph?
|
||||
- Industry standard for flame graph visualization
|
||||
- Interactive (zoom, search, tooltips)
|
||||
- Customizable colors and layout
|
||||
- Handles large datasets efficiently
|
||||
|
||||
### Why Separate Analyzer Module?
|
||||
- Decouples data processing from I/O
|
||||
- Easier to test in isolation
|
||||
- Can swap visualization formats without changing protocol
|
||||
- Allows offline analysis of captured data
|
||||
|
||||
## Extension Points
|
||||
|
||||
### Adding New Commands
|
||||
1. Add to `Command` enum in `protocol.py`
|
||||
2. Implement in `SerialReader.send_command()`
|
||||
3. Add handler in `web_server.py` SocketIO events
|
||||
4. Update embedded firmware to handle command
|
||||
|
||||
### Adding New Visualizations
|
||||
1. Add route in `web_server.py` (e.g., `/api/callgraph`)
|
||||
2. Implement data generation in `analyzer.py`
|
||||
3. Add HTML tab in `index.html`
|
||||
4. Add JavaScript rendering in `app.js`
|
||||
5. Update CSS as needed
|
||||
|
||||
### Supporting More Microcontrollers
|
||||
1. Ensure GCC toolchain supports `-finstrument-functions`
|
||||
2. Implement timing mechanism (DWT, SysTick, or custom timer)
|
||||
3. Port ring buffer and UART code to new MCU
|
||||
4. Test and document
|
||||
|
||||
### Adding Compression
|
||||
1. Update protocol version to 0x02
|
||||
2. Implement compression in embedded module (e.g., delta encoding)
|
||||
3. Add decompression in `protocol.py`
|
||||
4. Update `ProfileDataPayload` parsing
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Phase 2: Embedded Module
|
||||
- [ ] STM32 HAL/LL implementation
|
||||
- [ ] FreeRTOS integration
|
||||
- [ ] Example projects for STM32F4/F7/H7
|
||||
- [ ] CMake build system
|
||||
|
||||
### Phase 3: Advanced Features
|
||||
- [ ] Statistical sampling mode
|
||||
- [ ] ISR profiling
|
||||
- [ ] Multi-core support (dual-core STM32H7)
|
||||
- [ ] Task/thread tracking for RTOS
|
||||
- [ ] Filtering and search
|
||||
|
||||
### Phase 4: Renode Integration
|
||||
- [ ] Renode platform description
|
||||
- [ ] Virtual UART setup
|
||||
- [ ] CI/CD integration
|
||||
- [ ] Automated regression tests
|
||||
|
||||
### Phase 5: Analysis Tools
|
||||
- [ ] Differential profiling (compare two runs)
|
||||
- [ ] Export to Chrome Trace Format
|
||||
- [ ] Call graph visualization
|
||||
- [ ] Performance regression detection
|
||||
- [ ] Integration with debuggers (GDB)
|
||||
|
||||
## Performance Targets
|
||||
|
||||
### Embedded Overhead
|
||||
- **Target**: <5% CPU overhead
|
||||
- **Memory**: 2-10 KB RAM for buffers
|
||||
- **Instrumentation**: 1-2 μs per function call
|
||||
|
||||
### Host Performance
|
||||
- **Latency**: <100ms from device to visualization
|
||||
- **Throughput**: Handle 500-1000 records/sec
|
||||
- **Memory**: Scale to 100K+ records in browser
|
||||
|
||||
### Bandwidth
|
||||
- **115200 baud**: ~780 records/sec
|
||||
- **460800 baud**: ~3100 records/sec
|
||||
- **921600 baud**: ~6200 records/sec
|
||||
|
||||
## Contributing
|
||||
|
||||
See individual module docstrings for implementation details.
|
||||
Follow existing code style and structure when adding features.
|
||||
300
docs/PROTOCOL.md
Normal file
300
docs/PROTOCOL.md
Normal file
@@ -0,0 +1,300 @@
|
||||
# MiniProfiler Communication Protocol
|
||||
|
||||
## Overview
|
||||
|
||||
MiniProfiler uses a binary command-response protocol over UART/Serial communication at 115200 baud (configurable).
|
||||
|
||||
- **Command packets**: Host → Embedded device
|
||||
- **Response packets**: Embedded device → Host
|
||||
|
||||
## Command Packet Format
|
||||
|
||||
Commands are sent from the host to the embedded device.
|
||||
|
||||
### Structure
|
||||
|
||||
```
|
||||
┌────────┬─────────┬─────────────┬──────────┬──────────┐
|
||||
│ Header │ Command │ Payload Len │ Payload │ Checksum │
|
||||
│ (1B) │ (1B) │ (1B) │ (8B) │ (1B) │
|
||||
└────────┴─────────┴─────────────┴──────────┴──────────┘
|
||||
Total: 12 bytes
|
||||
```
|
||||
|
||||
### Fields
|
||||
|
||||
| Field | Size | Value | Description |
|
||||
|-------|------|-------|-------------|
|
||||
| Header | 1 byte | `0x55` | Packet start marker |
|
||||
| Command | 1 byte | See table below | Command code |
|
||||
| Payload Length | 1 byte | 0-8 | Actual payload size |
|
||||
| Payload | 8 bytes | Variable | Command parameters (padded with 0x00) |
|
||||
| Checksum | 1 byte | Sum of all bytes & 0xFF | Simple checksum |
|
||||
|
||||
### Command Codes
|
||||
|
||||
| Command | Code | Description | Payload |
|
||||
|---------|------|-------------|---------|
|
||||
| START_PROFILING | `0x01` | Start profiling | None |
|
||||
| STOP_PROFILING | `0x02` | Stop profiling | None |
|
||||
| GET_STATUS | `0x03` | Request status | None |
|
||||
| RESET_BUFFERS | `0x04` | Clear profiling buffers | None |
|
||||
| GET_METADATA | `0x05` | Request device metadata | None |
|
||||
| SET_CONFIG | `0x06` | Configure profiler | Config bytes (reserved) |
|
||||
|
||||
### Example
|
||||
|
||||
Start profiling command:
|
||||
```
|
||||
55 01 00 00 00 00 00 00 00 00 00 56
|
||||
│ │ │ └─────────────────────┘ │
|
||||
│ │ │ Payload (8B) │
|
||||
│ │ └── Payload Length (0) │
|
||||
│ └── Command (START_PROFILING) │
|
||||
└── Header (0x55) └── Checksum
|
||||
```
|
||||
|
||||
## Response Packet Format
|
||||
|
||||
Responses are sent from the embedded device to the host.
|
||||
|
||||
### Structure
|
||||
|
||||
```
|
||||
┌─────────┬──────┬──────────┬──────────┬────────┬─────┐
|
||||
│ Header │ Type │ Length │ Payload │ CRC │ End │
|
||||
│ (2B) │ (1B) │ (2B) │ (N bytes)│ (2B) │(1B) │
|
||||
└─────────┴──────┴──────────┴──────────┴────────┴─────┘
|
||||
Total: 8 + N bytes
|
||||
```
|
||||
|
||||
### Fields
|
||||
|
||||
| Field | Size | Value | Description |
|
||||
|-------|------|-------|-------------|
|
||||
| Header | 2 bytes | `0xAA55` | Packet start marker (little-endian) |
|
||||
| Type | 1 byte | See table below | Response type |
|
||||
| Length | 2 bytes | 0-65535 | Payload size (little-endian) |
|
||||
| Payload | Variable | Depends on type | Response data |
|
||||
| CRC16 | 2 bytes | CRC16-CCITT | Checksum of header+type+length+payload |
|
||||
| End | 1 byte | `0x0A` | Packet end marker (newline) |
|
||||
|
||||
### Response Types
|
||||
|
||||
| Type | Code | Description | Payload Format |
|
||||
|------|------|-------------|----------------|
|
||||
| ACK | `0x01` | Command acknowledged | None |
|
||||
| NACK | `0x02` | Command failed | None |
|
||||
| METADATA | `0x03` | Device metadata | See Metadata Payload |
|
||||
| STATUS | `0x04` | Device status | See Status Payload |
|
||||
| PROFILE_DATA | `0x05` | Profiling records | See Profile Data Payload |
|
||||
|
||||
## Payload Formats
|
||||
|
||||
### Metadata Payload (28 bytes)
|
||||
|
||||
Sent in response to `GET_METADATA` command or automatically on startup.
|
||||
|
||||
```c
|
||||
struct MetadataPayload {
|
||||
uint32_t mcu_clock_hz; // MCU clock frequency in Hz
|
||||
uint32_t timer_freq; // Profiling timer frequency in Hz
|
||||
uint32_t elf_build_id; // CRC32 of .text section for version matching
|
||||
char fw_version[16]; // Firmware version string (null-terminated)
|
||||
} __attribute__((packed));
|
||||
```
|
||||
|
||||
**Example:**
|
||||
- MCU Clock: 168,000,000 Hz (168 MHz STM32F4)
|
||||
- Timer Freq: 1,000,000 Hz (1 MHz for microsecond precision)
|
||||
- Build ID: 0xDEADBEEF
|
||||
- FW Version: "v1.0.0"
|
||||
|
||||
### Status Payload (10 bytes)
|
||||
|
||||
Sent in response to `GET_STATUS` command.
|
||||
|
||||
```c
|
||||
struct StatusPayload {
|
||||
uint8_t is_profiling; // 1 if profiling active, 0 otherwise
|
||||
uint32_t buffer_overflows; // Number of buffer overflow events
|
||||
uint32_t records_captured; // Total records captured
|
||||
uint8_t buffer_usage_percent; // Current buffer usage (0-100)
|
||||
} __attribute__((packed));
|
||||
```
|
||||
|
||||
### Profile Data Payload (Variable)
|
||||
|
||||
Sent automatically during profiling or in response to data requests.
|
||||
|
||||
```c
|
||||
struct ProfileDataPayload {
|
||||
uint8_t version; // Protocol version (0x01)
|
||||
uint16_t record_count; // Number of records in this packet
|
||||
ProfileRecord records[]; // Array of profile records
|
||||
} __attribute__((packed));
|
||||
```
|
||||
|
||||
Each `ProfileRecord` is 14 bytes:
|
||||
|
||||
```c
|
||||
struct ProfileRecord {
|
||||
uint32_t func_addr; // Function address (from instrumentation)
|
||||
uint32_t entry_time; // Entry timestamp in microseconds
|
||||
uint32_t duration_us; // Function duration in microseconds
|
||||
uint16_t depth; // Call stack depth (0 = root)
|
||||
} __attribute__((packed));
|
||||
```
|
||||
|
||||
**Field Details:**
|
||||
- `func_addr`: Return address from `__builtin_return_address(0)` in instrumentation hook
|
||||
- `entry_time`: Microsecond timestamp when function was entered (wraps at ~71 minutes)
|
||||
- `duration_us`: Time spent in function including children
|
||||
- `depth`: Call stack depth (0 for main, 1 for functions called by main, etc.)
|
||||
|
||||
## Communication Flow
|
||||
|
||||
### Initial Connection
|
||||
|
||||
```
|
||||
Host Device
|
||||
| |
|
||||
|--- GET_METADATA ------>|
|
||||
|<---- METADATA ---------|
|
||||
| |
|
||||
|--- START_PROFILING --->|
|
||||
|<---- ACK --------------|
|
||||
| |
|
||||
|<---- PROFILE_DATA -----| (continuous stream)
|
||||
|<---- PROFILE_DATA -----|
|
||||
|<---- PROFILE_DATA -----|
|
||||
| ... |
|
||||
```
|
||||
|
||||
### Typical Session
|
||||
|
||||
```
|
||||
1. Host connects to serial port
|
||||
2. Host sends GET_METADATA
|
||||
3. Device responds with METADATA packet
|
||||
4. Host sends START_PROFILING
|
||||
5. Device responds with ACK
|
||||
6. Device begins streaming PROFILE_DATA packets
|
||||
7. Host processes and visualizes data in real-time
|
||||
8. Host sends STOP_PROFILING when done
|
||||
9. Device responds with ACK and stops streaming
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### CRC Mismatch
|
||||
If the host detects a CRC mismatch:
|
||||
- Log the error
|
||||
- Discard the packet
|
||||
- Continue listening for next packet
|
||||
- No retransmission (real-time streaming)
|
||||
|
||||
### Packet Loss
|
||||
- Sequence numbers not implemented (keeps protocol simple)
|
||||
- Missing data will create gaps in visualization
|
||||
- Not critical for profiling use case
|
||||
|
||||
### Buffer Overflow
|
||||
- Device sets `buffer_overflows` counter in status
|
||||
- Host should warn user
|
||||
- Options: increase baud rate, reduce instrumentation, or use sampling
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Bandwidth Calculation
|
||||
|
||||
At 115200 baud:
|
||||
- Effective throughput: ~11.5 KB/s
|
||||
- Profile record size: 14 bytes
|
||||
- Packet overhead: ~8 bytes per packet
|
||||
- Records per packet (typical): 20
|
||||
- Packet size: 8 + 3 + 280 = 291 bytes
|
||||
- Packets per second: ~39
|
||||
- Records per second: ~780
|
||||
|
||||
**Recommendation:** If profiling >780 function calls/sec, increase baud rate to 460800 or 921600.
|
||||
|
||||
### Timing Overhead
|
||||
|
||||
Instrumentation overhead per function:
|
||||
- Entry hook: ~0.5-1 μs
|
||||
- Exit hook: ~0.5-1 μs
|
||||
- Total: ~1-2 μs per function call
|
||||
|
||||
Target: <5% overhead for typical applications.
|
||||
|
||||
## Protocol Versioning
|
||||
|
||||
Current version: **0x01**
|
||||
|
||||
The `version` field in `ProfileDataPayload` allows for future protocol extensions:
|
||||
- v0x01: Current format (entry_time + duration)
|
||||
- v0x02: Future - could add ISR markers, task IDs, etc.
|
||||
- v0x03: Future - compressed format, delta encoding
|
||||
|
||||
Host should check version and handle accordingly or reject unsupported versions.
|
||||
|
||||
## Example Packet Dumps
|
||||
|
||||
### GET_METADATA Command
|
||||
```
|
||||
55 05 00 00 00 00 00 00 00 00 00 5A
|
||||
```
|
||||
|
||||
### METADATA Response
|
||||
```
|
||||
AA 55 03 1C 00 // Header, Type=METADATA, Length=28
|
||||
00 09 FB 0A // mcu_clock_hz = 168000000
|
||||
40 42 0F 00 // timer_freq = 1000000
|
||||
EF BE AD DE // build_id = 0xDEADBEEF
|
||||
76 31 2E 30 2E 30 00 ... // fw_version = "v1.0.0\0..."
|
||||
XX XX // CRC16
|
||||
0A // End marker
|
||||
```
|
||||
|
||||
### PROFILE_DATA Response (2 records)
|
||||
```
|
||||
AA 55 05 1F 00 // Header, Type=PROFILE_DATA, Length=31
|
||||
01 // Version = 1
|
||||
02 00 // Record count = 2
|
||||
|
||||
// Record 1
|
||||
00 01 00 08 // func_addr = 0x08000100
|
||||
E8 03 00 00 // entry_time = 1000 μs
|
||||
D0 07 00 00 // duration = 2000 μs
|
||||
00 00 // depth = 0
|
||||
|
||||
// Record 2
|
||||
20 02 00 08 // func_addr = 0x08000220
|
||||
F4 01 00 00 // entry_time = 500 μs
|
||||
2C 01 00 00 // duration = 300 μs
|
||||
01 00 // depth = 1
|
||||
|
||||
XX XX // CRC16
|
||||
0A // End marker
|
||||
```
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
### Embedded Side
|
||||
- Use DMA for UART transmission to minimize CPU overhead
|
||||
- Implement ring buffer with power-of-2 size for efficient modulo operations
|
||||
- Send packets in background task or idle hook
|
||||
- Consider double-buffering: one buffer for capturing, one for transmitting
|
||||
|
||||
### Host Side
|
||||
- Use state machine for packet parsing (don't assume atomicity)
|
||||
- Handle partial packets gracefully
|
||||
- Verify CRC before processing payload
|
||||
- Use background thread for serial reading to not block UI
|
||||
|
||||
## References
|
||||
|
||||
- CRC16-CCITT: Polynomial 0x1021, initial value 0xFFFF
|
||||
- Little-endian byte order for multi-byte integers
|
||||
- GCC instrumentation: `__cyg_profile_func_enter/exit`
|
||||
Reference in New Issue
Block a user