- Contains the host code with a protocol implementation, data analyser and web-based visualiser
301 lines
9.7 KiB
Markdown
301 lines
9.7 KiB
Markdown
# MiniProfiler Communication Protocol
|
|
|
|
## Overview
|
|
|
|
MiniProfiler uses a binary command-response protocol over UART/Serial communication at 115200 baud (configurable).
|
|
|
|
- **Command packets**: Host → Embedded device
|
|
- **Response packets**: Embedded device → Host
|
|
|
|
## Command Packet Format
|
|
|
|
Commands are sent from the host to the embedded device.
|
|
|
|
### Structure
|
|
|
|
```
|
|
┌────────┬─────────┬─────────────┬──────────┬──────────┐
|
|
│ Header │ Command │ Payload Len │ Payload │ Checksum │
|
|
│ (1B) │ (1B) │ (1B) │ (8B) │ (1B) │
|
|
└────────┴─────────┴─────────────┴──────────┴──────────┘
|
|
Total: 12 bytes
|
|
```
|
|
|
|
### Fields
|
|
|
|
| Field | Size | Value | Description |
|
|
|-------|------|-------|-------------|
|
|
| Header | 1 byte | `0x55` | Packet start marker |
|
|
| Command | 1 byte | See table below | Command code |
|
|
| Payload Length | 1 byte | 0-8 | Actual payload size |
|
|
| Payload | 8 bytes | Variable | Command parameters (padded with 0x00) |
|
|
| Checksum | 1 byte | Sum of all bytes & 0xFF | Simple checksum |
|
|
|
|
### Command Codes
|
|
|
|
| Command | Code | Description | Payload |
|
|
|---------|------|-------------|---------|
|
|
| START_PROFILING | `0x01` | Start profiling | None |
|
|
| STOP_PROFILING | `0x02` | Stop profiling | None |
|
|
| GET_STATUS | `0x03` | Request status | None |
|
|
| RESET_BUFFERS | `0x04` | Clear profiling buffers | None |
|
|
| GET_METADATA | `0x05` | Request device metadata | None |
|
|
| SET_CONFIG | `0x06` | Configure profiler | Config bytes (reserved) |
|
|
|
|
### Example
|
|
|
|
Start profiling command:
|
|
```
|
|
55 01 00 00 00 00 00 00 00 00 00 56
|
|
│ │ │ └─────────────────────┘ │
|
|
│ │ │ Payload (8B) │
|
|
│ │ └── Payload Length (0) │
|
|
│ └── Command (START_PROFILING) │
|
|
└── Header (0x55) └── Checksum
|
|
```
|
|
|
|
## Response Packet Format
|
|
|
|
Responses are sent from the embedded device to the host.
|
|
|
|
### Structure
|
|
|
|
```
|
|
┌─────────┬──────┬──────────┬──────────┬────────┬─────┐
|
|
│ Header │ Type │ Length │ Payload │ CRC │ End │
|
|
│ (2B) │ (1B) │ (2B) │ (N bytes)│ (2B) │(1B) │
|
|
└─────────┴──────┴──────────┴──────────┴────────┴─────┘
|
|
Total: 8 + N bytes
|
|
```
|
|
|
|
### Fields
|
|
|
|
| Field | Size | Value | Description |
|
|
|-------|------|-------|-------------|
|
|
| Header | 2 bytes | `0xAA55` | Packet start marker (little-endian) |
|
|
| Type | 1 byte | See table below | Response type |
|
|
| Length | 2 bytes | 0-65535 | Payload size (little-endian) |
|
|
| Payload | Variable | Depends on type | Response data |
|
|
| CRC16 | 2 bytes | CRC16-CCITT | Checksum of header+type+length+payload |
|
|
| End | 1 byte | `0x0A` | Packet end marker (newline) |
|
|
|
|
### Response Types
|
|
|
|
| Type | Code | Description | Payload Format |
|
|
|------|------|-------------|----------------|
|
|
| ACK | `0x01` | Command acknowledged | None |
|
|
| NACK | `0x02` | Command failed | None |
|
|
| METADATA | `0x03` | Device metadata | See Metadata Payload |
|
|
| STATUS | `0x04` | Device status | See Status Payload |
|
|
| PROFILE_DATA | `0x05` | Profiling records | See Profile Data Payload |
|
|
|
|
## Payload Formats
|
|
|
|
### Metadata Payload (28 bytes)
|
|
|
|
Sent in response to `GET_METADATA` command or automatically on startup.
|
|
|
|
```c
|
|
struct MetadataPayload {
|
|
uint32_t mcu_clock_hz; // MCU clock frequency in Hz
|
|
uint32_t timer_freq; // Profiling timer frequency in Hz
|
|
uint32_t elf_build_id; // CRC32 of .text section for version matching
|
|
char fw_version[16]; // Firmware version string (null-terminated)
|
|
} __attribute__((packed));
|
|
```
|
|
|
|
**Example:**
|
|
- MCU Clock: 168,000,000 Hz (168 MHz STM32F4)
|
|
- Timer Freq: 1,000,000 Hz (1 MHz for microsecond precision)
|
|
- Build ID: 0xDEADBEEF
|
|
- FW Version: "v1.0.0"
|
|
|
|
### Status Payload (10 bytes)
|
|
|
|
Sent in response to `GET_STATUS` command.
|
|
|
|
```c
|
|
struct StatusPayload {
|
|
uint8_t is_profiling; // 1 if profiling active, 0 otherwise
|
|
uint32_t buffer_overflows; // Number of buffer overflow events
|
|
uint32_t records_captured; // Total records captured
|
|
uint8_t buffer_usage_percent; // Current buffer usage (0-100)
|
|
} __attribute__((packed));
|
|
```
|
|
|
|
### Profile Data Payload (Variable)
|
|
|
|
Sent automatically during profiling or in response to data requests.
|
|
|
|
```c
|
|
struct ProfileDataPayload {
|
|
uint8_t version; // Protocol version (0x01)
|
|
uint16_t record_count; // Number of records in this packet
|
|
ProfileRecord records[]; // Array of profile records
|
|
} __attribute__((packed));
|
|
```
|
|
|
|
Each `ProfileRecord` is 14 bytes:
|
|
|
|
```c
|
|
struct ProfileRecord {
|
|
uint32_t func_addr; // Function address (from instrumentation)
|
|
uint32_t entry_time; // Entry timestamp in microseconds
|
|
uint32_t duration_us; // Function duration in microseconds
|
|
uint16_t depth; // Call stack depth (0 = root)
|
|
} __attribute__((packed));
|
|
```
|
|
|
|
**Field Details:**
|
|
- `func_addr`: Return address from `__builtin_return_address(0)` in instrumentation hook
|
|
- `entry_time`: Microsecond timestamp when function was entered (wraps at ~71 minutes)
|
|
- `duration_us`: Time spent in function including children
|
|
- `depth`: Call stack depth (0 for main, 1 for functions called by main, etc.)
|
|
|
|
## Communication Flow
|
|
|
|
### Initial Connection
|
|
|
|
```
|
|
Host Device
|
|
| |
|
|
|--- GET_METADATA ------>|
|
|
|<---- METADATA ---------|
|
|
| |
|
|
|--- START_PROFILING --->|
|
|
|<---- ACK --------------|
|
|
| |
|
|
|<---- PROFILE_DATA -----| (continuous stream)
|
|
|<---- PROFILE_DATA -----|
|
|
|<---- PROFILE_DATA -----|
|
|
| ... |
|
|
```
|
|
|
|
### Typical Session
|
|
|
|
```
|
|
1. Host connects to serial port
|
|
2. Host sends GET_METADATA
|
|
3. Device responds with METADATA packet
|
|
4. Host sends START_PROFILING
|
|
5. Device responds with ACK
|
|
6. Device begins streaming PROFILE_DATA packets
|
|
7. Host processes and visualizes data in real-time
|
|
8. Host sends STOP_PROFILING when done
|
|
9. Device responds with ACK and stops streaming
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
### CRC Mismatch
|
|
If the host detects a CRC mismatch:
|
|
- Log the error
|
|
- Discard the packet
|
|
- Continue listening for next packet
|
|
- No retransmission (real-time streaming)
|
|
|
|
### Packet Loss
|
|
- Sequence numbers not implemented (keeps protocol simple)
|
|
- Missing data will create gaps in visualization
|
|
- Not critical for profiling use case
|
|
|
|
### Buffer Overflow
|
|
- Device sets `buffer_overflows` counter in status
|
|
- Host should warn user
|
|
- Options: increase baud rate, reduce instrumentation, or use sampling
|
|
|
|
## Performance Considerations
|
|
|
|
### Bandwidth Calculation
|
|
|
|
At 115200 baud:
|
|
- Effective throughput: ~11.5 KB/s
|
|
- Profile record size: 14 bytes
|
|
- Packet overhead: ~8 bytes per packet
|
|
- Records per packet (typical): 20
|
|
- Packet size: 8 + 3 + 280 = 291 bytes
|
|
- Packets per second: ~39
|
|
- Records per second: ~780
|
|
|
|
**Recommendation:** If profiling >780 function calls/sec, increase baud rate to 460800 or 921600.
|
|
|
|
### Timing Overhead
|
|
|
|
Instrumentation overhead per function:
|
|
- Entry hook: ~0.5-1 μs
|
|
- Exit hook: ~0.5-1 μs
|
|
- Total: ~1-2 μs per function call
|
|
|
|
Target: <5% overhead for typical applications.
|
|
|
|
## Protocol Versioning
|
|
|
|
Current version: **0x01**
|
|
|
|
The `version` field in `ProfileDataPayload` allows for future protocol extensions:
|
|
- v0x01: Current format (entry_time + duration)
|
|
- v0x02: Future - could add ISR markers, task IDs, etc.
|
|
- v0x03: Future - compressed format, delta encoding
|
|
|
|
Host should check version and handle accordingly or reject unsupported versions.
|
|
|
|
## Example Packet Dumps
|
|
|
|
### GET_METADATA Command
|
|
```
|
|
55 05 00 00 00 00 00 00 00 00 00 5A
|
|
```
|
|
|
|
### METADATA Response
|
|
```
|
|
AA 55 03 1C 00 // Header, Type=METADATA, Length=28
|
|
00 09 FB 0A // mcu_clock_hz = 168000000
|
|
40 42 0F 00 // timer_freq = 1000000
|
|
EF BE AD DE // build_id = 0xDEADBEEF
|
|
76 31 2E 30 2E 30 00 ... // fw_version = "v1.0.0\0..."
|
|
XX XX // CRC16
|
|
0A // End marker
|
|
```
|
|
|
|
### PROFILE_DATA Response (2 records)
|
|
```
|
|
AA 55 05 1F 00 // Header, Type=PROFILE_DATA, Length=31
|
|
01 // Version = 1
|
|
02 00 // Record count = 2
|
|
|
|
// Record 1
|
|
00 01 00 08 // func_addr = 0x08000100
|
|
E8 03 00 00 // entry_time = 1000 μs
|
|
D0 07 00 00 // duration = 2000 μs
|
|
00 00 // depth = 0
|
|
|
|
// Record 2
|
|
20 02 00 08 // func_addr = 0x08000220
|
|
F4 01 00 00 // entry_time = 500 μs
|
|
2C 01 00 00 // duration = 300 μs
|
|
01 00 // depth = 1
|
|
|
|
XX XX // CRC16
|
|
0A // End marker
|
|
```
|
|
|
|
## Implementation Notes
|
|
|
|
### Embedded Side
|
|
- Use DMA for UART transmission to minimize CPU overhead
|
|
- Implement ring buffer with power-of-2 size for efficient modulo operations
|
|
- Send packets in background task or idle hook
|
|
- Consider double-buffering: one buffer for capturing, one for transmitting
|
|
|
|
### Host Side
|
|
- Use state machine for packet parsing (don't assume atomicity)
|
|
- Handle partial packets gracefully
|
|
- Verify CRC before processing payload
|
|
- Use background thread for serial reading to not block UI
|
|
|
|
## References
|
|
|
|
- CRC16-CCITT: Polynomial 0x1021, initial value 0xFFFF
|
|
- Little-endian byte order for multi-byte integers
|
|
- GCC instrumentation: `__cyg_profile_func_enter/exit`
|