Data Loggers

The loggers package provides comprehensive data collection, recording, and analysis capabilities for multi-agent simulations. These loggers capture simulation dynamics, performance metrics, and behavioral patterns with configurable formats and frequencies.

Overview

SwarmSim’s logging framework enables systematic data collection from complex multi-agent simulations, supporting real-time monitoring, post-simulation analysis, and reproducible research workflows. The modular design allows for specialized logging behaviors while maintaining consistent data formats.

Key Features

  • Multi-Format Output: Support for CSV, NumPy, MATLAB, and custom formats

  • Configurable Frequency: Adaptive logging intervals and selective data capture

  • Real-Time Monitoring: Live performance tracking and system diagnostics

  • Memory Efficient: Streaming writes and chunked processing for large datasets

  • Extensible Architecture: Custom logger development for specialized applications

  • Automated Organization: Timestamped directories and structured file naming

Module Reference

Base Logger Interface

The abstract foundation for all logging implementations, defining the core interface and common functionality.

class swarmsim.Loggers.base_logger.BaseLogger(populations, environment, config_path)[source]

Bases: Logger

Comprehensive base logger for multi-agent simulation data collection and analysis.

This logger provides the foundational logging infrastructure for recording simulation data, managing file outputs, timing execution, and providing extensible hooks for specialized logging behaviors. It handles multiple output formats, configurable logging frequencies, and automatic file organization with timestamped naming.

The BaseLogger serves as the parent class for all specialized loggers in the framework, providing common functionality while allowing customization through method overriding. It automatically manages file creation, data serialization, and experiment metadata.

Parameters:
  • populations (list of Population) – List of population objects whose data will be logged throughout the simulation. Each population provides state information and dynamics data.

  • environment (Environment) – Environment object containing spatial and contextual information for the simulation. Provides environmental state and parameters for logging.

  • config_path (str) – Path to the YAML configuration file containing logger parameters and settings.

Variables:
  • config (dict) – Complete configuration dictionary loaded from the YAML file.

  • logger_config (dict) – Logger-specific configuration subset extracted from the main config.

  • activate (bool) – Flag controlling whether logging is active. If False, logging operations are skipped.

  • date (str) – Human-readable timestamp of logger initialization for metadata.

  • name (str) – Unique identifier for the logging session, combining timestamp and config name.

  • log_freq (int) – Frequency (in simulation steps) for printing progress information to console. Set to 0 to disable console output.

  • save_freq (int) – Frequency (in simulation steps) for saving data to files. Set to 0 to disable file saving.

  • save_data_freq (int) – Frequency for saving raw data arrays (positions, states, etc.).

  • save_global_data_freq (int) – Frequency for saving accumulated global simulation data.

  • log_path (str) – Base directory path where all log files will be stored.

  • comment_enable (bool) – Whether to prompt for and include user comments in the log files.

  • populations (list of Population) – Reference to the populations being logged.

  • environment (Environment) – Reference to the simulation environment.

  • log_name_csv (str) – Full path to the CSV output file for tabular data.

  • log_name_txt (str) – Full path to the human-readable text output file.

  • log_name_npz (str) – Full path to the compressed NumPy data file.

  • log_name_mat (str) – Full path to the MATLAB-compatible data file.

  • start (float or None) – Timestamp when logging session started.

  • end (float or None) – Timestamp when logging session ended.

  • step_count (int or None) – Current simulation step counter.

  • experiment_count (int or None) – Counter for multiple experiment runs.

  • done (bool or None) – Flag indicating if the simulation should terminate early.

  • current_info (dict or None) – Dictionary containing data for the current simulation timestep.

  • global_info (dict or None) – Dictionary containing accumulated data across all timesteps.

Config Requirements:

The YAML configuration file must contain the following parameters under the BaseLogger’s class section

  • activatebool, optional

    Enable/disable logging. Default: True

  • log_freqint, optional

    Console output frequency (0 = never). Default: 0

  • save_freqint, optional

    File save frequency (0 = never). Default: 1

  • save_data_freqint, optional

    Raw data save frequency. Default: 0

  • save_global_data_freqint, optional

    Global data save frequency. Default: 0

  • log_pathstr, optional

    Output directory path. Default: "./logs"

  • log_namestr, optional

    Log file name suffix. Default: ""

  • comment_enablebool, optional

    Enable user comments. Default: False

Notes

File Organization:

The logger creates a directory structure: ``` log_path/ └── YYYYMMDD_HHMMSS_log_name/

├── YYYYMMDD_HHMMSS_log_name.csv # Tabular data ├── YYYYMMDD_HHMMSS_log_name.txt # Human-readable ├── YYYYMMDD_HHMMSS_log_name.npz # NumPy arrays └── YYYYMMDD_HHMMSS_log_name.mat # MATLAB format

```

Logging Workflow:

  1. Initialization: Create directories, initialize files

  2. Start Experiment: Begin timing and setup data structures

  3. Step Logging: Record data at each simulation timestep

  4. End Experiment: Finalize files and compute summary statistics

Extensibility:

Subclasses can override key methods: - log(): Customize what data is collected each step - log_internal_data(): Modify data processing and storage - start_experiment(): Add initialization procedures - end_experiment(): Add finalization procedures

Performance Considerations:

  • Data is accumulated in memory between save operations

  • Large simulations should use appropriate save_freq values

  • Multiple output formats can be disabled for performance

  • File I/O is batched for efficiency

Examples

Basic Configuration:

BaseLogger:
    activate: true
    log_freq: 100
    save_freq: 10
    log_path: "./simulation_logs"
    log_name: "base_experiment"
    comment_enable: false
__init__(populations, environment, config_path)[source]

Initialize the Logger base class.

Subclasses should call this constructor and then initialize their specific logging infrastructure (files, databases, network connections, etc.).

reset()[source]

Reset logger at the beginning of the simulation. Verifies if active, reset step counter and start time counter

Returns:

activate (bool flag to check whether the logger is active)

log(data=None)[source]

A function that defines the information to log.

Parameters:

data (dict composed of {name_variabile: value} to log)

Returns:

done (bool flag to truncate a simulation early. Default value=False.)

Notes

In the configuration file (a yaml file) there should be a namespace with the name of the log you are creating. By default, it does not truncate episode early. See add_data from Utils/logger_utils.py to quickly add variables to log.

close(data=None)[source]

Function to store final step information, end-of-the-experiment information and close logger

Parameters:

data (dict composed of {name_variabile: value} to log)

Returns:

activate (bool flag to check whether the logger is active)

log_external_data(data, save_mode=['npz', 'mat'])[source]
log_internal_data(save_mode=['txt', 'print'])[source]
output_data()[source]

Standard Logger Implementation

The primary logger implementation providing comprehensive data collection capabilities.

class swarmsim.Loggers.logger.Logger[source]

Bases: ABC

Abstract base class for data logging in multi-agent simulations.

This class defines the interface for recording simulation data, including agent states, environment information, and custom metrics. Loggers can write data to various formats (CSV, NPZ, HDF5, etc.) and provide mechanisms for early simulation termination.

Notes

Subclasses must implement the abstract methods:

  • reset(): Initialize logging for a new simulation run

  • log(): Record data at each timestep and return termination flag

  • close(): Finalize logging and close files/connections

The logger operates in the simulation loop and can influence simulation control by returning a termination flag from the log() method.

__init__()[source]

Initialize the Logger base class.

Subclasses should call this constructor and then initialize their specific logging infrastructure (files, databases, network connections, etc.).

abstractmethod reset()[source]

Initialize or reset the logger for a new simulation run.

This method should prepare the logger for a new simulation by clearing previous data, creating new files, or resetting internal state. It is called before the simulation loop begins.

Notes

Implementations should:

  • Clear any accumulated data from previous runs

  • Create new output files or database entries

  • Initialize timestamps and counters

  • Set up any required data structures

abstractmethod log(data=None)[source]

Record simulation data at the current timestep.

This method is called at each simulation timestep to record relevant data. It can log agent positions, velocities, environment state, performance metrics, or any other simulation data. The method can also signal early termination.

Parameters:

data (dict or None, optional) – Dictionary containing custom data to log, with format {variable_name: value}. If None, the logger should record default simulation data. Default is None.

Returns:

bool – Flag indicating whether the simulation should terminate early. If True, the simulation loop will exit before reaching the specified end time.

Notes

Implementations should:

  • Record current simulation state (time, agent positions, etc.)

  • Process and store custom data if provided

  • Update any running calculations (averages, statistics, etc.)

  • Check termination conditions (convergence, time limits, etc.)

  • Return True only if early termination is desired

The logger has access to all simulation components and can extract data from populations, environment, controllers, and interactions.

abstractmethod close()[source]

Finalize logging and clean up resources.

This method is called at the end of a simulation to properly close files, save final data, and clean up any resources used by the logger. It should ensure all data is safely stored and accessible for analysis.

Notes

Implementations should:

  • Close any open files or database connections

  • Save accumulated data to persistent storage

  • Write metadata (simulation parameters, timing info, etc.)

  • Compress or archive data if appropriate

  • Clean up temporary files or memory structures

  • Print summary information if desired

This method is called even if the simulation terminates early due to the logger returning True from the log() method.

Features

  • Multi-Population Support: Handle multiple agent populations simultaneously

  • Flexible Output Formats: CSV, NumPy, MATLAB, and HDF5 support

  • Real-Time Analytics: Live computation of statistical measures

  • Memory Optimization: Efficient handling of large-scale simulations

Position Logger

Specialized logger focused on trajectory tracking and spatial analysis.

class swarmsim.Loggers.position_logger.PositionLogger(populations, environment, config_path)[source]

Bases: BaseLogger

Position-based data logger for tracking agent spatial dynamics.

This logger extends the BaseLogger to capture and record detailed positional information about agent populations throughout the simulation. It automatically logs agent positions, computes position-based statistics, and tracks control inputs when available.

The logger captures position data at regular intervals and saves it in multiple formats (CSV, NPZ, MAT) for analysis and visualization.

Parameters:
  • populations (list of Population) – List of population objects whose positions will be logged.

  • environment (Environment) – Environment object containing spatial context and boundaries.

  • config_path (str) – Path to the YAML configuration file containing logger parameters.

Variables:
  • populations (list of Population) – Population objects being monitored for position data.

  • environment (Environment) – Environment context for spatial logging, inherited from BaseLogger.

  • global_info (dict) – Accumulated data structure for position information across timesteps.

  • step_count (int) – Current simulation step counter, inherited from BaseLogger.

  • save_freq (int) – Frequency (in steps) for saving position data, inherited from BaseLogger.

Config Requirements:
  • The YAML configuration file must contain logger parameters under the class section

  • PositionLogger (dict) – Configuration section for the position logger:

    • activate : bool, optional Enable/disable logging. Default: True

    • log_freq : int, optional Print frequency (0 = never). Default: 0

    • save_freq : int, optional Save frequency (0 = never). Default: 1

    • save_data_freq : int, optional Data save frequency. Default: 0

    • save_global_data_freq : int, optional Global data save frequency. Default: 0

    • log_path : str, optional Output directory path. Default: "./logs"

    • log_name : str, optional Log file name suffix. Default: ""

    • comment_enable : bool, optional Enable experiment comments. Default: False

Notes

Data Capture:

The logger automatically captures:

  • Agent Positions: Full position arrays for all agents in all populations

  • Control Inputs: Mean control input values when available (e.g., u for first population)

  • Temporal Information: Timestep and timing data for synchronization

File Formats:

Position data is saved in multiple formats for flexibility:

  • CSV: Human-readable comma-separated values for spreadsheet analysis

  • NPZ: Compressed NumPy format for efficient Python data loading

  • MAT: MATLAB format for analysis in MATLAB/Octave

Performance Considerations:

  • Data is accumulated in memory and saved at specified intervals

  • Memory usage scales with number of agents and save frequency

  • For large populations, consider increasing save_freq to reduce I/O overhead

Examples

Basic Configuration:

PositionLogger:
    activate: true
    save_freq: 10
    log_path: "./simulation_logs"
    log_name: "position_data"
__init__(populations, environment, config_path)[source]

Initialize the Logger base class.

Subclasses should call this constructor and then initialize their specific logging infrastructure (files, databases, network connections, etc.).

log_internal_data(save_mode=['csv', 'npz', 'mat'])[source]

Log position data and control for all populations.

This method captures detailed position information for all agents in all populations and information about control inputs. Data is saved according to the specified save frequency and formats.

Parameters:

save_mode (list of str, optional) –

List of output formats for data saving. Default: ['csv','npz','mat']

  • 'csv' : Comma-separated values for spreadsheet analysis

  • 'npz' : Compressed NumPy format for Python analysis

  • 'mat' : MATLAB format for MATLAB/Octave analysis

  • 'print' : Console output for debugging

  • 'txt' : Human-readable text format

Notes

Position Data Capture:

The method uses get_positions() utility to extract:

  • Full position arrays for all agents in all populations

  • Timestep information

  • Population identifiers for multi-population simulations

Control Input:

For the first population (index 0), computes and logs:

  • Mean control input value: np.mean(populations[0].u)

  • Useful for monitoring control effort and system behavior

Save Frequency:

Data logging occurs only when step_count % save_freq == 0, ensuring efficient memory usage and I/O performance for long simulations.

Data Format:

The logged data structure includes:

  • positions_Population_N : Position arrays for population N

  • u : Mean control input for first population

  • step : Current simulation step

  • time : Simulation time or timestamp

Capabilities

  • High-Resolution Tracking: Detailed trajectory recording

  • Spatial Statistics: Automatic computation of spatial measures

  • Trajectory Analysis: Built-in diffusion and mobility metrics

  • Compression: Efficient storage of position time series

Shepherding Logger

Domain-specific logger for shepherding simulations with predator-prey dynamics.

class swarmsim.Loggers.shepherding_logger.ShepherdingLogger(populations, environment, config_path)[source]

Bases: BaseLogger

Specialized logger for shepherding simulations with task-specific metrics.

This logger extends the BaseLogger to capture and analyze shepherding-specific metrics such as target capture rates, completion status, and task progression. It monitors the effectiveness of shepherding algorithms by tracking how many targets are successfully guided to goal regions.

The logger computes the shepherding metric xi (ξ), which represents the fraction of targets successfully captured, and monitors task completion to enable early termination when all targets reach the goal region.

Parameters:
  • populations (list of Population) – List of population objects in the shepherding simulation. Typically includes target agents (index 0) and optionally shepherd agents.

  • environment (Environment) – Shepherding environment object containing goal regions and spatial boundaries. Must support shepherding-specific geometric calculations.

  • config_path (str) – Path to the YAML configuration file containing logger parameters.

Variables:
  • populations (list of Population) – Population objects being monitored, inherited from BaseLogger. First population (index 0) is typically the target agents.

  • environment (Environment) – Shepherding environment with goal regions, inherited from BaseLogger.

  • xi (float) – Current shepherding metric (fraction of captured targets). Range: [0.0, 1.0] where 1.0 indicates all targets captured.

  • done (bool) – Task completion flag indicating whether all targets are captured.

  • current_info (dict) – Current timestep information including shepherding metrics.

  • global_info (dict) – Accumulated simulation data across all timesteps.

Config Requirements:
  • The YAML configuration file must contain logger parameters under the class section

  • ShepherdingLogger (dict) – Configuration section for the shepherding logger: - activate : bool, optional

    Enable/disable logging. Default: True

    • log_freqint, optional

      Print frequency (0 = never). Default: 0

    • save_freqint, optional

      Save frequency (0 = never). Default: 1

    • save_data_freqint, optional

      Data save frequency. Default: 0

    • save_global_data_freqint, optional

      Global data save frequency. Default: 0

    • log_pathstr, optional

      Output directory path. Default: "./logs"

    • log_namestr, optional

      Log file name suffix. Default: ""

    • comment_enablebool, optional

      Enable experiment comments. Default: False

Notes

Shepherding Metrics:

The logger computes several task-specific metrics:

  • Xi (ξ): Fraction of targets successfully captured in goal region

  • Task Completion: Boolean flag indicating complete task success

  • Temporal Progression: Evolution of capture rate over time

Early Termination:

The logger can trigger early simulation termination when all targets are successfully shepherded to the goal region, improving computational efficiency for successful trials.

Integration with Shepherding Utils:

Uses specialized utility functions:

  • xi_shepherding(): Computes capture fraction metric

  • get_done_shepherding(): Determines task completion status

Examples

Basic Configuration:

ShepherdingLogger:
    activate: true
    log_freq: 10
    save_freq: 1
    log_path: "./shepherding_logs"
    log_name: "shepherding_experiment"
__init__(populations, environment, config_path)[source]

Initialize the Logger base class.

Subclasses should call this constructor and then initialize their specific logging infrastructure (files, databases, network connections, etc.).

log(data=None)[source]

A function that defines the information to log.

Parameters:

data (dict composed of {name_variabile: value} to log)

Returns:

done (bool flag to truncate a simulation early. Default value=False.)

Notes

In the configuration file (a yaml file) there should be a namespace with the name of the log you are creating. By default, it does not truncate episode early. See add_data from Utils/logger_utils.py to quickly add variables to log.

log_internal_data(save_mode=['print', 'txt'])[source]
get_xi()[source]

Get metric for shepherding xi, i.e., fraction of captured targets. :Returns: float (fraction of captured targets)

get_event()[source]

Verify if every target is inside the goal region :Returns: bool (true is every target is inside the goal region, false otherwise)

Specialized Features

  • Herding Metrics: Success rates, containment measures, escape events

  • Predator-Prey Analysis: Pursuit dynamics, capture statistics

  • Formation Tracking: Herd cohesion, shape evolution, splitting events

  • Strategic Analysis: Decision points, behavioral transitions

See Also