Device Management
ScalingPolicy
-
class ScalingPolicy : public Object
Abstract base class for scaling policies.
ScalingPolicy decides the target performance state for a backend accelerator based on per-backend state.
Subclassed by ns3::ConservativeScalingPolicy, ns3::UtilizationScalingPolicy
Public Functions
-
virtual Ptr<ScalingDecision> Decide(uint32_t backendIdx, const ClusterState::BackendState &backend) = 0
Decide the target performance state for a backend.
- Parameters:
backendIdx – The backend index in the cluster.
backend – The backend’s current state.
- Returns:
A scaling decision, or nullptr if no change is needed.
-
virtual std::string GetName() const = 0
Get the policy name for logging.
- Returns:
A string identifying this policy type.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual Ptr<ScalingDecision> Decide(uint32_t backendIdx, const ClusterState::BackendState &backend) = 0
UtilizationScalingPolicy
-
class UtilizationScalingPolicy : public ns3::ScalingPolicy
Utilization-based DVFS scaling policy inspired by Linux ondemand governor.
Simple binary policy using the accelerator’s performance state table:
Busy or queued tasks -> highest performance state (aggressive scale-up for latency)
Idle -> lowest performance state (energy savings)
Public Functions
-
virtual Ptr<ScalingDecision> Decide(uint32_t backendIdx, const ClusterState::BackendState &backend) override
Decide the target performance state for a backend.
- Parameters:
backendIdx – The backend index in the cluster.
backend – The backend’s current state.
- Returns:
A scaling decision, or nullptr if no change is needed.
-
virtual std::string GetName() const override
Get the policy name for logging.
- Returns:
A string identifying this policy type.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
ConservativeScalingPolicy
-
class ConservativeScalingPolicy : public ns3::ScalingPolicy
Work-proportional DVFS policy with conservative one-OPP stepping.
Computes a target OPP proportional to the estimated drain time of pending compute work (remaining FLOPS / nominal compute rate), then steps one OPP at a time toward that target. The TargetDrainTime attribute sets the drain time that maps to the highest OPP. Handles heterogeneous workloads naturally since heavy tasks contribute more FLOPS than lightweight ones.
Public Functions
-
virtual Ptr<ScalingDecision> Decide(uint32_t backendIdx, const ClusterState::BackendState &backend) override
Decide the target performance state for a backend.
- Parameters:
backendIdx – The backend index in the cluster.
backend – The backend’s current state.
- Returns:
A scaling decision, or nullptr if no change is needed.
-
virtual std::string GetName() const override
Get the policy name for logging.
- Returns:
A string identifying this policy type.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual Ptr<ScalingDecision> Decide(uint32_t backendIdx, const ClusterState::BackendState &backend) override
DeviceProtocol
-
class DeviceProtocol : public Object
Abstract base class for device metrics protocols.
DeviceProtocol encapsulates how accelerator metrics are serialized into packets and parsed back into DeviceMetrics objects. Each accelerator type provides its own concrete protocol implementation.
Subclassed by ns3::GpuDeviceProtocol
Public Functions
-
virtual Ptr<Packet> CreateMetricsPacket(Ptr<const Accelerator> accel) = 0
Serialize accelerator state into a metrics packet.
Called by the server application on task lifecycle events.
- Parameters:
accel – The accelerator whose state is read.
- Returns:
A packet containing the serialized metrics header.
-
virtual Ptr<DeviceMetrics> ParseMetrics(Ptr<Packet> packet) = 0
Parse a metrics packet into a DeviceMetrics object.
Called by the DeviceManager when a type-4 packet arrives.
- Parameters:
packet – The packet containing the metrics header.
- Returns:
The parsed DeviceMetrics.
-
virtual std::string GetName() const = 0
Get the name of this device protocol.
- Returns:
A string identifying the protocol.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual Ptr<Packet> CreateMetricsPacket(Ptr<const Accelerator> accel) = 0
GpuDeviceProtocol
-
class GpuDeviceProtocol : public ns3::DeviceProtocol
Concrete DeviceProtocol for GPU accelerators.
Serializes metrics using DeviceMetricsHeader (type 4) and parses received metrics packets into DeviceMetrics objects.
Public Functions
-
virtual Ptr<Packet> CreateMetricsPacket(Ptr<const Accelerator> accel) override
Serialize accelerator state into a metrics packet.
Called by the server application on task lifecycle events.
- Parameters:
accel – The accelerator whose state is read.
- Returns:
A packet containing the serialized metrics header.
-
virtual Ptr<DeviceMetrics> ParseMetrics(Ptr<Packet> packet) override
Parse a metrics packet into a DeviceMetrics object.
Called by the DeviceManager when a type-4 packet arrives.
- Parameters:
packet – The packet containing the metrics header.
- Returns:
The parsed DeviceMetrics.
-
virtual std::string GetName() const override
Get the name of this device protocol.
- Returns:
A string identifying the protocol.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
virtual Ptr<Packet> CreateMetricsPacket(Ptr<const Accelerator> accel) override
DeviceManager
-
class DeviceManager : public Object
Manages performance scaling for backend accelerators in the orchestrator.
DeviceManager is a concrete component of Orchestrator. It evaluates a pluggable ScalingPolicy using per-backend state from ClusterState, and sends ScalingCommandHeader packets to backends via the orchestrator’s worker ConnectionManager.
Public Types
-
typedef void (*StateChangedTracedCallback)(uint32_t backendIdx, uint32_t oldStateIdx, uint32_t newStateIdx)
TracedCallback signature for performance state change events.
- Param backendIdx:
The backend index.
- Param oldStateIdx:
The previous performance state index.
- Param newStateIdx:
The new performance state index.
Public Functions
-
void Start(const Cluster &cluster, Ptr<ConnectionManager> backendCm, ClusterState &state)
Initialize the device manager with a cluster and connection manager.
Must be called before HandleMetrics() or EvaluateScaling().
- Parameters:
cluster – The backend cluster.
backendCm – The backend connection manager for sending commands.
state – The cluster state to initialize performance states and backend compute capabilities.
-
void HandleMetrics(Ptr<Packet> packet, uint32_t backendIdx, ClusterState &state)
Store metrics received from a backend.
Called by Orchestrator when a type-4 packet arrives.
- Parameters:
packet – The metrics packet (DeviceMetricsHeader).
backendIdx – The backend index in the cluster.
state – The cluster state to update with parsed metrics.
-
bool TryConsumeMetrics(Ptr<Packet> buffer, const Address &from, ClusterState &state)
Try to consume a device metrics message from a receive buffer.
Peeks at the first byte of the buffer. If it is a metrics message and enough data is available, the message is consumed (removed from the buffer), parsed, and stored in ClusterState.
- Parameters:
buffer – The receive buffer (modified in-place if consumed).
from – The backend address (used to resolve backend index).
state – The cluster state to update.
- Returns:
true if a metrics message was consumed, false otherwise.
-
void EvaluateScaling(ClusterState &state)
Evaluate scaling decisions for all backends.
Called by Orchestrator on task events. For each backend, runs ScalingPolicy::Decide() and sends command packets if the performance state changed.
- Parameters:
state – The cluster state with per-backend load and metrics.
Public Static Functions
-
static TypeId GetTypeId()
Get the type ID.
- Returns:
The object TypeId.
-
typedef void (*StateChangedTracedCallback)(uint32_t backendIdx, uint32_t oldStateIdx, uint32_t newStateIdx)