Accelerators

Accelerator

class Accelerator : public Object

Abstract base class for computational accelerators.

Accelerator defines the interface for hardware that processes computational tasks. Concrete implementations include GpuAccelerator, and could include ASIC, FPGA, or other specialized hardware accelerators.

Example usage:

// Create a concrete accelerator (e.g., GPU)
Ptr<GpuAccelerator> gpu = CreateObject<GpuAccelerator>();
gpu->SetAttribute("ProcessingModel", PointerValue(processingModel));
node->AggregateObject(gpu);

// Connect to trace sources
gpu->TraceConnectWithoutContext("TaskStarted", MakeCallback(&OnStart));
gpu->TraceConnectWithoutContext("TaskCompleted", MakeCallback(&OnComplete));

// Submit a task
Ptr<SimpleTask> task = CreateObject<SimpleTask>();
task->SetComputeDemand(1e9);
task->SetInputSize(1e6);
task->SetOutputSize(1e6);
gpu->SubmitTask(task);

Subclassed by ns3::GpuAccelerator

Public Types

typedef void (*TaskTracedCallback)(Ptr<const Task> task)

TracedCallback signature for task events.

Param task:: The task.

typedef void (*TaskCompletedTracedCallback)(Ptr<const Task> task, Time duration)

TracedCallback signature for task completion.

Param task:: The completed task.
Param duration:: The total processing duration.

typedef void (*TaskFailedTracedCallback)(Ptr<const Task> task, std::string reason)

TracedCallback signature for task failure.

Param task:: The failed task.
Param reason:: Description of why the task failed.

typedef void (*PowerTracedCallback)(double power)

TracedCallback signature for power state changes.

Param power:: Current power consumption in Watts.

typedef void (*EnergyTracedCallback)(double energy)

TracedCallback signature for energy accumulation.

Param energy:: Total energy consumed in Joules.

typedef void (*TaskEnergyTracedCallback)(Ptr<const Task> task, double energy)

TracedCallback signature for per-task energy.

Param task:: The completed task.
Param energy:: Energy consumed by this task in Joules.

Public Functions

Accelerator(): Default constructor.

~Accelerator() override: Destructor.

virtual void SubmitTask(Ptr<Task> task) = 0

Submit a task for execution.

This is the core method that all accelerators must implement. The implementation decides how to process the task (queueing, scheduling, execution model, etc.).

Parameters:: task – The task to execute.

virtual std::string GetName() const = 0

Get the name of this accelerator type.

Returns:: A string identifying the accelerator (e.g., “GPU”, “FPGA”, “ASIC”).

virtual uint32_t GetQueueLength() const

Get the current queue length.

Default implementation returns 0. Override in subclasses that maintain a task queue.

Returns:: Number of tasks in queue (including currently executing).

virtual bool IsBusy() const

Check if accelerator is currently busy.

Default implementation returns false. Override in subclasses that track execution state.

Returns:: True if executing a task.

virtual double GetUtilization() const

Get the current compute utilization.

Default implementation returns 0.0. GpuAccelerator overrides this with a rolling-window measurement of the fraction of time spent in the compute phase over the configured UtilizationWindow.

Returns:: Utilization in [0.0, 1.0].

virtual double GetComputeRate() const

Get the effective compute rate of the accelerator.

Default implementation returns 0.0. Override in subclasses that model computational throughput.

Returns:: Effective compute rate in FLOP/s.

virtual void ApplyState(uint32_t stateIdx)

Apply a performance state by index.

Each subclass interprets the index for its hardware. For GPUs this looks up the OPP table and calls SetFrequency/SetVoltage. Default implementation is a no-op.

Parameters:: stateIdx – Index into the device’s performance state table.

virtual uint32_t GetNumPerformanceStates() const

Get the number of available performance states.

Default implementation returns 0 (no scaling). Override in subclasses that support multiple performance states.

Returns:: Number of performance states.

virtual double GetComputeRateAtState(uint32_t stateIdx) const

Get the compute rate at a given performance state.

Default implementation returns the current compute rate regardless of state index. Override in subclasses that support scaling.

Parameters:: stateIdx – Performance state index.
Returns:: Compute rate at that state in FLOP/s.

Ptr<Node> GetNode() const

Get the node this accelerator is aggregated to.

Returns:: Pointer to the node, or nullptr if not aggregated.

double GetCurrentPower() const

Get the current power consumption.

Returns:: Current power in Watts, or 0 if no EnergyModel is configured.

double GetTotalEnergy() const

Get the total energy consumed.

Returns:: Total energy consumed in Joules, or 0 if no EnergyModel is configured.

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

GpuAccelerator

class GpuAccelerator : public ns3::Accelerator 

GPU accelerator for processing computational tasks.

GpuAccelerator models a GPU processing unit. Task processing time is determined by the attached ProcessingModel, which must be set before tasks can be submitted.

Public Functions

virtual void SubmitTask(Ptr<Task> task) override

Submit a task for execution.

This is the core method that all accelerators must implement. The implementation decides how to process the task (queueing, scheduling, execution model, etc.).

Parameters:: task – The task to execute.

virtual std::string GetName() const override

Get the name of this accelerator type.

Returns:: A string identifying the accelerator (e.g., “GPU”, “FPGA”, “ASIC”).

virtual uint32_t GetQueueLength() const override

Get the current queue length.

Default implementation returns 0. Override in subclasses that maintain a task queue.

Returns:: Number of tasks in queue (including currently executing).

virtual bool IsBusy() const override

Check if accelerator is currently busy.

Default implementation returns false. Override in subclasses that track execution state.

Returns:: True if executing a task.

virtual double GetUtilization() const override

Get the current compute utilization.

Default implementation returns 0.0. GpuAccelerator overrides this with a rolling-window measurement of the fraction of time spent in the compute phase over the configured UtilizationWindow.

Returns:: Utilization in [0.0, 1.0].

virtual double GetComputeRate() const override

Get the effective compute rate of the accelerator.

Default implementation returns 0.0. Override in subclasses that model computational throughput.

Returns:: Effective compute rate in FLOP/s.

virtual void ApplyState(uint32_t stateIdx) override

Apply a performance state by index.

Each subclass interprets the index for its hardware. For GPUs this looks up the OPP table and calls SetFrequency/SetVoltage. Default implementation is a no-op.

Parameters:: stateIdx – Index into the device’s performance state table.

virtual uint32_t GetNumPerformanceStates() const override

Get the number of available performance states.

Default implementation returns 0 (no scaling). Override in subclasses that support multiple performance states.

Returns:: Number of performance states.

virtual double GetComputeRateAtState(uint32_t stateIdx) const override

Get the compute rate at a given performance state.

Default implementation returns the current compute rate regardless of state index. Override in subclasses that support scaling.

Parameters:: stateIdx – Performance state index.
Returns:: Compute rate at that state in FLOP/s.

double GetVoltage() const

Get the current operating voltage.

Returns:: Operating voltage in Volts.

double GetFrequency() const

Get the current operating frequency.

Returns:: Operating frequency in Hz.

void SetFrequency(double frequency)

Set the operating frequency.

Scales the compute rate proportionally. If a task is in progress, only the remaining compute phase is rescaled.

Parameters:: frequency – The new frequency in Hz.

void SetVoltage(double voltage)

Set the operating voltage.

Parameters:: voltage – The new voltage in Volts.

double GetMemoryBandwidth() const

Get GPU DRAM memory bandwidth in bytes/sec.

Returns:: The memory bandwidth.

double GetTransferBandwidth() const

Get host-device transfer bandwidth in bytes/sec (e.g. PCIe).

Returns:: The transfer bandwidth.

void AddOperatingPoint(double frequency, double voltage)

Add a discrete operating point (frequency-voltage pair).

Points are kept sorted by frequency ascending. This defines the GPU’s OPP table used by ApplyState() to translate performance state indices into frequency/voltage commands.

Parameters:

frequency – Operating frequency in Hz.
voltage – Operating voltage in Volts.

const std::vector<OperatingPoint> &GetOperatingPoints() const

Get the operating point table.

Returns:: Const reference to the OPP table, sorted by frequency ascending.

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

ProcessingModel

class ProcessingModel : public Object

Abstract base class for compute processing models.

ProcessingModel defines the interface for calculating task processing characteristics such as execution time and output size.

Example usage:

Ptr<FixedRatioProcessingModel> model = CreateObject<FixedRatioProcessingModel>();

Ptr<GpuAccelerator> gpu = CreateObject<GpuAccelerator>();
gpu->SetAttribute("ComputeRate", DoubleValue(1e12));
gpu->SetAttribute("MemoryBandwidth", DoubleValue(900e9));
gpu->SetAttribute("ProcessingModel", PointerValue(model));

Ptr<SimpleTask> task = CreateObject<SimpleTask>();
task->SetComputeDemand(1e9);
task->SetInputSize(1e6);
task->SetOutputSize(1e6);

ProcessingModel::Result result = model->Process(task, gpu);

Subclassed by ns3::FixedRatioProcessingModel

Public Functions

ProcessingModel(): Default constructor.

~ProcessingModel() override: Destructor.

virtual Result Process(Ptr<const Task> task, Ptr<const Accelerator> accelerator) const = 0

Calculate processing characteristics for a task.

Implementations should examine the task properties and return the calculated phase timings, processing time, and output size. If the task type is not supported, return a Result with success=false.

Parameters:

task – The task to process.
accelerator – The accelerator providing hardware characteristics.

Returns:

Result containing phase timings, output size, and success status.

virtual std::string GetName() const = 0

Get the name of this processing model.

Returns:: A string identifying the model (e.g., “FixedRatio”).

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

struct Result

Result of processing a task.

Contains the calculated phase timings, total processing time, output size, and success status.

Public Functions

inline Result(): Default constructor creates a failed result.

inline Result(Time inputTime, Time compute, Time outputTime, uint64_t output)

Constructor for successful result.

Parameters:

inputTime – Input-transfer time
compute – Input-independent compute-execution time
outputTime – Output-transfer time
output – Output size in bytes

Public Members

Time processingTime: Total time to process the task.

Time inputTransferTime: Time spent transferring input on-device.

Time computeTime: Time spent executing compute work.

Time outputTransferTime: Time spent transferring output on-device.

uint64_t outputSize: Output data size in bytes.

bool success: True if processing calculation succeeded.

FixedRatioProcessingModel

class FixedRatioProcessingModel : public ns3::ProcessingModel 

Processing model using three-phase timing on GpuAccelerator.

FixedRatioProcessingModel calculates processing time using a three-phase model (input transfer, compute, output transfer) based on Task properties and GpuAccelerator hardware characteristics.

Processing time calculation:

Input transfer: inputSize / GpuAccelerator.TransferBandwidth
Compute: computeDemand / GpuAccelerator.ComputeRate
Output transfer: outputSize / GpuAccelerator.TransferBandwidth
Total: sum of all three phases

Example usage:

Ptr<FixedRatioProcessingModel> model = CreateObject<FixedRatioProcessingModel>();

Ptr<GpuAccelerator> gpu = CreateObject<GpuAccelerator>();
gpu->SetAttribute("ComputeRate", DoubleValue(1e12));           // 1 TFLOPS
gpu->SetAttribute("MemoryBandwidth", DoubleValue(900e9));      // 900 GB/s GPU DRAM
gpu->SetAttribute("TransferBandwidth", DoubleValue(32e9));     // 32 GB/s PCIe 5.0 x8
gpu->SetAttribute("ProcessingModel", PointerValue(model));

Public Functions

FixedRatioProcessingModel(): Default constructor.

~FixedRatioProcessingModel() override: Destructor.

virtual Result Process(Ptr<const Task> task, Ptr<const Accelerator> accelerator) const override

Calculate processing characteristics for a task.

Implementations should examine the task properties and return the calculated phase timings, processing time, and output size. If the task type is not supported, return a Result with success=false.

Parameters:

task – The task to process.
accelerator – The accelerator providing hardware characteristics.

Returns:

Result containing phase timings, output size, and success status.

virtual std::string GetName() const override

Get the name of this processing model.

Returns:: A string identifying the model (e.g., “FixedRatio”).

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

QueueScheduler

class QueueScheduler : public Object

Abstract base class for task queue scheduling policies.

QueueScheduler defines the interface for managing task queues within accelerators. Subclasses implement different scheduling algorithms such as FIFO, priority queues, or batching strategies.

Example usage:

Ptr<FifoQueueScheduler> scheduler = CreateObject<FifoQueueScheduler>();

Ptr<SimpleTask> task1 = CreateObject<SimpleTask>();
Ptr<SimpleTask> task2 = CreateObject<SimpleTask>();

scheduler->Enqueue(task1);
scheduler->Enqueue(task2);

Ptr<Task> next = scheduler->Dequeue();  // Returns task1 (FIFO order)

Subclassed by ns3::FifoQueueScheduler

Public Functions

QueueScheduler(): Default constructor.

~QueueScheduler() override: Destructor.

virtual void Enqueue(Ptr<Task> task) = 0

Add a task to the queue.

The task is added according to the scheduling policy implemented by the subclass.

Parameters:: task – The task to enqueue.

virtual Ptr<Task> Dequeue() = 0

Remove and return the next task from the queue.

Returns the next task according to the scheduling policy.

Returns:: The next task, or nullptr if the queue is empty.

virtual Ptr<Task> Peek() const = 0

Return the next task without removing it.

Returns:: The next task, or nullptr if the queue is empty.

virtual bool IsEmpty() const = 0

Check if the queue is empty.

Returns:: True if the queue contains no tasks.

virtual uint32_t GetLength() const = 0

Get the number of tasks in the queue.

Returns:: The number of queued tasks.

virtual std::string GetName() const = 0

Get the name of this scheduling algorithm.

Returns:: A string identifying the scheduler (e.g., “FIFO”).

virtual void Clear() = 0

Remove all tasks from the queue.

Used during cleanup to release all task references.

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

FifoQueueScheduler

class FifoQueueScheduler : public ns3::QueueScheduler 

FIFO (First-In-First-Out) task queue scheduler.

FifoQueueScheduler processes tasks in strict first-in-first-out order. Tasks are dequeued in the same order they were enqueued, making this the simplest and most predictable scheduling policy.

Example usage:

Ptr<FifoQueueScheduler> scheduler = CreateObject<FifoQueueScheduler>();

Ptr<SimpleTask> task1 = CreateObject<SimpleTask>();
Ptr<SimpleTask> task2 = CreateObject<SimpleTask>();

scheduler->Enqueue(task1);
scheduler->Enqueue(task2);

Ptr<Task> next = scheduler->Dequeue();  // Returns task1
next = scheduler->Dequeue();             // Returns task2

Public Functions

FifoQueueScheduler(): Default constructor.

~FifoQueueScheduler() override: Destructor.

virtual void Enqueue(Ptr<Task> task) override

Add a task to the queue.

The task is added according to the scheduling policy implemented by the subclass.

Parameters:: task – The task to enqueue.

virtual Ptr<Task> Dequeue() override

Remove and return the next task from the queue.

Returns the next task according to the scheduling policy.

Returns:: The next task, or nullptr if the queue is empty.

virtual Ptr<Task> Peek() const override

Return the next task without removing it.

Returns:: The next task, or nullptr if the queue is empty.

virtual bool IsEmpty() const override

Check if the queue is empty.

Returns:: True if the queue contains no tasks.

virtual uint32_t GetLength() const override

Get the number of tasks in the queue.

Returns:: The number of queued tasks.

virtual std::string GetName() const override

Get the name of this scheduling algorithm.

Returns:: A string identifying the scheduler (e.g., “FIFO”).

virtual void Clear() override

Remove all tasks from the queue.

Used during cleanup to release all task references.

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

EnergyModel

class EnergyModel : public Object

Abstract base class for accelerator energy models.

EnergyModel defines the interface for calculating power consumption of computational accelerators.

Power consumption is modeled with two components:

Static power: Always consumed when the device is powered on
Dynamic power: Consumed during active computation

Example usage:

Ptr<DvfsEnergyModel> energy = CreateObject<DvfsEnergyModel>();
energy->SetAttribute("StaticPower", DoubleValue(30.0));
energy->SetAttribute("EffectiveCapacitance", DoubleValue(2e-9));

Ptr<GpuAccelerator> gpu = CreateObject<GpuAccelerator>();
gpu->SetAttribute("EnergyModel", PointerValue(energy));

Subclassed by ns3::DvfsEnergyModel

Public Functions

EnergyModel(): Default constructor.

~EnergyModel() override: Destructor.

virtual PowerState CalculateIdlePower(Ptr<Accelerator> accelerator) = 0

Calculate power consumption when accelerator is idle.

Parameters:: accelerator – The accelerator to calculate idle power for.
Returns:: PowerState with idle power values.

virtual PowerState CalculateActivePower(Ptr<Accelerator> accelerator, double utilization) = 0

Calculate power consumption when accelerator is active.

Parameters:

accelerator – The accelerator to calculate active power for.
utilization – Current task-local utilization level [0.0, 1.0].

Returns:

PowerState with active power values.

virtual std::string GetName() const = 0

Get the name of this energy model.

Returns:: A string identifying the energy model (e.g., “DVFS”, “Fixed”).

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

struct PowerState

Represents the current power state of an accelerator.

PowerState encapsulates the static and dynamic power components of an accelerator’s power consumption.

Public Functions

inline double GetTotalPower() const

Get total power consumption.

Returns:: Sum of static and dynamic power in Watts.

Public Members

double staticPower = {0.0}: Static/leakage power in Watts.

double dynamicPower = {0.0}: Dynamic/switching power in Watts.

DvfsEnergyModel

class DvfsEnergyModel : public ns3::EnergyModel 

DVFS-based energy model for accelerators.

DvfsEnergyModel implements the standard DVFS power equation: P_dynamic = C * V^2 * f * utilization

Where:

C is the effective capacitance (switching capacitance)
V is the operating voltage
f is the operating frequency
utilization is the task-local compute utilization factor [0.0, 1.0]

Total power = P_static + P_dynamic

Example usage:

Ptr<DvfsEnergyModel> energy = CreateObject<DvfsEnergyModel>();
energy->SetAttribute("StaticPower", DoubleValue(30.0));     // 30W static
energy->SetAttribute("EffectiveCapacitance", DoubleValue(2e-9));  // 2nF

// For GPU with 1.5GHz, 0.9V:
// P_dynamic = 2e-9 * 0.81 * 1.5e9 = 2.43W
// Total when active: 30 + 2.43 = 32.43W

Public Functions

virtual EnergyModel::PowerState CalculateIdlePower(Ptr<Accelerator> accelerator) override

Calculate power consumption when accelerator is idle.

Parameters:: accelerator – The accelerator to calculate idle power for.
Returns:: PowerState with idle power values.

virtual EnergyModel::PowerState CalculateActivePower(Ptr<Accelerator> accelerator, double utilization) override

Calculate power consumption when accelerator is active.

Parameters:

accelerator – The accelerator to calculate active power for.
utilization – Current task-local utilization level [0.0, 1.0].

Returns:

PowerState with active power values.

virtual std::string GetName() const override

Get the name of this energy model.

Returns:: A string identifying the energy model (e.g., “DVFS”, “Fixed”).

double GetEffectiveCapacitance() const

Get the effective capacitance.

Returns:: Effective capacitance in Farads.

double GetStaticPower() const

Get the static power.

Returns:: Static power in Watts.

Public Static Functions

static TypeId GetTypeId()

Get the type ID.

Returns:: The object TypeId.

DeviceMetrics

class DeviceMetrics : public SimpleRefCount<DeviceMetrics>

Backend telemetry reported from accelerator to orchestrator.

Contains only metrics the orchestrator cannot infer from its own bookkeeping.

Public Members

double utilization = {0}: Rolling compute utilization [0.0, 1.0].

double currentPower = {0}: Current total power draw in Watts.

ScalingDecision

class ScalingDecision : public SimpleRefCount<ScalingDecision>

Scaling command produced by a ScalingPolicy.

Carries the target performance state index for a backend accelerator. Serialized into a ScalingCommandHeader by DeviceManager and sent to the backend.

Public Members

uint32_t targetStateIdx = {0}: Target performance state index.