Optimizers

class HeteroSymNN.Core.optimizers.AdamOptimizer(learning_rate: float | None = None, computational_device: Literal['GPU', 'CPU'] | None = None, device_id: int | None = None, beta1: float = 0.9, beta2: float = 0.999, epsilon: float = 1e-8)[source]

Bases: Optimizer

Adam optimizer.

Parameters:

learning_rate (float, optional) – The learning rate. Defaults to 0.001.
computational_device (Literal["GPU", "CPU"], optional) – The device where computations will be performed.
device_id (int, optional) – The ID of the GPU to use if computational_device is “GPU”.
beta1 (float, optional) – The exponential decay rate for the 1st moment estimates. Defaults to 0.9.
beta2 (float, optional) – The exponential decay rate for the 2nd moment estimates. Defaults to 0.999.
epsilon (float, optional) – A small constant for numerical stability. Defaults to 1e-8.

get_config()[source]

Returns the configuration of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the configuration parameters necessary to reconstruct the optimizer instance.

Returns:: Dictionary containing the configuration parameters.
Return type:: dict

get_state()[source]

Returns the internal state of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the current internal state (e.g., iteration count, moving averages) for serialization.

Returns:: Dictionary containing the optimizer state.
Return type:: dict

set_state(state, be)[source]

Sets the internal state of the optimizer.

This method should be implemented by subclasses to restore the internal state from a provided dictionary.

Parameters:

state (dict) – The state dictionary to load.
be (module) – The backend module (numpy or cupy) to use for creating arrays.

step(layers: list, inputs)[source]

Performs a single optimization step using Adam.

Parameters:

layers (list) – List of layers to update.
inputs (Any) – Input data.

class HeteroSymNN.Core.optimizers.Optimizer(learning_rate: float | None = None, computational_device: Literal['GPU', 'CPU'] | None = None, device_id: int | None = None)[source]

Bases: object

Base class for all optimizers.

Parameters:

learning_rate (float, optional) – The learning rate for the optimizer.
computational_device (Literal["GPU", "CPU"], optional) – The device where computations will be performed.
device_id (int, optional) – The ID of the GPU to use if computational_device is “GPU”.

get_config()[source]

Returns the configuration of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the configuration parameters necessary to reconstruct the optimizer instance.

Returns:: Dictionary containing the configuration parameters.
Return type:: dict

get_state()[source]

Returns the internal state of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the current internal state (e.g., iteration count, moving averages) for serialization.

Returns:: Dictionary containing the optimizer state.
Return type:: dict

set_gpu_id(new_id: int)[source]

Sets the GPU ID for the optimizer.

Parameters:: new_id (int) – The new GPU ID.

set_state(state, be)[source]

Sets the internal state of the optimizer.

This method should be implemented by subclasses to restore the internal state from a provided dictionary.

Parameters:

state (dict) – The state dictionary to load.
be (module) – The backend module (numpy or cupy) to use for creating arrays.

step(layers: list, inputs)[source]

Performs a single optimization step.

This method must be implemented by subclasses to define the specific optimization logic (e.g., SGD update, Adam update) applied to the layers.

Parameters:

layers (list) – List of layers to update.
inputs (Any) – Input data (used by some optimizers for gradient calculation context if needed).

class HeteroSymNN.Core.optimizers.SgdOptimizer(learning_rate: float | None = None, computational_device: Literal['GPU', 'CPU'] | None = None, device_id: int | None = None)[source]

Bases: Optimizer

Stochastic Gradient Descent (SGD) optimizer.

Parameters:

learning_rate (float, optional) – The learning rate. Defaults to 0.01.
computational_device (Literal["GPU", "CPU"], optional) – The device where computations will be performed.
device_id (int, optional) – The ID of the GPU to use if computational_device is “GPU”.

step(layers: list, inputs)[source]

Performs a single optimization step using SGD.

Parameters:

layers (list) – List of layers to update.
inputs (Any) – Input data.