Optimizers

class HeteroSymNN.Core.optimizers.AdamOptimizer(learning_rate: float | None = None, computational_device: Literal['GPU', 'CPU'] | None = None, device_id: int | None = None, beta1: float = 0.9, beta2: float = 0.999, epsilon: float = 1e-8)[source]

Bases: Optimizer

Adam optimizer.

Parameters:
  • learning_rate (float, optional) – The learning rate. Defaults to 0.001.

  • computational_device (Literal["GPU", "CPU"], optional) – The device where computations will be performed.

  • device_id (int, optional) – The ID of the GPU to use if computational_device is “GPU”.

  • beta1 (float, optional) – The exponential decay rate for the 1st moment estimates. Defaults to 0.9.

  • beta2 (float, optional) – The exponential decay rate for the 2nd moment estimates. Defaults to 0.999.

  • epsilon (float, optional) – A small constant for numerical stability. Defaults to 1e-8.

get_config()[source]

Returns the configuration of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the configuration parameters necessary to reconstruct the optimizer instance.

Returns:

Dictionary containing the configuration parameters.

Return type:

dict

get_state()[source]

Returns the internal state of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the current internal state (e.g., iteration count, moving averages) for serialization.

Returns:

Dictionary containing the optimizer state.

Return type:

dict

set_state(state, be)[source]

Sets the internal state of the optimizer.

This method should be implemented by subclasses to restore the internal state from a provided dictionary.

Parameters:
  • state (dict) – The state dictionary to load.

  • be (module) – The backend module (numpy or cupy) to use for creating arrays.

step(layers: list, inputs)[source]

Performs a single optimization step using Adam.

Parameters:
  • layers (list) – List of layers to update.

  • inputs (Any) – Input data.

class HeteroSymNN.Core.optimizers.Optimizer(learning_rate: float | None = None, computational_device: Literal['GPU', 'CPU'] | None = None, device_id: int | None = None)[source]

Bases: object

Base class for all optimizers.

Parameters:
  • learning_rate (float, optional) – The learning rate for the optimizer.

  • computational_device (Literal["GPU", "CPU"], optional) – The device where computations will be performed.

  • device_id (int, optional) – The ID of the GPU to use if computational_device is “GPU”.

get_config()[source]

Returns the configuration of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the configuration parameters necessary to reconstruct the optimizer instance.

Returns:

Dictionary containing the configuration parameters.

Return type:

dict

get_state()[source]

Returns the internal state of the optimizer.

This method should be implemented by subclasses to return a dictionary containing the current internal state (e.g., iteration count, moving averages) for serialization.

Returns:

Dictionary containing the optimizer state.

Return type:

dict

set_gpu_id(new_id: int)[source]

Sets the GPU ID for the optimizer.

Parameters:

new_id (int) – The new GPU ID.

set_state(state, be)[source]

Sets the internal state of the optimizer.

This method should be implemented by subclasses to restore the internal state from a provided dictionary.

Parameters:
  • state (dict) – The state dictionary to load.

  • be (module) – The backend module (numpy or cupy) to use for creating arrays.

step(layers: list, inputs)[source]

Performs a single optimization step.

This method must be implemented by subclasses to define the specific optimization logic (e.g., SGD update, Adam update) applied to the layers.

Parameters:
  • layers (list) – List of layers to update.

  • inputs (Any) – Input data (used by some optimizers for gradient calculation context if needed).

class HeteroSymNN.Core.optimizers.SgdOptimizer(learning_rate: float | None = None, computational_device: Literal['GPU', 'CPU'] | None = None, device_id: int | None = None)[source]

Bases: Optimizer

Stochastic Gradient Descent (SGD) optimizer.

Parameters:
  • learning_rate (float, optional) – The learning rate. Defaults to 0.01.

  • computational_device (Literal["GPU", "CPU"], optional) – The device where computations will be performed.

  • device_id (int, optional) – The ID of the GPU to use if computational_device is “GPU”.

step(layers: list, inputs)[source]

Performs a single optimization step using SGD.

Parameters:
  • layers (list) – List of layers to update.

  • inputs (Any) – Input data.