soft.fuzzy.reinforcement package


soft.fuzzy.reinforcement.policy module

This module contains the custom policy for use with the stable_baselines3 library. It is based on the ActorCriticPolicy class from stable_baselines3, but modified to use the FuzzyActorCritic class instead of the ActorCritic class. This is necessary because the FuzzyActorCritic class is a

class soft.fuzzy.reinforcement.policy.FuzzyActorCritic(policy_network, value_network)

Bases: Module

Custom network for policy and value function. It receives as input the features extracted by the features extractor.

forward(features: Tensor) Tuple[Tensor, Tensor]

(torch.Tensor, torch.Tensor) latent_policy, latent_value of the specified network. If all layers are shared, then latent_policy == latent_value

forward_actor(features: Tensor) Tensor

Forward pass in the policy network.


features – The features to pass through the policy network.


The output of the policy network.

forward_critic(features: Tensor) Tensor

Forward pass in the value function.


features – The features to pass through the value function.


The output of the value function.

class soft.fuzzy.reinforcement.policy.FuzzyActorCriticPolicy(observation_space: Space, action_space: Space, lr_schedule: Callable[[float], float], controllers: Dict, *args, **kwargs)

Bases: ActorCriticPolicy

A custom policy for use with the stable_baselines3 library.

add_controllers(controllers) None

Add the controllers to the policy.


controllers – The controllers to add.



class soft.fuzzy.reinforcement.policy.FuzzyDQNPolicy(observation_space:, action_space: ~gymnasium.spaces.discrete.Discrete, lr_schedule: ~typing.Callable[[float], float], net_arch: ~typing.List[int] | None = None, activation_fn: ~typing.Type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, features_extractor_class: ~typing.Type[~stable_baselines3.common.torch_layers.BaseFeaturesExtractor] = <class 'stable_baselines3.common.torch_layers.FlattenExtractor'>, features_extractor_kwargs: ~typing.Dict[str, ~typing.Any] | None = None, normalize_images: bool = True, optimizer_class: ~typing.Type[~torch.optim.optimizer.Optimizer] = <class 'torch.optim.adam.Adam'>, optimizer_kwargs: ~typing.Dict[str, ~typing.Any] | None = None)

Bases: DQNPolicy

Fuzzy Q-Network policy.

make_q_net() ZeroOrderTSK
q_net: ZeroOrderTSK
q_net_target: ZeroOrderTSK

