soft.fuzzy.reinforcement package
Submodules
soft.fuzzy.reinforcement.policy module
This module contains the custom policy for use with the stable_baselines3 library. It is based on the ActorCriticPolicy class from stable_baselines3, but modified to use the FuzzyActorCritic class instead of the ActorCritic class. This is necessary because the FuzzyActorCritic class is a
- class soft.fuzzy.reinforcement.policy.FuzzyActorCritic(policy_network, value_network)
Bases:
Module
Custom network for policy and value function. It receives as input the features extracted by the features extractor.
- forward(features: Tensor) Tuple[Tensor, Tensor]
- Returns:
(torch.Tensor, torch.Tensor) latent_policy, latent_value of the specified network. If all layers are shared, then
latent_policy == latent_value
- forward_actor(features: Tensor) Tensor
Forward pass in the policy network.
- Parameters:
features – The features to pass through the policy network.
- Returns:
The output of the policy network.
- forward_critic(features: Tensor) Tensor
Forward pass in the value function.
- Parameters:
features – The features to pass through the value function.
- Returns:
The output of the value function.
- class soft.fuzzy.reinforcement.policy.FuzzyActorCriticPolicy(observation_space: Space, action_space: Space, lr_schedule: Callable[[float], float], controllers: Dict, *args, **kwargs)
Bases:
ActorCriticPolicy
A custom policy for use with the stable_baselines3 library.
- add_controllers(controllers) None
Add the controllers to the policy.
- Parameters:
controllers – The controllers to add.
- Returns:
None
- class soft.fuzzy.reinforcement.policy.FuzzyDQNPolicy(observation_space: ~gymnasium.spaces.space.Space, action_space: ~gymnasium.spaces.discrete.Discrete, lr_schedule: ~typing.Callable[[float], float], net_arch: ~typing.List[int] | None = None, activation_fn: ~typing.Type[~torch.nn.modules.module.Module] = <class 'torch.nn.modules.activation.ReLU'>, features_extractor_class: ~typing.Type[~stable_baselines3.common.torch_layers.BaseFeaturesExtractor] = <class 'stable_baselines3.common.torch_layers.FlattenExtractor'>, features_extractor_kwargs: ~typing.Dict[str, ~typing.Any] | None = None, normalize_images: bool = True, optimizer_class: ~typing.Type[~torch.optim.optimizer.Optimizer] = <class 'torch.optim.adam.Adam'>, optimizer_kwargs: ~typing.Dict[str, ~typing.Any] | None = None)
Bases:
DQNPolicy
Fuzzy Q-Network policy.
- make_q_net() ZeroOrderTSK
- q_net: ZeroOrderTSK
- q_net_target: ZeroOrderTSK