lisbet.modeling#
PyTorch models and their extensions. The transformer model is based on ViT [1] and its reference implementation in JAX/Flax, available at google-research/vision_transformer.
Notes
[a] Early versions of LISBET were using TensorFlow/Keras.
References
- [1] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X.,
Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ArXiv:2010.11929 [Cs]. http://arxiv.org/abs/2010.11929
- class lisbet.modeling.FrameClassificationHead(output_token_idx, input_dim, num_classes, hidden_dim=None)[source]#
Frame-level classification head.
This head selects a specific token from the sequence (typically the last one) and applies a classification layer to predict frame-level labels.
- Parameters:
output_token_idx (
int) – Index of the token to use for classification (e.g., -1 for last token).input_dim (
int) – Dimension of the input embeddings (formerly emb_dim).num_classes (
int) – Number of output classes (formerly out_dim).hidden_dim (
int|None) – Dimension of the hidden layer. If None, uses a single linear layer. If provided, uses an MLP with the specified hidden dimension.
- output_token_idx#
Index of the token used for classification.
- Type:
int
- logits#
Classification layer (either Linear or MLP).
- Type:
nn.Module
- __init__(output_token_idx, input_dim, num_classes, hidden_dim=None)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]#
Forward pass through the frame classification head.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, sequence_length, input_dim).- Returns:
Classification logits of shape (batch_size, num_classes).
- Return type:
Tensor
- get_config()[source]#
Get the configuration dictionary for this head.
- Returns:
Configuration dictionary containing all parameters needed to recreate this head instance.
- Return type:
dict[str,Any]
- classmethod from_config(config)[source]#
Create a FrameClassificationHead instance from a configuration dictionary.
- Parameters:
config (
dict[str,Any]) – Configuration dictionary containing all parameters needed to create the head instance.- Returns:
New FrameClassificationHead instance created from the configuration.
- Return type:
- class lisbet.modeling.WindowClassificationHead(input_dim, num_classes, hidden_dim=None)[source]#
Window-level classification head.
This head performs global max pooling over the sequence dimension and applies a classification layer to predict window-level labels.
- Parameters:
input_dim (
int) – Dimension of the input embeddings (formerly emb_dim).num_classes (
int) – Number of output classes (formerly out_dim).hidden_dim (
int|None) – Dimension of the hidden layer. If None, uses a single linear layer. If provided, uses an MLP with the specified hidden dimension.
- logits#
Classification layer (either Linear or MLP).
- Type:
nn.Module
- __init__(input_dim, num_classes, hidden_dim=None)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]#
Forward pass through the window classification head.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, sequence_length, input_dim).- Returns:
Classification logits of shape (batch_size, num_classes).
- Return type:
Tensor
- get_config()[source]#
Get the configuration dictionary for this head.
- Returns:
Configuration dictionary containing all parameters needed to recreate this head instance.
- Return type:
dict[str,Any]
- classmethod from_config(config)[source]#
Create a WindowClassificationHead instance from a configuration dictionary.
- Parameters:
config (
dict[str,Any]) – Configuration dictionary containing all parameters needed to create the head instance.- Returns:
New WindowClassificationHead instance created from the configuration.
- Return type:
- class lisbet.modeling.EmbeddingHead(output_token_idx)[source]#
Embedding head for extracting behavior embeddings.
This head selects a specific token from the sequence (typically the last one) and returns it as the behavior embedding without any additional transformation.
- Parameters:
output_token_idx (
int) – Index of the token to use for embedding extraction (e.g., -1 for last token).
- output_token_idx#
Index of the token used for embedding extraction.
- Type:
int
- __init__(output_token_idx)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]#
Forward pass through the embedding head.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, sequence_length, embedding_dim).- Returns:
Embedding tensor of shape (batch_size, embedding_dim).
- Return type:
Tensor
- get_config()[source]#
Get the configuration dictionary for this head.
- Returns:
Configuration dictionary containing all parameters needed to recreate this head instance.
- Return type:
dict[str,Any]
- classmethod from_config(config)[source]#
Create an EmbeddingHead instance from a configuration dictionary.
- Parameters:
config (
dict[str,Any]) – Configuration dictionary containing all parameters needed to create the head instance.- Returns:
New EmbeddingHead instance created from the configuration.
- Return type:
- class lisbet.modeling.MultiTaskModel(backbone, task_heads, model_id='lisbet_model')[source]#
Multi-task model that combines a backbone with multiple task-specific heads.
This model enables training and inference across multiple tasks using a shared backbone representation. Each task has its own dedicated head that processes the backbone output.
- Parameters:
backbone (
BackboneInterface) – The backbone model that processes input sequences and produces shared representations.task_heads (
dict[str,Module]) – Dictionary mapping task IDs to their corresponding task-specific heads.
- backbone#
The shared backbone model.
- Type:
- task_heads#
Dictionary of task-specific heads.
- Type:
nn.ModuleDict
- model_id#
Identifier for the model instance, useful for logging or saving. Defaults to “lisbet_model”.
- Type:
str
- __init__(backbone, task_heads, model_id='lisbet_model')[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, task_id)[source]#
Forward pass through the model for a specific task.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, sequence_length, input_dim).task_id (
str) – Identifier for the task to use. Must be a key in task_heads.
- Returns:
Task-specific output tensor. Shape depends on the specific task head.
- Return type:
Tensor- Raises:
KeyError – If task_id is not found in the available task heads.
- get_task_ids()[source]#
Get the list of available task IDs.
- Returns:
List of task IDs that can be used with this model.
- Return type:
list[str]
- get_config()[source]#
Get the configuration dictionary for this model.
- Returns:
Configuration dictionary containing backbone config and task head configs.
- Return type:
dict[str,Any]
- classmethod from_config(config, backbone_registry=None, head_registry=None)[source]#
Create a MultiTaskModel instance from a configuration dictionary.
- Parameters:
config (
dict[str,Any]) – Configuration dictionary containing backbone and task head configs.backbone_registry (
dict[str,type] |None) – Registry mapping backbone type names to their classes. If None, uses a default registry.head_registry (
dict[str,type] |None) – Registry mapping head type names to their classes. If None, uses a default registry.
- Returns:
New MultiTaskModel instance created from the configuration.
- Return type:
- Raises:
ValueError – If backbone or head types are not found in the registries.
- class lisbet.modeling.LSTMBackbone(feature_dim, embedding_dim, hidden_dim, num_layers)[source]#
LSTM backbone for sequence modeling.
- Parameters:
feature_dim (
int) – Dimension of the input features.embedding_dim (
int) – Dimension of the output embeddings.hidden_dim (
int) – Dimension of the LSTM hidden state.num_layers (
int) – Number of LSTM layers.
- __init__(feature_dim, embedding_dim, hidden_dim, num_layers)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]#
Forward pass through the LSTM backbone.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, sequence_length, feature_dim).- Returns:
Output tensor of shape (batch_size, sequence_length, embedding_dim).
- Return type:
Tensor
- get_config()[source]#
Get the configuration dictionary for this backbone.
- Return type:
dict[str,Any]
- class lisbet.modeling.TransformerBackbone(feature_dim, embedding_dim, hidden_dim, num_heads, num_layers, max_length)[source]#
Transformer backbone for sequence modeling.
A transformer-based backbone that processes input sequences using self-attention mechanisms. The backbone includes frame embedding, positional embedding, transformer encoder layers, and layer normalization.
- Parameters:
feature_dim (
int) – Dimension of the input features.embedding_dim (
int) – Dimension of the output embeddings.hidden_dim (
int) – Dimension of the feedforward network inside transformer layers.num_heads (
int) – Number of attention heads in the multi-head attention mechanism.num_layers (
int) – Number of transformer encoder layers.max_length (
int) – Maximum sequence length for positional embeddings.
- frame_embedder#
Linear layer for embedding input frames.
- Type:
nn.Linear
- pos_embedder#
Positional embedding module.
- Type:
- transformer_encoder#
Stack of transformer encoder layers.
- Type:
nn.TransformerEncoder
- layer_norm#
Layer normalization applied to the output.
- Type:
nn.LayerNorm
- __init__(feature_dim, embedding_dim, hidden_dim, num_heads, num_layers, max_length)[source]#
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]#
Forward pass through the transformer backbone.
- Parameters:
x (
Tensor) – Input tensor of shape (batch_size, sequence_length, feature_dim).- Returns:
Output tensor of shape (batch_size, sequence_length, embedding_dim).
- Return type:
Tensor
- get_config()[source]#
Get the configuration dictionary for this backbone.
- Returns:
Configuration dictionary containing all parameters needed to recreate this backbone instance.
- Return type:
dict[str,Any]
- classmethod from_config(config)[source]#
Create a TransformerBackbone instance from a configuration dictionary.
- Parameters:
config (
dict[str,Any]) – Configuration dictionary containing all parameters needed to create the backbone instance.- Returns:
New TransformerBackbone instance created from the configuration.
- Return type:
Modules