lisbet.datasets.iterable_style#
Iterable-style datasets for social behavior classification and self-supervised tasks.
Classes
|
Iterable dataset for the Group Consistency self-supervised task. |
|
Iterable dataset for social behavior classification. |
|
Iterable dataset for the temporal order prediction self-supervised task. |
|
Iterable dataset for the temporal shift prediction or regression task. |
|
Iterable dataset for the temporal warp prediction or regression task. |
- class lisbet.datasets.iterable_style.SocialBehaviorDataset(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, annot_format='multiclass', base_seed=None)[source]#
Iterable dataset for social behavior classification.
Generates windows of pose data and corresponding labels for supervised classification of social behaviors. Each sample consists of a window of frames selected from a record, with the label extracted according to the specified annotation format. Supports binary, multiclass, and multilabel classification tasks.
- __init__(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, annot_format='multiclass', base_seed=None)[source]#
Initialize the SocialBehaviorDataset.
- Parameters:
records (list) – List of records containing the data.
window_size (int) – Size of the window in frames.
window_offset (int, optional) – Offset for the window in frames (default is 0).
fps_scaling (float, optional) – Scaling factor for the frames per second (default is 1.0).
transform (callable, optional) – A function/transform to apply to the data (default is None).
annot_format (str, optional) – Format of the labels. Valid options are ‘binary’, ‘multiclass’, or ‘multilabel’ for the respective classification tasks (default is ‘multiclass’).
base_seed (int, optional) – Base seed for random number generation (default is None, which generates a random seed).
- class lisbet.datasets.iterable_style.GroupConsistencyDataset(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, base_seed=None)[source]#
Iterable dataset for the Group Consistency self-supervised task.
Generates windows for training models to determine whether a group of tracked individuals in a window of frames originates from the same recording (“consistent”) or is artificially constructed by combining individuals from different records (“inconsistent”).
Each sample consists of a window of frames, with 50% probability of being consistent and 50% probability of being inconsistent (via swapping individuals from another record).
Notes
The swap is performed by splitting the group of individuals at a random index, concatenating individuals from the original and swap windows. This allows for arbitrary group sizes and compositions.
Padding may not be consistent for swapped windows, especially near the sequence boundaries. However, since the number of padded windows is small compared to the total, this edge case is not explicitly handled.
This dataset requires that each record contains at least two individuals.
The label is 0 for consistent (all individuals from the same record) and 1 for inconsistent (group contains individuals from different records).
- __init__(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, base_seed=None)[source]#
Initialize the GroupConsistencyDataset.
- Parameters:
records (list) – List of records containing the data.
window_size (int) – Size of the window in frames.
window_offset (int, optional) – Offset for the window in frames (default is 0).
fps_scaling (float, optional) – Scaling factor for the frames per second (default is 1.0).
transform (callable, optional) – A function/transform to apply to the data (default is None).
base_seed (int, optional) – Base seed for random number generation (default is None, which generates a random seed).
- class lisbet.datasets.iterable_style.TemporalOrderDataset(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, method='strict', base_seed=None)[source]#
Iterable dataset for the temporal order prediction self-supervised task.
Generates samples for predicting whether a ‘post’ half-window follows a ‘pre’ half-window in the same recording (ordered) or not (unordered).
Each sample consists of a window created by concatenating the first half of one window (‘pre’) and the second half of another window (‘post’). Positive samples have the post window following the pre window in the same record; negative samples have the post window from a different record or from an earlier time in the same record, depending on the chosen ‘method’.
Notes
Padding may differ between pre and post windows, especially near sequence boundaries. This is not explicitly handled, as the number of such cases is small relative to the dataset size.
The last window_size frames of each record may produce overlapping pre and post windows in the positive (‘ordered’) case. These are included for simplicity and may help the model learn temporal relationships.
The concatenation of pre and post windows along the time dimension is used for simplicity and compatibility with multi-task training. This approach encourages the model to learn both positional and sequence embeddings jointly. In the future, embeddings could be computed separately and concatenated before the classifier.
In rare cases, pre and post windows may overlap perfectly and be labeled as both positive and negative, depending on random sampling. This ambiguity is tolerated for simplicity, as it is infrequent and unlikely to significantly impact model performance.
The ‘method’ parameter controls how negative samples are selected: ‘simple’, post windows can come from any record or from earlier times in the same record; ‘strict’, post windows are always from the same record but must precede the pre window in time.
- __init__(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, method='strict', base_seed=None)[source]#
Initialize the TemporalOrderDataset.
- Parameters:
records (list) – List of records containing the data.
window_size (int) – Size of the window in frames.
window_offset (int, optional) – Offset for the window in frames (default is 0).
fps_scaling (float, optional) – Scaling factor for the frames per second (default is 1.0).
transform (callable, optional) – A function/transform to apply to the data (default is None).
method (str, optional) – Selection method for negative class examples. Options are ‘simple’ (post window can be from any record or earlier in the same record) and ‘strict’ (post window is always from the same record but must precede the pre window).
base_seed (int, optional) – Base seed for random number generation (default is None, which generates a random seed).
- class lisbet.datasets.iterable_style.TemporalShiftDataset(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, max_shift=60, regression=False, base_seed=None)[source]#
Iterable dataset for the temporal shift prediction or regression task.
Generates samples in which the trajectory of the second individual in a group is shifted in time by a random delay within a specified interval (default: -60 to +60 frames).
For each sample, a window of frames is selected from a record. The first individual’s data is taken from this window, while the second individual’s data is taken from a window at the same location but shifted by a random delay (positive or negative) within the allowed range. The two individuals’ data are then concatenated along the ‘individuals’ dimension, forming a group window where one individual’s trajectory is temporally shifted relative to the other.
The task can be formulated as either: binary classification, predict whether the shift is positive (future) or negative (past); regression, estimate the normalized value of the temporal shift.
Notes
This dataset requires that each record contains at least two individuals.
The shift is always performed within the same record; the shifted window is sampled such that it stays within the valid frame range.
The split between individuals is randomized for each sample, so the shifted trajectory may correspond to any individual in the group except the first.
The label is either: the normalized shift value in [0, 1], where 0 corresponds to neg. max_shift and 1 to pos. max shift (regression); or 1 if the shift is positive (delta_delay > 0), 0 otherwise (classification).
Padding may occur if the shifted window extends beyond the sequence boundaries, but this is handled by the window extraction logic and is rare for typical settings.
The window_offset parameter determines the temporal alignment of the window relative to the reference frame.
- __init__(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, max_shift=60, regression=False, base_seed=None)[source]#
Initialize the TemporalShiftDataset.
- Parameters:
records (list) – List of records containing the data.
window_size (int) – Size of the window in frames.
window_offset (int, optional) – Offset for the window in frames (default is 0).
fps_scaling (float, optional) – Scaling factor for the frames per second (default is 1.0).
transform (callable, optional) – A function/transform to apply to the data (default is None).
max_shift (int, optional) – Maximum time shift to apply, expressed in number of frames (default is 60).
regression (bool, optional) – Whether to perform regression (default is False, which performs binary classification).
base_seed (int, optional) – Base seed for random number generation (default is None, which generates a random seed).
- class lisbet.datasets.iterable_style.TemporalWarpDataset(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, max_warp=50.0, regression=False, base_seed=None)[source]#
Iterable dataset for the temporal warp prediction or regression task.
Generates windows in which the temporal pace (speed) of the window is artificially warped by resampling the frames at a random speed factor within a specified range (default: 50% to 150%).
For each sample, a window is extracted from a random location in a record, and the time axis is rescaled by a randomly chosen speed factor. The resulting window is then interpolated back to the original window size, so the model always receives a fixed number of frames, but the underlying motion is either sped up or slowed down.
The task can be formulated as either: binary classification, predict whether the window was sped up (speed > 100%) or slowed down (speed < 100%); regression, estimate the normalized speed factor used to warp the window.
Notes
The speed factor is sampled uniformly at random from [max_warp, 100 + max_warp] for each sample.
The actual window is extracted by resampling the original frames at the chosen speed, then interpolated to the fixed window size.
For regression, the label is the normalized speed factor in [0, 1], where 0 corresponds to max_warp and 1 to 100 + max_warp.
For classification, the label is 1 if the speed is above 100, and 0 otherwise.
Padding may occur if the resampled window extends beyond the sequence boundaries, but this is handled by the window extraction logic and is rare for typical settings.
The window_offset parameter determines the temporal alignment of the window relative to the reference frame.
- __init__(records, window_size, window_offset=0, fps_scaling=1.0, transform=None, max_warp=50.0, regression=False, base_seed=None)[source]#
Initialize the TemporalWarpDataset.
- Parameters:
records (list) – List of records containing the data.
window_size (int) – Size of the window in frames.
window_offset (int, optional) – Offset for the window in frames (default is 0).
fps_scaling (float, optional) – Scaling factor for the frames per second (default is 1.0).
transform (callable, optional) – A function/transform to apply to the data (default is None).
max_warp (float, optional) – Maximum time warp to apply, expressed as a percentage (default is 50).
regression (bool, optional) – Whether to perform regression (default is False, which performs binary classification).
base_seed (int, optional) – Base seed for random number generation (default is None, which generates a random seed).