lisbet.unsupervised#
LISBET module for sequence segmentation and dimensionality reduction.
Functions
|
Dimensionality reduction using UMAP. |
|
Segment time series data using Hidden Markov Models. |
- lisbet.unsupervised.segment_hmm(data_path, min_n_components=2, max_n_components=32, num_iter=10, data_filter=None, fit_frac=None, hmm_seed=None, n_jobs=-1, pretrained_path=None, output_path=None)[source]#
Segment time series data using Hidden Markov Models.
This function fits one or more HMM models to the embeddings and uses the models to segment the data into discrete states.
- Parameters:
data_path (str or Path) – Path to the directory containing LISBET embeddings.
min_n_components (int, optional) – Minimum number of states to use in the HMM.
max_n_components (int, optional) – Maximum number of states to use in the HMM.
num_iter (int, default=10) – Maximum number of iterations for the Baum-Welch algorithm.
data_filter (callable, optional) – Function to filter the data before fitting.
fit_frac (float, optional) – Fraction of data to use for fitting. If None, use all data.
hmm_seed (int, optional) – Random seed for reproducibility.
n_jobs (int, default=-1) – Number of parallel jobs to run, -1 means using all processors.
pretrained_path (str or Path, optional) – Path to the directory containing pretrained HMM models. If None, models are trained from scratch.
output_path (str or Path, optional) – Path to save the results. If None, results are not saved to disk.
- Returns:
predictions – Dictionary mapping the number of states to the predicted segments for each sequence.
- Return type:
dict
- Raises:
ValueError – If min_n_components or max_n_components are smaller than 2, or max_n_components is smaller than min_n_components.