seiz_eeg.utils¶
Utility functions for EEG Datasets
- seiz_eeg.utils.cut_long_sessions(segments_df: DataFrame[ClipsDF], max_time: float) DataFrame[ClipsDF] ¶
Cut EEG session longer than
max_time
.If
max_time
is smaller than zero thensegments_df
is returned unchanged.- Parameters:
segments_df (DataFrame[ClipsDF]) – Dataframe of EEG segments
max_time (float) – Cutoff time
- Returns:
Datset of clipped sessions.
- Return type:
DataFrame[ClipsDF]
- seiz_eeg.utils.extract_target_labels(df: DataFrame[ClipsDF], target_labels: List[int], relabel: bool = False) DataFrame[ClipsDF] ¶
Old name of
segments_by_labels
. DEPRECATED
- seiz_eeg.utils.patient_split(df: DataFrame[ClipsDF], ratio_min: float, ratio_max: float, seed: int | None = None) List[str] ¶
Compute a set of patients from segments_df indices such that they represent between ratio min and max of each label appearences.
Patients are randomly sampled to satisy the constraint on each label iteratively. In some cases, previous selections make it impossible to satisfy the constraint by adding any patient. This functions tries ten different random subsets before failing. In that case, with high probability no split exists.
- Parameters:
df (DataFrame[ClipsDF]) – Dataframe of EEG segments
ratio_min (float) – Minimum fraction of labels to cover
ratio_max (float) – Maximum fraction of labels to cover
- Raises:
ValueError – If ratio_[min|max] do not satisfy
0 < ratio_min <= ratio_max < 1
ValueError – If the sampling algorithm fails 10 times, which probably means that ratio_[min|max] are too restrictive
- Returns:
List of selected patients
- Return type:
List[str]
- seiz_eeg.utils.patients_by_seizures(segments_df: DataFrame[ClipsDF], low: int = 0, high: int = inf, total=False) DataFrame[ClipsDF] ¶
Filter patients to have a number of seizures between low and high
- Parameters:
segments_df (DataFrame[ClipsDF]) – Segments annotation dataframe
min_nb_seiz (int, optinal) – Minumum number of seizures per patient (inclusive). Defaults to 0.
max_nb_seiz (int, optinal) – Maximum number of seizures per patient (inclusive). Defaults to infinity.
total (bool, optional) – Wether to aggregate all kind of seizures in the count. Defaults to False.
- Returns:
Filtered dataframe
- Return type:
DataFrame[ClipsDF]
- seiz_eeg.utils.resample_label(df: DataFrame[ClipsDF], label: int, ratio: float = 1, seed: int | None = None) DataFrame[ClipsDF] ¶
_summary_
- Parameters:
df (DataFrame[ClipsDF]) – Dataframe of EEG clips or segments
label (int) – Label to resample.
ratio (float, optional) – Ratio of desired samples w.r.t. the total count of other labels. If the desired
ratio
exceeds the label counts, then the label is bootstrapped (sampled with replacement), otherwise it is downsampled (no replacement). Defaults to 1.seed (Optional[int], optional) – Random seed. Defaults to None.
- Returns:
Dataframe with target class resampled.
- Return type:
DataFrame[ClipsDF]
- seiz_eeg.utils.segments_by_labels(df: DataFrame[ClipsDF], target_labels: List[int], relabel: bool = False) DataFrame[ClipsDF] ¶
Extract rows of
df
whose labels are intarget_labels
- Parameters:
df (DataFrame[ClipsDF]) – Dataframe of EEG clips
target_labels (List[int]) – List of integer labels to extract
relabel (bool) – Whether to relabel target_labels progressively from 0
- Returns:
Subset of
df
with desired labels.- Return type:
DataFrame[ClipsDF]
- seiz_eeg.utils.sessions_by_labels(df: DataFrame[ClipsDF], target_labels: List[int], relabel: bool = False) DataFrame[ClipsDF] ¶
Extract sessions of
df
whose labels are intarget_labels
- Parameters:
df (DataFrame[ClipsDF]) – Dataframe of EEG clips
target_labels (List[int]) – List of integer labels to extract
relabel (bool) – Whether to relabel target_labels progressively from 0
- Returns:
Subset of
df
with desired labels.- Return type:
DataFrame[ClipsDF]
- seiz_eeg.utils.sessions_by_seizures(segments_df: DataFrame[ClipsDF], low: int = 0, high: int = inf) DataFrame[ClipsDF] ¶
Extract only sessions with at least
min_nb_seiz
.- Parameters:
segments_df (DataFrame[ClipsDF]) – Segments annotation dataframe
min_nb_seiz (int) – Minumum number of seizures per session (inclusive)
- Returns:
Annotation dataframe with only requested sessions
- Return type:
DataFrame[ClipsDF]