seiz_eeg.utils

Utility functions for EEG Datasets

seiz_eeg.utils.cut_long_sessions(segments_df: DataFrame[ClipsDF], max_time: float) DataFrame[ClipsDF]

Cut EEG session longer than max_time.

If max_time is smaller than zero then segments_df is returned unchanged.

Parameters:
  • segments_df (DataFrame[ClipsDF]) – Dataframe of EEG segments

  • max_time (float) – Cutoff time

Returns:

Datset of clipped sessions.

Return type:

DataFrame[ClipsDF]

seiz_eeg.utils.extract_target_labels(df: DataFrame[ClipsDF], target_labels: List[int], relabel: bool = False) DataFrame[ClipsDF]

Old name of segments_by_labels. DEPRECATED

seiz_eeg.utils.patient_split(df: DataFrame[ClipsDF], ratio_min: float, ratio_max: float, seed: int | None = None) List[str]

Compute a set of patients from segments_df indices such that they represent between ratio min and max of each label appearences.

Patients are randomly sampled to satisy the constraint on each label iteratively. In some cases, previous selections make it impossible to satisfy the constraint by adding any patient. This functions tries ten different random subsets before failing. In that case, with high probability no split exists.

Parameters:
  • df (DataFrame[ClipsDF]) – Dataframe of EEG segments

  • ratio_min (float) – Minimum fraction of labels to cover

  • ratio_max (float) – Maximum fraction of labels to cover

Raises:
  • ValueError – If ratio_[min|max] do not satisfy 0 < ratio_min <= ratio_max < 1

  • ValueError – If the sampling algorithm fails 10 times, which probably means that ratio_[min|max] are too restrictive

Returns:

List of selected patients

Return type:

List[str]

seiz_eeg.utils.patients_by_seizures(segments_df: DataFrame[ClipsDF], low: int = 0, high: int = inf, total=False) DataFrame[ClipsDF]

Filter patients to have a number of seizures between low and high

Parameters:
  • segments_df (DataFrame[ClipsDF]) – Segments annotation dataframe

  • min_nb_seiz (int, optinal) – Minumum number of seizures per patient (inclusive). Defaults to 0.

  • max_nb_seiz (int, optinal) – Maximum number of seizures per patient (inclusive). Defaults to infinity.

  • total (bool, optional) – Wether to aggregate all kind of seizures in the count. Defaults to False.

Returns:

Filtered dataframe

Return type:

DataFrame[ClipsDF]

seiz_eeg.utils.resample_label(df: DataFrame[ClipsDF], label: int, ratio: float = 1, seed: int | None = None) DataFrame[ClipsDF]

_summary_

Parameters:
  • df (DataFrame[ClipsDF]) – Dataframe of EEG clips or segments

  • label (int) – Label to resample.

  • ratio (float, optional) – Ratio of desired samples w.r.t. the total count of other labels. If the desired ratio exceeds the label counts, then the label is bootstrapped (sampled with replacement), otherwise it is downsampled (no replacement). Defaults to 1.

  • seed (Optional[int], optional) – Random seed. Defaults to None.

Returns:

Dataframe with target class resampled.

Return type:

DataFrame[ClipsDF]

seiz_eeg.utils.segments_by_labels(df: DataFrame[ClipsDF], target_labels: List[int], relabel: bool = False) DataFrame[ClipsDF]

Extract rows of df whose labels are in target_labels

Parameters:
  • df (DataFrame[ClipsDF]) – Dataframe of EEG clips

  • target_labels (List[int]) – List of integer labels to extract

  • relabel (bool) – Whether to relabel target_labels progressively from 0

Returns:

Subset of df with desired labels.

Return type:

DataFrame[ClipsDF]

seiz_eeg.utils.sessions_by_labels(df: DataFrame[ClipsDF], target_labels: List[int], relabel: bool = False) DataFrame[ClipsDF]

Extract sessions of df whose labels are in target_labels

Parameters:
  • df (DataFrame[ClipsDF]) – Dataframe of EEG clips

  • target_labels (List[int]) – List of integer labels to extract

  • relabel (bool) – Whether to relabel target_labels progressively from 0

Returns:

Subset of df with desired labels.

Return type:

DataFrame[ClipsDF]

seiz_eeg.utils.sessions_by_seizures(segments_df: DataFrame[ClipsDF], low: int = 0, high: int = inf) DataFrame[ClipsDF]

Extract only sessions with at least min_nb_seiz.

Parameters:
  • segments_df (DataFrame[ClipsDF]) – Segments annotation dataframe

  • min_nb_seiz (int) – Minumum number of seizures per session (inclusive)

Returns:

Annotation dataframe with only requested sessions

Return type:

DataFrame[ClipsDF]