Handling events and EEG dataframes¶

This tutorial shows what you will find in the segments dataframe, how to split it into clips and how to load corresponding parts of EEG recordings with the seiz_eeg package.

We suppose that you have an annotation dataframe, here called segments.parquet, which contains annotation for EEG events with the following format

Multiindex			Columns
`patient`	`session`	`segment`	`label`	`start_time`	`end_time`	`date`	`sampling_rate`	`signals_path`
str	str	int	int	float	float	datetime	int	str

The annoations are represented as pandas.DataFrame with hierarchical indexing. For more information on advaced pandas indexing, we refer you to pandas documentation.

Reading annotations¶

We can read an annotation file stored at path/to/segments.parquet directly as a dataframe with the following code:

import pandas as pd

df = pd.read_parquet("path/to/segments.parquet")

We could then ispect the content of the first 10 lines, and sort them by index:

print(df.head(10).sort_index())

Which for the annotations of the train set of Temple University Seizure Corpus would give the following:

As you can see, the signals_path points to the same file for all segments belonging to the same session (recording). These files contain contiguous recordings which can span multiple events. One can manually load them, or use the provided seiz_eeg.dataset.EEGDataset class, which allows to easily fetch segment signals. We give an example of its usage in the Loading signals with EEGDataset section.

Filtering events¶

The seiz_eeg.utils module provide multiple functions that can be helpful to filter the data in the annotation dataframe.

For instance, we might only be interested in sessions that contain either non-ictal activity (label 0), or generalized seizures (label 3). We can filter them with the seiz_eeg.utils.sessions_by_labels function, and relabel them to 0 and 1 respectively, by running:

df = sessions_by_labels(df, target_labels=[0,3], relabel=True)

Otherwise, we might be interested in working only with patients that have at least one seizure, but no more than 30 of them. For this purpose, we can use the seiz_eeg.utils.patients_by_seizures function:

df = patients_by_seizures(df, low=1, high=30)

Note that most of the functions in seiz_eeg.utils take as input an annotation dataframe, and return an object following the same schema. This allows to concatenate multiple preprocessing steps in a streamline pipeline. For readibility, we suggest to use the pandas.DataFrame.pipe method, which allows us to write:

df = (
    pd.read_parquet("path/to/segments.parquet")
    .pipe(sessions_by_labels, target_labels=[0,3], relabel=True)
    .pipe(patients_by_seizures, low=1, high=30)
)

Getting clips of the same length¶

In case you need to work

Loading signals with `EEGDataset`¶

seiz_eeg.dataset.EEGDataset.

Handling events and EEG dataframes¶

Reading annotations¶

Filtering events¶

Getting clips of the same length¶

Loading signals with `EEGDataset`¶

EEG pipeline for reproducible ML

Navigation

Related Topics

Handling events and EEG dataframes¶

Reading annotations¶

Filtering events¶

Getting clips of the same length¶

Loading signals with EEGDataset¶

Loading signals with `EEGDataset`¶