pulse2percept.datasets¶
Utilities to download and import datasets.
- Dataset loaders can be used to load small datasets that come pre-packaged with the pulse2percept software.
- Dataset fetchers can be used to download larger datasets from a given URL and directly import them into pulse2percept.
See also
-
pulse2percept.datasets.
clear_data_dir
(data_dir=None)[source]¶ Delete all content in the data directory
By default, this is set to a directory called ‘pulse2percept_data’ in the user home directory. Alternatively, it can be set by a
PULSE2PERCEPT_DATA
environment variable or set programmatically by specifying a path.New in version 0.6.
Parameters: data_dir (str or None) – The path to the pulse2percept data directory.
-
pulse2percept.datasets.
fetch_url
(url, file_path, progress_bar=<function _report_hook>, remote_checksum=None)[source]¶ Download a remote file
Fetch a dataset pointed to by
url
, check its SHA-256 checksum for integrity, and save it tofile_path
.New in version 0.6.
Parameters: - url (string) – URL of file to download
- file_path (string) – Path to the local file that will be created
- progress_bar (func callback, optional) – A callback to a function
func(count, block_size, total_size)
that will display a progress bar. - remote_checksum (str, optional) – The expected SHA-256 checksum of the file.
-
pulse2percept.datasets.
fetch_beyeler2019
(subjects=None, electrodes=None, data_path=None, shuffle=False, random_state=0, download_if_missing=True)[source]¶ Load the phosphene drawing dataset from [Beyeler2019]
Download the phosphene drawing dataset described in [Beyeler2019] from https://osf.io/28uqg (66MB) to
data_path
. By default, all datasets are stored in ‘~/pulse2percept_data/’, but a different path can be specified.Retinal implants: Argus I, Argus II Subjects: 4 Number of samples: 400 Number of features: 16 The dataset includes the following features:
subject Subject ID, S1-S4 electrode Electrode ID, A1-F10 image Phosphene drawing img_shape x,y shape of the phosphene drawing date Experiment date (YYYY/mm/dd) stim_class Stimulus type used to stimulate the array amp Pulse amplitude used (x Threshold) freq Pulse frequency used (Hz) pdur Pulse duration used (ms) area Phosphene area (see [Beyeler2019] for details) orientation Phosphene orientation (see [Beyeler2019]) eccentricity Phosphene elongation (see [Beyeler2019]) compactness Phosphene compactness (see [Beyeler2019]) x_center, y_center Phosphene center of mass (see [Beyeler2019]) xrange, yrange Screen size in deg (see [Beyeler2019]) New in version 0.6.
Changed in version 0.7: Redirected download to 66MB version of the dataset that includes the fields
x_center
andy_center
.Parameters: - subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
- electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- data_path (string, optional) – Specify another download and cache folder for the dataset. By default all pulse2percept data is stored in ‘~/pulse2percept_data’ subfolders.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional, default: 0) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
- download_if_missing (optional) – If False, raise an IOError if the data is not locally available instead of trying to download it from the source site.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 400x16 Pandas DataFrame
-
pulse2percept.datasets.
get_data_dir
(data_dir=None)[source]¶ Return the path of the pulse2percept data directory
This directory is used to store the datasets retrieved by the data fetch utility functions to avoid downloading the data several times.
By default, this is set to a directory called ‘pulse2percept_data’ in the user home directory. Alternatively, it can be set by a
PULSE2PERCEPT_DATA
environment variable or set programmatically by specifying a path.If the directory does not already exist, it is automatically created.
New in version 0.6.
Parameters: data_dir (str or None) – The path to the pulse2percept data directory.
-
pulse2percept.datasets.
load_horsager2009
(subjects=None, electrodes=None, stim_types=None, shuffle=False, random_state=0)[source]¶ Load data from [Horsager2009]
Load the threshold data described in [Horsager2009]. Average thresholds were extracted from the figures of the paper using WebplotDigitizer.
Retinal implants: Argus I Subjects: 2 Number of samples: 552 Number of features: 21 The dataset includes the following features:
subject Subject ID, S05-S06 implant Argus I electrode Electrode ID, A1-F10 task ‘threshold’ or ‘matching’ stim_type ‘single_pulse’, ‘fixed_duration’, ‘variable_duration’, ‘fixed_duration_supra’, ‘bursting_triplets’, ‘bursting_triplets_supra’, ‘latent_addition’ stim_dur Stimulus duration (ms) stim_freq Stimulus frequency (Hz) stim_amp Stimulus amplitude (uA) pulse_type ‘cathodic_first’ pulse_dur Pulse duration (ms) interphase_dur Interphase gap (ms) delay_dur Stimulus delivered after delay (ms) ref_stim_type Reference stimulus type (‘single_pulse’, …) ref_freq Reference stimulus frequency (Hz) ref_amp Reference stimulus amplitude (uA) ref_amp_factor Reference stimulus amplitude factor (xThreshold) ref_pulse_dur Reference stimulus pulse duration (ms) ref_interphase_dur Reference stimulus interphase gap (ms) theta Temporal model output at threshold (a.u.) source Figure from which data was extracted Some stimulus types require a reference stimulus. For example, ‘bursting_triplets_supra’ were delivered at 2x or 3x threshold of a reference bursting triplet pulse. The parameters of the reference stimulus are given in
ref_*
fields.Missing values are denoted by NaN.
New in version 0.6.
Parameters: - subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
- electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- stim_types (str | list of strings | None, optional) – Select data from a single stimulus type or a list of stimulus types. By default, all stimulus types are selected.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 552x21 Pandas DataFrame
-
pulse2percept.datasets.
load_nanduri2012
(electrodes=None, task=None, shuffle=False, random_state=0)[source]¶ Load data from [Nanduri2012]
Load the brightness and size rating data described in [Nanduri2012]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.
Retinal implants: Argus I Subjects: 1 Number of samples: 128 Number of features: 17 The dataset includes the following features:
subject Subject ID, S06 implant Argus I electrode Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4 task ‘rate’ or ‘size’ stim_class “Nanduri2012” stim_dur Stimulus duration (ms) freq Stimulus frequency (Hz) amp_factor Stimulus amplitude ratio over threshold brightness Patient rated brightness compared to reference stimulus size Patient rated size compared to reference stimulus ref_stim_class “Nanduri2012” ref_amp_factor Amplitude factor (xTh) of reference pulse ref_freq Frequency (Hz) of reference pulse pulse_dur Pulse duration (ms) pulse_type ‘cathodicFirst’ interphase_dur Interphase gap (ms) varied_param Whether this trial is a part of ‘amp’ or ‘freq’ modulation New in version 0.7.
Parameters: - electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame
-
pulse2percept.datasets.
load_perezfornos2012
(shuffle=False, subjects=None, figures=None, random_state=0)[source]¶ Load data from [PerezFornos2012]
Load the brightness associated with joystick position data described in [PerezFornos2012]. Datapoints were extracted from Figures 3-7 of the paper.
Retinal implants: Argus II Subjects: 9 Number of samples: 45 Number of features: 158 The dataset includes the following features:
Subject Subject ID Figure Reference figure from [PerezFornos2012] time_series Numpy array of the time series data associated with the figure. Note: This was generated by linear interpolation from [-2.0, 75.5] in steps of 0.5 New in version 0.8.
Parameters: - shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
- subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected.
- figures (str | list of strings | None, optional) – Select data from a single figure or a list of figures. By default, all figures are selected
Returns: data (pd.DataFrame) – The whole dataset is returned in a 45x158 Pandas DataFrame
-
pulse2percept.datasets.
load_greenwald2009
(subjects=None, electrodes=None, shuffle=False, random_state=0)[source]¶ Load data from [Greenwald2009]
Load the brightness and size rating data described in [Greenwald2009]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.
Retinal implants: Argus I Subjects: 2 Number of samples: 83 Number of features: 12 The dataset includes the following features:
subject Subject ID, S06 implant Argus I electrode Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4 task ‘rate’ stim_class “Greenwald2009” stim_dur Stimulus duration (ms) amp Amplitude of the stimulation brightness Patient reported brightness pulse_dur Pulse duration (ms) interphase_dur Interphase gap (ms) pulse_type ‘cathodicFirst’ threshold Electrical stimulation threshold New in version 0.7.
Parameters: - subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected
- electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame