pulse2percept.datasets¶
Utilities to download and import datasets.
- Dataset loaders can be used to load small datasets that come pre-packaged with the pulse2percept software.
- Dataset fetchers can be used to download larger datasets from a given URL and directly import them into pulse2percept.
See also
-
pulse2percept.datasets.
clear_data_dir
(data_dir=None)[source]¶ Delete all content in the data directory
By default, this is set to a directory called ‘pulse2percept_data’ in the user home directory. Alternatively, it can be set by a
PULSE2PERCEPT_DATA
environment variable or set programmatically by specifying a path.New in version 0.6.
Parameters: data_dir (str or None) – The path to the pulse2percept data directory.
-
pulse2percept.datasets.
fetch_url
(url, file_path, progress_bar=<function _report_hook>, remote_checksum=None)[source]¶ Download a remote file
Fetch a dataset pointed to by
url
, check its SHA-256 checksum for integrity, and save it tofile_path
.New in version 0.6.
Parameters: - url (string) – URL of file to download
- file_path (string) – Path to the local file that will be created
- progress_bar (func callback, optional) – A callback to a function
func(count, block_size, total_size)
that will display a progress bar. - remote_checksum (str, optional) – The expected SHA-256 checksum of the file.
-
pulse2percept.datasets.
fetch_beyeler2019
(subjects=None, electrodes=None, data_path=None, shuffle=False, random_state=0, download_if_missing=True)[source]¶ Load the phosphene drawing dataset from [Beyeler2019]
Download the phosphene drawing dataset described in [Beyeler2019] from https://osf.io/28uqg (66MB) to
data_path
. By default, all datasets are stored in ‘~/pulse2percept_data/’, but a different path can be specified.Retinal implants: Argus I, Argus II Subjects: 4 Number of samples: 400 Number of features: 16 The dataset includes the following features:
subject Subject ID, S1-S4 electrode Electrode ID, A1-F10 image Phosphene drawing img_shape x,y shape of the phosphene drawing date Experiment date (YYYY/mm/dd) stim_class Stimulus type used to stimulate the array amp Pulse amplitude used (x Threshold) freq Pulse frequency used (Hz) pdur Pulse duration used (ms) area Phosphene area (see [Beyeler2019] for details) orientation Phosphene orientation (see [Beyeler2019]) eccentricity Phosphene elongation (see [Beyeler2019]) compactness Phosphene compactness (see [Beyeler2019]) x_center, y_center Phosphene center of mass (see [Beyeler2019]) xrange, yrange Screen size in deg (see [Beyeler2019]) New in version 0.6.
Changed in version 0.7: Redirected download to 66MB version of the dataset that includes the fields
x_center
andy_center
.Parameters: - subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
- electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- data_path (string, optional) – Specify another download and cache folder for the dataset. By default all pulse2percept data is stored in ‘~/pulse2percept_data’ subfolders.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional, default: 0) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
- download_if_missing (optional) – If False, raise an IOError if the data is not locally available instead of trying to download it from the source site.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 400x16 Pandas DataFrame
-
pulse2percept.datasets.
load_horsager2009
(subjects=None, electrodes=None, stim_types=None, shuffle=False, random_state=0)[source]¶ Load data from [Horsager2009]
Load the threshold data described in [Horsager2009]. Average thresholds were extracted from the figures of the paper using WebplotDigitizer.
Retinal implants: Argus I Subjects: 2 Number of samples: 552 Number of features: 21 The dataset includes the following features:
subject Subject ID, S05-S06 implant Argus I electrode Electrode ID, A1-F10 task ‘threshold’ or ‘matching’ stim_type ‘single_pulse’, ‘fixed_duration’, ‘variable_duration’, ‘fixed_duration_supra’, ‘bursting_triplets’, ‘bursting_triplets_supra’, ‘latent_addition’ stim_dur Stimulus duration (ms) stim_freq Stimulus frequency (Hz) stim_amp Stimulus amplitude (uA) pulse_type ‘cathodic_first’ pulse_dur Pulse duration (ms) interphase_dur Interphase gap (ms) delay_dur Stimulus delivered after delay (ms) ref_stim_type Reference stimulus type (‘single_pulse’, …) ref_freq Reference stimulus frequency (Hz) ref_amp Reference stimulus amplitude (uA) ref_amp_factor Reference stimulus amplitude factor (xThreshold) ref_pulse_dur Reference stimulus pulse duration (ms) ref_interphase_dur Reference stimulus interphase gap (ms) theta Temporal model output at threshold (a.u.) source Figure from which data was extracted Some stimulus types require a reference stimulus. For example, ‘bursting_triplets_supra’ were delivered at 2x or 3x threshold of a reference bursting triplet pulse. The parameters of the reference stimulus are given in
ref_*
fields.Missing values are denoted by NaN.
New in version 0.6.
Parameters: - subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
- electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- stim_types (str | list of strings | None, optional) – Select data from a single stimulus type or a list of stimulus types. By default, all stimulus types are selected.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 552x21 Pandas DataFrame
-
pulse2percept.datasets.
load_nanduri2012
(electrodes=None, task=None, shuffle=False, random_state=0)[source]¶ Load data from [Nanduri2012]
Load the brightness and size rating data described in [Nanduri2012]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.
Retinal implants: Argus I Subjects: 1 Number of samples: 128 Number of features: 17 The dataset includes the following features:
subject Subject ID, S06 implant Argus I electrode Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4 task ‘rate’ or ‘size’ stim_class “Nanduri2012” stim_dur Stimulus duration (ms) freq Stimulus frequency (Hz) amp_factor Stimulus amplitude ratio over threshold brightness Patient rated brightness compared to reference stimulus size Patient rated size compared to reference stimulus ref_stim_class “Nanduri2012” ref_amp_factor Amplitude factor (xTh) of reference pulse ref_freq Frequency (Hz) of reference pulse pulse_dur Pulse duration (ms) pulse_type ‘cathodicFirst’ interphase_dur Interphase gap (ms) varied_param Whether this trial is a part of ‘amp’ or ‘freq’ modulation New in version 0.7.
Parameters: - electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame
-
pulse2percept.datasets.
load_perezfornos2012
(shuffle=False, subjects=None, figures=None, random_state=0)[source]¶ Load data from [PerezFornos2012]
Load the brightness associated with joystick position data described in [PerezFornos2012]. Datapoints were extracted from Figures 3-7 of the paper.
Retinal implants: Argus II Subjects: 9 Number of samples: 45 Number of features: 158 The dataset includes the following features:
Subject Subject ID Figure Reference figure from [PerezFornos2012] time_series Numpy array of the time series data associated with the figure. Note: This was generated by linear interpolation from [-2.0, 75.5] in steps of 0.5 New in version 0.8.
Parameters: - shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
- subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected.
- figures (str | list of strings | None, optional) – Select data from a single figure or a list of figures. By default, all figures are selected
Returns: data (pd.DataFrame) – The whole dataset is returned in a 45x158 Pandas DataFrame
-
pulse2percept.datasets.
load_greenwald2009
(subjects=None, electrodes=None, shuffle=False, random_state=0)[source]¶ Load data from [Greenwald2009]
Load the brightness and size rating data described in [Greenwald2009]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.
Retinal implants: Argus I Subjects: 2 Number of samples: 83 Number of features: 12 The dataset includes the following features:
subject Subject ID, S06 implant Argus I electrode Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4 task ‘rate’ stim_class “Greenwald2009” stim_dur Stimulus duration (ms) amp Amplitude of the stimulation brightness Patient reported brightness pulse_dur Pulse duration (ms) interphase_dur Interphase gap (ms) pulse_type ‘cathodicFirst’ threshold Electrical stimulation threshold New in version 0.7.
Parameters: - subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected
- electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
- shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
- random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns: data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame