pulse2percept.datasets

Utilities to download and import datasets.

  • Dataset loaders can be used to load small datasets that come pre-packaged with the pulse2percept software.
  • Dataset fetchers can be used to download larger datasets from a given URL and directly import them into pulse2percept.
base get_data_dir, clear_data_dir, fetch_url
horsager2009 load_horsager2009
beyeler2019 fetch_beyeler2019
nanduri2012 load_nanduri2012
perezfornos2012 load_perezfornos2012
beyeler2019 fetch_beyeler2019
greenwald2009 load_greenwald2009
pulse2percept.datasets.clear_data_dir(data_dir=None)[source]

Delete all content in the data directory

By default, this is set to a directory called ‘pulse2percept_data’ in the user home directory. Alternatively, it can be set by a PULSE2PERCEPT_DATA environment variable or set programmatically by specifying a path.

New in version 0.6.

Parameters:data_dir (str or None) – The path to the pulse2percept data directory.
pulse2percept.datasets.fetch_url(url, file_path, progress_bar=<function _report_hook>, remote_checksum=None)[source]

Download a remote file

Fetch a dataset pointed to by url, check its SHA-256 checksum for integrity, and save it to file_path.

New in version 0.6.

Parameters:
  • url (string) – URL of file to download
  • file_path (string) – Path to the local file that will be created
  • progress_bar (func callback, optional) – A callback to a function func(count, block_size, total_size) that will display a progress bar.
  • remote_checksum (str, optional) – The expected SHA-256 checksum of the file.
pulse2percept.datasets.fetch_beyeler2019(subjects=None, electrodes=None, data_path=None, shuffle=False, random_state=0, download_if_missing=True)[source]

Load the phosphene drawing dataset from [Beyeler2019]

Download the phosphene drawing dataset described in [Beyeler2019] from https://osf.io/28uqg (66MB) to data_path. By default, all datasets are stored in ‘~/pulse2percept_data/’, but a different path can be specified.

Retinal implants: Argus I, Argus II
Subjects: 4
Number of samples: 400
Number of features: 16

The dataset includes the following features:

subject Subject ID, S1-S4
electrode Electrode ID, A1-F10
image Phosphene drawing
img_shape x,y shape of the phosphene drawing
date Experiment date (YYYY/mm/dd)
stim_class Stimulus type used to stimulate the array
amp Pulse amplitude used (x Threshold)
freq Pulse frequency used (Hz)
pdur Pulse duration used (ms)
area Phosphene area (see [Beyeler2019] for details)
orientation Phosphene orientation (see [Beyeler2019])
eccentricity Phosphene elongation (see [Beyeler2019])
compactness Phosphene compactness (see [Beyeler2019])
x_center, y_center Phosphene center of mass (see [Beyeler2019])
xrange, yrange Screen size in deg (see [Beyeler2019])

New in version 0.6.

Changed in version 0.7: Redirected download to 66MB version of the dataset that includes the fields x_center and y_center.

Parameters:
  • subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
  • data_path (string, optional) – Specify another download and cache folder for the dataset. By default all pulse2percept data is stored in ‘~/pulse2percept_data’ subfolders.
  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
  • random_state (int | numpy.random.RandomState | None, optional, default: 0) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
  • download_if_missing (optional) – If False, raise an IOError if the data is not locally available instead of trying to download it from the source site.
Returns:

data (pd.DataFrame) – The whole dataset is returned in a 400x16 Pandas DataFrame

pulse2percept.datasets.load_horsager2009(subjects=None, electrodes=None, stim_types=None, shuffle=False, random_state=0)[source]

Load data from [Horsager2009]

Load the threshold data described in [Horsager2009]. Average thresholds were extracted from the figures of the paper using WebplotDigitizer.

Retinal implants: Argus I
Subjects: 2
Number of samples: 552
Number of features: 21

The dataset includes the following features:

subject Subject ID, S05-S06
implant Argus I
electrode Electrode ID, A1-F10
task ‘threshold’ or ‘matching’
stim_type ‘single_pulse’, ‘fixed_duration’, ‘variable_duration’, ‘fixed_duration_supra’, ‘bursting_triplets’, ‘bursting_triplets_supra’, ‘latent_addition’
stim_dur Stimulus duration (ms)
stim_freq Stimulus frequency (Hz)
stim_amp Stimulus amplitude (uA)
pulse_type ‘cathodic_first’
pulse_dur Pulse duration (ms)
interphase_dur Interphase gap (ms)
delay_dur Stimulus delivered after delay (ms)
ref_stim_type Reference stimulus type (‘single_pulse’, …)
ref_freq Reference stimulus frequency (Hz)
ref_amp Reference stimulus amplitude (uA)
ref_amp_factor Reference stimulus amplitude factor (xThreshold)
ref_pulse_dur Reference stimulus pulse duration (ms)
ref_interphase_dur Reference stimulus interphase gap (ms)
theta Temporal model output at threshold (a.u.)
source Figure from which data was extracted

Some stimulus types require a reference stimulus. For example, ‘bursting_triplets_supra’ were delivered at 2x or 3x threshold of a reference bursting triplet pulse. The parameters of the reference stimulus are given in ref_* fields.

Missing values are denoted by NaN.

New in version 0.6.

Parameters:
  • subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
  • stim_types (str | list of strings | None, optional) – Select data from a single stimulus type or a list of stimulus types. By default, all stimulus types are selected.
  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns:

data (pd.DataFrame) – The whole dataset is returned in a 552x21 Pandas DataFrame

pulse2percept.datasets.load_nanduri2012(electrodes=None, task=None, shuffle=False, random_state=0)[source]

Load data from [Nanduri2012]

Load the brightness and size rating data described in [Nanduri2012]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.

Retinal implants: Argus I
Subjects: 1
Number of samples: 128
Number of features: 17

The dataset includes the following features:

subject Subject ID, S06
implant Argus I
electrode Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4
task ‘rate’ or ‘size’
stim_class “Nanduri2012”
stim_dur Stimulus duration (ms)
freq Stimulus frequency (Hz)
amp_factor Stimulus amplitude ratio over threshold
brightness Patient rated brightness compared to reference stimulus
size Patient rated size compared to reference stimulus
ref_stim_class “Nanduri2012”
ref_amp_factor Amplitude factor (xTh) of reference pulse
ref_freq Frequency (Hz) of reference pulse
pulse_dur Pulse duration (ms)
pulse_type ‘cathodicFirst’
interphase_dur Interphase gap (ms)
varied_param Whether this trial is a part of ‘amp’ or ‘freq’ modulation

New in version 0.7.

Parameters:
  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns:

data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame

pulse2percept.datasets.load_perezfornos2012(shuffle=False, subjects=None, figures=None, random_state=0)[source]

Load data from [PerezFornos2012]

Load the brightness associated with joystick position data described in [PerezFornos2012]. Datapoints were extracted from Figures 3-7 of the paper.

Retinal implants: Argus II
Subjects: 9
Number of samples: 45
Number of features: 158

The dataset includes the following features:

Subject Subject ID
Figure Reference figure from [PerezFornos2012]
time_series Numpy array of the time series data associated with the figure. Note: This was generated by linear interpolation from [-2.0, 75.5] in steps of 0.5

New in version 0.8.

Parameters:
  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
  • subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected.
  • figures (str | list of strings | None, optional) – Select data from a single figure or a list of figures. By default, all figures are selected
Returns:

data (pd.DataFrame) – The whole dataset is returned in a 45x158 Pandas DataFrame

pulse2percept.datasets.load_greenwald2009(subjects=None, electrodes=None, shuffle=False, random_state=0)[source]

Load data from [Greenwald2009]

Load the brightness and size rating data described in [Greenwald2009]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.

Retinal implants: Argus I
Subjects: 2
Number of samples: 83
Number of features: 12

The dataset includes the following features:

subject Subject ID, S06
implant Argus I
electrode Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4
task ‘rate’
stim_class “Greenwald2009”
stim_dur Stimulus duration (ms)
amp Amplitude of the stimulation
brightness Patient reported brightness
pulse_dur Pulse duration (ms)
interphase_dur Interphase gap (ms)
pulse_type ‘cathodicFirst’
threshold Electrical stimulation threshold

New in version 0.7.

Parameters:
  • subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected
  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
Returns:

data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame