pulse2percept.datasets

Utilities to download and import datasets.

  • Dataset loaders can be used to load small datasets that come pre-packaged with the pulse2percept software.

  • Dataset fetchers can be used to download larger datasets from a given URL and directly import them into pulse2percept.

base

get_data_dir, clear_data_dir, fetch_url

han2021

fetch_han2021

beyeler2019

fetch_beyeler2019, subject_params

nanduri2012

load_nanduri2012

perezfornos2012

load_perezfornos2012

greenwald2009

load_greenwald2009

horsager2009

load_horsager2009

pulse2percept.datasets.clear_data_dir(data_dir=None)[source]

Delete all content in the data directory

By default, this is set to a directory called ‘pulse2percept_data’ in the user home directory. Alternatively, it can be set by a PULSE2PERCEPT_DATA environment variable or set programmatically by specifying a path.

Added in version 0.6.

Parameters:

data_dir (str or None) – The path to the pulse2percept data directory.

pulse2percept.datasets.fetch_url(url, file_path, progress_bar=<function _report_hook>, remote_checksum=None)[source]

Download a remote file

Fetch a dataset pointed to by url, check its SHA-256 checksum for integrity, and save it to file_path.

Added in version 0.6.

Parameters:
  • url (string) – URL of file to download

  • file_path (string) – Path to the local file that will be created

  • progress_bar (func callback, optional) – A callback to a function func(count, block_size, total_size) that will display a progress bar.

  • remote_checksum (str, optional) – The expected SHA-256 checksum of the file.

pulse2percept.datasets.fetch_beyeler2019(subjects=None, electrodes=None, data_path=None, shuffle=False, random_state=0, download_if_missing=True)[source]

Load the phosphene drawing dataset from [Beyeler2019]

Download the phosphene drawing dataset described in [Beyeler2019] from https://osf.io/28uqg (66MB) to data_path. By default, all datasets are stored in ‘~/pulse2percept_data/’, but a different path can be specified.

Retinal implants:

Argus I, Argus II

Subjects:

4

Number of samples:

400

Number of features:

16

The dataset includes the following features:

subject

Subject ID, S1-S4

electrode

Electrode ID, A1-F10

image

Phosphene drawing

img_shape

x,y shape of the phosphene drawing

date

Experiment date (YYYY/mm/dd)

stim_class

Stimulus type used to stimulate the array

amp

Pulse amplitude used (x Threshold)

freq

Pulse frequency used (Hz)

pdur

Pulse duration used (ms)

area

Phosphene area (see [Beyeler2019] for details)

orientation

Phosphene orientation (see [Beyeler2019])

eccentricity

Phosphene elongation (see [Beyeler2019])

compactness

Phosphene compactness (see [Beyeler2019])

x_center, y_center

Phosphene center of mass (see [Beyeler2019])

xrange, yrange

Screen size in deg (see [Beyeler2019])

Added in version 0.6.

Changed in version 0.7: Redirected download to 66MB version of the dataset that includes the fields x_center and y_center.

Parameters:
  • subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.

  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.

  • data_path (string, optional) – Specify another download and cache folder for the dataset. By default all pulse2percept data is stored in ‘~/pulse2percept_data’ subfolders.

  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.

  • random_state (int | numpy.random.RandomState | None, optional, default: 0) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.

  • download_if_missing (optional) – If False, raise an IOError if the data is not locally available instead of trying to download it from the source site.

Returns:

data (pd.DataFrame) – The whole dataset is returned in a 400x16 Pandas DataFrame

pulse2percept.datasets.load_horsager2009(subjects=None, electrodes=None, stim_types=None, shuffle=False, random_state=0)[source]

Load data from [Horsager2009]

Load the threshold data described in [Horsager2009]. Average thresholds were extracted from the figures of the paper using WebplotDigitizer.

Retinal implants:

Argus I

Subjects:

2

Number of samples:

552

Number of features:

21

The dataset includes the following features:

subject

Subject ID, S05-S06

implant

Argus I

electrode

Electrode ID, A1-F10

task

‘threshold’ or ‘matching’

stim_type

‘single_pulse’, ‘fixed_duration’, ‘variable_duration’, ‘fixed_duration_supra’, ‘bursting_triplets’, ‘bursting_triplets_supra’, ‘latent_addition’

stim_dur

Stimulus duration (ms)

stim_freq

Stimulus frequency (Hz)

stim_amp

Stimulus amplitude (uA)

pulse_type

‘cathodic_first’

pulse_dur

Pulse duration (ms)

interphase_dur

Interphase gap (ms)

delay_dur

Stimulus delivered after delay (ms)

ref_stim_type

Reference stimulus type (‘single_pulse’, …)

ref_freq

Reference stimulus frequency (Hz)

ref_amp

Reference stimulus amplitude (uA)

ref_amp_factor

Reference stimulus amplitude factor (xThreshold)

ref_pulse_dur

Reference stimulus pulse duration (ms)

ref_interphase_dur

Reference stimulus interphase gap (ms)

theta

Temporal model output at threshold (a.u.)

source

Figure from which data was extracted

Some stimulus types require a reference stimulus. For example, ‘bursting_triplets_supra’ were delivered at 2x or 3x threshold of a reference bursting triplet pulse. The parameters of the reference stimulus are given in ref_* fields.

Missing values are denoted by NaN.

Added in version 0.6.

Parameters:
  • subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.

  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.

  • stim_types (str | list of strings | None, optional) – Select data from a single stimulus type or a list of stimulus types. By default, all stimulus types are selected.

  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.

  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.

Returns:

data (pd.DataFrame) – The whole dataset is returned in a 552x21 Pandas DataFrame

pulse2percept.datasets.load_nanduri2012(electrodes=None, task=None, shuffle=False, random_state=0)[source]

Load data from [Nanduri2012]

Load the brightness and size rating data described in [Nanduri2012]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.

Retinal implants:

Argus I

Subjects:

1

Number of samples:

128

Number of features:

17

The dataset includes the following features:

subject

Subject ID, S06

implant

Argus I

electrode

Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4

task

‘rate’ or ‘size’

stim_class

“Nanduri2012”

stim_dur

Stimulus duration (ms)

freq

Stimulus frequency (Hz)

amp_factor

Stimulus amplitude ratio over threshold

brightness

Patient rated brightness compared to reference stimulus

size

Patient rated size compared to reference stimulus

ref_stim_class

“Nanduri2012”

ref_amp_factor

Amplitude factor (xTh) of reference pulse

ref_freq

Frequency (Hz) of reference pulse

pulse_dur

Pulse duration (ms)

pulse_type

‘cathodicFirst’

interphase_dur

Interphase gap (ms)

varied_param

Whether this trial is a part of ‘amp’ or ‘freq’ modulation

Added in version 0.7.

Parameters:
  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.

  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.

  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.

Returns:

data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame

pulse2percept.datasets.load_perezfornos2012(shuffle=False, subjects=None, figures=None, random_state=0)[source]

Load data from [PerezFornos2012]

Load the brightness associated with joystick position data described in [PerezFornos2012]. Datapoints were extracted from Figures 3-7 of the paper.

Retinal implants:

Argus II

Subjects:

9

Number of samples:

45

Number of features:

158

The dataset includes the following features:

Subject

Subject ID

Figure

Reference figure from [PerezFornos2012]

time_series

Numpy array of the time series data associated with the figure. Note: This was generated by linear interpolation from [-2.0, 75.5] in steps of 0.5

Added in version 0.8.

Parameters:
  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.

  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.

  • subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected.

  • figures (str | list of strings | None, optional) – Select data from a single figure or a list of figures. By default, all figures are selected

Returns:

data (pd.DataFrame) – The whole dataset is returned in a 45x158 Pandas DataFrame

pulse2percept.datasets.load_greenwald2009(subjects=None, electrodes=None, shuffle=False, random_state=0)[source]

Load data from [Greenwald2009]

Load the brightness and size rating data described in [Greenwald2009]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.

Retinal implants:

Argus I

Subjects:

2

Number of samples:

83

Number of features:

12

The dataset includes the following features:

subject

Subject ID, S06

implant

Argus I

electrode

Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4

task

‘rate’

stim_class

“Greenwald2009”

stim_dur

Stimulus duration (ms)

amp

Amplitude of the stimulation

brightness

Patient reported brightness

pulse_dur

Pulse duration (ms)

interphase_dur

Interphase gap (ms)

pulse_type

‘cathodicFirst’

threshold

Electrical stimulation threshold

Added in version 0.7.

Parameters:
  • subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected

  • electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.

  • shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.

  • random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.

Returns:

data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame