pulse2percept.datasets
Utilities to download and import datasets.
Dataset loaders can be used to load small datasets that come pre-packaged with the pulse2percept software.
Dataset fetchers can be used to download larger datasets from a given URL and directly import them into pulse2percept.
|
|
|
|
|
|
See also
- pulse2percept.datasets.clear_data_dir(data_dir=None)[source]
Delete all content in the data directory
By default, this is set to a directory called ‘pulse2percept_data’ in the user home directory. Alternatively, it can be set by a
PULSE2PERCEPT_DATA
environment variable or set programmatically by specifying a path.Added in version 0.6.
- Parameters:
data_dir (str or None) – The path to the pulse2percept data directory.
- pulse2percept.datasets.fetch_url(url, file_path, progress_bar=<function _report_hook>, remote_checksum=None)[source]
Download a remote file
Fetch a dataset pointed to by
url
, check its SHA-256 checksum for integrity, and save it tofile_path
.Added in version 0.6.
- Parameters:
url (string) – URL of file to download
file_path (string) – Path to the local file that will be created
progress_bar (func callback, optional) – A callback to a function
func(count, block_size, total_size)
that will display a progress bar.remote_checksum (str, optional) – The expected SHA-256 checksum of the file.
- pulse2percept.datasets.fetch_beyeler2019(subjects=None, electrodes=None, data_path=None, shuffle=False, random_state=0, download_if_missing=True)[source]
Load the phosphene drawing dataset from [Beyeler2019]
Download the phosphene drawing dataset described in [Beyeler2019] from https://osf.io/28uqg (66MB) to
data_path
. By default, all datasets are stored in ‘~/pulse2percept_data/’, but a different path can be specified.Retinal implants:
Argus I, Argus II
Subjects:
4
Number of samples:
400
Number of features:
16
The dataset includes the following features:
subject
Subject ID, S1-S4
electrode
Electrode ID, A1-F10
image
Phosphene drawing
img_shape
x,y shape of the phosphene drawing
date
Experiment date (YYYY/mm/dd)
stim_class
Stimulus type used to stimulate the array
amp
Pulse amplitude used (x Threshold)
freq
Pulse frequency used (Hz)
pdur
Pulse duration used (ms)
area
Phosphene area (see [Beyeler2019] for details)
orientation
Phosphene orientation (see [Beyeler2019])
eccentricity
Phosphene elongation (see [Beyeler2019])
compactness
Phosphene compactness (see [Beyeler2019])
x_center, y_center
Phosphene center of mass (see [Beyeler2019])
xrange, yrange
Screen size in deg (see [Beyeler2019])
Added in version 0.6.
Changed in version 0.7: Redirected download to 66MB version of the dataset that includes the fields
x_center
andy_center
.- Parameters:
subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
data_path (string, optional) – Specify another download and cache folder for the dataset. By default all pulse2percept data is stored in ‘~/pulse2percept_data’ subfolders.
shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
random_state (int | numpy.random.RandomState | None, optional, default: 0) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
download_if_missing (optional) – If False, raise an IOError if the data is not locally available instead of trying to download it from the source site.
- Returns:
data (pd.DataFrame) – The whole dataset is returned in a 400x16 Pandas DataFrame
- pulse2percept.datasets.load_horsager2009(subjects=None, electrodes=None, stim_types=None, shuffle=False, random_state=0)[source]
Load data from [Horsager2009]
Load the threshold data described in [Horsager2009]. Average thresholds were extracted from the figures of the paper using WebplotDigitizer.
Retinal implants:
Argus I
Subjects:
2
Number of samples:
552
Number of features:
21
The dataset includes the following features:
subject
Subject ID, S05-S06
implant
Argus I
electrode
Electrode ID, A1-F10
task
‘threshold’ or ‘matching’
stim_type
‘single_pulse’, ‘fixed_duration’, ‘variable_duration’, ‘fixed_duration_supra’, ‘bursting_triplets’, ‘bursting_triplets_supra’, ‘latent_addition’
stim_dur
Stimulus duration (ms)
stim_freq
Stimulus frequency (Hz)
stim_amp
Stimulus amplitude (uA)
pulse_type
‘cathodic_first’
pulse_dur
Pulse duration (ms)
interphase_dur
Interphase gap (ms)
delay_dur
Stimulus delivered after delay (ms)
ref_stim_type
Reference stimulus type (‘single_pulse’, …)
ref_freq
Reference stimulus frequency (Hz)
ref_amp
Reference stimulus amplitude (uA)
ref_amp_factor
Reference stimulus amplitude factor (xThreshold)
ref_pulse_dur
Reference stimulus pulse duration (ms)
ref_interphase_dur
Reference stimulus interphase gap (ms)
theta
Temporal model output at threshold (a.u.)
source
Figure from which data was extracted
Some stimulus types require a reference stimulus. For example, ‘bursting_triplets_supra’ were delivered at 2x or 3x threshold of a reference bursting triplet pulse. The parameters of the reference stimulus are given in
ref_*
fields.Missing values are denoted by NaN.
Added in version 0.6.
- Parameters:
subjects (str | list of strings | None, optional) – Select data from a subject or list of subjects. By default, all subjects are selected.
electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
stim_types (str | list of strings | None, optional) – Select data from a single stimulus type or a list of stimulus types. By default, all stimulus types are selected.
shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
- Returns:
data (pd.DataFrame) – The whole dataset is returned in a 552x21 Pandas DataFrame
- pulse2percept.datasets.load_nanduri2012(electrodes=None, task=None, shuffle=False, random_state=0)[source]
Load data from [Nanduri2012]
Load the brightness and size rating data described in [Nanduri2012]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.
Retinal implants:
Argus I
Subjects:
1
Number of samples:
128
Number of features:
17
The dataset includes the following features:
subject
Subject ID, S06
implant
Argus I
electrode
Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4
task
‘rate’ or ‘size’
stim_class
“Nanduri2012”
stim_dur
Stimulus duration (ms)
freq
Stimulus frequency (Hz)
amp_factor
Stimulus amplitude ratio over threshold
brightness
Patient rated brightness compared to reference stimulus
size
Patient rated size compared to reference stimulus
ref_stim_class
“Nanduri2012”
ref_amp_factor
Amplitude factor (xTh) of reference pulse
ref_freq
Frequency (Hz) of reference pulse
pulse_dur
Pulse duration (ms)
pulse_type
‘cathodicFirst’
interphase_dur
Interphase gap (ms)
varied_param
Whether this trial is a part of ‘amp’ or ‘freq’ modulation
Added in version 0.7.
- Parameters:
electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
- Returns:
data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame
- pulse2percept.datasets.load_perezfornos2012(shuffle=False, subjects=None, figures=None, random_state=0)[source]
Load data from [PerezFornos2012]
Load the brightness associated with joystick position data described in [PerezFornos2012]. Datapoints were extracted from Figures 3-7 of the paper.
Retinal implants:
Argus II
Subjects:
9
Number of samples:
45
Number of features:
158
The dataset includes the following features:
Subject
Subject ID
Figure
Reference figure from [PerezFornos2012]
time_series
Numpy array of the time series data associated with the figure. Note: This was generated by linear interpolation from [-2.0, 75.5] in steps of 0.5
Added in version 0.8.
- Parameters:
shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected.
figures (str | list of strings | None, optional) – Select data from a single figure or a list of figures. By default, all figures are selected
- Returns:
data (pd.DataFrame) – The whole dataset is returned in a 45x158 Pandas DataFrame
- pulse2percept.datasets.load_greenwald2009(subjects=None, electrodes=None, shuffle=False, random_state=0)[source]
Load data from [Greenwald2009]
Load the brightness and size rating data described in [Greenwald2009]. Datapoints were extracted from figure 4 of the paper using WebplotDigitizer.
Retinal implants:
Argus I
Subjects:
2
Number of samples:
83
Number of features:
12
The dataset includes the following features:
subject
Subject ID, S06
implant
Argus I
electrode
Electrode ID, A2, A4, B1, C1, C4, D2, D3, D4
task
‘rate’
stim_class
“Greenwald2009”
stim_dur
Stimulus duration (ms)
amp
Amplitude of the stimulation
brightness
Patient reported brightness
pulse_dur
Pulse duration (ms)
interphase_dur
Interphase gap (ms)
pulse_type
‘cathodicFirst’
threshold
Electrical stimulation threshold
Added in version 0.7.
- Parameters:
subjects (str | list of strings | None, optional) – Select data from a single subject or a list of subjects. By default, all subjects are selected
electrodes (str | list of strings | None, optional) – Select data from a single electrode or a list of electrodes. By default, all electrodes are selected.
shuffle (boolean, optional) – If True, the rows of the DataFrame are shuffled.
random_state (int | numpy.random.RandomState | None, optional) – Determines random number generation for dataset shuffling. Pass an int for reproducible output across multiple function calls.
- Returns:
data (pd.DataFrame) – The whole dataset is returned in a 144x16 Pandas DataFrame