dataset module

This module defines important tools to import data from dataset.

Data path functions

inpystem.dataset.read_data_path()

Read the saved data folder path.

The inpystem library proposes to store all data in a particular directory with associated configuration files. This folder is saved in inpystem. To access to this folder path, use this function.

If no data path is saved, the function returns None. Else, the path is returned.

Returns

None is returned if no path is saved. Else, the data path is returned.

Return type

None, str

inpystem.dataset.set_data_path(path)

Sets the saved data folder path.

The inpystem library proposes to store all data in a particular directory with associated configuration files. This folder is saved in inpystem. To set to this folder path, use this function.

A boolean is returned to confirm that the change is effective.

Parameters

path (str) – The desired data path.

Returns

If the data path has really been changed, the function returns True. Else, it returns False.

Return type

bool

Load functions

inpystem.dataset.load_file(file, ndim, scan_ratio=None, scan_seed=None, dev=None, verbose=True)

This function loads a STEM acquisition based on a configuration .conf file path.

The number of dimensions ndim should also be given.

The Path is generated from a scan file given in the configuration file or is randomly drawn. Whatever the case, the Scan object ratio property can be set through the scan_ratio argument. Additionally, in the case where no file is provided for the scan pattern, use the scan_seed argument to have reproductible data.

The function allows the user to ask for development data by setting the dev argument. If dev is None, then the usual Stem2D and Stem3D classes are returned. If dev is a dictionary, then Dev2D and Dev3D classes are returned. This dictionary could contain additional class arguments such as:

  • snr, seed and normalized for Dev2D,

  • snr, seed, normalized, PCA_transformed and PCA_th for Dev3D.

Parameters
  • file (str) – The configuration file path.

  • ndim (int) – The data dimension. Should be 2 or 3.

  • scan_ratio (optional, None, float) – The Path object ratio. Default is None for full sampling.

  • scan_seed (int) – The seed in case of random scan initialization. Default is None for random seed.

  • dev (optional, None, dictionary) – This arguments allows the user to ask for development data. If this is None, usual data is returned. If this argument is a dictionary, then development data will be returned and the dictionary will be given to the data contructors. Default is None for usual data.

  • verbose (optional, bool) – If True, information will be sent to standard output.. Default is True.

Returns

The inpystem data.

Return type

Stem2D, Stem3D, Dev2D, Dev3D

Todo

Maybe enable PCA_th in config file for 3D data.

inpystem.dataset.load_key(key, ndim, scan_ratio=None, scan_seed=None, dev=None, verbose=True)

This function loads a STEM acquisition based on a key.

A key is a string which can be:

The key should always be the name of the configuration file without the suffix (.conf). As an example, if a configuration file located in the data folder is named my-sample.conf, then its data could be loaded with the my-sample key.

The number of dimensions ndim should also be given.

The Path is generated from a scan file given in the configuration file or is randomly drawn. Whatever the case, the Scan object ratio property can be set through the scan_ratio argument. Additionally, in the case where no file is provided for the scan pattern, use the scan_seed argument to have reproductible data.

The function allows the user to ask for development data by setting the dev argument. If dev is None, then the usual Stem2D and Stem3D classes are returned. If dev is a dictionary, then Dev2D and Dev3D classes are returned. This dictionary could contain additional class arguments such as:

  • snr, seed, normalized and verbose for Dev2D,

  • snr, seed, normalized, PCA_transformed, PCA_th and verbose for Dev3D.

This function only searches for the configuration file to use the load_file function afterwards.

Parameters
  • key (str) – The data key.

  • ndim (int) – The data dimension. Should be 2 or 3.

  • scan_ratio (optional, None, float) – The Path object ratio. Default is None for full sampling.

  • scan_seed (int) – The seed in case of random scan initialization. Default is None for random seed.

  • dev (optional, None, dictionary) – This arguments allows the user to ask for development data. If this is None, usual data is returned. If this argument is a dictionary, then development data will be returned and the dictionary will be given to the data contructors. Default is None for usual data.

  • verbose (optional, bool) – If True, information will be sent to standard output.. Default is True.

Returns

The inpystem data.

Return type

Stem2D, Stem3D, Dev2D, Dev3D