Initializing the data

How to initialize the scan pattern

The scan pattern can be initialized using three recipes:

Recall that a path which is initialized with the data shape only is set to be a full raster (i.e. line-by-line) scan.

Recall also that all scan initialization functions allow to define a ratio argument (see The first basic object is the scan).

Initialize it with a file

The scan pattern can be initialized with a numpy .npz file which should store:

  • m (resp. n) which is the data number of rows (resp. columns),

  • path which is the path argument

To that end, one should use the from_file() method of Scan.

>>> import numpy as np
>>> m, n = 50, 100
>>> path = np.random.permutation(m*n)
>>> data_2_save = {'m': m, 'n': n, 'path': path}
>>> np.savez('my_scan.npz', **data_2_save)  # This saves the Scan numpy file

>>> inpystem.Scan.from_file('my_scan.npz', ratio=0.5) # This loads the numpy scan file.
<Scan, shape: (50, 100), ratio: 0.500>

Initialize it as random sampling

The sampling scan can last be initialized with the random() method of Scan. One should just give the spatial data shape (m, n). In addition to the ratio argument which can also be given, the user can give a seed to the method to have reproducible results.

>>> inpystem.Scan.random((50, 100))
<Scan, shape: (50, 100), ratio: 1.000>
>>> scan = inpystem.Scan.random((50, 100), ratio=0.2)
>>> scan
<Scan, shape: (50, 100), ratio: 0.200>
>>> scan.path[:5]
array([4071,  662, 4168, 3787, 4584])

>>> scan = inpystem.Scan.random((50, 100), ratio=0.2, seed=0)
>>> scan.path[:5]
array([ 398, 3833, 4836, 4572,  636])
>>> scan = inpystem.Scan.random((50, 100), ratio=0.2, seed=0)
>>> scan.path[:5]  # This shows that setting the seed makes the results reproducible.
array([ 398, 3833, 4836, 4572,  636])

Construct inpystem data manually

As explained in The result is inpystem data, the inpystem data is composed of a Scan object which defines the sampling pattern and the HyperSpy data which stores the data. Once both have been defined, the inpystem structure can be defined by hand.

>>> inpystem_data = inpystem.Stem2D(hsdata, scan=scan_object)

Construct inpystem data from a Numpy array

In case your image is a numpy array, one should define the HyperSpy data before creating the inpystem data.

>>> import numpy as np
>>> import hyperspy.api as hs
>>> shape = (50, 100, 1500)                 # This is the 3D data shape
>>> im = np.ones(shape)                     # This is our image (which is 3D this time).
>>> scan = inpystem.Scan.random(shape[:2])    # The scan is created (be careful to have 2-tuple shape).
>>> hsdata = hs.signals.Signal1D(im)        # Here, hs data is created from numpy array.
>>> inpystem.Stem3D(hsdata, scan)
<Stem3D, title: , dimensions: (100, 50|1500), sampling ratio: 1.00>

Well, the problem here, which is the same as for numpy-based HyperSpy data, is that both axes_manager and metadata are empty. To correct that, it is hygly recommended to use a configuration file. That’s the subject of next section.

Construct inpystem data from a configuration file

As explained in Loading your data is faster, inpystem can load data from a .conf configuration file. This is loaded by using the load_file() function (or the load_key() function if the configuration file is in the data path). To that end, a configuration file gives to inpystem all important informations.

First, the configuration file is separated in three main sections (case-sensitive, caution !):

  • DATA 2D for 2D data,

  • DATA 3D for 3D data,

  • SCAN for the scan pattern.

Among these sections, only one of DATA 2D and DATA 3D sections is required (if no data is given, inpystem can not do anything …). And inside this section, the only key which is required is file which specifies the location of the data file (numpy .npy or .dm4 or all other file which is allowed by HyperSpy) relative to the configuration file. One info: contrary to sections wich are case-sensitive, keys are not.

In case no file key is given inside a SCAN section, the load_file() function creates automatically a random scan object (based on its scan_ratio and scan_seed arguments). Otherwise, a scan file (numpy or dm4/dm3) is loaded (the scan_ratio argument of load_file() can still be given).

Hence, a basic configuration file could look like this.

#
# This is a demo file.
# This text is not used, that's a commentary.
#

[3D DATA]
# This section defines all info about 3D data
File = eels_data.dm4

[SCAN]
# This section defines all info about scan pattern

# If the following line is commented, the scan pattern would be random.
FILE = scan.dm4

In the special case where the data file is a numpy .npy file, one could define additional information to fill the HyperSpy axes_manager attribute. To that end, a set of keys can be given inside the corresponding section. These keys should be like axis_dim_info where:

  • dim is the axis index (0 for the x axis, 1 for the y axis and 2 in case of 3D data for the spectrum axis),

  • info belongs to name, scale, unit and offset.

As an example, the previous section data axes_manager should look like this.

>>> data = inpystem.Stem3D(hsdata, scan)
Creating STEM acquisition...

>>> data.hsdata.axes_manager
<Axes manager, axes: (100, 50|1500)>
            Name |   size |  index |  offset |   scale |  units
================ | ====== | ====== | ======= | ======= | ======
     <undefined> |    100 |      0 |       0 |       1 | <undefined>
     <undefined> |     50 |      0 |       0 |       1 | <undefined>
---------------- | ------ | ------ | ------- | ------- | ------
     <undefined> |   1500 |        |       0 |       1 | <undefined>

If the numpy array is save inside a directory with the following configuration file, this issue would be fixed.

#
# This is a demo file to define Numpy data axes_manager.
#

[3D DATA]
file = numpy_data.npy

# Infos for the axes_manager
axis_0_name = x
axis_1_name = y
axis_2_name = Energy loss

# Some more info for the energy loss axis
axis_2_offset = 4.6e+02
axis_2_scale = 0.32
axis_2_unit = eV

# No scan section, I want a random scan.

And the data would be loaded by simply typing this.

>>> inpystem.load_file('my-nice-file.conf', scan_ratio=0.5, scan_seed=0)

Some example data for fast testing

The package is delivered with some toy data for testing which are not provided inside the package itself due to the high data size. Please download it at the github project page under location DATA/ and copy it to your data path (see Loading your data is faster). These data can be called afterwards with the load_key() function.

The three example data are called with the following keys:

  • 'HR-sample': this is a real atomic-scale HAADF/EELS sample,

  • 'HR-synth': this is a synthetic EELS image generated to be similar to 'HR-sample',

  • 'LR-synth': this is a synthetic low-resolution EELS image.

The first data were acquired in the context of the following works [AZWT+19], [APLML+18]. Authors of these works would like to acknowledge Daniele Preziosi for the LAO-NNO thin film growth, Alexandre Gloter for the FIB lamella preparation and Xiaoyan Li for STEM experiments.

The two last data were generated to compare reconstruction methods in the context of STEM-EELS data inpainting [AMonierOberlinBrun+18]. The high-resolution works were submitted.

References

APLML+18

Daniele Preziosi, Laura Lopez-Mir, Xiaoyan Li, Tom Cornelissen, Jin Hong Lee, Felix Trier, Karim Bouzehouane, Sergio Valencia, Alexandre Gloter, Agnès Barthélémy, and Manuel Bibes. Direct mapping of phase separation across the metal–insulator transition of ndnio3. Nano Letters, 18(4):2226–2232, 2018. doi:10.1021/acs.nanolett.7b04728.

AZWT+19

Alberto Zobelli, Steffi Y Woo, Luiz HG Tizei, Nathalie Brun, Anna Tararan, Xiaoyan Li, Odile Stéphan, Mathieu Kociak, and Marcel Tencé. Spatial and spectral dynamics in stem hyperspectral imaging using random scan patterns. arXiv preprint arXiv:1909.07842, 2019.

AMonierOberlinBrun+18

É. Monier, T. Oberlin, N. Brun, M. Tencé, M. de Frutos, and N. Dobigeon. Reconstruction of partially sampled multiband images—application to stem-eels imaging. IEEE Trans. Comput. Imag., 4(4):585–598, dec. 2018.