Welcome to inpystem’s documentation!¶
inpystem User Guide¶
Introduction¶
What is inpystem¶
inpystem is an open source Python library which provides tools to reconstruct partially sampled 2D images as multi-band images.
inpystem’s core is a set of reconstruction techniques such as interpolation, regularized least-square and dictionary learning methods. It provides a user interface which simplify the use of these techniques.
inpystem is mainly at the destination of the microscopy community so that it highly depends on the good library HyperSpy.
This library was originally developed by its creator Etienne Monier to handle EELS data and develop reconstruction algorithms. This was proposed afterwards to the microscopy community as a tool.
About the developer¶
This library is developed by Etienne Monier, a French PhD student.
His research interests are in new methods and algorithms to solve challenging acquisition problems encountered in the acquisition of multi-band microscopy images. In particular, he is interested in acquiring high-SNR Electron Energy Loss Spectroscopy (EELS) images with extremely low beam energy to prevent destruction of sensitive microscopy samples. His goal is to provide to the EELS community precise algorithms to detect rapidly the presence of a chemical element inside a sample to reduce irradiation as much as possible. Such problem requires signal processing methods, such as convex optimization and proximal splitting methods.
Installing inpystem¶
With pip¶
inpystem is listed in the Python Package Index. Therefore, it can be automatically downloaded and installed with pip. You may need to install pip for the following commands to run.
Install with pip:
$ pip install inpystem
Install from source¶
When installing manually, be sure that all dependences are required. For example, do:
$ pip install numpy scipy matplotlib hyperspy scikit-learn scikit-image
Be aware that the scikit-image package versions is above 0.16.
Released version¶
To install from source grab a tar.gz release from Python Package Index <https://pypi.org/> and use the following code if Linux/Mac user:
$ tar -xzf inpystem.tar.gz
$ cd inpystem
$ python setup.py install
You can also use a Python installer, e.g.
$ pip install inpystem.tar.gz
Development version¶
To get the development version from our git repository you need to install git. Then just do:
$ git clone https://github.com/etienne-monier/inpystem
$ cd inpystem
Then, perform one of the following commands:
$ pip install -e .
$ python setup.py install
Getting started¶
In this documentation, we will assume that the reader can write some command line or jupyter python.
Starting inpystem in python¶
inpystem can be imported in python just as any python package.
>>> import inpystem
In addition to inpystem, the HyperSpy library is required to construct the inpystem objects as some arguments should be HyperSpy data.
>>> import hyperspy.api as hs
Last, note that most of the scientific python applications require libraries such as numpy and matplotlib. It is recommended to import them as well.
>>> import numpy as np
>>> import matplotlib.pyplot as plt
The base: What is a STEM acquisition here ?¶
A STEM acquisition here is the result of two main informations:
The scan pattern which is basically the array of the visited pixels indexes,
The data which are the values of the data at the sampled pixels.
Be aware that the inpystem library can handle 2D (e.g. HAADF data) as 3D data (e.g. EELS data). The data that are acquired at a spatial position (or pixel) can then be a single value (for 2D) or a spectrum (for 3D).
The first basic object is the scan¶
The basic object to understand is the Scan
class. This is an object which stores the scan pattern. To create one, the data spatial shape and the path indexes should be given.
>>> shape = (3, 4) # This is the spatial shape of the data : 3 rows, 4 columns.
>>> path = [ 1, 3, 6, 4, 7, 11, 8, 0] # These are the sampled pixels indexes.
>>> scan = inpystem.Scan(shape, path)
>>> scan
<Scan, shape: (3, 4), ratio: 0.667>
Note
Note that in this documentation, the matrix indexes are denoted in the rows major order (which is the order chosen in python).
It mean that if the spatial dimensions are (M, N)
, then the i’th index refers to the pixels which coordinates are (i // N, i % N)
. In the above example, the first visited pixel is the second pixel of the first line.
Be careful also that python coordinates begin at 0, not 1.
Additionally, the Scan
object allows you to select a fraction of the scan pattern. Let us consider a fully sampled acquisition which spatial shape is (M, N)
, then only 10% of the pixels can be selected by setting the ratio
argument to 0.1
. This argument can be modified after the object definition.
Thought, be careful in case the acquisition is partially sampled (or example, if only r*M*N
pixels were acquired, with r
below 1). In such case, if ratio
is below r
, then only ratio*M*N
pixels are kept. If ratio
is above r
, then all r*M*N
pixels are kept.
>>> scan
<Scan, shape: (3, 4), ratio: 0.667>
>>> scan.ratio = 0.8 # Here, we ask for a ratio which is higher than 0.667.
WARNING:root:Input ratio is higher than higher maximal ratio (0.667). Ratio is set to the maximal value.
>>> scan.ratio = 0.5 # Here, the value is correct.
>>> scan
<Scan, shape: (3, 4), ratio: 0.500>
>>> scan = inpystem.Scan(shape, path, ratio=0.5) # The ratio can be given at initialization.
>>> scan
<Scan, shape: (3, 4), ratio: 0.500>
>>> scan.ratio = 0.667 # But don't worry, the additional visited pixels are not lost.
>>> scan
<Scan, shape: (3, 4), ratio: 0.667>
In fact, the pixels that are given at initialization of scan
are not lost when a below ratio
is given as the currently visited index are stored in path
attribute while the path_0
attribute stores all pixels at initialization.
>>> scan = inpystem.Scan(shape, path, ratio=0.5) # The ratio can be given at initialization.
>>> scan
<Scan, shape: (3, 4), ratio: 0.500>
>>> scan.path
.. code-block:: python
>>> scan.path_0
array([ 1, 3, 6, 4, 7, 11, 8, 0])
>>> scan.ratio
0.5
Note
The Scan object data can be represented with a sampling mask \(\mathbf{M}\) defined as
This representation suffer from information deficiency, but is interesting to study the acquired pixels repartition. This sampling mask which shape is the same as the spatial shape can be obtained using the get_mask()
method of Scan
. This one can also be plotted using the method plot()
(see Data Visualization).
The second basic object is data¶
Well, data here are nothing else than HyperSpy data. Please refer to its documentation for more info about it.
The result is inpystem data¶
As explained previously, the inpystem data is the combination of a Scan
object and an HyperSpy data. Two classes are proposed to the user:
Both are initialized with a scan pattern and the associated data. Though, the scan pattern is optional as the default scan pattern is raster scan (line-by-line) full sampling.
>>> import hyperspy.api as hs
>>> haadf_hs = hs.load('haadf_data.dm4')
>>> acquisition_1 = inpystem.Stem2D(haadf_hs) # fully sampled HAADF image.
>>> m, n = haadf_hs.data.shape
>>> N = int(0.5*m*n) # The number of pixels to visit.
>>> path = np.random.permutation(m*n)[:N]
>>> scan = inpystem.Scan((m, n), path)
>>> acquisition_2 = inpystem.Stem2D(haadf_hs, scan) # partially sampled HAADF image.
>>> eels_hs = hs.load('eels_data.dm4')
>>> acquisition_3 = inpystem.Stem3D(eels_hs) # fully sampled EELS image.
Loading your data is faster¶
inpystem offers you a way to accelerate the data definition. To that end, inpystem proposes you to setup a data directory (let’s say /my/wonderful/data/dir/
) and to put inside your data so that the structure looks like this:
/my/wonderful/data/dir/
|
+-- MyData1
| |
| +-- ells_data.dm4
| +-- haadf_data.dm4
| +-- scan.dm4
| +-- MyData1.conf
|
+-- MyData2
|
+-- ells_data_2.dm4
+-- MyData2.conf
Note
The data directory is not set by default. You should use the set_data_path()
function to set the path. Then, it can be read with the read_data_path()
.
>>> inpystem.set_data_path('/my/wonderful/data/dir/')
>>> inpystem.read_data_path()
'/my/wonderful/data/dir/'
The data directory contains sub-directories which host:
the data files (2D/3D data, scan pattern),
the configuration file (such as
MyData1.conf
in the above tree).
The configuration file has the structure of a .ini
file (have a look at this page for an example format) and defines the relative location of data files. This would look like this (be aware that the section names such as 2D DATA
is case sensitive while keys such as file
are not).
#
# This is a demo MyData1.conf file
#
[2D DATA]
# This section defines all info about 2D data
file = haadf_data.dm4
[3D DATA]
# This section defines all info about 3D data
File = eels_data.dm4
[SCAN]
# This section defines all info about scan pattern
FILE = scan.dm4
This file defines all is necessary to define the inpystem data objects. To load the corresponding data, one should use the load_file()
function which loads the data based on the .conf
configuration file. Alternatively, inpystem can load the mydata.conf
data directly by using the load_key()
with the mydata
key (as long as mydata.conf
is located inside the data directory). The difference between the two functions ? load_file()
allows you to load a file which is not in the data directory.
In addition to the configuration file path, the user should specify which data to load with the ndim
argument (2 for 2D data and 3 for 3D data).
>>> inpystem.get_data_path()
/my/wonderful/data/dir/
>>> acquisition = inpystem.load_key('MyData1')
>>> acquisition = inpystem.load_file('/my/wonderful/data/dir/MyData2.conf', ndim=2)
Other arguments (such as the scan pattern ratio) can be passed to the two load function. That will be seen later.
Note
From this point, the examples can be tested directly in the command line as long as the data path is set and that the inpystem example data are downloaded and placed inside the data path. See Some example data for fast testing for more details.
What about restoration ?¶
Well, everything was loaded and is ready for reconstruction. Lets us consider that your acquisition was partially sampled with a ratio of 0.2. So, to use any reconstruction method, use the restore()
method of inpystem objects.
The methods to reconstruct the data include nearest neighbor interpolation, regularized least-square and dictionary learning. Let’s try with an example data (inpystem has three dataset that can be loaded easily, this will be mentioned in).
>>> import inpystem
>>> data = inpystem.load_key('HR-sample', ndim=2, scan_ratio=0.2) # This loads example data.
Reading configuration file ...
Generating data ...
Creating STEM acquisition...
Correcting STEM acquisition...
>>> data
<Stem2D, title: HR-sample, dimensions: (|113, 63), sampling ratio: 0.20>
>>> reconstructed_data, info = data.restore('interpolation', parameters={'method': 'nearest'})
>>> reconstructed_data
<Signal2D, title: HR-sample, dimensions: (|113, 63)> # 2D hs data.
>>> info
{'time': 0.012229681015014648} # Execution time in sec.
Have a look at the reconstructed data which is an HyperSpy data. It means that the reconstructed data can analyzed with HyperSpy tools. Additional information are returned in the info
dictionary (for the nearest neighbor method, the only information that is returned is the execution time).
What about axes_manager
and metadata
informations ?¶
The initialization of Stem2D
or Stem3D
objects need an HyperSpy image which stores information about the axes (as the axes_manager
attribute) and other general information (as the metadata
attribute). These informations are transfered to the reconstructed data.
>>> data.hsdata.metadata
├── General
│ ├── original_filename = spim4-2-df-manualy aligned image.dm4
│ └── title = HR-sample
└── Signal
├── Noise_properties
│ └── Variance_linear_model
│ ├── gain_factor = 1.0
│ └── gain_offset = 0.0
├── binned = False
├── quantity = Intensity
└── signal_type =
>>> data.hsdata.axes_manager
<Axes manager, axes: (|113, 63)>
Name | size | index | offset | scale | units
================ | ====== | ====== | ======= | ======= | ======
---------------- | ------ | ------ | ------- | ------- | ------
x | 113 | | -0 | 1 |
y | 63 | | -0 | 1 |
>>> reconstructed_data.metadata
├── General
│ ├── original_filename = spim4-2-df-manualy aligned image.dm4
│ └── title = HR-sample
└── Signal
├── Noise_properties
│ └── Variance_linear_model
│ ├── gain_factor = 1.0
│ └── gain_offset = 0.0
├── binned = False
├── quantity = Intensity
└── signal_type =
>>> reconstructed_data.axes_manager
<Axes manager, axes: (|113, 63)>
Name | size | index | offset | scale | units
================ | ====== | ====== | ======= | ======= | ======
---------------- | ------ | ------ | ------- | ------- | ------
x | 113 | | -0 | 1 |
y | 63 | | -0 | 1 |
Initializing the data¶
How to initialize the scan pattern¶
The scan pattern can be initialized using three recipes:
initialize it with the shape and path values (see The first basic object is the scan),
initialize it with a Numpy or HyperSpy file,
initialize it as random sampling.
Recall that a path which is initialized with the data shape only is set to be a full raster (i.e. line-by-line) scan.
Recall also that all scan initialization functions allow to define a ratio argument (see The first basic object is the scan).
Initialize it with a file¶
The scan pattern can be initialized with a numpy .npz
file which should store:
m
(resp.n
) which is the data number of rows (resp. columns),path
which is thepath
argument
To that end, one should use the from_file()
method of Scan
.
>>> import numpy as np
>>> m, n = 50, 100
>>> path = np.random.permutation(m*n)
>>> data_2_save = {'m': m, 'n': n, 'path': path}
>>> np.savez('my_scan.npz', **data_2_save) # This saves the Scan numpy file
>>> inpystem.Scan.from_file('my_scan.npz', ratio=0.5) # This loads the numpy scan file.
<Scan, shape: (50, 100), ratio: 0.500>
Initialize it as random sampling¶
The sampling scan can last be initialized with the random()
method of Scan
. One should just give the spatial data shape (m, n)
. In addition to the ratio argument which can also be given, the user can give a seed to the method to have reproducible results.
>>> inpystem.Scan.random((50, 100))
<Scan, shape: (50, 100), ratio: 1.000>
>>> scan = inpystem.Scan.random((50, 100), ratio=0.2)
>>> scan
<Scan, shape: (50, 100), ratio: 0.200>
>>> scan.path[:5]
array([4071, 662, 4168, 3787, 4584])
>>> scan = inpystem.Scan.random((50, 100), ratio=0.2, seed=0)
>>> scan.path[:5]
array([ 398, 3833, 4836, 4572, 636])
>>> scan = inpystem.Scan.random((50, 100), ratio=0.2, seed=0)
>>> scan.path[:5] # This shows that setting the seed makes the results reproducible.
array([ 398, 3833, 4836, 4572, 636])
Construct inpystem data manually¶
As explained in The result is inpystem data, the inpystem data is composed of a Scan
object which defines the sampling pattern and the HyperSpy data which stores the data. Once both have been defined, the inpystem structure can be defined by hand.
>>> inpystem_data = inpystem.Stem2D(hsdata, scan=scan_object)
Construct inpystem data from a Numpy array¶
In case your image is a numpy array, one should define the HyperSpy data before creating the inpystem data.
>>> import numpy as np
>>> import hyperspy.api as hs
>>> shape = (50, 100, 1500) # This is the 3D data shape
>>> im = np.ones(shape) # This is our image (which is 3D this time).
>>> scan = inpystem.Scan.random(shape[:2]) # The scan is created (be careful to have 2-tuple shape).
>>> hsdata = hs.signals.Signal1D(im) # Here, hs data is created from numpy array.
>>> inpystem.Stem3D(hsdata, scan)
<Stem3D, title: , dimensions: (100, 50|1500), sampling ratio: 1.00>
Well, the problem here, which is the same as for numpy-based HyperSpy data, is that both axes_manager
and metadata
are empty. To correct that, it is hygly recommended to use a configuration file. That’s the subject of next section.
Construct inpystem data from a configuration file¶
As explained in Loading your data is faster, inpystem can load data from a .conf
configuration file. This is loaded by using the load_file()
function (or the load_key()
function if the configuration file is in the data path). To that end, a configuration file gives to inpystem all important informations.
First, the configuration file is separated in three main sections (case-sensitive, caution !):
DATA 2D
for 2D data,DATA 3D
for 3D data,SCAN
for the scan pattern.
Among these sections, only one of DATA 2D
and DATA 3D
sections is required (if no data is given, inpystem can not do anything …). And inside this section, the only key which is required is file
which specifies the location of the data file (numpy .npy
or .dm4 or all other file which is allowed by HyperSpy) relative to the configuration file. One info: contrary to sections wich are case-sensitive, keys are not.
In case no file
key is given inside a SCAN
section, the load_file()
function creates automatically a random scan object (based on its scan_ratio
and scan_seed
arguments). Otherwise, a scan file (numpy or dm4/dm3) is loaded (the scan_ratio
argument of load_file()
can still be given).
Hence, a basic configuration file could look like this.
#
# This is a demo file.
# This text is not used, that's a commentary.
#
[3D DATA]
# This section defines all info about 3D data
File = eels_data.dm4
[SCAN]
# This section defines all info about scan pattern
# If the following line is commented, the scan pattern would be random.
FILE = scan.dm4
In the special case where the data file is a numpy .npy
file, one could define additional information to fill the HyperSpy axes_manager
attribute. To that end, a set of keys can be given inside the corresponding section. These keys should be like axis_dim_info
where:
dim
is the axis index (0 for thex
axis, 1 for they
axis and 2 in case of 3D data for the spectrum axis),info
belongs toname
,scale
,unit
andoffset
.
As an example, the previous section data axes_manager should look like this.
>>> data = inpystem.Stem3D(hsdata, scan)
Creating STEM acquisition...
>>> data.hsdata.axes_manager
<Axes manager, axes: (100, 50|1500)>
Name | size | index | offset | scale | units
================ | ====== | ====== | ======= | ======= | ======
<undefined> | 100 | 0 | 0 | 1 | <undefined>
<undefined> | 50 | 0 | 0 | 1 | <undefined>
---------------- | ------ | ------ | ------- | ------- | ------
<undefined> | 1500 | | 0 | 1 | <undefined>
If the numpy array is save inside a directory with the following configuration file, this issue would be fixed.
#
# This is a demo file to define Numpy data axes_manager.
#
[3D DATA]
file = numpy_data.npy
# Infos for the axes_manager
axis_0_name = x
axis_1_name = y
axis_2_name = Energy loss
# Some more info for the energy loss axis
axis_2_offset = 4.6e+02
axis_2_scale = 0.32
axis_2_unit = eV
# No scan section, I want a random scan.
And the data would be loaded by simply typing this.
>>> inpystem.load_file('my-nice-file.conf', scan_ratio=0.5, scan_seed=0)
Some example data for fast testing¶
The package is delivered with some toy data for testing which are not provided inside the package itself due to the high data size. Please download it at the github project page under location DATA/
and copy it to your data path (see Loading your data is faster). These data can be called afterwards with the load_key()
function.
The three example data are called with the following keys:
'HR-sample'
: this is a real atomic-scale HAADF/EELS sample,'HR-synth'
: this is a synthetic EELS image generated to be similar to'HR-sample'
,'LR-synth'
: this is a synthetic low-resolution EELS image.
The first data were acquired in the context of the following works [AZWT+19], [APLML+18]. Authors of these works would like to acknowledge Daniele Preziosi for the LAO-NNO thin film growth, Alexandre Gloter for the FIB lamella preparation and Xiaoyan Li for STEM experiments.
The two last data were generated to compare reconstruction methods in the context of STEM-EELS data inpainting [AMonierOberlinBrun+18]. The high-resolution works were submitted.
References¶
- APLML+18
Daniele Preziosi, Laura Lopez-Mir, Xiaoyan Li, Tom Cornelissen, Jin Hong Lee, Felix Trier, Karim Bouzehouane, Sergio Valencia, Alexandre Gloter, Agnès Barthélémy, and Manuel Bibes. Direct mapping of phase separation across the metal–insulator transition of ndnio3. Nano Letters, 18(4):2226–2232, 2018. doi:10.1021/acs.nanolett.7b04728.
- AZWT+19
Alberto Zobelli, Steffi Y Woo, Luiz HG Tizei, Nathalie Brun, Anna Tararan, Xiaoyan Li, Odile Stéphan, Mathieu Kociak, and Marcel Tencé. Spatial and spectral dynamics in stem hyperspectral imaging using random scan patterns. arXiv preprint arXiv:1909.07842, 2019.
- AMonierOberlinBrun+18
É. Monier, T. Oberlin, N. Brun, M. Tencé, M. de Frutos, and N. Dobigeon. Reconstruction of partially sampled multiband images—application to stem-eels imaging. IEEE Trans. Comput. Imag., 4(4):585–598, dec. 2018.
Correcting data¶
Introduction¶
Sometimes, you can face a situation where your data has outliers, i.e. data that are abnormal because of local dysfunctions. The data were acquired, but their value are not normal. In such case, you can wish to
purely remove columns / rows / bands,
correct particular dead pixels by replacing their value by correct ones.
For the dead pixels, their new value are set to a mean over the nearest correct sampled positions.
In the below example, the user wants to remove the red rows, the green columns and correct a dead pixel in blue.

Correction with a single image¶
Correcting data manually¶
This is realized easily by using the correct()
method of the data object. This only requires
slices objects that define the rows and columns to keep (and the bands in case of 3D data),
the position of dead pixels.
Note
Let us recall what a slice
object is. When you define a numpy array A
of shape (m, n)
, you can access columns and rows using e.g. A[1:2:5, :2]
. In each direction, you specify the start, stop and step values.
For example, writing A[1:, :]
mean you want to select all rows from row 1 to the last row m
. This is equivalent to write A[slice(1), :]
. In fact, a slice
object just defines, for a direction, the start, stop and step values. Let’s have some examples:
1:
is equivalent toslice(1)
,:10
is equivalent toslice(None, 10)
,::10
is equivalent toslice(None, None, 10)
,2:10
is equivalent toslice(2, 10)
,:
is equivalent toslice(None)
.
For the above example, one should write the following code.
>>> data
<Stem3D, title: Test, dimensions: (|11, 11), sampling ratio: 0.20>
>>> data.correct(rows=slice(1, 9), columns=slice(2, 10), dpixels=[6*11+5])
Correcting STEM acquisition...
>>> data
<Stem3D, title: Test, dimensions: (|8, 8), sampling ratio: 0.20>
Correcting data with the configuration file¶
A lot of information can be given to inpystem through the configuration file. To pass correction info to the load functions, all you need is to define slices inside the corresponding 2D DATA
or 3D DATA
sections. For this, use the following keys: rows
, columns
, bands
and dpixels
. The slice should be written as for numpy array selection (1:-60
). Let’s illustrate is with the configuration file of the HR-sample example image.
[2D DATA]
file = spim4-2-df-manualy aligned image.dm4
columns = 15:-60
dpixels = [9384, 8468]
[3D DATA]
file = spim4-2_ali.dm4
columns = 15:-60
bands = 90:
dpixels = [9384, 8468]
Note
Contrary to the correct()
method which allows slices, the configuration files require literal slices such as 1:-50
.
One other main difference is that slices objects do not accept negative values while the configuration files values are parsed and accept negative values. As a consequence, to tell inpystem you just want to keep all rows but the first one and the five last ones, you should write slice(1:m-1)
(where m
is the number of rows) for the correct()
method while the configuration file would accept 1:-5
.
Restoration¶
Welcome to the main page of this documentation. The inpystem library is nothing else than a pluggin to HyperSpy to allow reconstruction.
Some words about pre and post-processing steps¶
For 2D data¶
2D data are centered and normalized before reconstruction. The reason is to avoid highly variable reconstruction methods parameters.
# data numpy array is ready to be reconstructed.
#
# Let's get data mean and standard devition.
data_mean, data_std = data.mean(), data.std()
# Let's center and normalize the data
data_ready = (data - data_mean) / data_std
#
# Reconstruction is performed
#
# Let's perform the inverse transformation
data_out = reconstructed_data * data_std + data_mean
# data_out is returned
For 3D data¶
3D data have the save pre and post-processing as for 2D data. But in addition to centering and normalization, a thresholded principal component analysis (PCA) is performed to reduce the data dimension along the bands axis and to ensure the low rank assumption. This one states that most multi-band data result in the mixing of R
basic data with R
small compared to the data dimension).
The default behavior is to perform PCA with an automatically estimated threshold (which can be really over-estimated in case of data starvation situations, i.e. if you have almost as many samples as the data size). Though, the user can set both parameters and choose if PCA should be performed and which value the threshold should have.
To sum up, the steps are:
perform thresholded PCA if required,
center and normalize the data,
perform reconstruction,
re-set the data mean and standard deviation values,
perform inverse PCA.
How to reconstruct my data¶
To reconstruct the data, the user should use the restore()
method. Both Stem2D
and Stem3D
classes need the following arguments to restore their data:
method
which is the method name (default is'interpolation'
),parameters
which should be a dictionary with input parameters,verbose
which allows the method to display information on the console (default isTrue
).
In addition to these common parameters, the Stem3D
class has the foolowing inputs:
PCA_transform
which controlls the PCA execution (defualt is True for PCA execution),PCA_th
which states the PCA threshold.
A common reconstruction task will then look like this.
>>> data = inpystem.load_example('HR-sample', ndim=2, scan_ratio=0.2)
Reading configuration file ...
Generating data ...
Creating STEM acquisition...
Correcting STEM acquisition...
>>> rec, info = data.restore()
Restoring the 2D STEM acquisition...
-- Interpolation reconstruction algorithm --
Done in 0.01s.
---
>>> rec
<Signal2D, title: HR-sample, dimensions: (|113, 63)>
>>> info
{'time': 0.011758089065551758}
The reconstruction methods available¶
All you need to know for each method is:
what the method do (of course you need to know a little about it),
his nickname to give to
restore()
,his parameters,
what informations are returned.
Restoration cheet sheet¶
Additional info in case PCA_transform
is True
is PCA_info
which stores the following keys:
H
: the truncated PCA basis,PCA_th
: the PCA threshold,Ym
: the data mean.
Interpolation¶
The interpolation method calls linear, cubic or nearest neighbor interpolation.
The method to give to the restore()
method is interpolation
. The associated function is resp. interpolate()
.
The input parameters are:
method
: (optional, str) The interpolation method (amongnearest
,linear
andcubic
). Default is nearest neighbor.
The output dictionary stores the following informations:
time
: the execution time (in sec.),PCA_info
: in case of 3D data with PCA pre-processing, it stores info about PCA.
L1¶
This regularized least-square method solves the following optimization problem:
where \(\mathbf{y}\) are the corrupted data, \(\Phi\) is a subsampling operator and \(\Psi\) is a 2D DCT operator.
The method to give to the restore()
method is L1
. The associated function is resp. L1_LS()
.
The input parameters are:
Lambda
: (float) The regularization parameter,init
: (optional, numpy array) An initial point for the gradient descent algorithm which should have the same shape as the input data,Nit
: (optional, int) The number of iterations.
The output dictionary stores the following informations:
E
: The evolution of the functional value,Gamma
: The set of all pixel positions which coefficient in the DCT basis is non-zero,nnz-ratio
: The ratio of non-zero coefficients over the number of DCT coefficients,time
: the execution time (in sec.).
Smoothed SubSpace¶
The 3S algorithm denoise or reconstructs a multi-band image possibly spatially sub-sampled in the case of spatially smooth images. It is well adapted to intermediate scale images.
This algorithm performs a PCA pre-processing operation to estimate:
the data subspace basis \(\mathbf{H}\),
the subspace dimension \(R\),
the associated eigenvalues in decreasing order \(\mathbf{d}\),
the noise level \(\hat{\sigma}\).
After this estimation step, the algorithm solves the folowing regularization problem in the PCA space:
where \(\mathbf{Y}\) are the corrupted data, \(\mathbf{D}\) is a spatial finite difference operator and \(\mathcal{I}\) is the set of all sampled pixels. The coefficient \(\alpha\) is a coefficient which scales the power of the data fidelity term.
For more details, see [BMonierOberlinBrun+18].
The method to give to the restore()
method is 3S
. The associated function is resp. SSS()
.
The input parameters are:
Lambda
: (float) The regularization parameter,scale
: (optional, float) The spectrinit
: (optional, numpy array) An initial point for the gradient descent algorithm which should have the same shape as the input data,Nit
: (optional, int) The number of iterations.
The output dictionary stores the following informations:
E
: The evolution of the functional value,time
: the execution time (in sec.),PCA_info
: in case of 3D data with PCA pre-processing, it stores info about PCA.
Smoothed Nuclear Norm¶
The SNN algorithm denoise or reconstructs a multi-band image possibly spatially sub-sampled in the case of spatially smooth images. It is well adapted to intermediate scale images.
This algorithm solves the folowing optimization problem:
where \(\mathbf{Y}\) are the corrupted data, \(\mathbf{D}\) is a spatial finite difference operator and \(\mathcal{I}\) is the set of all sampled pixels.
For more details, see [BMonierOberlinBrun+18].
The method to give to the restore()
method is SNN
. The associated function is resp. SNN()
.
The input parameters are:
Lambda
: (float) The \(\lambda\) regularization parameter,Mu
: (float) The \(\mu\) regularization parameter,init
: (optional, numpy array) An initial point for the gradient descent algorithm which should have the same shape as the input data,Nit
: (optional, int) The number of iterations.
The output dictionary stores the following informations:
E
: The evolution of the functional value,time
: the execution time (in sec.),PCA_info
: in case of 3D data with PCA pre-processing, it stores info about PCA.
Cosine Least Square¶
The CLS algorithm denoises or reconstructs a multi-band image possibly spatially sub-sampled in the case of spatially sparse content in the DCT basis. It is well adapted to periodic data.
This algorithm solves the folowing optimization problem:
where \(\mathbf{Y}\) are the corrupted data, \(\mathbf{D}\) is a spatial finite difference operator and \(\mathcal{I}\) is the set of all sampled pixels.
The method to give to the restore()
method is CLS
. The associated function is resp. CLS()
.
The input parameters are:
Lambda
: (float) The \(\lambda\) regularization parameter,init
: (optional, numpy array) An initial point for the gradient descent algorithm which should have the same shape as the input data,Nit
: (optional, int) The number of iterations.
The output dictionary stores the following informations:
E
: The evolution of the functional value,Gamma
: The set of all pixel positions which coefficient in the DCT basis is non-zero,nnz-ratio
: The ratio of non-zero coefficients over the number of DCT coefficients,time
: the execution time (in sec.),PCA_info
: in case of 3D data with PCA pre-processing, it stores info about PCA.
Post-Lasso CLS algorithm¶
This algorithms consists in applying CLS to restore the data and determine the data support in DCT basis. A post-least square optimization is performed to reduce the coefficients bias.
The method to give to the restore()
method is Post_LS_CLS
. The associated function is resp. Post_LS_CLS()
.
The input parameters are:
Lambda
: (float) The \(\lambda\) regularization parameter,init
: (optional, numpy array) An initial point for the gradient descent algorithm which should have the same shape as the input data,Nit
: (optional, int) The number of iterations.
The output dictionary stores the following informations:
E_CLS
: The evolution of the functional value for the CLS optimization step,E_post_ls
: The evolution of the functional value for the post-LS optimization step,Gamma
: The set of all pixel positions which coefficient in the DCT basis is non-zero,nnz-ratio
: The ratio of non-zero coefficients over the number of DCT coefficients,time
: the execution time (in sec.),PCA_info
: in case of 3D data with PCA pre-processing, it stores info about PCA.
ITKrMM and wKSVD¶
Weighted K-SVD (see [BMairalEladSapiro08]) and Iterative Thresholding and K residual Means for Masked data (see [BNS18]) methods.
The wKSVD and ITKrMM algorithms share a lots of their code so that their input and output are the same. Though, two implementations exist to run these algorithms: one with python (ITKrMM
and wKSVD
methods) and one with maltab (ITKrMM_matlab
and wKSVD_matlab
methods). The original Matlab codes are broadcasted by Karin Schnass. They were translated afterwards into python. Nothing distinguish them but for wKSVD where matlab is faster. The only problem is that you should have the matlab
command in your system path.
The methods to give to the restore()
method are ITKrMM
, wKSVD
, ITKrMM_matlab
or wKSVD_matlab
. The associated functions are resp. ITKrMM()
, wKSVD()
, ITKrMM_matlab()
and wKSVD_matlab()
.
The input parameters are:
Patchsize
: (optional, int) The patch width,K
: (optional, int) The dictionary size (incl. low-rank component),L
: (optional, int) The number of low-rank components to estimate,S
: (optional, int) The sparsity level,Nit
: (optional, int) The number of iterations for the dictionary estimation.Nit_lr
: (optional, int) The number of iterations for the low-rank estimation.
The output dictionary stores the following informations:
dico
: The dictionary,E
: The evolution of the error,time
: the execution time (in sec.),PCA_info
: in case of 3D data with PCA pre-processing, it stores info about PCA.
BPFA¶
Beta Process Factor Analysis algorithm (see [BXZC+12]).
As for wKSVD and ITKrMM, BPFA is based on a Matlab code from Zhengming Xing (these codes were broadcasted without any license). The python code just calls it, so matlab should be in the path system so that the matlab
command could be called from the command line.
The method to give to the restore()
method is BPFA_matlab
. The associated function is resp. BPFA_matlab()
.
The input parameters are:
Patchsize
: (optional, int) The patch width,K
: (optional, int) The dictionary size,step
: (optional, int) That’s the pixel space between two consecutive patches (if 1, full overlap),Nit
: (optional, int) The number of iterations for the dictionary estimation.
The output dictionary stores the following informations:
dico
: The dictionary,time
: the execution time (in sec.),PCA_info
: in case of 3D data with PCA pre-processing, it stores info about PCA.
That’s all folks !¶
This was the main content of the documentation. Congrats, you understood 90% of this library :)
References¶
- BNS18
Valeriya Naumova and Karin Schnass. Fast dictionary learning from incomplete data. EURASIP journal on advances in signal processing, 2018(1):12, 2018.
- BXZC+12
Z. Xing, M. Zhou, A. Castrodad, G. Sapiro, and L. Carin. Dictionary learning for noisy and incomplete hyperspectral images. SIAM J. Imag. Sci., 5(1):33–56, 2012.
- BMairalEladSapiro08
J. Mairal, M. Elad, and G. Sapiro. Sparse representation for color image restoration. IEEE Trans. Image Process., 17(1):53–69, Jan 2008.
- BMonierOberlinBrun+18(1,2)
É. Monier, T. Oberlin, N. Brun, M. Tencé, M. de Frutos, and N. Dobigeon. Reconstruction of partially sampled multiband images—application to stem-eels imaging. IEEE Trans. Comput. Imag., 4(4):585–598, dec. 2018.
Data Visualization¶
Visualizing the data¶
As for the HyperSpy data, the data objects Stem2D
and Stem3D
include a plot()
method to display the data.
In addition to this basic function, the Stem3D
class implements four other functions:
plot_sum()
which displays the sum of the 3D data along the last axis,plot_as2D()
which displays the data as 2D data (the navigation direction is the “channel” axis while the data are spatial),plot_as1D()
which displays the data as 1D data (the navigation directions are the spatial axes while the data is a spectrum, this is the behavior of the defaultplot()
function),plot_roi()
which considers the data as 1D and enable the user to mean the data over a spatial region-of-interest.
Visualizing the scan sampling mask¶
As explained in The first basic object is the scan, a Scan object can be represented with its sampling mask \(\mathbf{M}\) defined as
\[\mathbf{M}_i = \begin{cases} 1, & \text{if pixel \# $i$ is acquired}\\ 0, & \text{otherwise}\\ \end{cases} \]
This representation suffer from information deficiency, but is interesting to study the acquired pixels repartition. This sampling mask which shape is the same as the spatial shape can be obtained using the get_mask()
method of Scan
. This one can also be plotted using the method plot()
.
inpystem Developper Guide¶
Well, for now, this small library is developed by myself. I don’t know yet how much this could be usefull. However, if someone is interested to help me, please let me know at etienne[dot]monier[at]enseeiht[dot]fr.
This section just give some informations about the children classes of Stem2D
and Stem3D
that helps me to develop algorithms.
Dev data objects¶
Basically, a signal processing algorithm development aims at testing some method on synthetic (i.e. noise-free) data that has been manually degraded. As the truth image is known, metrics can be computed to test and compare the method.
To that end, two data objects Dev2D
and Dev3D
have been developed. These are children classes from Stem2D
and Stem3D
(themselves are children of the basic data structure AbstractStem
).

A simple UML diagram.¶
These objects basically work as non-dev data objects:
they are initialized with data and a Scan object,
loading them can be done with the same functions as for non-dev data,
they can be corrected with a configuration file,
they can be reconstructed using the basic
restore()
ofAbstractStem
,they can be plotted with the same tools.
Their difference is that:
noise can be added,
the results are reproducible,
new noise can be drawn whenever you want,
the
Dev3D
allows you to consider the PCA-transformed data as the base data (this is useful for developing 3D restoration algorithm).
Initialize Dev data¶
Manually¶
The Dev data objects could be initialized manually with the same arguments as for Stem2D
and Stem3D
classes, i.e.:
In addition to these arguments, a required input is key
which is a small descriptive keyword to help referencing. Other optional inputs are:
modif_file
: an configuration file which is sent to correction function,sigma
: the desired noise standard deviation in case additional noise is desired,seed
: the noise seed to have reproducible data,normalized
: if set to True, the data are normalized at initialization.
>>> stem2d_data = inpystem.load_key('HR-sample', 2)
Reading configuration file ...
Generating data ...
Creating STEM acquisition...
Correcting STEM acquisition...
>>> scan_shape = stem2d_data.scan.shape
>>> scan = inpystem.Scan.random(shape=scan_shape, ratio=0.5)
>>> dev_data = inpystem.Dev2D('my-dev-data', hsdata=stem2d_data.hsdata, scan=scan, sigma=0.5, seed=0)
Creating STEM acquisition...
>>> dev_data
<Dev2D, title: HR-sample, dimensions: (|113, 63), sampling ratio: 0.50>
With load functions¶
The load_file()
and load_key()
functions also enable to load development data. To that end, the user just has to use the dev
input which is a dictionary. This dictionary should store the desired inputs:
for 2D data:
modif_file
,sigma
,seed
andnormalized
for 2D data:
PCA_transform
,PCA_th
,modif_file
,sigma
,seed
andnormalized
>>> dev = {'sigma': 0.5, 'seed': 0}
>>> inpystem.load_key('HR-sample', 2, dev=dev, scan_ratio=0.5, scan_seed=1)
Reading configuration file ...
Generating data ...
Creating STEM acquisition...
Correcting STEM acquisition...
<Dev2D, title: HR-sample, dimensions: (|113, 63), sampling ratio: 0.50>
Note that in case the development data is loaded, the key would be the name of the .conf
file (e.g. for my-conf-file.conf
, the key would be my-conf-file
).
Some words about data storage¶
Contrary to Stem2D and Stem3D objects, development objects work with numpy techniques (to generate the noise, add it, perform PCA) so that this is the central data to be stored in Dev objects.
More precisely, the data are stored twice or three times under the attributes:
data
which stores the noise-free data,ndata
which stores the noisy data (in casesigma
is None and no noise-corruption procedure was applied, this attribute is None),hsdata
which stores the data as an HyperSpy data.
The last attribute only exist to send the data into restoration procedures. To display noisy or noise-free data, prefer the two first attributes.
Reproducibility and noise¶
Inside the data creation procedure, the random effects can come from the scan generation (in case of random initialization) or from the noise generation. For both situations, the seed can be set to get reproducible results. Indeed, the scan seed can be set in the load functions with its scan_seed
parameter or directly calling the random()
with its seed
parameter. The noise seed can be set itself with the seed
attribute.
Caution
The seed values are set for the startup procedures. When this is set for the Dev2D
class, this seed is set just before drawing the noise matrix for the first time. If the user wants to draw another noise matrix, the seed will not be the same any more. This is the same for the random scans.
To generate a new noise matrix, just use the set_ndata()
. To generate a new random scan, just re-run the random()
method without the seed
argument.
How PCA works for Dev3D
¶
At the Stem3D
initialization, the 3D data are fully stored as an HyperSpy data. When the user wants to reconstruct the data, the usual PCA_transform
and PCA_th
arguments can given. This is passed into the reconstruction algorithms which perform PCA as pre and post-processing steps.
In the case of the Dev3D
class, these parameters are given at initialization. If PCA_transform
is False, then the full data is stored into the data
attribute. The user could choose to perform PCA by giving PCA_transform
into the restore()
as an argument.
>>> dev = {'sigma': 0.5, 'PCA_transform': False}
>>> data = inpystem.load_key('HR-sample', 3, dev=dev)
Reading configuration file ...
Generating data ...
Creating STEM acquisition...
Correcting STEM acquisition...
>>> data.data.shape
(63, 115, 1510)
>>> outdata, info = data.restore()
Restoring the 3D STEM acquisition...
-- Interpolation reconstruction algorithm --
- PCA transformation -
Dimension reduced from 1510 to 4.
Estimated sigma^2 is 2.76e-01.
Done in 1.27s.
-
Done in 0.05s.
---
>>> info['PCA_info']['H'].shape # The PCA basis used for restoration
(1510, 4)
In the case of True PCA_transform
at initialization, a PCA procedure is executed at initialization and the data
(and possibly ndata
) data are reduced in the last axis direction. Additional information is stored in the PCA_info
attribute. In such case, the user should use the restore()
method without giving the PCA_transform
argument.
>>> dev = {'sigma': 0.5, 'PCA_transform': True}
>>> data = inpystem.load_key('HR-sample', 3, dev=dev)
Reading configuration file ...
Generating data ...
Creating STEM acquisition...
Correcting STEM acquisition...
- PCA transformation -
Dimension reduced from 1510 to 290.
Estimated sigma^2 is 1.43e+03.
Done in 1.27s.
-
>>> outdata, info = data.restore()
Restoring the 3D STEM acquisition...
-- Interpolation reconstruction algorithm --
Done in 11.12s.
---
>>> 'PCA_info' in info
False
Note
The default behavior for the restore()
PCA_transform
argument is to take the logical not of the PCA_transform
argument given at initialization. If the Dev3D
class has been initialized without PCA, then a PCA is applied by default before restoration. If PCA has been required at initialization, then no PCA will be applied at restoration.
Yet, the user can explicitly ask for additional PCA (which is stupid, I agree) or for no PCA at all. Let’s explain it clearly: if you don’t want PCA, say it at initialization and at restoration.
Caution
The two examples above show something important as the estimated PCA threshold is 3 in the case where PCA_transform
is not given at initialization and 290 otherwise. This means that both orders do not have the same effects.
If PCA_transform
is True at initialization, the PCA transformation is performed before adding noise so that the signal is clear enough to have a high threshold. Besides, the noise is added to the data in PCA space.
If PCA_transform
is False at initialization but True at restoration, the noise is added to the full dimension data. Besides, the PCA is applied to noised data so that few principle components get more powerful than noise and the threshold drops.
To handle easily direct and inverse PCA transformations, two methods are given: direct_transform()
and inverse_transform()
. They both allow the user to perform the same PCA direct/inverse transformation as for the :class:`~.dev.Dev3D` initialization method. These methods accept HyperSpy as numpy data.
These methods incorporate also normalization procedure inside. This means that the direct_transform()
method performs also centering and normalization whereas the inverse_transform()
inject the standard deviation and the mean back.
>>> import inpystem
# Case with non-PCA-initialized object
>>> dev = {'sigma': 0.5, 'PCA_transform': False, 'normalize': False}
>>> data = inpystem.load_key('HR-sample', 3, dev=dev)
>>> direct_data = data.direct_transform(data.data) # Performing direct transformation to data
>>> inverse_data = data.inverse_transform(data.data) # Performing inverse transformation to data
>>> import numpy.testing as npt # This is to check arrays are equal
>>> npt.assert_allclose(data.data, direct_data) # Equal
>>> npt.assert_allclose(data.data, inverse_data) # Equal
>>> dev = {'sigma': 0.5, 'PCA_transform': False} # Non-normalized here
>>> data = inpystem.load_key('HR-sample', 3, dev=dev)
>>> direct_data = data.direct_transform(data.data)
>>> inverse_data = data.inverse_transform(direct_data)
>>> npt.assert_allclose(data.data, direct_data) # Error because of normalization
>>> npt.assert_allclose(data.data, inverse_data) # Equal: direct, then inverse is identity :)
# Case with PCA-initialized object
>>> dev = {'sigma': 0.5, 'PCA_transform': True}
>>> data = inpystem.load_key('HR-sample', 3, dev=dev)
>>> inverse_data = data.inverse_transform(data.data)
>>> direct_data = data.direct_transform(inverse_data)
>>> npt.assert_allclose(data.data, direct_data) # Equal: direct, then inverse is still identity :)
>>> data.data.shape # PCA shape
(63, 115, 290)
>>> inverse_data.shape # True image shape
(63, 115, 1510)
>>> direct_data.shape # PCA shape
(63, 115, 290)
One word about visualization¶
The development data visualization works as for non-dev classes. Yet, development data visualization methods accept an additional argument which is noised
. This optional argument that is False by default sets which data should be displayed (noise-free data by default or noisy data).
List of todos:¶
Todo
Maybe enable PCA_th in config file for 3D data.
(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/inpystem/checkouts/latest/inpystem/dataset.py:docstring of inpystem.dataset.load_file, line 44.)
Full inpystem API documentation¶
inpystem module.
Description here.
signals module¶
This package defines all sort of classes to handle data for inpystem.
The Scan class¶
-
class
inpystem.signals.
Scan
(shape, path, ratio=None)¶ Scan pattern class.
This class stores the data spatial shape and two copies of the scan pattern. One of these copies is the initial scan pattern which is given to the class. At the same time, a
ratio
argument can be given to keep only a portion of the available samples. See Notes for more details.- Variables
shape (2-length tuple) – The spatial shape (m, n) where m is the number of rows and n is the number of columns.
path (numpy array) – The sampling path to be used in the study.
path_0 (numpy array) – The initial sampling path to be kept in case the ratio is changed.
ratio (float) – The current
ratio
value such thatpath`has size :code:`ratio*m*n
. Changing this attribute automaticaly updatespath
.
Note
Consider only
r*m*n
pixels hve been sampled, then thepath_0
attribute has shape (r*m*n, ) and its elements lay between 0 and m*n-1.Meanwhile, if the user wants to consider only
ratio
percent of the samples, theratio
argument should be given. Thepath
attribute would then have shape (ratio*m*n, ). In such case,path_0[:ratio*m*n]
will be equal topath
. Be aware thatratio
should be lower thanr
.Each element of these arrays is the pixel index in row major order. To recover the row and column index array, type the following commands.
- ::code:
i = path // n j = path % n
-
__init__
(shape, path, ratio=None)¶ Scan pattern constructor.
- Parameters
shape ((m, n) tuple) – The spatial shape where m is the number of rows and n is the number of columns.
path (tuple, numpy array) – The sampling path. See class Notes for more detail.
ratio (optional, float) – The ratio of sampled pixels. This should lay between 0 (excl.) and 1. Default is None for full sampling.
-
property
ratio
¶ Ratio getter.
- Returns
The ratio property value.
- Return type
float
-
classmethod
from_file
(data_file, ratio=None)¶ Creates a scan pattern object from a data file (such as .dm3, .dm4 or npz).
In the case of a .npz file, this one should contain the :code:̀’m`, :code:̀’n` and :code:̀’path` variables which are resp. the number of rows and columns and the path array.
Concerning the .dm3/.dm4 files, the data storage is specific to the LPS Lab (Orsay, France) implementation.
An aditional argument :code:̀’ratio` allows you to select only a given ratio of the sampled pixels. This should lay between 0 (excl.) and 1.
- Parameters
data_file (str) – The data file path.
ratio (optional, float) – The ratio of sampled pixels. This should lay between 0 (excl.) and 1. Default is None for full sampling.
- Returns
The scan pattern.
- Return type
Scan object
-
classmethod
random
(shape, ratio=None, seed=None)¶ Creates a random scan pattern object.
- Parameters
shape ((m, n) tuple) – The data spatial shape.
ratio (optional, float) – The ratio of sampled pixels. It should lay between 0 (excluded) and 1. Default is None for full sampling.
seed (optional, int) – Seed for random sampling. Default is None for random seed.
- Returns
The scan pattern.
- Return type
Scan object
-
get_mask
()¶ Returns the sampling mask.
The sampling mask is boolean and True is for sampled pixels.
- Returns
mask – The sampling mask.
- Return type
(m, n) numpy array
-
plot
()¶ Plots the sampling mask.
White (resp. black) pixels are sampled (resp. non-sampled).
The Stem classes¶
-
class
inpystem.signals.
AbstractStem
(hsdata, scan=None, verbose=True)¶ Abstract STEM acquisition class.
This is an abstract class, which mean you can not instantiate such object.
It defines the structure for a STEM acquisition object.
- Variables
hsdata (hs BaseSignal) – The acquired STEM data hyperspy object.
scan (Scan object) – The sampling scan object associated with the data.
verbose (bool) – If True, information is sent to standard output. Default is True.
-
__init__
(hsdata, scan=None, verbose=True)¶ AbstractStem constructor.
- Parameters
hsdata (hs BaseSignal) – The acquired STEM data hyperspy object.
scan (optional, Scan object) – The sampling scan object associated with the data. Default is None for full sampling.
verbose (bool) – If True, information is sent to standard output. Default is True.
-
correct
(rows=slice(None, None, None), cols=slice(None, None, None), bands=slice(None, None, None), dpixels=None)¶ Correct deffective data.
Deffective data correspond to:
Rows to remove at the begging or at the end of the image.
Columns to remove at the begging or at the end of the image.
Bands to remove at the begging or at the end of the image.
Located dead pixels at the center of the image.
In the cases 1, 2 or 3, the rows and columns are purely removed. The dead pixels are filled with the mean over a neighbourhood.
A :code`slice` object for an object
A
of lengthL
defines a continuous portion ofA
such asA[n_1], A[n_1+1], ..., A[n_2-1]
withn_1
<n_2
. In such case, a slice object definition isslice(n_1, n_2)
. Ifn_1
is 0, then useslice(None, n_2)
. Ifn_2
isL
useslice(n_1, None)
. Last, if all the elements ofA
should be kept, useslice(None)
.- Parameters
rows (slice object) – The range of rows to keep.
cols (slice object) – The range of columns to keep.
cols – The range of bands to keep.
dpixels (list) – The positions of the dead pixels.
-
correct_fromfile
(file, force_ndim=None)¶ force_ndim aims at forcing dimension. Default is None for data dimension.
-
abstract
restore
()¶ Restores corrupted data.
-
plot
()¶ Plots the masked data.
-
class
inpystem.signals.
Stem2D
(hsdata, scan=None, verbose=True)¶ 2D image STEM acquisition.
This defines a 2D STEM image with its associated sampling scan.
- Variables
hsdata (hs BaseSignal) – The acquired STEM data hyperspy object.
scan (Path object) – The sampling scan object associated with the data.
-
restore
(method='interpolation', parameters={})¶ Restores the acquisition.
It performs denoising in the case of full scan and performs recontruction in case of partial sampling.
-
class
inpystem.signals.
Stem3D
(hsdata, scan=None, verbose=True)¶ 3D image STEM acquisition.
This defines a 3D STEM image with its associated sampling scan.
- Variables
hsdata (hs BaseSignal) – The acquired STEM data hyperspy object.
scan (Path object) – The sampling scan object associated with the data.
-
restore
(method='interpolation', parameters={}, PCA_transform=True, PCA_th='auto')¶ Restores the acquisition.
It performs denoising in the case of full scan and performs recontruction in case of partial sampling.
- Parameters
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
-
plot_sum
()¶ Shows the sum of the data along the last axis.
-
plot_as2D
()¶ Implements the HypersSpy tool to visualize the image for a given band.
-
plot_as1D
()¶ Implements the HypersSpy tool to visualize the spectrum for a given pixel.
-
plot_roi
()¶ Implements the Hyperspy tool to analyse regions of interest.
dev module¶
This module defines the basic stem acquisitions objects used for processing. These objects are:
The 3D spectrum-image,
The 2D HAADF image,
-
class
inpystem.dev.
AbstractDev
(key, data, mask=None, sigma=None, seed=None, normalize=True, verbose=True)¶ Abstract Dev acquisition class.
This is an abstract class, which mean you can not instantiate such object.
It defines the structure for a Dev acquisition object.
- key: str
1-word description of the Dev2D image.
- data: (m,n) or (m, n, l) numpy array
The Dev2D image data before the noise step. Its dimension is (m,n).
- ndata: (m,n) or (m, n, l) numpy array
The noised Dev2D image. If
snr
is None,ndata
is None. Its dimension is (m,n).- sigma: float
The noise standard deviation.
- seed: optional, int
The random noise matrix seed.
- normalize: bool
If :code:normalize` is True, the data will be centered and normalize before the corruption steps.
- mean_std: None, 2-tuple
It stores the data mean and std in case normalize is True.
- verbose: bool
If True, information will be displayed. Default is True.
-
__init__
(key, data, mask=None, sigma=None, seed=None, normalize=True, verbose=True)¶ AbstractDev constructor.
- Parameters
key (str) – 1-word description of the Dev2D image. Generally, it’s common to the stem acquisition object.
data ((m, n) or (m, n, l) numpy array) – The noise-free image data.
mask ((m, n) numpy array) – The sampling mask.
sigma (optional, None, float) – The desired standard deviation used to model noise. Dafault is None for no additional noise.
seed (optional, None, int) – The random noise matrix seed. Dafault is None for no seed initialization.
normalize (optional, bool) – If :code:normalize` is True, the data will be centered and normalize before the corruption steps. Default is True.
verbose (optional, bool) – If True, information will be displayed. Default is True.
-
property
seed
¶ seed property getter.
-
property
sigma
¶ sigma property getter.
-
set_ndata
()¶ Constructs the noised data.
It is also used to draw a new noise matrix.
-
class
inpystem.dev.
Dev2D
(key, hsdata, scan=None, modif_file=None, sigma=None, seed=None, normalize=True, verbose=True)¶ Dev2D Class.
- Variables
key (str) – 1-word description of the Dev2D image.
hsdata (Signal2D hyperspy data) – The hyperspy Signal2D image. Its dimension is denoted (m,n). This is used to communicate with the parrent class.
data ((m,n) numpy array) – The Dev2D image data before the noise step. Its dimension is (m,n).
ndata ((m,n) numpy array) – The noised Dev2D image. If
snr
is None,ndata
is None. Its dimension is (m,n).scan (optional, Scan object) – The sampling scan object associated with the data. Default is None for full sampling.
sigma (float) – The noise standard deviation.
seed (optional, int) – The random noise matrix seed.
normalize (bool) – If :code:normalize` is True, the data will be centered and normalize before the corruption steps.
mean_std (None, 2-tuple) – It stores the data mean and std in case normalize is True.
verbose (bool) – If True, information will be displayed. Default is True.
-
__init__
(key, hsdata, scan=None, modif_file=None, sigma=None, seed=None, normalize=True, verbose=True)¶ SpectrumImage constructor.
- Parameters
key (str) – 1-word description of the Dev2D image. Generally, it’s common to the stem acquisition object.
hsdata (Signal2D hyperspy data) – The noise-free Dev2D image data. Its dimension is denoted (m,n).
scan (optional, None, Scan object) – The sampling scan object associated with the data. Default is None for full sampling.
modif_file (optional, None, str) – A .conf configuration file to remove rows, columns or dead pixels. Default is None for no modification.
sigma (optional, None, float) – The desired standard deviation used to model noise. Dafault is None for no additional noise.
seed (optional, None, int) – The random noise matrix seed. Dafault is None for no seed initialization.
normalize (optional, bool) – If :code:normalize` is True, the data will be centered and normalize before the corruption steps. Default is True.
verbose (optional, bool) – If True, information will be displayed. Default is True.
-
restore
(method='interpolation', parameters={}, verbose=None)¶
-
plot
(noised=False)¶ Plots the haadf image.
- Parameters
noised (optional, bool) – If True, the noised data is used. If False, the noise-free data is shown. Default is False.
-
class
inpystem.dev.
Dev3D
(key, hsdata, scan=None, modif_file=None, sigma=None, seed=None, normalize=True, PCA_transform=False, PCA_th='auto', verbose=True)¶ Dev3D Class
- Variables
key (str) – 1-word description of the Dev3D.
hsdata (Signal1D hyperspy data) – The hyperspy Signal2D image. Its dimension is denoted (m,n). This is used to communicate with the parrent class.
data ((m,n,l) numpy array) – The Dev3D data before the noise step. Its dimension is (m,n,l).
ndata ((m,n,l) numpy array) – The noised Dev3D data. If snr is None, ndata is None. Its dimension is (m,n,l).
snr (optional, float) – The desired snr used for the noising step.
sigma (float) – The noise standard deviation.
seed (optional, int) – The random noise matrix seed.
normalize (bool) – If normalize is True, the data will be centered and normalize before the corruption steps.
PCA_transform (bool) – If PCA_transformed is True, a PCA transformation has been applied to the data.
PCA_info (None, dictionary) – If PCA_transformed is True, PCA_info contains informations about the reduction. Otherwise, it is None.
PCA_operator (PcaHandler) – The PCA operator.
verbose (bool) – If True, information will be displayed. Default is True.
-
__init__
(key, hsdata, scan=None, modif_file=None, sigma=None, seed=None, normalize=True, PCA_transform=False, PCA_th='auto', verbose=True)¶ Dev3D __init__ function.
- Parameters
key (str) – 1-word description of the Dev3D image. Generally, it’s common to the stem acquisition object.
hsdata (Signal1D hyperspy data) – The noise-free Dev3D image data. Its dimension is denoted (m,n, l).
scan (optional, None, Scan object) – The sampling scan object associated with the data. Default is None for full sampling.
modif_file (optional, None, str) – A .conf configuration file to remove rows, columns or dead pixels. Default is None for no modification.
sigma (optional, None, float) – The desired standard deviation used to model noise. Dafault is None for no additional noise.
seed (optional, None, int) – The random noise matrix seed. Dafault is None for no seed initialization.
normalize (optional, bool) – If :code:normalize` is True, the data will be centered and normalize before the corruption steps. Default is True.
PCA_transform (optional, bool) – If PCA_transformed is True, a PCA transformation is applied to the data. Default is False.
PCA_th (optional, str, int) –
The desired data dimension after dimension reduction. Possible values are:
’auto’ for automatic choice,
’max’ for maximum value
an int value for user value.
Default is ‘auto’.
verbose (optional, bool) – If True, information will be displayed. Default is True.
-
direct_transform
(data)¶ Applies the Dev3D PCA transformation and normalization steps to data.
- Parameters
data ((m, n, l) numpy array, hs image) – Data whose shape is the same as self.data.
-
inverse_transform
(data)¶ Applies the Dev3D PCA inverse transformation and inverse normalization steps to spim.
- Parameters
data ((m, n, l) numpy array, hs image) – Data whose shape is the same as self.data.
-
restore
(method='interpolation', parameters={}, PCA_transform=None, PCA_th='auto', verbose=None)¶
-
show_sum
(noised=False)¶ Shows the sum of the data along the last axis.
- Parameters
noised (optional, bool) – If True, the noised data is used. If False, the noise-free data is shown. Default is False.
-
plot_as2D
(noised=False)¶ Implements the HypersSpy tool to visualize the image for a given band.
- Parameters
noised (optional, bool) – If True, the noised data is used. If False, the noise-free data is shown. Default is False.
-
plot_as1D
(noised=False)¶ Implements the HypersSpy tool to visualize the spectrum for a given pixel.
- Parameters
noised (optional, bool) – If True, the noised data is used. If False, the noise-free data is shown. Default is False.
-
plot_roi
(noised=False)¶ Implements the Hyperspy tool to analyse regions of interest.
- Parameters
noised (optional, bool) – If True, the noised data is used. If False, the noise-free data is shown. Default is False.
dataset module¶
This module defines important tools to import data from dataset.
Data path functions¶
-
inpystem.dataset.
read_data_path
()¶ Read the saved data folder path.
The inpystem library proposes to store all data in a particular directory with associated configuration files. This folder is saved in inpystem. To access to this folder path, use this function.
If no data path is saved, the function returns None. Else, the path is returned.
- Returns
None is returned if no path is saved. Else, the data path is returned.
- Return type
None, str
-
inpystem.dataset.
set_data_path
(path)¶ Sets the saved data folder path.
The inpystem library proposes to store all data in a particular directory with associated configuration files. This folder is saved in inpystem. To set to this folder path, use this function.
A boolean is returned to confirm that the change is effective.
- Parameters
path (str) – The desired data path.
- Returns
If the data path has really been changed, the function returns True. Else, it returns False.
- Return type
bool
Load functions¶
-
inpystem.dataset.
load_file
(file, ndim, scan_ratio=None, scan_seed=None, dev=None, verbose=True)¶ This function loads a STEM acquisition based on a configuration .conf file path.
The number of dimensions ndim should also be given.
The Path is generated from a scan file given in the configuration file or is randomly drawn. Whatever the case, the Scan object
ratio
property can be set through thescan_ratio
argument. Additionally, in the case where no file is provided for the scan pattern, use thescan_seed
argument to have reproductible data.The function allows the user to ask for development data by setting the
dev
argument. Ifdev
is None, then the usual Stem2D and Stem3D classes are returned. Ifdev
is a dictionary, then Dev2D and Dev3D classes are returned. This dictionary could contain additional class arguments such as:snr, seed and normalized for Dev2D,
snr, seed, normalized, PCA_transformed and PCA_th for Dev3D.
- Parameters
file (str) – The configuration file path.
ndim (int) – The data dimension. Should be 2 or 3.
scan_ratio (optional, None, float) – The Path object ratio. Default is None for full sampling.
scan_seed (int) – The seed in case of random scan initialization. Default is None for random seed.
dev (optional, None, dictionary) – This arguments allows the user to ask for development data. If this is None, usual data is returned. If this argument is a dictionary, then development data will be returned and the dictionary will be given to the data contructors. Default is None for usual data.
verbose (optional, bool) – If True, information will be sent to standard output.. Default is True.
- Returns
The inpystem data.
- Return type
Todo
Maybe enable PCA_th in config file for 3D data.
-
inpystem.dataset.
load_key
(key, ndim, scan_ratio=None, scan_seed=None, dev=None, verbose=True)¶ This function loads a STEM acquisition based on a key.
A key is a string which can be:
an example data name,
the name of some data located in the inpystem data path (which is defined with the
inpystem.dataset.set_data_path()
function).
The key should always be the name of the configuration file without the suffix (.conf). As an example, if a configuration file located in the data folder is named my-sample.conf, then its data could be loaded with the my-sample key.
The number of dimensions ndim should also be given.
The Path is generated from a scan file given in the configuration file or is randomly drawn. Whatever the case, the Scan object
ratio
property can be set through thescan_ratio
argument. Additionally, in the case where no file is provided for the scan pattern, use thescan_seed
argument to have reproductible data.The function allows the user to ask for development data by setting the
dev
argument. Ifdev
is None, then the usual Stem2D and Stem3D classes are returned. Ifdev
is a dictionary, then Dev2D and Dev3D classes are returned. This dictionary could contain additional class arguments such as:snr, seed, normalized and verbose for Dev2D,
snr, seed, normalized, PCA_transformed, PCA_th and verbose for Dev3D.
This function only searches for the configuration file to use the load_file function afterwards.
- Parameters
key (str) – The data key.
ndim (int) – The data dimension. Should be 2 or 3.
scan_ratio (optional, None, float) – The Path object ratio. Default is None for full sampling.
scan_seed (int) – The seed in case of random scan initialization. Default is None for random seed.
dev (optional, None, dictionary) – This arguments allows the user to ask for development data. If this is None, usual data is returned. If this argument is a dictionary, then development data will be returned and the dictionary will be given to the data contructors. Default is None for usual data.
verbose (optional, bool) – If True, information will be sent to standard output.. Default is True.
- Returns
The inpystem data.
- Return type
inpystem.tools subpackage¶
This introduces some tools for inpystem.
inpystem.tools.PCA module¶
This module implements tools to perform PCA transformation.
The main element is the PcaHandler class which is a user interface. It performs direct and inverse PCA transformation for 3D data.
Dimension_Reduction is the background function which performs PCA while the EigenEstimate function improves the estimation of PCA eigenvalues.
The PcaHandler interface¶
-
class
inpystem.tools.PCA.
PcaHandler
(Y, mask=None, PCA_transform=True, PCA_th='auto', verbose=True)¶ Interface to perform PCA.
The PCA is applied at class initialization based on the input data. This same operation can be applied afterward to other data using the
direct
andinverse
methods.- Variables
Y ((m, n, l) numpy array) – Multi-band data.
Y_PCA ((m, n, PCA_th) numpy array) – The data in PCA space.
mask (optional, (m, n) numpy array) – Spatial sampling mask. Default is full sampling.
PCA_transform (optional, bool) – Flag that sets if PCA should really be applied. This is useful in soma cases where PCA has already been applied. Default is True.
verbose (optional, bool) – If True, information is sent to output.
H ((l, PCA_th) numpy array) – The subspace base.
Ym ((m, n, l) numpy array) – Matrix whose spectra are all composed of the data spectral mean.
PCA_th (int) – The estimated data dimension.
InfoOut (dict) – The dictionary contaning additional information about the reduction. See Note.
Note
The InfoOut dictionary containg the thee following keys:
‘H’ which is the base of the reduced subspace. Its shape is (l, PCA_th) where PCA_th is the estimated data dimension.
‘d’ which is the evolution of the PCA-eigenvalues after estimation.
‘PCA_th’ which is the estimated data dimension.
‘sigma’ which is the estimated Gaussian noise standard deviation.
‘Ym’ which is a (m, n, l) numpy array where the data mean over bands is repeated for each spatial location.
-
__init__
(Y, mask=None, PCA_transform=True, PCA_th='auto', verbose=True)¶ PcaHandler constructor.
- Parameters
Y ((m, n, l) numpy array) – Multi-band data.
mask ((m, n) numpy array) – Spatial sampling mask.
PCA_transform (optional, bool) – Flag that sets if PCA should really be applied. This is useful in soma cases where PCA has already been applied. Default is True.
verbose (optional, bool) – If True, information is sent to output.
-
direct
(X=None)¶ Performs direct PCA transformation.
The input X array can be data to project into the PCA subspace or None. If input is None (which is default), the output will be simply self.Y_PCA.
Caution
The input data to transform should have the same shape as the Y initial data.
- Parameters
X ((m, n, l) numpy array) – The data to transform into PCA space.
- Returns
Multi-band data in reduced space.
- Return type
(m, n, PCA_th) numpy array
-
inverse
(X_PCA)¶ Performs inverse PCA transformation.
Caution
The input data to transform should have the same shape as the self.Y_PCA transformed data.
- Parameters
X_PCA ((m, n, PCA_th) numpy array) – The data to transform into data space.
- Returns
Multi-band data after inverse transformation.
- Return type
(m, n, l) numpy array
Backgroud functions¶
-
inpystem.tools.PCA.
Dimension_Reduction
(Y, mask=None, PCA_th='auto', verbose=True)¶ Reduces the dimension of a multi-band image.
- Parameters
Y ((m, n, l) numpy array) – The multi-band image where the last axis is the spectral one.
mask (optional, (m, n) numpy array) – The spatial sampling mask filled with True where pixels are sampled. This is used to remove correctly the data mean. Default if a matrix full of True.
PCA_th (optional, str, int) – The PCA threshold. ‘auto’ for automatic estimation. ‘max’ to keep all components. An interger to choose the threshold. In case there are less samples (N) than the data dimension (l), thi sparameter is overridded to keep a threshold of N-1.
verbose (optional, bool) – Prints output if True. Default is True.
- Returns
(m, n, PCA_th) numpy array – The data in the reduced subspace. Its shape is (m, n, PCA_th) where PCA_th is the estimated data dimension.
dict – The dictionary contaning additional information about the reduction. See Note.
Note
The InfoOut dictionary containg the thee following keys:
‘H’ which is the base of the reduced subspace. Its shape is (l, PCA_th) where PCA_th is the estimated data dimension.
‘d’ which is the evolution of the PCA-eigenvalues after estimation.
‘PCA_th’ which is the estimated data dimension.
‘sigma’ which is the estimated Gaussian noise standard deviation.
‘Ym’ which is a (m, n, l) numpy array where the data mean over bands is repeated for each spatial location.
-
inpystem.tools.PCA.
EigenEstimate
(l, Ns)¶ Computes an estimate of the covariance eigenvalues given the sample covariance eigenvalues. The Stein estimator coupled with isotonic regression has been used here.
For more information, have a look at:
MESTRE, Xavier. Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates. IEEE Transactions on Information Theory, 2008, vol. 54, no 11, p. 5113-5129.s
- Parameters
l (numpy array) – Sample eigenvalues
Ns (int) – Number of observations
- Returns
numpy array – Estimated covariance matrix eigenvalues.
float – Estimated Gaussian noise standard deviation.
int – Estimated dimension of the signal subspace.
inpystem.tools.FISTA module¶
This module implements interfacing tools for the FISTA algorithm.
For further informations about the FISTA algorithm, have a look at 1.
- 1
BECK, Amir et TEBOULLE, Marc. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2009, vol. 2, no 1, p. 183-202.
-
class
inpystem.tools.FISTA.
FISTA
(f, df, L, g, pg, shape, Nit=None, init=None, verbose=True)¶ Fast Iterative Shrinkage-Thresholding Algorithm implementation.
- Variables
f (function) – \(C^{1,1}\) convex function.
df (function) – derivative function of f.
L (float) – Lipshitz contant of f.
g (function) – Non-smooth function.
pg (function) – g poximal operator.
shape (tuple) – The data shape.
Nit (None, int) – Number of iteration. If None, the iterations will stop as soon as the functional no longer evolve. Default is None.
init (numpy array) – Init point which shape is the same as the data. If None, a random initailization is drawn. Default is None.
verbose (bool) – If True, process informations are sent to the output. Default is True.
Nit_max (int) – Maximum number of iterations.
tau (float) – Descent step.
E (numpy array) – Functional evolution across iterations.
lim (float) – Controlls the stop condition in case Nit is None. The smallest lim, the more iterations before stopping. lim is usually 1e-4.
-
__init__
(f, df, L, g, pg, shape, Nit=None, init=None, verbose=True)¶ Initialization function for FISTA.
- Parameters
f (function) – \(C^{1,1}\) convex function.
df (function) – derivative function of f.
L (float) – Lipshitz contant of f.
g (function) – Non-smooth function.
pg (function) – g poximal operator.
shape (tuple) – The data shape.
Nit (None, int) – Number of iteration. If None, the iterations will stop as soon as the functional no longer evolve. Default is None.
init (numpy array) – Init point which shape is the same as the data. If None, a random initailization is drawn. Default is None.
verbose (bool) – If True, process informations are sent to the output. Default is True.
-
StopCritera
(n)¶ This function computes a critera that informs about the algorithm convergence at step n.
- Parameters
n (int) – Current step
- Returns
Value of the critera.
- Return type
float
-
StopTest
(n)¶ This function choose if iterations should be stopped at step n. If Nit is not None, it returns True as long as n is smaller than Nit. If Nit is None, it returns True as long as the functional is evolving fast.
- Parameters
n (int) – Current step.
- Returns
Should the iterations go on ?
- Return type
bool
-
execute
()¶ Method that executes the FISTA algorithm.
- Returns
numpy array – The optimum of the optimization problem.
dict – Extra informations about convergence.
Note
Infos in output dictionary:
E
: Evolution of the functional along the iterations.time
: Execution time.
inpystem.tools.matlab_interface module¶
This module defines an interface to run matlab codes from python.
-
inpystem.tools.matlab_interface.
matlab_interface
(program, dataDico)¶ Interfaces a matlab code with python3.
The functions needs a matlab program to run and input data to be given to the matlab program.
The input data should be given in dictionary format where keys are the matlab variable names and values are the variable data.
- Parameters
program (str, Path object) – The program path.
dataDico (dict) – The dico containing the data to give to the program.
- Returns
The data returned by the program.
- Return type
dict
Note
A matlab command matlab should be accessible in the command line to make this code work.
If this does not work, please be sure the PATH variable is perfecty set. For exemple, please add this to your .bashrc for Linux Users:
.bashrc¶export PATH:$PATH:/path/to/matlab/bin
and for Windows users, please have a search about how to add a location into your path (this is a graphical task).
inpystem.tools.dct module¶
This module defines some functions related to DCT decomposition including:
direct and inverse normalized 2D DCT transform,
direct and inverse band-by-band DCT transform for multi-band data.
2D transformations¶
-
inpystem.tools.dct.
dct2d
(a)¶ Computes the 2D Normalized DCT-II.
- Parameters
X ((m, n) numpy array) – 2D image.
- Returns
DCT coefficient matrix.
- Return type
(m, n) numpy array
-
inpystem.tools.dct.
idct2d
(a)¶ Computes the 2D Normalized Inverse DCT-II.
- Parameters
A ((m, n) numpy array) – DCT coefficient matrix
- Returns
2D image.
- Return type
(m, n) numpy array
Band-by-band transformations¶
-
inpystem.tools.dct.
dct2d_bb
(x, shape=None)¶ Computes the band-by-band 2D Normalized DCT-II
If the input X is a 3D data cube, the 2D dct will be computed for each 2D images staked along the 2nd axis.
- Parameters
X ((l, m*n) or (m, n, l) numpy array) – 2D or 3D multi-band data. If the data has 3 dimensions, the last axis is for spetra. If the data is 2D, the first axis is for spectra.
shape (optional, (m, n, l) tuple) – This is the data shape. This parameter is required only if input data are 2D.
- Returns
DCT coefficient matrix.
- Return type
(l, m*n) or (m, n, l) numpy array
-
inpystem.tools.dct.
idct2d_bb
(a, shape=None)¶ Computes the band-by-band inverse 2D Normalized DCT-II
If the input a is a 3D data cube, the 2D dct will be computed for each 2D images staked along the 2nd axis.
- Parameters
A ((l, m*n) or (m, n, l) numpy array) – 2D or 3D multi-band data DCT decomposition. If the data has 3 dimensions, the last axis is for spetra. If the data is 2D, the first axis is for spectra.
shape (optional, (m, n, l) tuple) – This is the data shape. This parameter is required only if input data are 2D.
- Returns
The image matrix.
- Return type
(l, m*n) or (m, n, l) numpy array
inpystem.tools.sec2str module¶
This small module only contain sec2str, which is a function to display time in human-readable format.
-
inpystem.tools.sec2str.
sec2str
(t)¶ Returns a human-readable time str from a duration in s.
- Parameters
t (float) – Duration in seconds.
- Returns
Human-readable time str.
- Return type
str
Example
>>> from inpystem.tools.sec2str import sec2str >>> sec2str(5.2056) 5.21s >>> sec2str(3905) '1h 5m 5s'
inpystem.tools.metrics module¶
This module contains several metric functions.
-
inpystem.tools.metrics.
SNR
(xhat, xref)¶ Computes the SNR metric.
- Parameters
xhat (numpy array) – The noised data.
xref (numpy array) – The noise-free image.
- Returns
The SNR value in dB.
- Return type
float
-
inpystem.tools.metrics.
NMSE
(xhat, xref)¶ Computes the normalized mean square metric.
- Parameters
xhat (numpy array) – The noised data.
xref (numpy array) – The noise-free image.
- Returns
The NMSE value.
- Return type
float
-
inpystem.tools.metrics.
aSAD
(xhat, xref)¶ Computes the averaged Spectral Angle Distance metric.
The input data number of dimensions can be:
1: the data are spectra,
2: the data are matrices of shape (n, M),
3: the data are matrices of shape (m, n, M)
where M is the spectrum size.
- Parameters
xhat (numpy array) – The noised data.
xref (numpy array) – The noise-free image.
- Returns
The (mean) aSAD value.
- Return type
float
-
inpystem.tools.metrics.
SSIM
(xhat, xref)¶ Computes the structural similarity index.
- Parameters
xhat (numpy array) – The noised data.
xref (numpy array) – The noise-free image.
- Returns
The (mean) SSIM value.
- Return type
float
inpystem.tools.misc module¶
This module defines some miscellaneous functions.
-
inpystem.tools.misc.
toslice
(text=None, length=None)¶ Parses a string into a slice.
Input strings can be eg. ‘5:10’, ‘:10’, ‘1:’. Negative limits are allowed only if the data length is given. In such case, input strings can be e.g. ‘1:-10’. Last, an integer can be given alone such as ‘1’ to select only the 1st element.
If no text not length is given, default slice is slice(None).
- Parameters
text (optional, None, str) – The input text to parse. Default is None.
length (None, int) – The data length. This is not mendatory if no slice limit is negative. Dafault is None.
- Returns
The parsed slice object.
- Return type
slice
inpystem.restore subpackage¶
This package implements a variety of reconstruction algorithms for 2D as 3D data.
These methods include:
interpolation methods that are known to be fast but with low-quality results,
regularized least-square methods which are a slower than interpolation but with higher quality,
dictionary-learning methods which are very efficient at the price of long reconstruction procedures.
inpystem.restore.interpolation module¶
This module implement the interpolate function which is an interface between 2D as 3D data and the interpolation function of scipy.
-
inpystem.restore.interpolation.
interpolate
(Y, mask=None, method='nearest', PCA_transform=True, PCA_th='auto', verbose=True)¶ Implements data interpolation.
Three interpolation methods are implemented: nearest neighbor, linear interpolation and cubic interpolation.
Note that cubic interpolation is performed band-by-band in the case of 3D data while other methods perform in 3D directly.
- Parameters
Y ((m, n) or (m, n, l) numpy array) – Input data
mask (optional, None, (m, n) numpy array) – Sampling mask. True for sampled pixel. Default is None for full sampling.
method (optional, 'nearest', 'linear' or 'cubic') – Interpolation method. Default is ‘nearest’.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
verbose (optional, bool) – Indicates if information text is desired. Default is True.
- Returns
(m, n) or (m, n, l) numpy array – Interpolated data.
float – Execution time (s).
inpystem.restore.LS_2D module¶
This module gathers regularized least square restoration methods adapted to 2D data.
The only method it implements for the moment is the L1-LS algorithm.
-
inpystem.restore.LS_2D.
L1_LS
(Y, Lambda, mask=None, init=None, Nit=None, verbose=True)¶ L1-LS algorithm.
The L1-LS algorithm denoises or reconstructs an image possibly spatially sub-sampled in the case of spatially sparse content in the DCT basis. It is well adapted to periodic data.
This algorithms solves the folowing regularization problem:
\[\gdef \x {\mathbf{x}} \gdef \y {\mathbf{y}} \hat{\x} = \mathrm{arg}\min_{ \x\in\mathbb{R}^{m \times n} } \frac{1}{2} ||(\x-\y)\cdot \Phi||_F^2 + \lambda ||\x\Psi||_1\]where \(\mathbf{y}\) are the corrupted data, \(\Phi\) is a subsampling operator and \(\Psi\) is a 2D DCT operator.
Caution
It is strongly recomended to remove the mean before reconstruction. Otherwise, this value could be lost automaticaly by the algorithm in case of powerful high frequencies.
In the same way, normalizing the data is a good practice to have the parameter be low sensitive to data.
These two operations are implemented in this function.
- Parameters
(m, n) numpy array (Y) – An image which mean has been removed.
Lambda (float) – Regularization parameter.
mask (optional, None, (m, n) numpy array) – A sampling mask which is True if the pixel is sampled. Default is None for full sampling.
init (optional, None, (m, n, l) numpy array) – The algorithm initialization. Default is None for random initialization.
Nit (optional, None, int) – Number of iteration in case of inpainting. If None, the iterations will stop as soon as the functional no longer evolve. Default is None.
verbose (optional, bool) – Indicates if information text is desired. Default is True.
- Returns
(m, n) numpy array – The reconstructed/denoised image.
dict – A dictionary containing some extra info
Note
Infos in output dictionary:
E
: In the case of partial reconstruction, the cost function evolution over iterations.Gamma
: The array of kept coefficients (order is Fortran-style).nnz_ratio
: the ratio Gamma.size/(m*n).
inpystem.restore.LS_3D module¶
This module implements regularized least square restoration methods adapted to 3D data.
The two methods it gathers are
Smoothed SubSpace (3S) algorith,
Smoothed Nuclear Norm (SNN) algorithm.
-
inpystem.restore.LS_3D.
SSS
(Y, Lambda, mask=None, PCA_transform=True, PCA_th='auto', PCA_info=None, scale=1, init=None, Nit=None, verbose=True)¶ Smoothed SubSpace algorithm.
The 3S algorithm denoise or reconstructs a multi-band image possibly spatially sub-sampled in the case of spatially smooth images. It is well adapted to intermediate scale images.
This algorithm performs a PCA pre-processing operation to estimate:
the data subspace basis \(\mathbf{H}\),
the subspace dimension \(R\),
the associated eigenvalues in decreasing order \(\mathbf{d}\),
the noise level \(\hat{\sigma}\).
After this estimation step, the algorithm solves the folowing regularization problem in the PCA space:
\[\gdef \S {\mathbf{S}} \gdef \Y {\mathbf{Y}} \gdef \H {\mathbf{H}} \gdef \I {\mathcal{I}} \begin{aligned} \hat{\S} &= \underset{\S\in\mathbb{R}^{m \times n \times R}}{\arg\min} \frac{1}{2R}\left\|\S \mathbf{D}\right\|_\mathrm{F}^2 + \frac{\lambda}{2}\sum_{m=1}^{R} w_{m} |\S_{m,:}|_2^2\\ &\textrm{s.t.}\quad \frac{1}{R}|\H_{1:R}^T\Y_{\I(n)}-\S_{\mathcal{I}(n)}|^2_2 \leq\hat{\sigma}^2,\ \forall n \in \{1, \dots,\ m*n\} \end{aligned}\]where \(\mathbf{Y}\) are the corrupted data, \(\mathbf{D}\) is a spatial finite difference operator and \(\mathcal{I}\) is the set of all sampled pixels.
Caution
It is strongly recomended to perform PCA before running the algorithm core. This operation is integrated in this function.
In case this pre-processing step has already been done, set the
PCA_transform
parameter to False to disable the PCA step included in the SSS function. IfPCA_transform
is set to False, thePCA_info
parameter is required.- Parameters
(m, n, l) numpy array (Y) – A 3D multi-band image.
Lambda (float) – Regularization parameter.
mask (optional, None, (m, n) numpy array) – A sampling mask which is True if the pixel is sampled. Default is None for full sampling.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
PCA_info (optional, dict) –
In case PCA_transform is False, some extra info should be given to SSS. The required keys for PCA_info are:
’d’ which are the PCA eigenvalues.
’sigma’ which is an estimate of the data noise std.
scale (optional, float) – Scales the prox operator sphere radius. Should lay in ]0, +inf[. Default is 1.
init (optional, None, (m, n, l) numpy array) – The algorithm initialization. Default is None for random initialization.
Nit (optional, None, int) – Number of iteration in case of inpainting. If None, the iterations will stop as soon as the functional no longer evolve. Default is None.
verbose (optional, bool) – Indicates if information text is desired. Default is True.
- Returns
(m, n, l) numpy array – The reconstructed/denoised multi-band image.
dict – A dictionary containing some extra info
Note
Infos in output dictionary:
E
: in the case of partial reconstruction, the cost function evolution over iterations.H
: the basis of the chosen signal subspace
References
Monier, E., Oberlin, T., Brun, N., Tencé, M., de Frutos, M., & Dobigeon, N. (2018). Reconstruction of Partially Sampled Multiband Images—Application to STEM-EELS Imaging. IEEE Trans. Comput. Imag., 4(4), 585–598.
-
inpystem.restore.LS_3D.
SNN
(Y, Lambda, Mu, mask=None, PCA_transform=True, PCA_th='auto', init=None, Nit=None, verbose=True)¶ Smoothed Nuclear Norm algorithm.
The SNN algorithm denoise or reconstructs a multi-band image possibly spatially sub-sampled in the case of spatially smooth images. It is well adapted to intermediate scale images.
This algorithm solves the folowing optimization problem:
\[\gdef \X {\mathbf{X}} \gdef \Y {\mathbf{Y}} \gdef \H {\mathbf{H}} \gdef \I {\mathcal{I}} \hat{\X} = \underset{\X\in\mathbb{R}^{m \times n \times B}}{\arg\min} \frac{1}{2}||\Y_\I - \X_\I||_\mathrm{F}^2 + \frac{\lambda}{2}\left\|\X \mathbf{D}\right\|_\mathrm{F}^2 + \mu ||\X||_*\]where \(\mathbf{Y}\) are the corrupted data, \(\mathbf{D}\) is a spatial finite difference operator and \(\mathcal{I}\) is the set of all sampled pixels.
This algorithm can perform a PCA pre-processing operation to estimate:
the data subspace basis \(\mathbf{H}\),
the subspace dimension \(R\).
This is particularly usefull to reduce the data dimension and the execution time and to impose a data low-rank property.
Caution
It is strongly recomended to perform PCA before running the algorithm core. This operation is integrated in this function.
In case this pre-processing step has already been done, set the
PCA_transform
parameter to False to disable the PCA step included in the CLS function.- Parameters
(m, n, l) numpy array (Y) – A 3D multi-band image.
Lambda (float) – Regularization parameter #1.
Mu (float) – Regularization parameter #2.
mask (optional, None, (m, n) numpy array) – A sampling mask which is True if the pixel is sampled. Default is None for full sampling.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
init (optional, None, (m, n, l) numpy array) – The algorithm initialization. Default is None for random initialization.
Nit (optional, None, int) – Number of iteration in case of inpainting. If None, the iterations will stop as soon as the functional no longer evolve. Default is None.
verbose (optional, bool) – Indicates if information text is desired. Default is True.
- Returns
(m, n, l) numpy array – The reconstructed/denoised multi-band image.
dict – A dictionary containing some extra info
Note
Infos in output dictionary:
E
: in the case of partial reconstruction, the cost function evolution over iterations.H
: the basis of the chosen signal subspace
References
Monier, E., Oberlin, T., Brun, N., Tencé, M., de Frutos, M., & Dobigeon, N. (2018). Reconstruction of Partially Sampled Multiband Images—Application to STEM-EELS Imaging. IEEE Trans. Comput. Imag., 4(4), 585–598.
inpystem.restore.LS_CLS module¶
This module implements regularized least square restoration methods adapted to 3D data.
The two methods it gathers are
Cosine Least Square (CLS) algorith,
Post-LS Cosine Least Square (Post_LS_CLS) algorithm.
-
inpystem.restore.LS_CLS.
CLS
(Y, Lambda, mask=None, PCA_transform=True, PCA_th='auto', init=None, Nit=None, verbose=True)¶ Cosine Least Square algorithm
The CLS algorithm denoises or reconstructs a multi-band image possibly spatially sub-sampled in the case of spatially sparse content in the DCT basis. It is well adapted to periodic data.
This algorithm solves the folowing optimization problem:
\[\gdef \X {\mathbf{X}} \gdef \Y {\mathbf{Y}} \gdef \H {\mathbf{H}} \gdef \I {\mathcal{I}} \hat{\X} = \underset{\X\in\mathbb{R}^{m \times n \times B}}{\arg\min} \frac{1}{2}||\Y_\I - \X_\I||_\mathrm{F}^2 + \lambda ||\X \Psi||_{2, 1}\]where \(\mathbf{Y}\) are the corrupted data, \(\mathbf{D}\) is a spatial finite difference operator and \(\mathcal{I}\) is the set of all sampled pixels.
This algorithm can perform a PCA pre-processing operation to estimate:
the data subspace basis \(\mathbf{H}\),
the subspace dimension \(R\).
This is particularly usefull to reduce the data dimension and the execution time and to impose a data low-rank property.
Caution
It is strongly recomended to perform PCA before running the algorithm core. This operation is integrated in this function.
In case this pre-processing step has already been done, set the PCA_transform parameter to False to disable the PCA step included in the SSS function. If PCA_transform is set to False, the PCA_info parameter is required.
- Parameters
(m, n, l) numpy array (Y) – A 3D multi-band image.
Lambda (float) – Regularization parameter.
mask (optional, None, (m, n) numpy array) – A sampling mask which is True if the pixel is sampled. Default is None for full sampling.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
init (optional, None, (m, n, l) numpy array) – The algorithm initialization. Default is None for random initialization.
Nit (optional, None, int) – Number of iteration in case of inpainting. If None, the iterations will stop as soon as the functional no longer evolve. Default is None.
verbose (optional, bool) – Indicates if information text is desired. Default is True.
- Returns
(m, n, l) numpy array – The reconstructed/denoised multi-band image.
dict – A dictionary containing some extra info
Note
Infos in output dictionary:
E
: In the case of partial reconstruction, the cost function evolution over iterations.Gamma
: The array of kept coefficients (order is Fortran-style)nnz_ratio
: the ratio Gamma.size/(m*n)H
: the basis of the chosen signal subspace
-
inpystem.restore.LS_CLS.
Post_LS_CLS
(Y, Lambda, mask=None, PCA_transform=True, PCA_th='auto', init=None, Nit=None, verbose=True)¶ Post-Lasso CLS algorithm.
This algorithms consists in applying CLS to restore the data and determine the data support in DCT basis. A post-least square optimization is performed to reduce the coefficients bias.
- Parameters
(m, n, l) numpy array (Y) – A 3D multi-band image.
Lambda (float) – Regularization parameter.
mask (optional, None, (m, n) numpy array) – A sampling mask which is True if the pixel is sampled. Default is None for full sampling.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
init (optional, None, (m, n, l) numpy array) – The algorithm initialization. Default is None for random initialization.
Nit (optional, None, int) – Number of iteration in case of inpainting. If None, the iterations will stop as soon as the functional no longer evolve. Default is None.
verbose (optional, bool) – Indicates if information text is desired. Default is True.
- Returns
(m, n, l) numpy array – The reconstructed/denoised multi-band image.
tuple – A 2-tuple whose alements are the CLS and reffitting information dictionaries.
Note
Infos in output dictionary:
E_CLS
: In the case of partial reconstruction, the cost function evolution over iterations.E_post_ls
: In the case of partial reconstruction, the cost function evolution over iterations.Gamma
: The array of kept coefficients (order is Fortran-style)nnz_ratio
: the ratio Gamma.size/(m*n)H
: the basis of the chosen signal subspace
inpystem.restore.DL_ITKrMM module¶
This module implements the ITKrMM algorithm.
The ITKrMM algorithm¶
-
inpystem.restore.DL_ITKrMM.
ITKrMM
(Y, mask=None, P=5, K=None, L=1, S=None, Nit_lr=10, Nit=40, init_lr=None, init=None, CLS_init=None, PCA_transform=True, PCA_th='auto', verbose=True)¶ ITKrMM restoration algorithm.
- Parameters
Y ((m, n) or (m, n, l) numpy array) – The input data.
mask (optional, None or (m, n) numpy array) – The acquisition mask. Default is None for full sampling.
P (optional, int) – The width (or height) of the patch. Default is 5.
K (optional, int) – The dictionary dimension. Default is 128.
L (optional, int) – The number of low rank components to learn. Default is 1.
S (optional, int) – The code sparsity level. Default is 20.
Nit_lr (optional, int) – The number of iterations for the low rank estimation. Default is 10.
Nit (optional, int) – The number of iterations. Default is 40.
init ((P**2, K+L) or (P**2*l, K+L) numpy array) – Initialization dictionary.
CLS_init (optional, dico) – CLS initialization inofrmation. See Notes for details. Default is None.
xref (optional, (m, n) or (m, n, l) numpy array) – Reference image to compute error evolution. Default is None for input Y data.
verbose (optional, bool) – The verbose parameter. Default is True.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
- Returns
(m, n) or (m, n, l) numpy array – Restored data.
dict – Aditional informations. See Notes.
Notes
The algorithm can be initialized with CLS as soon as
CLS_init
is not None. In this case,CLS_init
should be a dictionary containing the requiredLambda
key and eventually theinit
optional argument.The output information keys are:
time
: Execution time in seconds.lrc
: low rank component.dico
: Estimated dictionary.E
: Evolution of the error.
The wKSVD algorithm¶
-
inpystem.restore.DL_ITKrMM.
wKSVD
(Y, mask=None, P=5, K=None, L=1, S=None, Nit_lr=10, Nit=40, init_lr=None, init=None, CLS_init=None, PCA_transform=True, PCA_th='auto', verbose=True)¶ wKSVD restoration algorithm.
- Parameters
Y ((m, n) or (m, n, l) numpy array) – The input data.
mask (optional, None or (m, n) numpy array) – The acquisition mask. Default is None for full sampling.
P (optional, int) – The width (or height) of the patch. Default is 5.
K (optional, int) – The dictionary dimension. Default is 128.
L (optional, int) – The number of low rank components to learn. Default is 1.
S (optional, int) – The code sparsity level. Default is 20.
Nit_lr (optional, int) – The number of iterations for the low rank estimation. Default is 10.
Nit (optional, int) – The number of iterations. Default is 40.
init ((P**2, K+L) or (P**2*l, K+L) numpy array) – Initialization dictionary.
CLS_init (optional, dico) – CLS initialization inofrmation. See Notes for details. Default is None.
xref (optional, (m, n) or (m, n, l) numpy array) – Reference image to compute error evolution. Default is None for input Y data.
verbose (optional, bool) – The verbose parameter. Default is True.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
- Returns
(m, n) or (m, n, l) numpy array – Restored data.
dict – Aditional informations. See Notes.
Notes
The algorithm can be initialized with CLS as soon as
CLS_init
is not None. In this case,CLS_init
should be a dictionary containing the requiredLambda
key and eventually theinit
optional argument.The output information keys are:
time
: Execution time in seconds.lrc
: low rank component.dico
: Estimated dictionary.E
: Evolution of the error.
Patch manipulation functions¶
-
inpystem.restore.DL_ITKrMM.
forward_patch_transform
(ref, w)¶ Transforms data from 2D/3D array to array whose shape is (w**2, N) where w is the patch width and N is the number of patches.
- Parameters
ref ((m, n) or (m, n, l) numpy array) – The input image.
w (int) – The width (or height) of the patch.
- Returns
data – The patches stacked version. Its shape is (w**2, N) where N is the number of patches if ref is 2D or (w**2*l, N) is ref is 3D.
- Return type
(w**2, N) or (w**2*l, N) numpy array
-
inpystem.restore.DL_ITKrMM.
inverse_patch_transform
(data, shape)¶ Transforms data from array of the form (w**2, N) or (w**2*l, N) where w is the patch width, l is the number of bands (in the case of 3D data) and N is the number of patches into 2D/3D array.
- Parameters
data ((w**2, N) or (w**2*l, N) numpy array) – The input data.
shape ((m, n) or (m, n, l)) – The image shape.
- Returns
ref – The input image.
- Return type
(m, n) or (m, n, l) numpy array
CLS initialization function¶
-
inpystem.restore.DL_ITKrMM.
CLS_init
(Y, Lambda, K=128, S=None, P=5, mask=None, PCA_transform=False, PCA_th='auto', init=None, verbose=True)¶ Dictionary learning initialization based on CLS restoration algorithm.
- Parameters
Y ((m, n, l)n umpy array) – A 3D multi-band image.
Lambda (float) – Regularization parameter.
K (optional, int) – The dictionary size. Default is 128.
S (optional, int) – The code sparsity. Default is 0.1*P*l.
P (optional, int) – The patch size. Default is 5.
mask (optional, None, (m, n) boolean numpy array) – A sampling mask which is True if the pixel is sampled and False otherwise. Default is None for full sampling.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is False as it should be done in dico learning operator.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
init (optional, None, (m, n, l) numpy array) – The algorithm initialization. Default is None for random initialization.
verbose (optional, bool) – Indicates if information text is desired. Default is True.
- Returns
(K, l*P**2) numpy array – The dictionary for dictionary learning algorithm.
(K, l*P**2) numpy array – The sparse code for dictionary learning algorithm.
(m, n, l) numpy array – CLS restored array.
dict – Dictionary containing some extra info
Note
Infos in output dictionary:
E
: In the case of partial reconstruction, the cost function evolution over iterations.Gamma
: The array of kept coefficients (order is Fortran-style)nnz_ratio
: the ratio Gamma.size/(m*n)H
: the basis of the chosen signal subspace
The dictionary learning interface¶
-
class
inpystem.restore.DL_ITKrMM.
Dico_Learning_Executer
(Y, mask=None, P=5, K=None, L=1, S=None, Nit_lr=10, Nit=40, init_lr=None, init=None, CLS_init=None, PCA_transform=True, PCA_th='auto', verbose=True)¶ Class to define execute dictionary learning algorithms.
The following class is a common code for most dictionary learning methods. It performs the following tasks:
reshapes the data in patch format,
performs low-rank component estimation,
starts the dictionary learning method,
reshape output data,
handle CLS initialization to speed-up computation.
- Variables
Y ((m, n) or (m, n, l) numpy array) – The input data.
Y_PCA ((m, n) or (m, n, PCA_th) numpy array) – The input data in PCA space. Its value is Y if Y is 2D.
mask ((m, n) numpy array) – The acquisition mask.
P (int) – The width (or height) of the patch.
K (int) – The dictionary dimension. This dictionary is composed of L low-rank components and K-L non-low-rank components.
L (int) – The number of low rank components to learn.
S (int) – The code sparsity level.
Nit_lr (int) – The number of iterations for the low rank estimation.
Nit (int) – The number of iterations.
CLS_init (dico) – CLS initialization inofrmation.
verbose (bool) – The verbose parameter. Default is True.
mean_std (2-tuple) – Tuple of size 2 which contains the data mean and std.
data ((N, D) numpy array) – The Y data in patch format. N (resp. D) is the number of voxels per patch (resp. patches).
mdata ((N, D) numpy array) – The mask in patch format. N (resp. D) is the number of voxels per patch (resp. patches).
init ((N, K-L) numpy array) – The low-rank estimation initialization in patch format. N is the number of voxels per patch.
init – The dictionary-learning initialization in patch format. N is the number of voxels per patch.
PCA_operator (PcaHandler object) – The PCA operator.
Note
The algorithm can be initialized with CLS as soon as
CLS_init
is not None. In this case,CLS_init
should be a dictionary containing the requiredLambda
key and eventually the CLSinit
optional argument.
The ITKrMM core¶
-
inpystem.restore.DL_ITKrMM.
rec_lratom
(data, masks=None, lrc=None, Nit=10, inatom=None, verbose=True)¶ Recover new low rank atom equivalent to itkrmm with K = S = 1.
- Parameters
data ((d, N) numpy array) – The (corrupted) training signals as its columns.
masks ((d, N) numpy array) – Mask data as its columns. masks(.,.) in {0,1}. Default is masks = 1.
lrc ((d, n) numpy array) – Orthobasis for already recovered low rank component. Default is None.
Nit (int) – Number of iterations. Default is 10.
inatom ((d, ) numpy array) – Initialisation that should be normalized. Default is None for random.
verbose (bool) – If verbose is True, information is sent to the output. Default is True.
- Returns
atom – Estimated low rank component.
- Return type
(d, ) numpy array
-
inpystem.restore.DL_ITKrMM.
OMPm
(D, X, S, Masks=None)¶ Masked OMP.
This is a modified version of OMP to account for corruptions in the signal.
Consider some input data \(\mathbf{X}\) (whose shape is (N, P) where N is the number of signals) which are masked by \(\mathbf{M}\). Given an input dictionary \(\mathbf{D}\) of shape (K, P), this algorithm returns the optimal sparse \(\hat{\mathbf{A}}\) matrix such that:
\[\gdef \A {\mathbf{A}} \gdef \M {\mathbf{M}} \gdef \X {\mathbf{X}} \gdef \D {\mathbf{D}} \begin{aligned} \hat{\A} &= \arg\min_\A \frac{1}{2}||\M\X - \M(\A\D)||_F^2\\ &s.t. \max_k||\A_{k,:}||_{0} \leq S \end{aligned}\]A slightly different modification of Masked OMP is available in “Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing,” the book written by M. Elad in 2010.
- Parameters
D ((K, P) numpy array) – The dictionary. Its rows MUST be normalized, i.e. their norm must be 1.
X ((N, P) numpy array) – The masked signals to represent.
S (int) – The max. number of coefficients for each signal.
Masks (optional, (N, P) numpy array or None) – The sampling masks that should be 1 if sampled and 0 otherwise. Default is None for full sampling.
- Returns
sparse coefficient matrix.
- Return type
(N, K) sparse coo_matrix array
-
inpystem.restore.DL_ITKrMM.
itkrmm_core
(data, masks=None, K=None, S=1, lrc=None, Nit=50, init=None, verbose=True, parent=None)¶ Iterative Thresholding and K residual Means masked.
- Parameters
data ((d, N) numpy array) – The (corrupted) training signals as its columns.
masks (optional, None, (d, N) numpy array) – The masks as its columns. masks(.,.) in {0,1}. Default is None for full sampling.
K (optional, None or int) – Dictionary size. Default is None for d.
S (optional, int) – Desired or estimated sparsity level of the signals. Default is 1.
lrc (optional, None or (d, L) numpy array) – Orthobasis for low rank component. Default is None.
Nit (optional, int) – Number of iterations. Default is 50.
init (optional, None or (d, K-L) numpy array) – Initialisation, unit norm column matrix. Here, L is the number of low rank components. Default is None for random.
verbose (optional, optional, bool) – The verbose parameter. Default is True.
parent (optional, None or Dico_Learning_Executer object) – The Dico_Learning_Executer object that called this function. If this is not None, the SNR between initial true data (given throught the `xref`argument of Dico_Learning_Executer) and the currently reconstructed data will be computed for each iteration. As this means one more OMPm per iteration, this is quite longer. Default is None for faster code and non-SNR output.
- Returns
(d, K) numpy array – Estimated dictionary
dictionary – Output information. See Note.
Note
The output dictionary contains the following keys.
time (float): Execution time in seconds.
‘SNR’ (None, (Nit, ) array): Evolution of the SNR across the iterations in case `parent`is not None.
inpystem.restore.DL_ITKrMM_matlab module¶
This module implements the ITKrMM algorithm.
The ITKrMM algorithm¶
-
inpystem.restore.DL_ITKrMM_matlab.
ITKrMM_matlab
(Y, mask=None, P=5, K=None, L=1, S=None, Nit_lr=10, Nit=40, init_lr=None, init=None, CLS_init=None, save_it=False, PCA_transform=True, PCA_th='auto', verbose=True)¶ ITKrMM restoration algorithm with matlab code.
- Parameters
Y ((m, n) or (m, n, l) numpy array) – The input data.
mask (optional, None, (m, n) numpy array) – The acquisition mask. Default is None for full sampling.
P (optional, int) – The width (or height) of the patch. Default is 5.
K (optional, int) – The dictionary dimension. Default is 2*P**2-1.
L (optional, int) – The number of low rank components to learn. Default is 1.
S (optional, int) – The code sparsity level. Default is P-L. This should be lower than K-L.
Nit_lr (optional, int) – The number of iterations for the low rank estimation. Default is 10.
Nit (optional, int) – The number of iterations. Default is 40.
init_lr (optional, (N, L) numpy array) – Initialization for low-rank component. N is the number of voxel in a patch. Default is random initialization.
init (optional, (N, K-L) numpy array) – Initialization for dictionary learning. N is the number of voxel in a patch. Default is random initialization.
CLS_init (optional, dico) – CLS initialization infrmation. See Note for details. Default is None.
save_it (optional, bool) – Particular parameter to save estimated reconstruction inside learning loops. This is recomended to stay false. Default is False.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
verbose (bool) – The verbose parameter. Default is True.
- Returns
(m, n) or (m, n, l) numpy array – Restored data.
dict – Aditional informations. See Notes.
Notes
The algorithm can be initialized with CLS as soon as
CLS_init
is not None. In this case,CLS_init
should be a dictionary containing the requiredLambda
key and eventually the CLSinit
optional argument.The output information keys are:
time
: Execution time in seconds.lrc
: low rank component.dico
: Estimated dictionary.E
: Evolution of the error.
The wKSVD algorithm¶
-
inpystem.restore.DL_ITKrMM_matlab.
wKSVD_matlab
(Y, mask=None, P=5, K=None, L=1, S=None, Nit_lr=10, Nit=40, init_lr=None, init=None, CLS_init=None, save_it=False, PCA_transform=True, PCA_th='auto', verbose=True)¶ wKSVD restoration algorithm with Matlab code.
- Parameters
Y ((m, n) or (m, n, l) numpy array) – The input data.
mask (optional, None, (m, n) numpy array) – The acquisition mask. Default is None for full sampling.
P (optional, int) – The width (or height) of the patch. Default is 5.
K (optional, int) – The dictionary dimension. Default is 2*P**2-1.
L (optional, int) – The number of low rank components to learn. Default is 1.
S (optional, int) – The code sparsity level. Default is P-L. This should be lower than K-L.
Nit_lr (optional, int) – The number of iterations for the low rank estimation. Default is 10.
Nit (optional, int) – The number of iterations. Default is 40.
init_lr (optional, (N, L) numpy array) – Initialization for low-rank component. N is the number of voxel in a patch. Default is random initialization.
init (optional, (N, K-L) numpy array) – Initialization for dictionary learning. N is the number of voxel in a patch. Default is random initialization.
CLS_init (optional, dico) – CLS initialization infrmation. See Note for details. Default is None.
save_it (optional, bool) – Particular parameter to save estimated reconstruction inside learning loops. This is recomended to stay false. Default is False.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
verbose (bool) – The verbose parameter. Default is True.
- Returns
(m, n) or (m, n, l) numpy array – Restored data.
dict – Aditional informations. See Notes.
Notes
The algorithm can be initialized with CLS as soon as
CLS_init
is not None. In this case,CLS_init
should be a dictionary containing the requiredLambda
key and eventually the CLSinit
optional argument.The output information keys are:
time
: Execution time in seconds.lrc
: low rank component.dico
: Estimated dictionary.E
: Evolution of the error.
The dictionary learning interface¶
-
class
inpystem.restore.DL_ITKrMM_matlab.
Matlab_Dico_Learning_Executer
(Y, mask=None, P=5, K=None, L=1, S=None, Nit_lr=10, Nit=40, init_lr=None, init=None, CLS_init=None, save_it=False, PCA_transform=True, PCA_th='auto', verbose=True)¶ Class to define and execute dictionary learning algorithms with matlab interface.
The following class is a common code for most dictionary learning methods. It performs the following tasks:
reshapes the data in patch format,
performs low-rank component estimation,
starts the dictionary learning method,
reshape output data,
handle CLS initialization to speed-up computation.
- Variables
Y ((m, n) or (m, n, l) numpy array) – The input data.
Y_PCA ((m, n) or (m, n, PCA_th) numpy array) – The input data in PCA space. Its value is Y if Y is 2D.
mask ((m, n) numpy array) – The acquisition mask.
P (int) – The width (or height) of the patch.
K (int) – The dictionary dimension. This dictionary is composed of L low-rank components and K-L non-low-rank components.
L (int) – The number of low rank components to learn.
S (int) – The code sparsity level.
Nit_lr (int) – The number of iterations for the low rank estimation.
Nit (int) – The number of iterations.
CLS_init (dico) – CLS initialization inofrmation.
save_it (bool) – Particular parameter to save estimated reconstruction inside learning loops. This is recomended to stay false.
verbose (bool) – The verbose parameter. Default is True.
mean_std (2-tuple) – Tuple of size 2 which contains the data mean and std.
data ((N, D) numpy array) – The Y data in patch format. N (resp. D) is the number of voxels per patch (resp. patches).
mdata ((N, D) numpy array) – The mask in patch format. N (resp. D) is the number of voxels per patch (resp. patches).
init ((N, K-L) numpy array) – The low-rank estimation initialization in patch format. N is the number of voxels per patch.
init – The dictionary-learning initialization in patch format. N is the number of voxels per patch.
PCA_operator (PcaHandler object) – The PCA operator.
Note
The algorithm can be initialized with CLS as soon as
CLS_init
is not None. In this case,CLS_init
should be a dictionary containing the requiredLambda
key and eventually the CLSinit
optional argument.
inpystem.restore.DL_BPFA module¶
This module implements the BPFA algorithm.
-
inpystem.restore.DL_BPFA.
BPFA_matlab
(Y, mask, P=5, Omega=1, K=128, Nit=100, step=1, PCA_transform=True, PCA_th='auto', verbose=True)¶ Implements BPFA algorithm for python.
This function does not properly executes BPFA but it calls the Matlab BPFA code.
- Parameters
Y ((m, n) or (m, n, l) numpy array) – The input data.
mask ((m, n) numpy array) – The acquisition mask.
P (int) – The patch width. Default is 5.
Omega (int) – The Omega parameter. Default is 1.
K (int) – The dictionary dimension. Default is 128.
Nit (int) – The number of iterations. Default is 100.
step (int) – The distance between two consecutive patches. Default is 1 for full overlap.
PCA_transform (optional, bool) – Enables the PCA transformation if True, otherwise, no PCA transformation is processed. Default is True.
PCA_th (optional, int, str) – The desired data dimension after dimension reduction. Possible values are ‘auto’ for automatic choice, ‘max’ for maximum value and an int value for user value. Default is ‘auto’.
verbose (bool) – The verbose parameter. Default is True.
- Returns
(m, n) or (m, n, l) numpy array – Restored data.
dict – Aditional informations. See Notes.
Note
The output information keys are:
time
: Execution time in seconds.Z
A
S