Publications | Tomáš Karella

2024

3D Non-separable Moment Invariants and Their Use in Neural Networks

Tomáš Karella, Tomáš Suk, Václav Košík, Leonid Bedratyuk, Tomáš Kerepecký, and Jan Flusser

Appears in SN Computer Science, 2024
Harmformer: Harmonic Networks Meet Transformers for Continuous Roto-Translation Equivariance

Tomáš Karella, Adam Harmanec, Jan Kotera, Jan Blažek, and Filip Šroubek

Appears in NeurIPS 2024 Workshop on Symmetry and Geometry in Neural Representations, 2024

Abs HTML PDF

CNNs exhibit inherent equivariance to image translation, leading to efficient parameter and data usage, faster learning, and improved robustness. The concept of translation equivariant networks has been successfully extended to rotation transformation using group convolution for discrete rotation groups and harmonic functions for the continuous rotation group encompassing 360°. We explore the compatibility of the SA mechanism with full rotation equivariance, in contrast to previous studies that focused on discrete rotation. We introduce the Harmformer, a harmonic transformer with a convolutional stem that achieves equivariance for both translation and continuous rotation. Accompanied by an end-to-end equivariance proof, the Harmformer not only outperforms previous equivariant transformers, but also demonstrates inherent stability under any continuous rotation, even without seeing rotated samples during training.

2023

H-NeXt: The next step towards roto-translation invariant networks

Tomáš Karella, Filip Šroubek, Jan Blažek, Jan Flusser, and Václav Košík

In 34th British Machine Vision Conference 2023, BMVC 2023, Aberdeen, UK, November 20-24, 2023, 2023

Abs Bib HTML PDF Code

The widespread popularity of equivariant networks underscores the significance of parameter efficient models and effective use of training data. At a time when robustness to unseen deformations is becoming increasingly important, we present H-NeXt, which bridges the gap between equivariance and invariance. H-NeXt is a parameter-efficient roto-translation invariant network that is trained without a single augmented image in the training set. Our network comprises three components: an equivariant backbone for learning roto-translation independent features, an invariant pooling layer for discarding roto-translation information, and a classification layer. H-NeXt outperforms the state of the art in classification on unaugmented training sets and augmented test sets of MNIST and CIFAR-10.
@inproceedings{karella2023, author = {Karella, Tomáš and Šroubek, Filip and Blažek, Jan and Flusser, Jan and Košík, Václav}, title = {H-NeXt: The next step towards roto-translation invariant networks}, booktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023}, publisher = {BMVA}, year = {2023}, url = {https://papers.bmvc2023.org/0578.pdf}, }
IPTA23
CNN Ensemble Robust to Rotation Using Radon Transform

Václav Košík, Tomáš Karella, and Jan Flusser

Proceedings of The 12th International Conference on Image Processing Theory, Tools and Applications (IPTA 2023), 2023

Abs Bib PDF

A great deal of attention has been paid to alternative techniques to data augmentation in the literature. Their goal is to make convolutional neural networks (CNNs) invariant or at least robust to various transformations. In this paper, we present an ensemble model combining a classic CNN with an invariant CNN\nwhere both were trained without any augmentation. The goal is to preserve the performance of the classic CNN on nondeformed images (where it is supposed to classify more accurately) and the performance of the invariant CNN on deformed images (where it is the other way around). The combination is controlled by another network which outputs a coefficient that determines the fusion rule of the two networks. The auxiliary network is trained to output the coefficient depending on the intensity of the image deformation. In the experiments, we focus on rotation as a simple and most frequently studied case of transformation. In addition, we present a network invariant to rotation that is fed with the Radon transform of the input images. The performance of this network is tested on rotated MNIST and is further used in the ensemble whose performance is demonstrated on the CIFAR10- dataset.
@article{kosik2023, title = {CNN Ensemble Robust to Rotation Using Radon Transform}, year = {2023}, author = {Košík, Václav and Karella, Tomáš and Flusser, Jan}, journal = {Proceedings of The 12th International Conference on Image Processing Theory, Tools and Applications (IPTA 2023)}, }
3D Non-separable Moment Invariants

Jan Flusser, Tomáš Suk, Leonid Bedratyuk, and Tomáš Karella

In Computer Analysis of Images and Patterns, 2023

Abs Bib PDF

In this paper, we introduce new 3D rotation moment invariants, which are composed of non-separable Appell moments. The Appell moments can be substituted directly into the 3D rotation invariants instead of the geometric moments without violating their invariance. We show that non-separable moments may outperform the separable ones in terms of recognition power and robustness thanks to a better distribution of their zero surfaces over the image space. We test the numerical properties and discrimination power of the proposed invariants on three real datasets – MRI images of human brain, 3D scans of statues, and confocal microscope images of worms.
@inproceedings{flusser2023, author = {Flusser, Jan and Suk, Tom{\'a}{\v{s}} and Bedratyuk, Leonid and Karella, Tom{\'a}{\v{s}}}, title = {3D Non-separable Moment Invariants}, booktitle = {Computer Analysis of Images and Patterns}, year = {2023}, publisher = {Springer Nature Switzerland}, address = {Cham}, pages = {295--305}, isbn = {978-3-031-44237-7}, }

2022

Convolutional neural network exploiting pixel surroundings to reveal hidden features in artwork NIR reflectograms

Tomáš Karella, Jan Blažek, and Jana Striová

Journal of Cultural Heritage, 2022

Abs Bib PDF Code

Near-infrared reflectography (NIR) is a well-established non-invasive and non-contact imaging technique. The NIR methods are able to reveal concealed layers of artwork, such as a painter’s sketch or repainted canvas. The information obtained may be helpful to historians for studying artist technique, attributing an artwork reconstructing faded details. Our research presents the improved method previously developed that reveals the hidden features by removing the information content of the visible spectrum from NIR. Based on convolutional neural networks (CNN), our model estimates the transfer function from visible spectra to NIR, which is nonlinear and specific for painting materials. Its parameters are learnt for particular paintings on the subsamples randomly selected across the canvas, and the model is further utilised to enhance the whole artwork. In addition to the previously developed model, our algorithm exploits each pixel’s surroundings to estimate its NIR response. This leads to more precise results and increased robustness to various noises. We demonstrate higher accuracy than the previous method on the historical paintings mock-ups and higher performance on well-known artworks such as Madonna dei Fusi attributed to Leonardo da Vinci.
@article{karella2022, title = {Convolutional neural network exploiting pixel surroundings to reveal hidden features in artwork NIR reflectograms}, journal = {Journal of Cultural Heritage}, volume = {58}, pages = {186-198}, year = {2022}, issn = {1296-2074}, doi = {https://doi.org/10.1016/j.culher.2022.09.022}, author = {Karella, Tomáš and Blažek, Jan and Striová, Jana}, keywords = {Signal separation, Concealed features visualization, Artwork analysis, Infrared reflectography, Convolutional neural networks}, }