SwePub
Sök i SwePub databas

  Utökad sökning

Träfflista för sökning "WFRF:(Lindeberg Tony Professor 1964 ) "

Sökning: WFRF:(Lindeberg Tony Professor 1964 )

  • Resultat 1-28 av 28
Sortera/gruppera träfflistan
   
NumreringReferensOmslagsbildHitta
1.
  • Friberg, Anders, Professor, et al. (författare)
  • Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields
  • 2018
  • Ingår i: Journal of the Acoustical Society of America. - : Acoustical Society of America (ASA). - 0001-4966 .- 1520-8524. ; 144:3, s. 1467-1483
  • Tidskriftsartikel (refereegranskat)abstract
    • Vocal sound imitations provide a new challenge for understanding the coupling between articulatory mechanisms and the resulting audio. In this study, we have modeled the classification of three articulatory categories, phonation, supraglottal myoelastic vibrations, and turbulence from audio recordings. Two data sets were assembled, consisting of different vocal imitations by four professional imitators and four non-professional speakers in two different experiments. The audio data were manually annotated by two experienced phoneticians using a detailed articulatory description scheme. A separate set of audio features was developed specifically for each category using both time-domain and spectral methods. For all time-frequency transformations, and for some secondary processing, the recently developed Auditory Receptive Fields Toolbox was used. Three different machine learning methods were applied for predicting the final articulatory categories. The result with the best generalization was found using an ensemble of multilayer perceptrons. The cross-validated classification accuracy was 96.8 % for phonation, 90.8 % for supraglottal myoelastic vibrations, and 89.0 % for turbulence using all the 84 developed features. A final feature reduction to 22 features yielded similar results.
  •  
2.
  • Finnveden, Lukas, 1998-, et al. (författare)
  • The problems with using STNs to align CNN feature maps
  • 2020
  • Konferensbidrag (övrigt vetenskapligt/konstnärligt)abstract
    • Spatial transformer networks (STNs) were designed to enable CNNs to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image and its original. We present a theoretical argument for this and investigate the practical implications, showing that this inability is coupled with decreased classification accuracy. We advocate taking advantage of more complex features in deeper layers by instead sharing parameters between the classification and the localisation network.
  •  
3.
  • Finnveden, Lukas, 1998-, et al. (författare)
  • Understanding when spatial transformer networks do not support invariance, and what to do about it
  • 2021
  • Ingår i: ICPR 2020: International Conference on Pattern Recognition. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 3427-3434
  • Konferensbidrag (refereegranskat)abstract
    • Spatial transformer networks (STNs) were designed to enable convolutional neural networks (CNNs) to learn invariance to image transformations. STNs were originally proposed to transform CNN feature maps as well as input images. This enables the use of more complex features when predicting transformation parameters. However, since STNs perform a purely spatial transformation, they do not, in the general case, have the ability to align the feature maps of a transformed image with those of its original. STNs are therefore unable to support invariance when transforming CNN feature maps. We present a simple proof for this and study the practical implications, showing that this inability is coupled with decreased classification accuracy. We therefore investigate alternative STN architectures that make use of complex features. We find that while deeper localization networks are difficult to train, localization networks that share parameters with the classification network remain stable as they grow deeper, which allows for higher classification accuracy on difficult datasets. Finally, we explore the interaction between localization network complexity and iterative image alignment.
  •  
4.
  • Jansson, Ylva, 1983-, et al. (författare)
  • Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
  • 2018
  • Ingår i: Journal of Mathematical Imaging and Vision. - : Springer. - 0924-9907 .- 1573-7683. ; 60:9, s. 1369-1398
  • Tidskriftsartikel (refereegranskat)abstract
    • This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatiotemporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state of the art. In particular, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.
  •  
5.
  • Jansson, Ylva, 1983-, et al. (författare)
  • Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges
  • 2020
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8 also when training on single scale training data and give improvements in the small sample regime.
  •  
6.
  • Jansson, Ylva, 1983-, et al. (författare)
  • Exploring the ability of CNNs to generalise to previously unseen scales over wide scale ranges
  • 2021
  • Ingår i: ICPR 2020: International Conference on Pattern Recognition. - : Institute of Electrical and Electronics Engineers (IEEE). ; , s. 1181-1188
  • Konferensbidrag (refereegranskat)abstract
    • The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. We, therefore, present a theoretical analysis of invariance and covariance properties of scale channel networks and perform an experimental evaluation of the ability of different types of scale channel networks to generalise to previously unseen scales. We identify limitations of previous approaches and propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single scale training data, and do also give improvements in the small sample regime.
  •  
7.
  • Jansson, Ylva, 1983-, et al. (författare)
  • Scale-invariant scale-channel networks : Deep networks that generalise to previously unseen scales
  • 2022
  • Ingår i: Journal of Mathematical Imaging and Vision. - : Springer Science+Business Media B.V.. - 0924-9907 .- 1573-7683. ; 64:5, s. 506-536
  • Tidskriftsartikel (refereegranskat)abstract
    • The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is toprocess an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale-channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. In this paper, we present a systematic study of this methodology by implementing different types of scale-channel networks and evaluating their ability to generalise to previously unseen scales. We develop a formalism for analysing the covariance and invariance properties of scale-channel networks, including exploring their relations to scale-space theory, and exploring how different design choices, unique to scaling transformations, affect the overall performance of scale-channel networks. We first show that two previously proposed scale-channel network designs, in one case, generalise no better than a standard CNN to scales not present in the training set, and in the second case, have limited scale generalisation ability. We explain theoretically and demonstrate experimentally why generalisation fails or is limited in these cases. We then propose a new type of foveated scale-channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. This new type of scale-channel network is shown to generalise extremely well, provided sufficient image resolution and the absence of boundary effects. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single-scale training data, and do also give improved performance  when learning from datasets with large scale variations in the small sample regime.
  •  
8.
  • Jansson, Ylva, 1983-, et al. (författare)
  • Scale-invariant scale-channel networks: Deep networks that generalise to previously unseen scales
  • 2021
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The ability to handle large scale variations is crucial for many real world visual tasks. A straightforward approach for handling scale in a deep network is to process an image at several scales simultaneously in a set of scale channels. Scale invariance can then, in principle, be achieved by using weight sharing between the scale channels together with max or average pooling over the outputs from the scale channels. The ability of such scale channel networks to generalise to scales not present in the training set over significant scale ranges has, however, not previously been explored. In this paper, we present a systematic study of this methodology by implementing different types of scale channel networks and evaluating their ability to generalise to previously unseen scales. We develop a formalism for analysing the covariance and invariance properties of scale channel networks, and explore how different design choices, unique to scaling transformations, affect the overall performance of scale channel networks. We first show that two previously proposed scale channel network designs do not generalise well to scales not present in the training set. We explain theoretically and demonstrate experimentally why generalisation fails in these cases. We then propose a new type of foveated scale channel architecture, where the scale channels process increasingly larger parts of the image with decreasing resolution. This new type of scale channel network is shown to generalise extremely well, provided sufficient image resolution and the absence of boundary effects. Our proposed FovMax and FovAvg networks perform almost identically over a scale range of 8, also when training on single scale training data, and do also give improved performance  when learning from datasets with large scale variations in the small sample regime.
  •  
9.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • A time-causal and time-recursive analogue of the Gabor transform
  • 2023
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents a time-causal analogue of the Gabor filter, as well as a both time-causal and time-recursive analogue of the Gabor transform, where the proposed time-causal representations obey both temporal scale covariance and a cascade property with a simplifying kernel over temporal scales. The motivation behind these constructions is to enable theoretically well-founded time-frequency analysis over multiple temporal scales for real-time situations, or for physical or biological modelling situations, when the future cannot be accessed, and the non-causal access to future in Gabor filtering is therefore not viable for a time-frequency analysis of the system.We develop the theory for these representations, obtained by replacing the Gaussian kernel in Gabor filtering with a time-causal kernel, referred to as the time-causal limit kernel, which guarantees simplification properties from finer to coarser levels of scales in a time-causal situation, similar as the Gaussian kernel can be shown toguarantee over a non-causal temporal domain. In these ways, the proposed time-frequency representations guarantee well-founded treatment over multiple scales, in situations when the characteristic scales in the signals, or physical or biological phenomena, to be analyzed may vary substantially, and additionally all steps in the time-frequency analysis have to be fully time-causal.
  •  
10.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time
  • 2022
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This article presents an overview of a theory for performing temporal smoothing on temporal signals in such a way that: (i) temporally smoothed signals at coarser temporal scales are guaranteed to constitute simplifications of corresponding temporally smoothed signals at any finer temporal scale (including the original signal) and (ii) the temporal smoothing process is both time-causal and time-recursive, in the sense that it does not require access to future information and can be performed with no other temporal memory buffer of the past than the resulting smoothed temporal scale-space representations themselves.For specific subsets of parameter settings for the classes of linear and shift-invariant temporal smoothing operators that obey this property, it is shown how temporal scale covariance can be additionally obtained, guaranteeing that if the temporal input signal is rescaled by a uniform temporal scaling factor, then also the resulting temporal scale-space representations of the rescaled temporal signal will constitute mere rescalings of the temporal scale-space representations of the original input signal, complemented by a shift along the temporal scale dimension. The resulting time-causal limit kernel that obeys this property constitutes a canonical temporal kernel for processing temporal signals in real-time scenarios when the regular Gaussian kernel cannot be used, because of its non-causal access to information from the future, and we cannot additionally require the temporal smoothing process to comprise a complementary memory of the past beyond the information contained in the temporal smoothing process itself, which in this way also serves as a multi-scale temporal memory of the past.We describe how the time-causal limit kernel relates to previously used temporal models, such as Koenderink's scale-time kernels and the ex-Gaussian kernel. We do also give an overview of how the time-causal limit kernel can be used for modelling the temporal processing in models for spatio-temporal and spectro-temporal receptive fields, and how it more generally has a high potential for modelling neural temporal response functions in a purely time-causal and time-recursive way, that can also handle phenomena at multiple temporal scales in a theoretically well-founded manner.We detail how this theory can be efficiently implemented for discrete data, in terms of a set of recursive filters coupled incascade. Hence, the theory is generally applicable for both: (i) modelling continuous temporal phenomena over multiple temporal scales and (ii)digital processing of measured temporal signals in real time.We conclude by stating implications of the theory for modelling temporal phenomena in biological, perceptual, neural and memory processes by mathematical models, as well as implications regarding the philosophy of time and perceptual agents. Specifically, we propose that for A-type theories of time, as well as for perceptual agents, the notion of a non-infinitesimal inner temporal scale of the temporal receptive fields has to be included in representations of the present, where the inherent non-zero temporal delay of such time-causal receptive fields implies a need for incorporating predictions from the actual time-delayed present in the layers of a perceptual hierarchy, to make it possible for a representation of the perceptual present to constitute a representation of the environment with timing properties closer to the actual present.
  •  
11.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time
  • 2023
  • Ingår i: Biological Cybernetics. - : Springer Nature. - 0340-1200 .- 1432-0770. ; 117:1-2, s. 21-59
  • Tidskriftsartikel (refereegranskat)abstract
    • This article presents an overview of a theory for performing temporal smoothing on temporal signals in such a way that: (i) temporally smoothed signals at coarser temporal scales are guaranteed to constitute simplifications of corresponding temporally smoothed signals at any finer temporal scale (including the original signal) and (ii) the temporal smoothing process is both time-causal and time-recursive, in the sense that it does not require access to future information and can be performed with no other temporal memory buffer of the past than the resulting smoothed temporal scale-space representations themselves.For specific subsets of parameter settings for the classes of linear and shift-invariant temporal smoothing operators that obey thisproperty, it is shown how temporal scale covariance can be additionally obtained, guaranteeing that if the temporal input signal is rescaled by a uniform temporal scaling factor, then also the resulting temporal scale-space representations of the rescaled temporal signal will constitute mere rescalings of the temporal scale-space representations of the original input signal, complemented by a shift along the temporal scale dimension. The resulting time-causal limit kernel that obeys this property constitutes a canonical temporal kernel for processing temporal signals in real-time scenarios when the regular Gaussian kernel cannot be used, because of its non-causal access to information from the future, and we cannot additionally require the temporal smoothing process to comprise a complementary memory of the past beyond the information contained in the temporal smoothing process itself, which in this way also serves as a multi-scale temporal memory of the past.We describe how the time-causal limit kernel relates to previously used temporal models, such as Koenderink's scale-time kernels and the ex-Gaussian kernel. We do also give an overview of how the time-causal limit kernel can be used for modelling the temporal processing in models for spatio-temporal and spectro-temporal receptive fields, and how it more generally has a high potential for modelling neural temporal response functions in a purely time-causal and time-recursive way, that can also handle phenomena at multiple temporal scales in a theoretically well-founded manner.We detail how this theory can be efficiently implemented for discrete data, in terms of a set of recursive filters coupled incascade. Hence, the theory is generally applicable for both: (i) modelling continuous temporal phenomena over multiple temporal scales and (ii) digital processing of measured temporal signals in real time.We conclude by stating implications of the theory for modelling temporal phenomena in biological, perceptual, neural and memoryprocesses by mathematical models, as well as implications regarding the philosophy of time and perceptual agents. Specifically, we propose that for A-type theories of time, as well as for perceptual agents, the notion of a non-infinitesimal inner temporal scale of the temporal receptive fields has to be included in representations of the present, where the inherent non-zero temporal delay of such time-causal receptive fields implies a need for incorporating predictions from the actual time-delayed present in the layers of a perceptual hierarchy, to make it possible for a representation of the perceptual present to constitute a representation of the environment with timing properties closer to the actual present.
  •  
12.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Covariance properties under natural image transformations for the generalized Gaussian derivative model for visual receptive fields
  • 2023
  • Ingår i: Frontiers in Computational Neuroscience. - : Frontiers Media SA. - 1662-5188. ; 17, s. 1189949-1-1189949-23
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The property of covariance, also referred to as equivariance, means that an image operator is well-behaved under image transformations, in the sense that the result of applying the image operator to a transformed input image gives essentially a similar result as applying the same image transformation to the output of applying the image operator to the original image. This paper presents a theory of geometric covariance properties in vision, developed for a generalized Gaussian derivative model of receptive fields in the primary visual cortex and the lateral geniculate nucleus, which, in turn, enable geometric invariance properties at higher levels in the visual hierarchy.It is shown how the studied generalized Gaussian derivative model for visual receptive fields obeys true covariance properties under spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations. These covariance properties imply that a vision system, based on image and video measurements in terms of the receptive fields according to the generalized Gaussian derivative model, can, to first order of approximation, handle the image and video deformations between multiple views of objects delimited by smooth surfaces, as well as between multiple views of spatio-temporal events, under varying relative motions between the objects and events in the world and the observer.We conclude by describing implications of the presented theory for biological vision, regarding connections between the variabilities of the shapes of biological visual receptive fields and the variabilities of spatial and spatio-temporal image structures under natural image transformations. Specifically, we formulate experimentally testable biological hypotheses as well as needs for measuring population statistics of receptive field characteristics, originating from predictions from the presented theory, concerning the extent to which the shapes of the biological receptive fields in the primary visual cortex span the variabilities of spatial and spatio-temporal image structures induced by natural image transformations, based on geometric covariance properties.
  •  
13.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Discrete approximations of Gaussian smoothing and Gaussian derivatives
  • 2023
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper develops an in-depth treatment concerning the problem of approximating the Gaussian smoothing and Gaussian derivative computations in scale-space theory for application on discrete data. With close connections to previous axiomatic treatments of continuous and discrete scale-space theory, we consider three main ways discretizing these scale-space operations in terms of explicit discrete convolutions, based on either (i) sampling the Gaussian kernels and the Gaussian derivative kernels, (ii) locally integrating the Gaussian kernels and the Gaussian derivative kernels over each pixel support region, to aim at suppressing some of the severe artefacts of sampled Gaussian kernels and sampled Gaussian derivatives at very fine scales, and (iii) basing the scale-space analysis on the discrete analogue of the Gaussian kernel, and then computing derivative approximations by applying small-support central difference operators to the spatially smoothed image data.We study the properties of these three main discretization methods both theoretically and experimentally, and characterize their performance by quantitative measures, including the results they give rise to with respect to the task of scale selection, investigated for four different use cases, and with emphasis on the behaviour at fine scales. The results show that the sampled Gaussian kernels and the sampled Gaussian derivatives as well as the integrated Gaussian kernels and the integrated Gaussian derivatives perform very poorly at very fine scales. At very fine scales, the discrete analogue of the Gaussian kernel with its corresponding discrete derivative approximations performs substantially better. The sampled Gaussian kernel and the sampled Gaussian derivatives do, on the other hand, lead to numerically very good approximations of the corresponding continuous results, when the scale parameter is sufficiently large, in the experiments presented in the paper, when the scale parameter is greater than a value of about 1, in units of the grid spacing. Below a standard deviation of about 0.75, the derivative estimates obtained from convolutions with the sampled Gaussian derivative kernels are, however, not numerically accurate or consistent, while the results obtained from the discrete analogue of the Gaussian kernel with its associated central difference operators applied to the spatially smoothed image data is then a much better choice.
  •  
14.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Do the receptive fields in the primary visual cortex span a variability over the degree of elongation of the receptive fields?
  • 2024
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents the results of combining (i) theoretical analysis regarding connections between the orientation selectivity and the elongation of receptive fields for the affine Gaussian derivative model with (ii) biological measurements of orientation selectivity in the primary visual cortex to investigate if (iii) the receptive fields can be regarded as spanning a variability in the degree of elongation.From an in-depth theoretical analysis of idealized models for the receptive fields of simple and complex cells in the primary visual cortex, we established that the orientation selectivity becomes more narrow with increasing elongation of the receptivefields. Combined with previously established biological results, concerning broad vs. sharp orientation tuning of visual neurons in the primary visual cortex, as well as previous experimental results concerning distributions of the resultant of the orientation selectivity curves for simple and complex cells, we show that these results are consistent with the receptive fields spanning a variability over the degree of elongation of the receptive fields. We also show that our principled theoretical model for visual receptive fields leads to qualitatively similar types of deviations from a uniform histogram of the resultant descriptor of the orientation selectivity curves for simple cells, as can be observed in the results from biological experiments.To firmly determine if the underlying working hypothesis, regarding the receptive fields spanning a variability in the degree of elongation, would truly hold for the receptive fields in the primary visual cortex of higher mammals, we formulate a set of testable predictions, that can be used for investigate this property experimentally, and, if applicable, then also characterize if such a variability would, in a structured way, be related to the pinwheel structure in the visual cortex.
  •  
15.
  • Lindeberg, Tony, 1964-, et al. (författare)
  • Idealized computational models for auditory receptive fields
  • 2014
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents a theory by which idealized models of auditory receptive fields can be derived in a principled axiomatic manner, from a set of structural properties to (i) enable invariance of receptive field responses under natural sound transformations and (ii) ensure internal consistency between spectro-temporal receptive fields at different temporal and spectral scales.For defining a time-frequency transformation of a purely temporal sound signal, it is shown that the framework allows for a new way of deriving the Gabor and Gamma- tone filters as well as a novel family of generalized Gammatone filters, with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal temporal window functions.When applied to the definition of a second-layer of receptive fields from a spec- trogram, it is shown that the framework leads to two canonical families of spectro- temporal receptive fields, in terms of spectro-temporal derivatives of either spectro- temporal Gaussian kernels for non-causal time or the combination of a time-causal generalized Gammatone filter over the temporal domain and a Gaussian filter over the logspectral domain. For each filter family, the spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando trans- formations that represent variations in logarithmic frequencies over time. Within each domain of either non-causal or time-causal time, these receptive field families are de- rived by uniqueness from the assumptions.It is demonstrated how the presented framework allows for computation of basic auditory features for audio processing and that it leads to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields measured in the inferior colliculus (ICC) and primary auditory cortex (A1) of mammals.
  •  
16.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Joint covariance properties under geometric image transformations for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields
  • 2024
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The influence of natural image transformations on receptive field responses is crucial for modelling visual operations in computer vision and biological vision. In this regard, covariance properties with respect to geometric image transformations in the earliest layers of the visual hierarchy are essential for expressing robust image operations, and for formulating invariant visual operations at higher levels.This paper defines and proves a set of joint covariance properties under compositions of spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations, which make it possible to characterize how different types of image transformations interact with each other and the associated spatio-temporal receptive field responses. In this regard, we also extend the notion of scale-normalized derivatives to affine-normalized derivatives, to be able to obtain true affine-covariant properties of spatial derivatives, that are computed based on spatial smoothing with affine Gaussian kernels.The derived relations show how the parameters of the receptive fields need to be transformed, in order to match the output from spatio-temporal receptive fields under composed spatio-temporal image transformations. As a side effect, the presented proof for the joint covariance property over the integrated combination of the different geometric image transformations also provides specific proofs for the individual transformation properties, which have not previously been fully reported in the literature.We conclude with a geometric analysis, showing how the derived joint covariance properties make it possible to relate or match spatio-temporal receptive field responses, when observing, possibly moving, local surface patches from different views, under locally linearized perspective or projective transformations, as well as when observing different instances of spatio-temporal events that may occur either faster or slower between different views of similar spatio-temporal events. We do furthermore describe how the parameters in the studied composed spatio-temporal image transformation models directly relate to geometric entities in the image formation process and the 3-D scene structure.In relation to these geometric interpretations, the derived explicit transformation properties for receptive field responses, defined in terms of spatio-temporal derivatives of the underlying covariant spatio-temporal smoothing kernels, do specifically show how to both interpret and relate spatio-temporal receptive field responses, when viewing dynamic scenes under different composed geometric viewing conditions.Specifically, we propose that this theoretical analysis should have direct relevance, when interpreting the functional properties of biological receptive fields, both computationally and with regard to how the simple cells in the primary visual cortex, whose functional properties we here model with an idealized axiomatically derived spatio-temporal receptive field model. From the viewpoint of the here presented theory, in combination with previous biological modelling results that demonstrate a very good qualitative agreement between idealized receptive field models according to this theory and neurophysiological recordings of actual biological receptive fields in the primary visual cortex of higher mammals, the shapes of these joint spatio-temporal receptive fields can, from this viewpoint, be regarded as very well adapted to the structural properties of the environment.
  •  
17.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Joint covariance property under geometric image  transformations for spatio-temporal receptive fields according to the generalized Gaussian derivative model for visual receptive fields
  • 2023
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The influence of natural image transformations on receptive field responses is crucial for modelling visual operations in computer vision and biological vision. In this regard, covariance properties with respect to geometric image transformations in the earliest layers of the visual hierarchy are essential for expressing robust image operations and for formulating invariant visual operations at higher levels. This paper defines and proves a joint covariance property under compositions of spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations, which makes it possible to characterize how different types of image transformations interact with each other. Specifically, the derived relations show how the receptive field parameters need to be transformed, in order to match the output from spatio-temporal receptive fields with the underlying spatio-temporal image transformations.
  •  
18.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Normative theory of visual receptive fields
  • 2021
  • Ingår i: Heliyon. - : Elsevier. - 2405-8440. ; 7:1, s. e05897-1-e05897-20
  • Tidskriftsartikel (refereegranskat)abstract
    • This article gives an overview of a normative theory of visual receptive fields. We describe how idealized functional models of early spatial, spatio-chromatic and spatio-temporal receptive fields can be derived in a principled way, based on a set of axioms that reflect structural properties of the environment in combination with assumptions about the internal structure of a vision system to guarantee consistent handling of image representations over multiple spatial and temporal scales. Interestingly, this theory leads to predictions about visual receptive field shapes with qualitatively very good similarities to biological receptive fields measured in the retina, the LGN and the primary visual cortex (V1) of mammals.
  •  
19.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Orientation selectivity of affine Gaussian derivative based receptive fields
  • 2023
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents a theoretical analysis of the orientation selectivity of simple and complex cells that can be well modelled by the generalized Gaussian derivative model for visual receptive fields, with the purely spatial component of the receptive fields determined by oriented affine Gaussian derivatives for different orders of spatial differentiation.A detailed mathematical analysis is presented for the three different cases of either: (i) purely spatial receptive fields, (ii) space-time separable spatio-temporal receptive fields and (iii)~velocity-adapted spatio-temporal receptive fields. Closed-form theoretical expressions for the orientation selectivity curves for idealized models of simple and complex cells are derived for all these main cases, and it is shown that the degree of orientation selectivity of the receptive fields increases with a scale parameter ratio $\kappa$, defined as the ratio between the scale parameters in the directions perpendicular to vs. parallel with the preferred orientation of the receptive field. It is also shown that the degree of orientation selectivity increases with the order of spatial differentiation in the underlying affine Gaussian derivative operators over the spatial domain.We describe biological implications of the derived theoretical results, demonstrating that the predictions from the presented theory are consistent with previously established biological results concerning broad vs. sharp orientation tuning of visual neurons in the primary visual cortex. We also demonstrate that the above theoretical predictions, in combination with these biological results, are consistent with a previously formulated biological hypothesis, stating that the biological receptive field shapes should span the degrees of freedom in affine image transformations, to support affine covariance over the population of receptive fields in the primary visual cortex.Based on the results from the theoretical analysis in the paper, combined with existing results for biological experiments, we formulate a set of testable predictions that could be used to, with neurophysiological experiments, judge if the receptive fields in the primary visual cortex of higher mammals could be regarded as spanning a variability over the eccentricity or the elongation of the receptive fields, and, if so, then also characterize if such a variability would, in a structured way, be related to the pinwheel structure in the visual cortex.For comparison, we also present a corresponding theoretical orientation selectivity analysis for purely spatial receptive fields according to an affine Gabor model. The results from that analysis are consistent with the results obtained from the affine Gaussian derivative model, in the respect that the orientation selectivity becomes more narrow when making the receptive fields wider in the direction perpendicular to the preferred orientation of the receptive field. The affine Gabor model does, however, comprise one more degree of freedom in its parameter space, compared to the affine Gaussian derivative model, where a variability within that additional dimension of the parameter space does also strongly influence the orientation selectivity of the receptive fields. In this respect, the affine Gaussian derivative model leads to more specific predictions concerning relationships between the orientation selectivity and the elongation of the receptive fields, compared to the affine Gabor model.
  •  
20.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Orientation selectivity properties for the affine Gaussian derivative and the affine Gabor models for visual receptive fields
  • 2024
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This paper presents an in-depth theoretical analysis of the orientation selectivity properties of simple cells and complex cells, that can be well modelled by the generalized Gaussian derivative model for visual receptive fields, with the purely spatial component of the receptive fields determined by oriented affine Gaussian derivatives for different orders of spatial differentiation.A detailed mathematical analysis is presented for the three different cases of either: (i) purely spatial receptive fields, (ii) space-time separable spatio-temporal receptive fields and (iii) velocity-adapted spatio-temporal receptive fields. Closed-form theoretical expressions for the orientation selectivity curves for idealized models of simple and complex cells are derived for all these main cases, and it is shown that the  orientation selectivity of the receptive fields becomes more narrow, as a scale parameter ratio $\kappa$, defined as the ratio between the scale parameters in the directions perpendicular to vs. parallel with the preferred orientation of the receptive field, increases. It is also shown that the orientation selectivity becomes more narrow with increasing order of spatial differentiation in the underlying affine Gaussian derivative operators over the spatial domain.Additionally, we also derive closed-form expressions for the resultant and the bandwidth descriptors of the orientation selectivity curves, which have previously been used as compact descriptors of the orientation selectivity properties for biological neurons. These results together show that the properties of the affine Gaussian derivative model for visual receptive fields can be analyzed in closed form, which can be highly useful when to relate the results from biological experiments to computational models of the functional properties of simple cells and complex cells in the primary visual cortex.For comparison, we also present a corresponding theoretical orientation selectivity analysis for purely spatial receptive fields according to an affine Gabor model. The results from that analysis are consistent with the results obtained from the affine Gaussian derivative model, in the respect that the orientation selectivity becomes more narrow when making the receptive fields wider in the direction perpendicular to the preferred orientation of the receptive field.The affine Gabor model does, however, comprise one more degree of freedom in its parameter space, compared to the affine Gaussian derivative model, where a variability within that additional dimension of the parameter space does also strongly influence the orientation selectivity of the receptive fields. In this respect, the relationship between the orientation selectivity properties and the degree of elongation of the receptive fields is more direct for the affine Gaussian derivative model than for the affine Gabor model.
  •  
21.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Scale-covariant and scale-invariant Gaussian derivative networks
  • 2020
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • This article presents a hybrid approach between scale-space theory and deep learning, where a deep learning architecture is constructed by coupling parameterized scale-space operations in cascade. By sharing the learnt parameters between multiple scale channels, and by using the transformation properties of the scale-space primitives under scaling transformations, the resulting network becomes provably scale covariant. By in addition performing max pooling over the multiple scale channels, a resulting network architecture for image classification also becomes provably scale invariant. We investigate the performance of such networks on the MNISTLargeScale dataset, which contains rescaled images from the original MNIST dataset over a factor 4 concerning training data and over a factor of 16 concerning testing data. It is demonstrated that the resulting approach allows for scale generalization, enabling good performance for classifying patterns at scales not present in the training data.
  •  
22.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Scale-covariant and scale-invariant Gaussian derivative networks
  • 2021
  • Ingår i: Scale Space and Variational Methods in Computer Vision. - Cham : Springer Nature. ; , s. 3-14
  • Konferensbidrag (refereegranskat)abstract
    • This paper presents a hybrid approach between scale-space theory and deep learning, where a deep learning architecture is constructed by coupling parameterized scale-space operations in cascade. By sharing the learnt parameters between multiple scale channels, and by using the transformation properties of the scale-space primitives under scaling transformations, the resulting network becomes provably scale covariant. By in addition performing max pooling over the multiple scale channels, a resulting network architecture for image classification also becomes provably scale invariant. We investigate the performance of such networks on the MNISTLargeScale dataset, which contains rescaled images from original MNIST over a factor of 4 concerning training data and over a factor of 16 concerning testing data. It is demonstrated that the resulting approach allows for scale generalization, enabling good performance for classifying patterns at scales not spanned by the training data.
  •  
23.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Scale-covariant and scale-invariant Gaussian derivative networks
  • 2022
  • Ingår i: Journal of Mathematical Imaging and Vision. - : Springer Nature. - 0924-9907 .- 1573-7683. ; 64:3, s. 223-242
  • Tidskriftsartikel (refereegranskat)abstract
    • This paper presents a hybrid approach between scale-space theory and deep learning, where a deep learning architecture is constructed by coupling parameterized scale-space operations in cascade. By sharing the learnt parameters between multiple scale channels, and by using the transformation properties of the scale-space primitives under scaling transformations, the resulting network becomes provably scale covariant. By in addition performing max pooling over the multiple scale channels, or other permutation-invariant pooling over scales, a resulting network architecture for image classification also becomes provably scale invariant.We investigate the performance of such networks on the MNIST Large Scale dataset, which contains rescaled images from the original MNISTdataset over a factor of 4 concerning training data and over a factor of 16 concerning testing data. It is demonstrated that the resulting approach allows for scale generalization, enabling good performance for classifying patterns at scales not spanned by the training data.
  •  
24.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Scale selection
  • 2021. - 2
  • Ingår i: Computer Vision. - Cham : Springer. ; , s. 1-14
  • Bokkapitel (refereegranskat)abstract
    • The notion of scale selection refers to methods for estimating characteristic scales in image data and for automatically determining locally appropriate scales in a scale-space representation, so as to adapt subsequent processing to the local image structure and compute scale invariant image features and image descriptors.An essential aspect of the approach is that it allows for a bottom-up determination of inherent scales of features and objects without first recognizing them or delimiting alternatively segmenting them from their surrounding.Scale selection methods have also been developed from other viewpoints of performing noise suppression and exploring top-down information.
  •  
25.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Unified theory for joint covariance properties under geometric image  transformations for spatio-temporal receptive fields according to  the generalized Gaussian derivative model for visual receptive fields
  • 2024
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • The influence of natural image transformations on receptive field responses is crucial for modelling visual operations in computer vision and biological vision. In this regard, covariance properties with respect to geometric image transformations in the earliest layers of the visual hierarchy are essential for expressing robust image operations, and for formulating invariant visual operations at higher levels.This paper defines and proves a set of joint covariance properties for spatio-temporal receptive fields in terms of spatio-temporal derivative operators applied to spatio-temporally smoothed image data under compositions of spatial scaling transformations, spatial affine transformations, Galilean transformations and temporal scaling transformations. Specifically, the derived relations show how the parameters of the receptive fields need to be transformed, in order to match the output from spatio-temporal receptive fields under composed spatio-temporal image transformations.For this purpose, we also fundamentally extend the notion of scale-normalized derivatives to affine-normalized derivatives, that are computed based on spatial smoothing with affine Gaussian kernels, and analyze the covariance properties of the resulting affine-normalized derivatives for the affine group as well as for important subgroups thereof.We conclude with a geometric analysis, showing how the derived joint covariance properties make it possible to relate or match spatio-temporal receptive field responses, when observing, possibly moving, local surface patches from different views, under locally linearized perspective or projective transformations, as well as when observing different instances of spatio-temporal events, that may occur either faster or slower between different views of similar spatio-temporal events. We do furthermore describe how the parameters in the studied composed spatio-temporal image transformation models directly relate to geometric entities in the image formation process and the 3-D scene structure.In these ways, the paper presents a unified theory for the interaction between spatio-temporal receptive field responses and geometric image transformations, with generic implications for both: (i) designing computer vision systems that are to compute image features and image features, to be robust under the variabilities in spatio-temporal image structures as caused by geometric image transformations, and (ii) understanding fundamental geometric constraints for interpreting and constructing models of biological vision.
  •  
26.
  • Lindeberg, Tony, Professor, 1964- (författare)
  • Updates to : Scale-Space Theory in Computer Vision
  • 1996
  • Annan publikation (övrigt vetenskapligt/konstnärligt)abstract
    • This document contains corrections and additional remarks to:Scale-Space Theory in Computer Vision by Tony Lindeberg, published by Kluwer Academic Publishers, Dordrecht, The Netherlands, 1993.Last update: October 4, 1996
  •  
27.
  • Maki, Atsuto, et al. (författare)
  • In Memoriam : Jan-Olof Eklundh
  • 2022
  • Ingår i: IEEE Transactions on Pattern Analysis and Machine Intelligence. - : IEEE COMPUTER SOC. - 0162-8828 .- 1939-3539. ; 44:9, s. 4488-4489
  • Tidskriftsartikel (refereegranskat)
  •  
28.
  • Pedersen, Jens, et al. (författare)
  • Covariant spatio-temporal receptive fields for neuromorphic computing
  • 2024
  • Rapport (övrigt vetenskapligt/konstnärligt)abstract
    • Biological nervous systems constitute important sources of inspiration towards computers that are faster, cheaper, and more energy efficient. Neuromorphic disciplines view the brain as a coevolved system, simultaneously optimizing the hardware and the algorithms running on it. There are clear efficiency gains when bringing the computations into a physical substrate, but we presently lack theories to guide efficient implementations. Here, we present a principled computational model for neuromorphic systems in terms of spatio-temporal receptive fields, based on affine Gaussian kernels over space and leaky-integrator and leaky integrate-and-fire models over time. Our theory is provably covariant to spatial affine and temporal scaling transformations, and with close similarities to the visual processing in mammalian brains. We use these spatio-temporal receptive fields as a prior in an event-based vision task, and show that this improves the training of spiking networks, which otherwise is known as problematic for event-based vision. This work combines efforts within scale-space theory and computational neuroscience to identify theoretically well-founded ways to process spatio-temporal signals in neuromorphic systems. Our contributions are immediately relevant for signal processing and event-based vision, and can be extended to other processing tasks over space and time, such as memory and control.
  •  
Skapa referenser, mejla, bekava och länka
  • Resultat 1-28 av 28

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Stäng

Kopiera och spara länken för att återkomma till aktuell vy