UALG VISION LABORATORY: THE MOVIDE PROJECT

MODELING VISUAL DETECTION

Keywords: visual perception, spatial patterns, detection, thresholds, modeling


NEW: see Results section for predictions of CIFs and Modelfest data with an extremely simple model!


Organisation of this page:

1. Project details
2. Introduction: from retina to cortex
3. Visual detection experiments
4. So what's the problem?
5. Project goals
6. Methodology
7. Results
8. Publications
9. References
10. Links: Modelfest


  • Project details

    Funded by FCT/POCTI/FEDER, project POCTI/36445/PSI/2000
    Area: Psychology
    Start: October 2000
    Duration: 3 years
    Budget: 14000 contos (approx 70000 Euros)
    Executed by: CINTAL
    Scientific leader: Prof. Hans du Buf
    Personnel: Dr. Ulli Bobinger (postdoc)

    In collaboration with the Department of Psychology, University of Muenster (Germany), Prof. Uwe Mortensen and Dr. Guenter Meinhardt


  • Introduction: from retina to cortex

    Our visual system is quite complex. Light enters the eye and the information is preprocessed in the retina by bipolar, horizontal and amacrine cells. Here we also find ON and OFF retinal ganglion cells that can be seen as isotropic filters (DOG) and which outputs (axons) leave the eye in the optic nerve. The retinal preprocessing is still only partly understood, but already at the retinal level are created the so-called magno- and parvo-cellular pathways that go to the LGN or lateral geniculate nucleus and that continue to the primary visual cortex.
    At the cortical level we find cells (simple cells) that behave as anisotropic filters and that can be modeled by complex Gabor functions with a real symmetric part and an imaginary asymmetric part. This can be seen as the cortical input level, but processing continues in complex and hypercomplex cells. Yet at a higher level cells have been found that detect vertices (end-stopped operators), line/edge completion cells (occlusion grouping, see Heitger et al., 1998) as well as grating and bar cells (Petkov and Kruizinga, 1997). For more information see Gregory (1998) and, for those willing to go into more detail, for example Schwartz (1994). If you go to the VisLab's homepage (address at the bottom of this page), you can find more books re vision.


  • Visual detection experiments

    In detection experiments the contrast of a pattern is so small that we cannot see the pattern; we can only detect that there is a difference with the background on which the pattern is superimposed. However, detection is a probabilistic process because of the "noise" in the system: we measure the contrast for which in 50% of the pattern presentations something was detected by the observer. Below this threshold contrast nothing can be seen, above this contrast only some parts can be seen. In reality we can measure a continuous detection curve which is sigmoidal (a sort of error function, the integral of a Gaussian distribution). Once we have measured the contrast threshold for one pattern, we can repeat the experiment for another pattern and in this way measure a CSF (contrast sensitivity function, for example for sinewave gratings covering frequencies between 1 and 20 cycles per degree of visual angle). This can be done for different patterns and background levels. Psychophysical experiments are explained in many books (e.g. Schwartz, 1994), although the quality is sometimes not according to the price of the book (e.g. Farell and Pelli, 1998). For a small introduction and explanation of the method of constant stimuli, see this ASCII text file.


  • So what's the problem?

    The main problem is that there exist many data sets for different patterns that cannot all be described by one and the same model. We are still looking for a model that can explain all data, see for example my own paper in which I tried to develop models that can explain threshold data for both (co)sine gratings and disks (du Buf, 1992). In such "classical" detection models a multiscale approach is common: multiple channels based on different frequency/orientation filters. The filter responses are then nonlinearly summed (Minkowski metric) and detection occurs when the sum exceeds some fixed value. It was also found that some data can be described by matched filters but other data can't (Hauske et al., 1976; Meinhardt and Mortensen, 1998; Meinhardt, 1999). We know that matched filters are optimum for dealing with patterns in noise, but the fact that not all data can be described by assuming such filters poses a serious problem.
    But the picture is even more complicated because of other aspects: (1) we simply don't know at which level in the visual system the detection takes place; at the simple cells, complex cells, hypercomplex cells, or even at a much higher level? (2) what is the "noise" in the visual system? Is there noise at different levels and are these of the same type? One must realise that there are other factors involved like the concentration of an observer, his training, what does he expect to see, where exactly is his focus of attention in each stimulus presentation...
    Finally, there is something called plasticity: the system can adapt itself to different patterns. The latter occurs by training over days and weeks, and this can last for years, or it can even occur instantly by changing synaptic weights depending on the pattern. Gilbert (1998) summarizes many findings, amongst others the fact that in the LGN there are more signals going down than there are going up, which means that vision is not only a straightforward bottom-up process: the system may be testing hypotheses at all levels by top-down feedback processes.


  • Project goals

    The main goal of the MOVIDE project is to develop new detection models for spatial patterns (no temporal effects). In the "classical" models mentioned above there is, apart from the filtering, absolutely no functionality. Assuming that the visual system has a very specific functionality, namely to detect and represent structures ranging from very simple (lines and edges) to more complex (for example periodic patterns in textures), we will build this functionality into the models. Different grouping processes will be employed to improve, within individual scales, the line/edge continuity when there is curvature or the selectivity of grating datectors and, across scales, the evidence for stable structures to suppress filter interference effects. The new 2D models must be able to predict correct threshold curves of most if not all data sets.

    Note: part of the functionality can be used in the development of brightness models (e.g. du Buf and Fischer, 1995).


  • Methodology

    The first year is devoted to the development concerning:
    (1) Multiscale line/edge detection
    (2) Curvature grouping
    (3) Grouping over neighbouring scales
    (4) Model calibration (channel sensitivities using the CSF)
    (5) Predictions for other patterns like disks
    (6) Model optimisations by iterating steps 2-5

    After gaining first acceptable results, grating and bar cell models will be included and other data, notably those of the Muenster group concerning subliminal summation experiments (CIFs or contrast interrelation functions), will be considered.


  • Results

    Pedro Guerreiro (MSc student) prepared 2.5 pages about periodic gratings, you can download gratings.pdf.gz

    During the last project year we also studied one of the most simple models: a retinal one with nonlinear (Minkowski) summation over space and frequency. We prepared two small reports with a small description and model predictions. In view of the huge effort spent on cortical models, the quality of the predictions of the simple retinal model are embarrassingly good:

    ModelFest data (see below for link) cover a wide variety of different patterns: download modelfest.pdf.gz

    CIFs or Contrast Interrelation Functions are obtained by subthreshold summation of two patterns. Hauske et al. (1976) first modelled CIF data by assuming matched filters in combination with a common "prefilter". Later, Meinhardt and Mortensen (1998) and Meinhardt (1999) studied the matched filter approach using many more patterns. We modeled some data using the simple retinal Minkowski model: download cifs.pdf.gz


  • Publications

    Bobinger and du Buf (2002) In search of the Holy Grail: a unified spatial-detection model. ECVP Glasgow. Perception Vol. 31, Suppl p. 137b.


  • References

    du Buf (1992) Modeling spatial vision at the threshold level. Spatial Vision 6, 25-60.

    du Buf and Fischer (1995) Modeling brightness perception and syntactical image coding. Optical Engineering 34, 1900-1911.

    van Deemter and du Buf (2000) Simultaneous detection of lines and edges using compound Gabor filters. Int J Pattern Recogn Artif Intelligence 14, 757-777.

    Farell and Pelli (1998) Psychophysical methods, or how to measure a threshold, and why. In: Carpenter and Robson (eds) Vision Research - a practical guide to laboratory methods. Oxford Univ. Press.

    Gilbert (1998) Adult cortical dynamics. Physiological Reviews 78, 467-485.

    Gregory (1998) Eye and brain. Oxford Univ. Press.

    Hauske, Wolf and Lupp (1976) Matched filters in human vision. Biol. Cybernetics 22, 181-188.

    Heitger et al. (1998) Simulation of neural contour mechanisms: representing anomalous contours. Image Vision Computing 16, 407-421.

    Meinhardt and Mortensen (1998) Detection of aperiodic test patterns by pattern specific detectors revealed by subthreshold summation. Biol. Cybernetics 79, 413-425.

    Meinhardt (1999) Evidence for different nonlinear summation schemes for lines and gratings at threshold. Biol. Cybernetics 81, 263-277.

    Petkov and Kruizinga (1997) Computational models of visual neurons specialised in the detection of periodic and aperiodic visual stimuli: bar and grating cells. Biol Cybernetics 76, 83-96.

    Schwartz (1994) Visual perception. Appleton and Lange.


  • Links: ModelFest

    The ModelFest initiative was created with the same goal: developing models that can predict detection data for many patterns, but also for many observers. The latter aspect is different because MOVIDE will optimise models for improving the quality of threshold curves for individual observers. See above (Results) for some of MOVIDE's Modelfest data predictions!


    Send comments to: dubuf@ualg.pt

    Go back to the Hans du Buf homepage or visit the UAlg Vision Laboratory.


    This page has been visited ****" times since December 2000.

    Last update: April 2004 - HdB