Projects

Defocus estimation in natural images

We have developed an optimal model of defocus estimation that integrates the statistical structure of natural scenes with the properties of the vision system. We make empirical measurements of natural scenes, account for the optics, sensors, and noise of the vision system, and use Bayesian statistical analyses to obtain accurate defocus estimates. From a small blurry patch of an individual image we can tell how out-of-focus that part of the image is. Our results have implications for vision science, neuroscience, comparative psychology, computer science, and machine vision.


We are currently investigating human defocus discrimination in natural images using psychophysical methods and a custom built three-monitor rig. We are also developing improved algorithms for auto-focusing digital cameras and other digital imaging systems, without need for specialized hardware or trial-and-error search.

Vision begins with a lens system that focuses light near the retinal photoreceptors. A lens can  focus light perfectly only from one distance. Natural scenes have objects at many distances, so regions of nearly all images will be at least a little bit out of focus; that is, defocus blur is present in almost every natural images. Defocus is used for many biological tasks: guiding predatory behavior, estimating depth and scale, controlling accommodation (biological auto-focusing), and regulating eye growth. However, before defocus can be used to perform these tasks, defocus itself must first be estimated.


Surprisingly little is known about how biological vision systems estimate defocus. The computations that make optimal use of available defocus information are not well-understood, little is known about the psychophysics of human defocus estimation, and almost nothing is known about its neurophysiological basis. Very little existing work has examined defocus estimation with natural stimuli. Given that defocus may be the most widely available depth cue on the planet, we decided to dig into the problem.

  1. Talks

  1. Research

  1. Teaching

  1. CV

  1. Music

  1. Patents

 HomeHome.htmlResearch.htmlshapeimage_3_link_0
  1. People

  1. Code

  1. Projects

  1. Demos

  1. Press

 

Natural scene statistics & depth perception

Visual adaptation

Visuo-motor adaptation

Visuo-haptic calibration

The human visual system has internalized a counter-intuitive statistical relationship between the shapes of object silhouettes and depth magnitude. In natural scenes, large depth discontinuities are more likely across edges bounding objects with convex than concave silhouettes (e.g. dinner plate vs crescent moon). In controlled experiments, we presented depth steps defined by disparity that were identical in every respect except the shape of the bounding edge. Human judgments of depth magnitude were biased in a manner predicted by the natural scene statistics.

The influence of a visual cue on depth percepts changes after extensive haptic training. The convex side of an edge is more likely to be the near side. Similarly, the convex side of an edge is more likely. Depth percepts are affected accordingly. For the same disparity-defined depth step, more depth in perceived if the near surface has a convex silhouette than if it is concave. However, after extensive training in a virtual environment in which concave surfaces are always nearer, the affect of convexity on depth percepts is reversed.

An optimal Bayesian model shows that the relative reliability of error signals and the visuo-motor mapping stability predicts the rate at which subjects correct mapping errors.

We are also preparing a manuscript on slant perception that follows similar logic. First, we measured the distribution of slants in natural scenes. Next, we collected human psychophysical data on a custom psychophysical rig. Then we compared the pattern of biases in human slant perception to the pattern predicted by an ideal observer that had internalized the natural scene statistics. We found that humans behave as if they have internalized a close approximation to the naturally occurring distribution of slants.

When visual and haptic cues are miscalibrated (signals disagree on average), the relative amount of recalibration depends on the relative reliability of the two cues.