Research

My research explores how changing common assumptions about visual algorithms leads to new problems and new capabilities. Over the last several years, I have posed three questions whose answers suggest paradigms for collecting and analyzing imagery, with applications to surveillance, robotics, and environmental and medical imaging. Questions that have recently kept me awake at night include:

Passive Vision is the analysis of video taken by cameras that are not moving. Many cameras do not move, and continually watch a specific scene - an airport security desk, a beach, a volcano - for months or years. Much as Active Vision (the ability to intentionally control camera motion) simplifies problems in structure from motion, Passive Vision simplifies statistical image analysis by observing the same scene for very long time periods. These statistics support algorithms for more robust video surveillance, the ability to geo-locate any webcam feed, and the potential to re-purpose webcams for environmental monitoring.

One basic question is: "where is the camera?" There are many live webcams broadcasting online from unknown locations; these cameras can be geo-located because the lighting and weather changes they observe depends on where the camera is. Our paper on webcam geolocation [jacobs2007b] offered the first algorithms to geolocate a time-series. The algorithm used tensor factorization of imagery that we found to be consistent across nearly all outdoor camera scenes [jacobs2007a]. Related cues helped to geo-calibrate (i.e. find the orientation and the zoom level) cameras [jacobs08].

Another natural question is "what is in the scene?" Classical approaches attempt to recognize objects by their appearance in one image, but we have explored what can be learned by measuring the time scale over which things change. Tensor factorization of long term time-lapses gives an approach to automatically labeling scene locations (like trees) that vary over annual time scales, locations (like eastward facing walls) that are consistently brighter in the morning [jacobs2007a], or segmenting objects in a scene based on very small motions [dixon2011]. At shorter time scale, we have begun to explore variations in lighting due to clouds as a form of “stochastically structured light”, and recently derived constraints for building the 3D model of a scene from a time-lapse of clouds passing overhead [jacobs10integral].

To support our research, and the larger community, we have built and actively share the Archive of Many Outdoor Scenes (AMOS). AMOS provides a variety of tools for large scale data visualization and integration with Google Earth [1], and is a widely used as an experimental platform to ground research in webcam geo-location and calibration (e.g. [ cites]). Pages my former student, Nathan Jacobs, maintains about parts of this project:

  1. Shape from Clouds
  2. Geo-location
Acknowledgements

This project is supported under NSF IIS 0546383: "CAREER: Passive Vision, What Can Be Learned by a Stationary Observer". Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.