Object Recognition
Object recognition is a topic of ongoing interest to me. An attraction is that we don't really know what it means to be doing object recognition, and the collective view of what the problem is has changed rather sharply on several occasions. One should use image cues to identify image regions as containing objects belonging to categories --- but what are categories? One source of information is language, and we have looked at the way people annotate images as a cue to what is depicted there.
There are a series of themes in my work here, which I break out in the list of papers. I have been fortunate to work with several talented people on this topic, including Joe Mundy, Andrew Zisserman, Charlie Rothwell and Michael Jordan; and including Kobus Barnard, Nando de Freitas and Erik Learned-Miller, who worked with me when they were post-docs at Berkeley; and including Pinar Duygulu, who worked with me as a visiting scholar at Berkeley; and including Tamara Berg, Alex Berg, Jaety Edwards, Michael Maire and Ryan White who worked with me when they were students at Berkeley.
Current students working with me on object recognition are: Ian Endres, Ali Farhadi, Varsha Hedau, Nicolas Loeff, Alex Sorokin and Gang Wang
I have done quite a lot of work on human activity recognition with NSF support, which is also described on this page.
Applications of Object Recognition
- Forsyth, D.A., Malik, J. and Wilensky, R., ``Searching for digital pictures,'' Scientific American, June, 1997.
Words and pictures
The words that appear near pictures are very revealing of what might appear in the picture. Cooccurrence cues can be used to link images with words, or image regions with particular words. A nearby picture could disambiguate the sense of a word. Finally, there are language cues that suggest some objects are present.
Image-word co-occurrence
- C.O. Alm, N.Loeff and D.A. Forsyth, Challenges for annotating
images for sense disambiguation, Proceedings of the Workshop on
Frontiers in Linguistically Annotated Corpora 2006, pages 1–4,
Sydney, July 2006
- N.Loeff, C.O. Alm, D.A. Forsyth, Discriminating image senses by
clustering with multimodal features Proceedings of the COLING/ACL
2006 Main Conference Poster Sessions, pages
547–554, Sydney, July 2006
- T. Berg and D.A. Forsyth, Animals on the Web, Proc CVPR 2006,
- Jaety Edwards, Ryan White and David Forsyth "Words and Pictures in the News", HLT-NAACL03 Workshop on Learning Word Meaning from Non-Linguistic Data, 2003
- K. Barnard and D.A. Forsyth, ``Learning the semantics of words and
pictures'' Proc. IEEE InternationalConference on Computer Vision , pp.
408-415, July, 2001
- K. Barnard and D.A. Forsyth, ``Clustering Art,'' IEEE Conf. Computer Vision and Pattern Recognition ,
2001
Region-word co-occurrence
- Kobus Barnard, Pinar Duygulu, Raghavendra Guru, Prasad Gabbur, and David Forsyth, "The effects of segmentation and feature choice in a translation
model of object recognition", in Proceedings of Computer Vision and Pattern Recognition (CVPR), Madison, Wisconsin, June 2003.
- Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth,
David Blei, and Michael I. Jordan, "Matching Words and Pictures",
Journal of Machine Learning Research
, Vol 3, pp 1107-1135. 2003
- P. Duygulu, N. de Freitas, K. Barnard and D.A. Forsyth, "Object Recognition as Machine Translation," Proc. European Conf. Computer Vision,
2002 Best paper in cognitive computer vision award
Word senses
- Kobus Barnard, Matthew Johnson and David Forsyth, "Words Sense Disambiguation with Pictures ", HLT-NAACL03 Workshop on Learning Word Meaning from Non-Linguistic Data, 2003
Language cues to depiction
- Tamara Miller, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White,
Yee Whye Teh, Eric Learned-Miller, David A. Forsyth "Faces and Names in the News", Proceedings of Computer Vision and Pattern Recognition (CVPR), Washington DC, June 2004
- Tamara L. Berg, Alexander C. Berg, Jaety Edwards, David A. Forsyth ``Who's in the Picture, '' Neural Information Processing Systems (NIPS), 2004
Object Representation
- Himanshu Arora, Nicolas Loeff, David Forsyth, and Narendra
Ahuja, “Unsupervised Segmentation of Objects using Efficient
Learning” IEEE Conference on Computer Vision and Pattern
Recognition, 2007
- N. Loeff and A.Sorokin and H. Arora and D.A. Forsyth,
``Efficient Unsupervised Learning for Localization and Detection in
Object Categories'', NIPS 2005
- D.A. Forsyth; J. Malik; M. M. Fleck; H. Greenspan; T. K. Leung; S. Belongie; C. Carson; C. Bregler "Finding Pictures of Objects in Large Collections of Images" Proceedings of the International Workshop on Object Representation in Computer Vision II, LNCS; Vol. 1144, 335 - 360, 1996
Geometric methods for representing objects
- D.A. Forsyth, ``Recognizing Algebraic Surfaces from their Outlines,'' International J. of Computer Vision , 18 , 1, 21-40, 1996.
- Zisserman, A., Mundy, J.L., Forsyth, D.A., Liu, J.S., Pillow,
N., Rothwell, C.A. and
Utcke, S., ``Class-based grouping in perspective images'', Proceedings of
the 5'th International Conference on Computer Vision , Boston, MA.,
June 20-23, 1995
- Rothwell, C.A., Zisserman, A.., Forsyth, D.A. and Mundy, J.L. ``
Planar Object Recognition using Projective Shape Representation'', International J. of Computer Vision , Vol 16, p. 57-59, 1995.
- Zisserman, A., Forsyth, D.A., Mundy, J.L., Rothwell, C.A., and
Liu, J.S., ``3D Object Recognition using Invariance,'' Artificial
Intelligence , 78 , 239-288, 1995.
- Forsyth, D.A., Mundy, J.L., Rothwell, C.A. and Zisserman, A., ``Using global consistency to recognise Euclidean objects with an
uncalibrated
camera'', Proc CVPR-94 502-507, 1994
- Rothwell, C.A., Forsyth, D.A., Zisserman, A. and Mundy, J.L., ``Extracting
projective structure from single perspective views of 3D point sets,'' International Conference on Computer Vision , Berlin, 573-582,
1993. Best paper award (Marr prize)
- D.A. Forsyth, ``Recognizing Algebraic Surfaces from their Outlines,'' International Conference on Computer Vision , Berlin, 476-480, 1993.
- J. Liu, J.L. Mundy, D.A. Forsyth, A.P. Zisserman and C.A.
Rothwell, ``Efficient Recognition
of rotationally symmetric surfaces and straight homogenous
generalized cylinders,'' IEEE conference on
Computer Vision and Pattern Recognition '93 , 1993.
- Forsyth, D.A. and Rothwell, C.A., ``Recognising extruded
surfaces from their outlines'', Proc. Allerton Conference on
communication, computing and control , 1993.
- Forsyth, D.A., Mundy, J.L., Zisserman, A.P.
``Transformational Invariance- a primer,'' Image and Vision
Computing , 10 , 1, 39-45, 1992.
- Rothwell, C.A., A. Zisserman, D.A. Forsyth and J.L. Mundy, ``Canonical Frames for Planar Object Recognition,'' 2nd European Conference on Computer Vision , Springer Verlag
Lecture Notes in Computer
Science, 757-772, 1992.
- Rothwell, C.A., A. Zisserman, J.L. Mundy and D.A. Forsyth, ``Efficient Model Library Access by Projectively Invariant Indexing
Functions,'' Computer Vision and Pattern Recognition 92 ,
109-114, 1992.
- Forsyth, D.A., Mundy, J.L., Zisserman, A.P. and Rothwell, C.A.
``Recognising rotationally symmetric surfaces from their outlines,'' 2nd European Conference on Computer Vision ,
639-648, 1992.
- Forsyth, D.A., Mundy, J.L., Zisserman, A.P.. and Brown, C.M.,
``Projectively Invariant Representations Using Implicit Algebraic
Curves,'' 9 , 2, 130-136, 1991.
- Forsyth, D.A., Mundy, J.L., Zisserman, A.P., Coelho, C., Heller,
A. and Rothwell, C.A. ``Invariant descriptors for 3D object
recognition and pose,'' Special issue of I.E.E.E. Pattern
Analysis and Machine Intelligence on 3D object recognition , 971-991,
October, 1991.
- Rothwell, C.A., Zisserman, A.P., Forsyth, D.A. and
Mundy, J.L.,
``Using Projective Invariants for constant time library indexing in
model
based vision,'' Proc. British Machine Vision Conference , 1991.
- Forsyth, D.A., Mundy, J.L., Zisserman, A.P. and Brown, C.M.,
``Projectively Invariant Representations Using Implicit Algebraic
Curves,'' 1'st European Conference on Machine Vision , 427-436, 1990.
(also OU Internal
report no: 1829/90).
- Forsyth, D.A., Mundy, J.L., Zisserman, A.P. and Brown, C.M., ``Invariance-a
new framework for vision,'' 3rd International Conference on
Computer Vision , 598-605, 1990. (also OU Internal report no: 1830/90)
- Forsyth, D.A., Mundy, J.L., Zisserman, A.P. and Brown, C.M.,
``Transformational Invariance- a primer,'' British Machine
Vision Association Conference , 1-6, 1990
- Zisserman, A.P., Marinos, C.I., Forsyth, D.A., Mundy, J.L. and
Rothwell, C.A.
``Relative motion and pose from invariants,'' British Machine
Vision Association Conference , 7-12, 1990.