CS 598 Optimization methods in vision and learning

Notes

  1. Week 1: Basic continuous optimization (descent directions, coordinate ascent, EM as coordinate ascent, Newton's method, stabilized Newton's method); My notes
  2. Week 2, Week 3: More basic continuous optimization (trust regions, dogleg method, subspace method); My notes: (Conjugate gradient, approximate Hessian methods, BFGS, limited memory methods My notes:
  3. Week 4 and 5: Constrained optimization methods (Lagrangians, lagrange duals, SVMs, quadratic penalty method, augmented lagrangian method) My notes:    (inequality constraints; boxes, interior point methods) My notes:   (Interior point methods) My notes; Logistic regression as a classifier and stochastic gradient descent (my notes);
  4. Week 8:  Initial remarks on Combinatorial optimization (my notes); Flow and cuts (my notes)
  5. Week 9:  More flow and cuts; matchings (my notes)
  6. Week 10: Max Cut and SDP (my notes)
  7. Week 15: Clean-up on Max-Cut and SUDOKU (my notes); Submodular functions (my notes)

Resources

    Continuous optimization books
  1. Numerical Optimization (Springer Series in Operations Research and Financial Engineering) by Jorge Nocedal and Stephen Wright, 2006
  2. Convex Optimization by Stephen Boyd and Lieven Vandenberghe, Cambridge, 2004
  3. Nonlinear Programming by Dimitri P. Bertsekas, Athena, 1999
  4. Practical Methods of Optimization by R. Fletcher, Wiley, 2000
  5. Practical Optimization by Philip E. Gill, Walter Murray, Margaret H. Wright, Academic, 1982
  6. Johathan Shewchuk, An Introduction to the conjugate gradient method without the agonizing pain, 1994

Papers

 

Wed 19 Nov paper:

A combinatorial, primal dual approach to semidefinite programs, S. Arora and S. Kale

Continuous optimization applications

Iterative scaling and the like

  1. The improved iterative scaling algorithm: A gentle introduction
    A Berger - Unpublished manuscript, 1997
  2. Iain Bancarz, M. Osborne, Improved iterative scaling can yield multiple globally optimal models with radically differing performance levels, Proceedings of the 19th international conference on Computational linguistics, 1 - 7, 2002
  3. Robert Malouf, A comparison of algorithms for maximum entropy parameter estimation,
    proceeding of the 6th conference on Natural language learning - Volume 20, 1 - 7,  2002
  4. F. Sha and F. Pereira, Shallow parsing with conditional random fields, Proc HLT-NAACL, Main papers, pp 134-141, 2003
  5. Hanna Wallach, Efficient Training of Conditional Random Fields, University of Edinburgh, 2002.

Boosting

  1.  Jerome H. Friedman, Greedy Function Approximation: A Gradient Boosting Machine , The Annals of Statistics, Vol. 29, No. 5 (Oct., 2001), pp. 1189-1232

Bundle adjustment

  1.   Bill Triggs, Philip F. McLauchlan, Richard I. Hartley and Andrew W. Fitzgibbon,   Bundle Adjustment -- A Modern Synthesis, Vision Algorithms: Theory and Practice: International Workshop on Vision Algorithms, Corfu, Greece, September 1999, pp153-177,

Matrix factorization

  1.   Buchanan, A.M.; Fitzgibbon, A.W., Damped Newton algorithms for matrix factorization with missing data, Computer Vision and Pattern Recognition, 2005, 316 - 322
  2. Fast Maximum Margin Matrix Factorization for Collaborative Prediction, Jason D. M. Rennie, Nati Srebro, in Luc De Raedt, Stefan Wrobel (Eds.) Proceedings of the 22nd International Machine Learning Conference, ACM Press, 2005
  3. Scene Discovery by Matrix Factorization, N.Loeff, A. Farhadi, ECCV 2008
  4. T. Finley and T.Joachims, Supervised clustering with support vector machines, Proceedings of the 22nd international conference on Machine learning,  217 - 224, 2005

Registration

  1. A.W. Fitzgibbon, Robust registration of 2D and 3D point sets, Image and Vision Computing, Volume 21, Issues 13-14, 1 December 2003, Pages 1145-1153

SVM's

        1. C.J.C. Burges, ``A Tutorial on Support Vector Machines for Pattern Recognition, '' Data Mining and Knowledge Discovery, 2, 121-167 (1998)
        2. Sequential minimal optimization: A fast algorithm for training support vector machines J Platt - Advances in Kernel Methods-Support Vector Learning, 1999
        3. S. S. Keerthi, S. K. Shevade,C. Bhattacharyya, K. R. K. Murthy, Improvements to Platt's SMO Algorithm for SVM Classifier Design,  Neural Computation. 2001;13:637-649, 2001
        4. Dynamic visual category learning, T.Yeh and T. Darrell, CVPR, 2008
        5. K. Zhang, I.W. Tsang, J.T. Kwok, Maximum margin clustering made practical, Proceedings of the 24th international conference on Machine learning,  1119 - 1126, 2007   
        6. Pegasos: Primal Estimated sub-GrAdient SOlver for SVM, S. Shalev-Shwartz, Y. Singer, N. Srebro, ICML 2008

        Structure learning

        1. I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, Large Margin Methods for Structured and Interdependent Output Variables, Journal of Machine Learning Research (JMLR), 6(Sep):1453-1484, 2005.
        2. Subgradient Methods for Maximum Margin Structured Learning Nathan D. Ratliff J. Andrew Bagnell Martin A. Zinkevich
        3. Learning to localize objects with structured output regression, Blaschko and Lamport, ECCV 2008

    Discrete Optimization

    Resources

    Applications