ICS 691 Machine Learning
Class objectives
An introduction is given to the area of machine learning.
Concepts and typical problems are discussed. Frequently used machine
learning methods and classical work are introduced. The course proceeds
to focus on information theoretic approaches to machine learning. An
introduction to information theory is given. Motivations from
information processing in the brain are discussed, and finally
applications to bioinformatics and other areas of interest.
Organization
The course combines Lectures
with Seminars in which students present a paper. The
presentations count 40% towards the final grade. The remaining 60% are
covered by a final exam. There is an optional choice of doing a class
project instead of an exam.
Schedule
Week
|
Subject
|
Lecture / Seminar: discussed
papers
|
1
|
Introduction to machine
learning. Supervised learning: Regression. Cross-validation. Bayesian
estimation.
|
Lecture
|
2
|
Neural Networks intro:
Perceptrons.
|
Lecture
|
3
|
Reinforcement Learning. |
Guest Lecture by Dr. Chris
Watkins, Royal Holloway, University of London.
|
4
|
Neural Networks. Support vector
machines. |
Lecture
|
5
|
Unsupervised learning and
cluster analysis.
|
Lecture |
6
|
Quantifying information
transmission and learning. Introduction to information
theory. |
Lecture |
7
|
Optimality in neural information
processing systems.
|
W. Bialek, F. Rieke, R. de Ruyter van Steveninck,
& D. Warland,
Reading a neural code, Science 252 1854-57 (1991)
Fairhall
AL, Lewen GD, Bialek W, de Ruyter Van Steveninck RR, Efficiency and
ambiguity in an adaptive neural code, Nature, 412(6849), 787-92, August 2001
|
8
|
Applications of Neural Networks
|
[Ning
et al., 2005]: Toward Automatic Phenotyping of Developing Embryos
from Videos (IEEE Trans. Image Processing, 2005)
|
9
|
Lossy compression and
unsupervised learning.
1: Rate distortion theory and clustering. |
Lecture
|
10
|
2: Compression and relevance. |
Lecture
|
11
|
3: Complexity control. |
Lecture |
12
|
Behavioral and interactive
learning. |
Lecture |
13
|
Advanced topics or applications.
|
2 Papers of student choice. Possible examples.
|
14
|
Applications to molecular
biology and bioinformatics.
|
2 Papers of student choice.
|
15
|
Student project reports.
|
|
Reference books (not required):
- Mitchell, "Machine Learning"
- MacKay, "Information Theory, Inference and Learning Algorithms"
- Duda, Hart and Stork, "Pattern CLassification"
- Alpaydin, "Introduction to Machine Learning"
- Hastie, Tibshirani and Friedman, "The
Elements of Statistical Learning: Data Mining, Inference, and Prediction"
- Cristianini and Shaw-Taylor, "An Introduction to Support Vector
Machines"
- Sutton and Barto, "Reinforcement Learning"
- Cover and Thomas, "Elements of Information Theory". See Cover's website.
- Gordon, "Classification"
- Hertz, Krogh, Palmer "Introduction to the theory of neural
computation" (read Chapter 2 to repeat material on Hopfield Nets)
Useful web sites:
Journal of Machine Learning Research
http://jmlr.csail.mit.edu/
Kernel Machines
http://www.kernel-machines.org
MacKay, "Information Theory, Inference and Learning Algorithms" online
at
http://www.inference.phy.cam.ac.uk/mackay/itila/book.html
Independent Component Analysis
http://www.cnl.salk.edu/~tewon/ica_cnl.html
Citeseer:
http://citeseer.ist.psu.edu/
NIPS Proceedings:
books.nips.cc
Some papers, etc.:
- Kurt Hornik,
Maxwell B. Stinchcombe,
Halbert White:
Multilayer feedforward networks are universal approximators.
Neural Networks 2(5): 359-366 (1989)
- George Cybenko. Continuous
valued neural networks with two hidden layers are sufficient.
Technical report, Department of Computer Science, Tufts University,
Medford, MA, 1988. link
to his publications
- Nettalk
- Digit
recognition
- Hopfield net: Neural
networks and physical systems with emergent collective computational
abilities. PNAS 79, 2554, 1982.
- Kernel PCA: Schölkopf, B.,
A.J. Smola and K.-R. Müller: Nonlinear
component analysis as a kernel eigenvalue problem. Neural Computation 10(5),
1299-1319 (1998)
- Clustering: http://www.cs.uwaterloo.ca/~shai/LuxburgBendavid05.pdf ; J. Buhmann and M. Held (2000): Model
selection in clustering by uniform convergence bounds, NIPS Proceedings.
- Rose et. al (1999)
Mini-quizzes
(not
graded):
The intention here is to give
you a reminder of all the subjects we covered in the lecture. You can
take a little time at the end of each lecture to write down the main
ideas for each subject. This may help you to assess your understanding,
and to formulate questions if necessary.
Notes and
announcements:
- (9/7) Guest Lecture by Dr. Chris Watkins, Royal Holloway,
University of London.
- (9/7) Students should start now to chose papers to present and a
project to work on.
Google MathWorld Citeseer