## Computational Intelligence Lab

### Spring Semester 2017

This laboratory course teaches fundamental concepts in
computational science and machine learning based on **matrix
factorization**.

where a data matrix **X** is (approximately)
factorized into two matrices **A** and
**B**. Based on the choice of approximation quality
measure and the constraints on **A** and
**B**, the method provides a powerful framework of numerical
linear algebra that encompasses many important techniques, such as dimension
reduction, clustering and sparse coding.

### News

Date | What? |
---|---|

9 February 2017 | Web site is online |

23 February 2017 | Posted lecture 1 |

25 February 2017 | Pictures of the blackboard will be posted here |

1 March 2017 | Posted exercise 1 |

10 March 2017 | Posted lecture 2 |

11 March 2017 | Posted exercise 3 |

11 March 2017 | Collaborative Filtering Kaggle competition is online here. |

17 March 2017 | Posted lecture 3 + exercise 4 |

23 March 2017 | Posted lecture 4 |

27 March 2017 | Posted exercise 5 |

31 March 2017 | Posted lecture 5 + exercise 6 |

7 April 2017 | Posted lecture 6 |

21 April 2017 | Posted exercise 7 |

26 April 2017 | Posted lecture 7 |

1 May 2017 | Posted exercise 8 |

4 May 2017 | Posted lecture 8 |

7 May 2017 | Posted exercise 9 |

11 May 2017 | Posted lecture 9 |

12 May 2017 | Posted exercise 10 |

19 May 2017 | Posted lecture 10 |

23 May 2017 | Posted exercise 11 |

1 June 2017 | Posted lecture 11 |

2 June 2017 | Posted exercise 12 |

### Course Overview

Week | Topic | Lecture | Exercise | ||
---|---|---|---|---|---|

8 | Introduction to the course | CIL2017-00 |
Exercise 1 |
||

9 | Principal Component Analysis | CIL2017-01-PCA | Exercise 2 | ||

10 | Singular Value Decomposition | CIL2017-02-SVD | Exercise 3 | ||

11 | Optimization | CIL2017-03-Optimization | Exercise 4 | ||

12 | NMF | CIL2017-04-Non-Negative | Exercise 5 | ||

13 | Word embeddings | CIL2017-05-Word-Emeddings | Exercise 6 | ||

14 | Clustering Mixture Models | CIL2017-06-Clustering Mixture Models | Exercise 7 | ||

15 | (no lecture, easter holiday) |
||||

16 | (no lecture, easter holiday) |
||||

17 | Neural Networks | CIL2017-07-Neural Networks | Exercise 8 | ||

18 | Convolutional Neural Networks | CIL2017-08-Convolutional-Neural-Networks | Exercise 9 | ||

19 | Sparse coding | CIL2017-09-Sparse-Coding | Exercise 10 | ||

20 | Ditionnary learning | CIL2017-10-Dictionary-Learning | Exercise 11 | ||

21 | (no lecture) |
(no exercise) |
|||

22 | Robust PCA | CIL2017-11-Robust-PCA | Exercise 12 |

### Classes

What | Time | Room |
---|---|---|

Lectures | Fri 10 - 12 | ML D 28 |

Exercises | Thu 15 - 17 | CAB G 51 |

Thu 16 - 18 | CAB G 61 | |

Fri 15 - 17 | CAB G 61 | |

Presence hour | Mo 11-12 | CAB H 53 |

#### Exercises and Assignments

Each exercise session will provide you with a pen-and-paper problem and discussion of the solution in the session. These problems help solidify theory presented in the lecture and identify areas of lack of understanding.

Assignments are larger problems, that you work on in groups of **three or
four** students. For each assignment, you develop and implement an algorithm
that solves one of the three application problems. Submitting the predictions of your method provides you with feedback in terms of accuracy and efficiency.

#### Semester Project

Based on the implementations you developed during the semester, you create a novel solution for one of the application problems, by extending or combining previous work. You write up your methodology and an experimental comparison to baselines in the form of a short scientific paper. You submit your novel solution to the online ranking system for competitive comparison.

Projects are due on Tuesday July 4th, 2017 (midnight).

See detailed description of the projects here

#### Written Exam *(TBD. )*

The mode of examination is written, 120 minutes length. The language of examination is English. As written aids, you can bring two A4 pages (i.e. one A4 sheet of paper), either handwritten or 11 point minimum font size.

To help with exam preparation, two old CIL exams can be found here:

cil-exam-2010.pdf

cil-exam-2012.pdf

cil-exam-2015.pdf.

#### Grade

Your final grade will be determined by the written final exam (2/3 weight) and the semester project (1/3 weight). Your semester project only accounts for a bonus, i.e. it will only be counted if it exceeds your exam grade.

### Programming Assignments

This lab course has a strong focus on practical assignments. Students can work in groups to develop solutions to one of three application problems.

#### Solving Assignments

For solving assignments, you...

- work in groups of
two to threestudents (no more, no less)- download the assignment description sheet (see the exercise sheets in the syllabus)
- download Python function skeletons, training data and an evaluation script
- develop, debug and optimise your solution on the training data
- submit your solution to online evaluation on test data
- see where you stand in a ranking of all submissions

### Semester Project

Your semester project is a group effort. It consists of four parts:

- The programming assignments you solve during the semester (not graded!).
- Developing a novel solution for one of the three application problems, e.g. by combining methods from previous programming assignments into a novel solution.
- Comparing your novel solution to previous assignments.
- Writing up your findings in a short scientific paper.

If you don't belong to any group so far, please contact the VIS forum, and inform us by 27 May 2017.

For students repeating the course, you can either submit the previous year's project as-is. In this case you have to be a group of one person, and inform us in advance. Alternatively, you can also resubmit a new project in a (new) regular group.

#### Developing a Novel Solution

As your final programming assignment, you develop a novel solution to one of the four application problems. You are free to exploit any idea you have, provided it is not identical to any other group submission or existing implementation of an algorithm on the internet.

Two examples for developing a novel solution:

- You implemented a collaborative filtering algorithm based on dimension reduction as part of an assignment. Now you apply dimension reduction to one of the two other tasks.
- You implemented both a clustering and a sparse coding algorithm for image compression. Now you combine both techniques into a novel compression method.

#### Comparison to Baselines

You compare your novel algorithm to at least two baseline algorithms. For the baselines, you can use the implementations you developed as part of the programming assignments.

#### Ranking of Novel Solution

You submit your novel algorithm to the online ranking system.

#### Write Up

Technical report: how to write a scientific paper. [PDF] [source]

The document must be a maximum of **4 pages** (excluding references).

#### Grading

There are two different types of grading criteria applied to your project, with the corresponding weights shown in brackets.

#### Competitive

The ranks in the Kaggle competition system will be converted on a linear scale into a grade between 4 and 6. The competitive part counts 30% to the project grade.

#### Non-competitive

The following criteria are graded based on an evaluation by the teaching assistants.

- quality of paper (30%)
- quality of implementation (20%)
- creativity of solution (20%)

#### Paper Submission

To submit your report, please go to https://cmt3.research.microsoft.com/CILETHZ2017, register and follow the instructions given there. You can resubmit any number of times until the deadline passes.

#### Instructions

For a successful submission please follow these steps:

- Your group should consist of three or four students registered to the CIL 2017 course.
- Register your group on the kaggle and paper submission webpages.
- When finally uploading your paper, we also require you to upload the Python code which generates the exact predictions which you have used for your final Kaggle submission. The code should be well-documented and generate the predictions in the required format as uploaded to Kaggle.
**For reproducibility of your approach (as described in your paper) you should also include the additional code which you have used to produce plots and additional experiments etc.** - Prepare your project paper as described on the course webpage. Include the name of your group in the header of the submitted PDF file, e.g: \author{Hans Mustermann and John Doe\\group: mustermann_doe, Department of Computer Science, ETH Zurich, Switzerland}
- Attach the signed plagiarism form at the end of your paper (scan).
- Submit the paper through the CMT system

#### Acknowledgements

We are grateful to Kaggle In-Class and to Microsoft Research for allowing us to use their systems for the submission of project predictions and reports.

### Reading Material

Here is a list of additional material if you want to read up on a specific topic of the course.

#### Probability Theory and Statistics

Chapter 1.2 "Probability Theory" in: Christopher M. Bishop (2006). *Pattern Recognition and Machine Learning*. Springer.

Larry Wasserman (2003). *All of Statistics*. Springer.

#### Linear Algebra

Gene Golub and Charles Van Loan (1996). *Matrix Computations*. The Johns Hopkins University Press.

Lloyd Trefethen and David Bau (1997). *Numerical Linear Algebra*. SIAM.

Dave Austin. *We recommend a Singular Value Decomposition*. (SVD tutorial)

Michael Mahoney. *Randomized algorithms for matrices and data*. (Recent review paper)

#### Collaborative Filtering

Yehuda Koren, Robert Bell and Chris Volinsky (2009). *Matrix Factorization Techniques for Recommender Systems*. IEEE Computer.

#### Clustering

Chapter 9 "Mixture Models and EM" in: Christopher M. Bishop (2006). *Pattern Recognition and Machine Learning*. Springer.

#### Neural Networks

TensorFlow tutorials, and udacity Deep Learning Lecture.

#### Sparse Coding

Chapter 1 "Sparse Representations" in: Stephane Mallat (2009). *A Wavelet Tour of Signal Processing - The Sparse Way*. Academic Press.

Chapter 6 "Wavelet Transforms", pp. 244-276; in: Andrew S. Glassner (1995). *Principles of Digital Image Synthesis*, Vol. 1. Morgan Kaufmann Publishers, inc.

Chapter 13 "Fourie and Spectral Application", pp. 699-716; in: William H. Press, Saul A. Teukolsky, William T. Vetterling and Brian P. Flannery (2007). *Numerical Recipes. The Art of Scientific Computing*. Cambridge University Press.

Aharon, Elad and Bruckstein (2005). *K-SVD: Design of Dictionaries for Sparse Representation.* Proceedings of SPARS.

Richard Baraniuk (2007). *Compressive sensing*. IEEE Signal Processing Magazine.

### Frequently Asked Questions

How do I run Python and Tensorflow on the Euler cluster? Follow this link

### Contact

You can ask questions on piazza. Please post questions there, so others can see them and share in the discussion.

If you have questions which are not of general interest, please don't hesitate to contact us directly.

The main email point of contact for the course is CILAB.

Lecturer | Prof. Thomas Hofmann |

Head Assistants | Aurelien Lucchi |

Assistants | Gary Becigneul, Andrew Bian, Hadi Daneshmand, Octavian Ganea, Paulina Grnarova, Yannic Kilcher, Matthias Hüser, Francesco Locatello, Xinrui Lyu |