Loading...

P E O P L E ( H T T P S : / / L A S . I N F . E T H Z . C H / P E O P L E )

P U B L I C A T I O N S ( H T T P S : / / L A S . I N F . E T H Z . C H / P U B L I C A T I O N S )

T EACHING

( H T T P S : / / L A S . I N F . E T H Z . C H / T E A C H I N G )

R E S E A R C H ( H T T P S : / / L A S . I N F . E T H Z . C H / R E S E A R C H )

O P E N I N G S ( H T T P S : / / L A S . I N F . E T H Z . C H / O P E N I N G S )

Seminar – Advanced Topics in Machine Learning

C O N T A C T ( H T T P S : / / L A S . I N F . E T H Z . C H / C O N T A C T )

In this seminar, recent papers of the pattern recognition and machine learning literature are presented and discussed. Possible topics cover statistical models in computer vision, graphical models and machine learning. The seminar “Advanced Topics in Pattern Recognition” familiarizes students with recent developments in pattern recognition and machine learning. Original articles have to be presented and critically reviewed. The students will learn how to structure a scientific presentation in English which covers the key ideas of a scientific paper. An important goal of the seminar presentation is to summarize the essential ideas of the paper in sufficient depth while omitting details which are not essential for the understanding of the work. The presentation style will play an important role and should reach the level of professional scientific presentations. The seminar will cover a number of recent papers which have emerged as important contributions to the pattern recognition and machine learning literature. The topics will vary from year to year but they are centered on methodological issues in machine learning like new learning algorithms, ensemble methods or new statistical models for machine learning applications. Frequently, papers are selected from computer vision or bioinformatics – two fields, which relies more and more on machine learning methodology and statistical models. The papers will be presented

in

the

first

session

of

the

seminar.

VVZ

information

is

available

here

(http://vvz.ethz.ch/Vorlesungsverzeichnis/lerneinheitPre.do?lerneinheitId=100981&semkez=2015W&lang=en).

Contact

Lectures

Professors

Joachim M. Buhmann (http://www.ml.inf.ethz.ch/people/person-detail.html? persid=113456), Thomas Hofmann (http://www.da.inf.ethz.ch/people/ThomasHofmann/), Andreas Krause (/krausea)

Assistants

Tue

16-18

CAB H 52

Thu

16-18

CHN G 22

Hamed Hassani (/people/hamed-hassani), Martin Jaggi (http://www.da.inf.ethz.ch/people/MartinJaggi/), Bauer Stefan (http://www.ml.inf.ethz.ch/people/person-detail.html?persid=158811)

Tuesday Schedule Date

Presenter

Topic

20 Oct

Melis

Distributed Stochastic Variance Reduced Gradient Methods (2015)

20 Oct

Jagerman

Communication Efficient Distributed Machine Learning with the Parameter Server (2014)

27 Oct

Pilgerstorfer

Adaptive subgradient methods for online learning and stochastic optimization (2011)

27 Oct

Mutny

Stochastic dual coordinate ascent methods for regularized loss (2013)

03 Nov

Karaivanov

Beyond Convexity: Stochastic Quasi-Convex Optimization (2015)

03 Nov

Wang

Non-convex Robust PCA (2014)

10 Nov

Holmer

A stochastic PCA and SVD algorithm with an exponential convergence rate (2015)

10 Nov

Herbst

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods (2015)

17 Nov

Lianos

Training Highly Multiclass Classifiers

17 Nov

Helminger

User Conditional Hashtag Prediction for Images

17 Nov

Bahman

Sequence to Sequence Learning with Neural Networks

24 Nov

Spurr

Show and tell: A neural image caption generator

24 Nov

Marti

A Critical Review of Recurrent Neural Networks for Sequence Learning

01 Dec

Ciganovic

Neural variational inference and learning in belief networks

01 Dec

Ihnatov

Neural Turing Machines

15 Dec

Kan

Zero-shot learning by convex combination of semantic embeddings

15 Dec

Ghosh

Giraffe: Using Deep Reinforcement Learning to Play Chess

Thursday Schedule Date

Presenter

Topic

15 Oct

Van der Goten

Adaptively Learning the Crowd Kernel (2011)

15 Oct

Vollprecht

Tuned Models of Peer Assessment in MOOCs (2013)

22 Oct

Calderara

Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing (2014)

22 Oct

Hamas

Probabilistic Programming (ICSE 2014)

29 Oct

Greuter

A New Approach to Probabilistic Programming Inference (2014)

29 Oct

Porvaznik

Learning Probabilistic Programs (2014)

5 Nov

Nikolov

On the convexity of latent social network inference (2010)

5 Nov

Minhaz

Scalable Influence Estimation in Continuous-Time Diffusion Networks (2013)

12 Nov

Koleva

Uncovering the Temporal Dynamics of Diffusion Networks (2011)

12 Nov

Carion

A Tutorial on Bayesian Optimization of Expensive Cost Functions (2010)

19 Nov

Song

Practical Bayesian Optimization of Machine Learning Algorithms (2012)

19 Nov

Wu

Input Warping for Bayesian Optimization of Non-Stationary Functions (2014)

26 Nov

Ma

High Dimensional Bayesian Optimisation and Bandits via Additive Models (2015)

26 Nov

Nishant

Introduction to causal inference (2010)

3 Dec

Abdelmessih

Identifying the direction of causal time series (2009)

3 Dec

Wang Jingyi

Nonlinear causal discovery with additive noise models (2008)

17 Dec

Demitri

Probabilistic latent variable models for distinguishing between cause and effect (2010)

17 Dec

Raiskin

Towards a learning theory of cause-effect inference (2015)

Tuesday Topics and Papers Convex and Non-Convex Optimization Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, 12, 2121-2159. Shalev-Shwartz, Shai, and Tong Zhang. Stochastic dual coordinate ascent methods for regularized loss. The Journal of Machine Learning Research 14.1 (2013): 567-599. Defazio, Aaron, Francis Bach, and Simon Lacoste-Julien. SAGA: A fast incremental gradient method with support for nonstrongly convex composite objectives. NIPS 2014. Christopher De Sa, Christopher Re, Kunle Olukotun, Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems, ICML 2015 Shamir, Ohad. A stochastic PCA and SVD algorithm with an exponential convergence rate. ICML 2015. Bhojanapalli, S., and, A. K., & Sanghavi, S. (2015). Dropping Convexity for Faster Semi-definite Optimization arXiv preprint Praneeth Netrapalli, U N Niranjan, Sujay Sanghavi (2014). Non-convex Robust PCA, NIPS 2014 Elad Hazan, Kfir Levy, Shai Shalev-Shwartz (2015). Beyond Convexity: Stochastic Quasi-Convex Optimization NIPS 2015 Deep Learning, Embeddings, Multiclass Classification Denton, E., Weston, J., Paluri, M., Bourdev, L., & Fergus, R. (2015). User Conditional Hashtag Prediction for Images. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1731-1740). ACM. Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2014). Show and tell: A neural image caption generator. arXiv preprint Maya R. Gupta, Samy Bengio, Jason Weston; Training Highly Multiclass Classifiers. The Journal of Machine Learning Research 15(Apr):1461−1492, 2014. Norouzi, M., Mikolov, T., Bengio, S., Singer, Y., Shlens, J., Frome, A., … & Dean, J. (2013). Zero-shot learning by convex combination of semantic embeddings. arXiv preprint Lipton, Z. A Critical Review of Recurrent Neural Networks for Sequence Learning arXiv preprint Majid Janzamin, Hanie Sedghi, Anima Anandkumar (2015). Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods arXiv preprint Mnih, A., & Gregor, K. (2014). Neural variational inference and learning in belief networks. ICML Lai, M. (2015). Giraffe: Using Deep Reinforcement Learning to Play Chess. MSc Thesis Variational Inference Hoffman, M. D., Blei, D. M., Wang, C., & Paisley, J. (2013). Stochastic variational inference. The Journal of Machine Learning Research, 14(1), 1303-1347. Salimans, T. (2014). Markov chain Monte Carlo and variational inference: Bridging the gap. arXiv preprint. Paisley, J., Blei, D., & Jordan, M. (2012). Variational Bayesian inference with stochastic search. arXiv preprint arXiv:1206.6430. Mnih, A., & Gregor, K. (2014). Neural variational inference and learning in belief networks. ICML Djolonga, J., & Krause, A. (2014). From map to marginals: Variational inference in bayesian submodular models. In Advances in Neural Information Processing Systems (pp. 244-252). Variational Message Passing. John Winn, Christopher M. Bishop. JMLR 2005 Distributed Optimization Lee, J., Ma, T., & Lin, Q. (2015). Distributed Stochastic Variance Reduced Gradient Methods arXiv H Mania, X Pan, D Papailiopoulos, B Recht, K Ramchandran, M I. Jordan (2015) Perturbed Iterate Analysis for Asynchronous Stochastic Optimization Li, M., Andersen, D. G., Smola, A., & Yu, K. (2014). Communication Efficient Distributed Machine Learning with the Parameter Server. NIPS 2014 Lee, C.-P., & Roth, D. (2015). Distributed Box-Constrained Quadratic Optimization for Dual Linear SVM. ICML 2015.

Thursday Topics and Papers Network Inference M. Gomez-Rodriguez, D. Balduzzi, B. Schölkopf. Uncovering the Temporal Dynamics of Diffusion Networks. The 28th International Conference on Machine Learning (ICML), 2011. Seth Myers, and Jure Leskovec. On the convexity of latent social network inference. NIPS’10. Scalable Influence Estimation in Continuous-Time Diffusion Networks. Nan Du, Le Song, Manuel Gomez Rodriguez, and Hongyuan Zha. NIPS 2013. Learning meets optimization G. Papandreou and A. Yuille, Perturb-and-MAP Random Fields: Using Discrete Optimization to Learn and Sample from Energy Models, ICCV 2011 M. J. Wainwright, T. Jaakkola and A. S. Willsky (2005). A new class of upper bounds on the log partition function. IEEE Trans. on Information Theory Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets, Adarsh Prasad, Stefanie Jegelka, Dhruv Batra, NIPS 2014. Active Learning as Non-Convex Optimization, Andrew Guillory, Erick Chastain, Jeff Bilmes, AISTATS 2009. Bayesian optimization A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. Eric Brochu, Vlad M. Cora and Nando de Freitas. eprint arXiv:1012.2599, pdf Input Warping for Bayesian Optimization of Non-Stationary Functions, Jasper Snoek, Kevin Swersky, Richard S. Zemel, Ryan P. Adams. ICML’14 Bayesian Optimization with Unknown Constraints, Michael A. Gelbart, Jasper Snoek, Ryan P. Adams. UAI’14 Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. NIPS, 2012. Bayesian Active Learning for Posterior Estimation.Kirthevasan Kandasamy, Jeff Schneider, and Barnabas Poczos. IJCAI, 2015. High Dimensional Bayesian Optimisation and Bandits via Additive Models , Kirthevasan Kandasamy, Jeff Schneider, Barnabas Poczos, ICML 2015 Probabilistic Programming Andrew Gordon, Thomas Henzinger, Aditya Nori, and Sriram Rajamani. Probabilistic Programming. ICSE, 2014. A New Approach to Probabilistic Programming Inference, Wood, F., van de Meent, J. W., & Mansinghka, V., AISTATS 2014 Learning Probabilistic Programs. Perov, Y., & Wood, F. arXiv preprint arXiv:1407.2646. PDF Learning and Economics / Game Theory Adaptively Learning the Crowd Kernel. Omer Tamuz, Ce Liu, Serge Belongie, Ohad Shamir, Adam Tauman Kalai. ICML’11. Tuned Models of Peer Assessment in MOOCs. C. Piech, J. Huang, Z. Chen, C. Do, A. Ng, D. Koller. EDM’13. Spectral Methods meet EM: A Provably Optimal Algorithm for Crowdsourcing. Y. Zhang , X. Chen, D. Zhou, M. Jordan. NIPS’14. Causality Spirtes, Peter. “Introduction to causal inference.” The Journal of Machine Learning Research 11 (2010): 1643-1662 with additional examples from Guyon, Isabelle. “Practical feature selection: from correlation to causality.” NATO Science for Peace and Security 19 (2008): 27-43. Patrik O. Hoyer et al. “Nonlinear causal discovery with additive noise models” in Advances in Neural Information Processing Systems 21 (NIPS 2008) Stegle, Oliver, et al. “Probabilistic latent variable models for distinguishing between cause and effect.” Advances in Neural Information Processing Systems. 2010. Peters, Janzing, Gretton and Schölkopf Detecting the Direction of Causal Time Series in ICML 2009 Lopez-Paz, David, et al. “Towards a learning theory of cause-effect inference.” Proceedings of the 32nd International Conference

on Machine Learning, JMLR: W&CP, Lille, France. 2015

A B O U T ( H T T P S : / / L A S . I N F . E T H Z . C H / ) P E O P L E ( H T T P S : / / L A S . I N F . E T H Z . C H / P E O P L E ) P U B L I C A T I O N S ( H T T P S : / / L A S . I N F . E T H Z . C H / P U B L I C A T I O N S ) R E S E A R C H ( H T T P S : / / L A S . I N F . E T H Z . C H / R E S E A R C H ) T E A C H I N G ( H T T P S : / / L A S . I N F . E T H Z . C H / T E A C H I N G ) O P E N I N G S ( H T T P S : / / L A S . I N F . E T H Z . C H / O P E N I N G S ) C O N T A C T ( H T T P S : / / L A S . I N F . E T H Z . C H / C O N T A C T )

L e a rn in g & A d a p t ive S yst e ms G ro u p | Ma ch in e L e a rn in g I n st it u t e | E TH Zu rich

Loading...