This course provides a general introduction to data science and machine learning, covering topics such as statistical learning theory, supervised learning (parametric and non-parametric), as well as unsupervised methods. In the tutorial, students will implement methods in Python and apply them to real data.

In this seminar, we will read and discuss classic papers as well as seminal novel contributions on two of the currently most interesting topics in data science: deep learning & causality. In terms of deep learning, we will focus on the (largely unsolved) question why deep nets generalize so well. In terms of causality, we will read several classic papers to get acquainted with various causal concepts, and then discuss novel attempts to infer causal relations from non-experimental data.

The course content is currently available here

This course is centered around software design and systems development. We explore and apply the concept of object-oriented programming which is (still) the leading programming paradigm for extensive projects. In contrast to other programming paradigms like functional or declarative programming object-oriented programming languages model objects through their attributes and functionalities. This concept has proven convenient when modeling real-world problems. 


Theoretic background will be brought to action in programming exercises and a small programming project. In the first weeks (duration determined according to progress), you will learn the core functionalities of object-oriented programming such classes, inheritance, polymorphy, interfaces and abstract classes. Subsequently, you will build a basic routing service following weekly exercises. Given two geo-locations via a web service the routing service will compute the shortest path and return it to the user where the result will be displayed.


The course gives an overview on basic concepts of data structures and general principles of algorithmic design. The general algorithmic paradigms will be examined based on the classical problems of searching, sorting and classical graph problems like shortest path or minimal spanning trees. Finally, the course will give an introduction to the general methodology of programming.

Find the code of the lessons in the forum.


    Statistical core module for Data Science.

    ===

    As we would like to have a head count as soon as possible to estimate how many students are interested in this lecture, please sign in for this lecture using the key learnDL if you plan to hear this lecture.

    ===


    In recent years, deep learning network has steadily increased in popularity, mainly due to their state-of-the-art performance in image and speech recognition, text mining and related tasks. Deep neural networks attempt to automatically learn multi-level representations and features of data and are able to uncover complex underlying data structures.

    The lecture aims at providing a basic theoretical and practical understanding of modern neural network approaches. We will start out by covering the necessary background on traditional artificial neural networks, backpropagation, online learning, and regularization. Then we will cover special methods used in deep learning, like drop-out and rectified linear units. We will also talk about further advanced topics like convolutional layers, recurrent neural networks, and auto-encoders.

    We will also talk about practical application and open-source deep learning libraries.


    Requirements:

    • English

    • Statistics Master (any) or Data Science Master

    • Some background in modeling, e.g., lecture on GLMs, preferably a lecture on machine learning / predictive modeling

    • Some background in optimization, e.g., Computational Methods I in the statistics master

    • Practical programming knowledge in R or Python


    We will introduce the basic concepts of multivariate statistical methods for data scientists. This includes [[subject to change]]:

    • Random vectors, multivariate distributions, and their inference
    • Visualization of multivariate data
    • Principal Components Analysis (PCA)
    • Multidimensional Scaling
    • Factor analysis
    • Cluster analysis
    • Repeated measures data


    Lecturers:

    • Prof. Bernd Bischl (bernd.bischl@stat.uni-muenchen.de)
    • Janek Thomas (janek.thomas@stat.uni-muenchen.de)

    Login: MVS1718


    Time:


    Thursday 17-19 in Ludwigstrasse 33 Room 254