Associate Professor, Particle Physics, Niels Bohr Institute, University of Copenhagen
Applied machine learning and big data analysis
Machine Learning is now being applied in essentially all data-based fields, and Big Data is omnipresent from private industry to governmental organizations. It is a new approach to problem solving, and while the potential is often exaggerated, Machine Learning does indeed open up exciting new opportunities, but it also poses some very real challenges. The ability to analyze and combine large amounts of data from different sources obviously has wide applications. However, the lack of quality in the data combined with a high variance means that conventional analysis often fails. To counter this requires proper training in the correct application of Machine Learning algorithms.
This course will put you at the forefront of applied machine learning by introducing you to the newest tools and methods in large-scale data analysis based on cutting-edge research and the extensive experience of our teachers.
After the course you will:
- Be able to set up a basic Machine Learning Analysis from beginning to end: from retrieving and cleaning the data, to setting the information level, extracting patterns and finding outliers, to curating the necessary data.
- Get acquainted with a number of advanced tools for data cleaning, statistical analysis of very large datasets, data stream analysis, finding patterns and outliers in Big Data, collecting data from instruments and devices (e.g. internet of things (IoT)) and for using hardware systems design for efficient handling and analysis of Big Data.
Throughout the course, we will use examples of structured datasets in a commercial context, which will be used to demonstrate the different steps in Big Data Analysis. Participants will also have the chance to ask questions about specific data and challenges.
- Data cleaning and statistical methods: Detecting and correcting (or removing) corrupt or inaccurate records, and robust statistical methods for data with very large variance and cross checks.
- Machine Learning algorithms: Introduction to a variety of methods, how they work behind the scenes, their strengths and weaknesses, and their applications.
- Finding patterns and outliers in Big Data: Which methods can be used to identify sparse patterns in very large datasets, and how can we identify data that does not follow the general pattern of a dataset?
- Collecting data from instruments and devices: How to collect, store, and analyze data from a multitude of sources (e.g. apparatus, IoT, etc.).
- Systems for Big Data Analysis: Hadoop, PyDisco, etc., and hardware systems design for efficient analysis.
- Selected machine learning algorithms for large-scale data: Random forests, (deep) neural networks, support vector machines, and large-scale exact nearest neighbour search.
- Systems for Big Data Analysis: Common systems for BDA; Hadoop, PyDisco, etc., and hardware systems design for efficient BDA.
- Selected machine learning algorithms for large-scale data: Random forests, support vector machines, and large-scale exact nearest neighbour search
- Data curation: How to select data for long time curation, systems, techniques and standards for data curation
We will primarily be working with Python; however, all techniques that are covered are easily implemented with all standard data-analysis languages.
The course is strictly focused on Machine Learning and Big Data Analysis, so a prerequisite is that you have a background in statistics and/or conventional data analysis. This course assumes you have studied to at least Bachelor degree level and/or have several years of data analysis experience.
Brian Vinter, Professor, eScience, Niels Bohr Institute, University of Copenhagen
Joachim Mathiesen, Associate Professor, Biocomplexity, Niels Bohr Institute, University of Copenhagen
Share this page
“Excellent overview of modern Machine Learning techniques”
Lorenzo Tonelli, Business Analyst, MAN Diesel & Turbo
"Very interesting, well organized, great teachers. I learned a lot! Thank you for a great course!"
Agnethe Larsen, Head of Section, Danish Business Authority
"I found the course to be of very high quality theoretically. The theoretical concepts covered were very detailed & relevant."
Anup Singh, General Manager, Maersk Group
|Dates and time:||
Please note that this course is held twice:
|Price:||EUR 2,680 (DKK 19,900) excl. Danish VAT. The price includes tuition, course material and all meals during course hours.|
|Location:||South Campus, Faculty of Law, Njalsgade 76, DK-2300 Copenhagen S, Denmark|
|Registration deadline:||June 19, 2019|
|Contact:||Copenhagen Summer University
+45 3533 3423