Python for Data Processing
4 hours
6 weeks
Course Description
Python for Data Processing (Py4DP) course introduces main tools forming Python stack for data science and machine learning. The course is focused on practical skills and core packages for data science in general and exploratory data analysis in particular: Jupyter, numpy, pandas and matplotlib. Basic understanding of scipy, sklearn, dask, tensorflow and other packages and tools is also provided, as well as very brief review of tools beyond Python, for example Scala and Spark.

After completion of the course, students should be able to configure working environment, strategically setup machine learning or data science project and efficiently perform exploratory analysis of moderately sized dataset (up to tens of Gb), including data cleaning, analysis of individual variables, their relations, visualizations and feature constructions.