The course is dedicated to crowdsourcing as a tool for efficient and scalable data labeling.
Great amounts of data are essential for most AI-based technologies. The better the algorithms are, the more data is needed to make more of them. This is the reason why efficient data labeling is a demanded yet essential skill for professionals dealing with ML. Crowdsourcing helps to establish robust and scalable data labeling processes by distributing tasks among a vast cloud of users. However, there are certain challenges of establishing data quality, training performers to do the right thing, preparing the task in a simple and clear manner or processing results. In this course we will share our experience and see how it all works in practice.
The course will be built around several hands-on crowdsourcing projects. Each week we will discuss a certain ML-related task and launch a data labeling project which serves this task. As a result, at the end of this course you will:
- understand general principles of preparing and controlling crowdsourcing tasks;
- learn how to work with a major crowdsourcing platform;
- practice different kinds of tasks and collect a portfolio of several common crowd-based projects.
The course is designed for the professionals and students who deal with machine learning, as well as everyone else who is in need of great amounts of data.
Course prerequisites- General understanding of ML
- Experience with HTML, CSS, JS and Python will be an advantage