What is Machine Learning?
Improving performance on some task with experience
Experience: data
Tasks: lots of them
Performance: some measure of how correct you are
Basically: train a model on the data and see how well it works
Paradigms
- supervised learning
- unsupervised learning
- semi-supervised learning
- reinforcement learning
- active learning
- probably some others I'm missing
We'll focus on supervised learning.
Supervised Learning
Given training examples $(x, y)$, train a model $f$ to predict a $y$ given some $x$.
If $y$ is a label or category, this is called classification.
- e.g. $x$ is words in a document, $y$ is a topic (science, news, business, etc)
- e.g. $x$ is an image of a cell, $y$ is healthy or anemic cell
If $y$ is a real number, this is called regression.
- e.g. $x$ is a time series, $y$ is a stock price
- e.g. $x$ is data about a person, $y$ is how much to charge insurance premium
If $y$ is a rank or ordered list, this is called learning to rank.
- e.g. $x$ is a query, $y$ is a list of search results (homework 2)
If $y$ is a sequence, this is called sequence to sequence
- e.g. $x$ is a Chinese sentence, $y$ is an English sentence
- e.g. $x$ is speech signals, $y$ is a transcription of the speech