Task is a data generating distribution. A distribution over inputs, a distribution over labels given inputs and a loss function.

Goal: Given training data sample from those distribution and minimize the loss function in a way that generalizes to new data points sampled from those distributions
Learning a task means minimizing the loss function and training over the data to learn the parameters of the network

