Carrying error Analysis

If you're trying to get a learning algorithm to do a task that humans can do. And if your learning algorithm is not yet at the performance of a human. Then manually examining mistakes that your algorithm is making, can give you insights into what to do next.



For example: No matter how much better you do on dog images, or on Instagram images. You at most improve performance by maybe 8%, or 12%, in these examples. Whereas you can to better on great cat images, or blurry images, the potential improvement.

Cleaning up Incorrectly Labeled Data

  1. Mislabeled examples,: to refer to if your learning algorithm outputs the wrong value of Y.
  2. Incorrectly labeled examples, to refer to if in the data set you have in the training set or the dev set or the test set, the label for Y, whatever a human label assigned to this piece of data, is actually incorrect.

What should you do if you find data with incorrect labeled examples?


If the errors are reasonably random, then it's probably okay to just leave the errors as they are and not spend too much time fixing them.


They are less robust to systematic errors. So for example, if your labeler consistently labels white dogs as cats, then that is a problem because your classifier will learn to classify all white colored dogs as cats


During error analysis to add one extra column so that you can also count up the number of examples where the label Y was incorrect
