Example using images and matrices having the values of the RGB pixels
Notation:
Parameters w and b are learnable parameters
We are not using the red notation (approach) in this course
$\sigma$ refers to the Sigmoid function $\sigma(z) = \frac{1}{1+e^{-z}}$
To train our logistic regression we need to define the cost function
In logistic regression we often not use the Squared error to avoid local minimums when using Gradient Descent.
We use loss function: $L(\hat{y},y) = -(ylog(\hat{y}) + (1-y)log(-\hat{y}))$
Cost function: $J(w,b) = - \frac 1 m \sum_i^m (y^i log(\hat{y}^ i) + (1-y^i)log(-\hat{y}^i))$
We want to find w and b that minimize $J(w,b)$ and we need a convex cost function