Linear Regression using Gradient Descent

 

Linear Regression using Gradient Descent

By Bindeshwar Singh Kushwaha

General Linear Regression Model

We have a collection of labeled examples:

$$
\{(\mathbf{x}_i, y_i)\}_{i=1}^{N}
$$

  • \( \mathbf{x}_i \) is a \( D \)-dimensional feature vector
  • \( y_i \) is a real-valued target
  • Each feature \( x_i^{(j)} \in \mathbb{R} \), where \( j = 1, …, D \)
  • The model is: $$ f_{\mathbf{w}, b}(\mathbf{x}) = \mathbf{w} \cdot \mathbf{x} + b $$
  • \( \mathbf{w} \): weights, \( b \): bias

Sample Dataset

Company Spending (M$) Sales (Units)
1 37.8 22.1
2 39.3 10.4
3 45.9 9.3
4 41.3 18.5
5 50.0 25.0
6 38.5 15.0
7 42.2 19.3
8 48.1 23.7
9 36.4 13.2
10 40.7 17.4
11 43.5 20.2
12 47.3 24.5
13 49.0 26.1
14 35.2 11.3
15 38.0 14.7

Goal: Predict Sales based on Spending.

Linear Regression Model

Model: \( f(x) = wx + b \)

Objective: Minimize MSE:

$$
l = \frac{1}{N} \sum_{i=1}^{N} (y_i – (wx_i + b))^2
$$

Gradient Descent Derivatives

Gradients:

\[ \frac{\partial l}{\partial w} = \frac{1}{N} \sum_{i=1}^{N} -2x_i(y_i – (wx_i + b)) \]

\[ \frac{\partial l}{\partial b} = \frac{1}{N} \sum_{i=1}^{N} -2(y_i – (wx_i + b)) \]

Gradient Descent Update Rule

Update equations:

\[
w \leftarrow w + \frac{2\alpha}{N} \sum_{i=1}^{N} x_i(y_i – (wx_i + b))
\]

\[
b \leftarrow b + \frac{2\alpha}{N} \sum_{i=1}^{N} (y_i – (wx_i + b))
\]

Step-by-Step Python Implementation

  1. Import Libraries: numpy, matplotlib.pyplot, matplotlib.animation
  2. Define Dataset: 15 (x, y) pairs
  3. Initialize Parameters: \( w = 0.0, b = 0.0, \alpha = 0.0005, \text{epochs} = 100 \)
  4. Training Loop:
    • Predict \( \hat{y} = wx + b \)
    • Compute Loss: $$ \text{MSE} = \frac{1}{N} \sum (y – \hat{y})^2 $$
    • Compute Gradients:
    • \( \frac{\partial L}{\partial w} = \frac{2}{N} \sum x(y – \hat{y}) \)
    • \( \frac{\partial L}{\partial b} = \frac{2}{N} \sum (y – \hat{y}) \)
    • Update \( w, b \)
  5. Set Up Plots: Left: scatter + line, Right: loss curve
  6. Define Animation: Update line and loss with frame
  7. Run Animation: FuncAnimation()

Key Takeaways

  • Gradient descent iteratively minimizes error
  • Helps learn optimal parameters from data
  • Animations provide intuition in training

📣 Reach PostNetwork Academy

🙏 Thank You!

©Postnetwork-All rights reserved.