Label Encoding and One Hot Encoding in Machine Learning


πŸ“˜ Label Encoding and One-Hot Encoding

Author: Bindeshwar Singh Kushwaha


🎯 Encoding Categorical Features

πŸ”Ή Label Encoding

  • Assigns each category an integer value
  • Suitable for ordinal data (e.g., size: small, medium, large)
  • Tool: LabelEncoder from sklearn.preprocessing
  • Example (Titanic): Encoding Sex as 0 (male), 1 (female)

πŸ”Ή One-Hot Encoding

  • Converts categories into binary columns (one per category)
  • Suitable for nominal data (no natural order)
  • Tools: OneHotEncoder, pd.get_dummies()
  • Example (Mushroom dataset): Encoding cap-shape into binary columns

🚒 Titanic Dataset Overview

  • Source: Kaggle’s β€œTitanic: Machine Learning from Disaster”
  • Objective: Predict survival of passengers
  • Target: Survived: 0 = Did not survive, 1 = Survived
  • Key Features: Pclass, Sex, Age, Fare, Embarked, etc.

πŸ”’ Label Encoding in Titanic

  • Assigns each category an integer
  • Example (Ordinal): small β†’ 0, medium β†’ 1, large β†’ 2
  • Tool: LabelEncoder from sklearn
  • Example on Titanic:
    • Original: ['male', 'female', 'male']
    • Encoded: [0, 1, 0]
    • Mapping: male β†’ 0, female β†’ 1

πŸ”„ Implementation (Titanic)

import pandas as pd
from sklearn.preprocessing import LabelEncoder

df = pd.read_csv("titanic.csv")
le = LabelEncoder()
df['Sex_encoded'] = le.fit_transform(df['Sex'])

print(df['Sex'].unique())         # ['male', 'female']
print(df['Sex_encoded'].unique()) # [1, 0]
print(le.classes_)                # ['female', 'male']

πŸ„ Mushroom Dataset Overview

  • Source: UCI ML Repository
  • Target: class: e = edible, p = poisonous
  • All Features: Categorical (e.g., cap-shape, odor, etc.)

Example: One-Hot Encoding cap-shape

  • Categories: b, c, x, f, k, s
  • After Encoding: binary vectors like:
    • x β†’ [0, 0, 1, 0, 0, 0]
    • f β†’ [0, 0, 0, 1, 0, 0]
import pandas as pd
from sklearn.preprocessing import OneHotEncoder

df = pd.read_csv("mushrooms.csv")
pd.get_dummies(df['cap-shape'], prefix='cap-shape')

πŸ“’ Reach PostNetwork Academy

πŸ™ Thank You!

Β©Postnetwork-All rights reserved. Β  Β