logo
Python
CNN
Machine Learning
Deep Learning
Kaggle
Computer Vision

Dog Breed Detection Using CNN

March 25, 2024
7 min read
Source Code
Dog Breed Detection Using CNN

A deep learning model using Convolutional Neural Networks (CNN) to classify dog breeds from images with high accuracy, featuring preprocessing, data augmentation, and optimized architecture.

Introduction

Dog Breed Detection is a computer vision project that uses Convolutional Neural Networks (CNN) to accurately identify dog breeds from images. This project demonstrates the power of deep learning in image classification tasks and showcases best practices in model development and optimization.

The Challenge

Dog breed classification is particularly challenging due to:

  • Inter-class similarity (many breeds look similar)
  • Intra-class variation (same breed can look different)
  • Pose and viewpoint variations
  • Background complexity
  • Lighting conditions
  • Image quality variations

This project tackles these challenges through careful model design and data preprocessing.

Dataset

The model is trained on a comprehensive dataset featuring:

  • 120+ dog breeds
  • 20,000+ images for training
  • High-quality labeled data from Kaggle
  • Diverse poses, angles, and settings

Popular breeds included:

  • Golden Retriever
  • German Shepherd
  • Labrador Retriever
  • Bulldog
  • Poodle
  • Beagle
  • And many more...

Model Architecture

CNN Design

import tensorflow as tf
from tensorflow.keras import layers, models

def create_dog_breed_model(num_classes=120):
    model = models.Sequential([
        # First Conv Block
        layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)),
        layers.BatchNormalization(),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),
        
        # Second Conv Block
        layers.Conv2D(64, (3, 3), activation='relu'),
        layers.BatchNormalization(),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),
        
        # Third Conv Block
        layers.Conv2D(128, (3, 3), activation='relu'),
        layers.BatchNormalization(),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),
        
        # Fourth Conv Block
        layers.Conv2D(256, (3, 3), activation='relu'),
        layers.BatchNormalization(),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),
        
        # Dense Layers
        layers.Flatten(),
        layers.Dense(512, activation='relu'),
        layers.BatchNormalization(),
        layers.Dropout(0.5),
        layers.Dense(256, activation='relu'),
        layers.BatchNormalization(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation='softmax')
    ])
    
    return model

Transfer Learning Approach

For improved performance, using pre-trained models:

from tensorflow.keras.applications import ResNet50, VGG16, InceptionV3

def create_transfer_learning_model(base_model='resnet50', num_classes=120):
    # Load pre-trained model
    if base_model == 'resnet50':
        base = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
    elif base_model == 'vgg16':
        base = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
    else:
        base = InceptionV3(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
    
    # Freeze base layers
    base.trainable = False
    
    # Add custom layers
    model = models.Sequential([
        base,
        layers.GlobalAveragePooling2D(),
        layers.Dense(512, activation='relu'),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation='softmax')
    ])
    
    return model

Data Preprocessing

Image Preprocessing Pipeline

import cv2
import numpy as np
from PIL import Image

def preprocess_image(image_path, target_size=(224, 224)):
    """
    Preprocess image for model input
    """
    # Load image
    img = cv2.imread(image_path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
    # Resize
    img = cv2.resize(img, target_size)
    
    # Normalize pixel values
    img = img.astype(np.float32) / 255.0
    
    # Standardize (optional)
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    img = (img - mean) / std
    
    return img

def load_and_preprocess_batch(image_paths, labels, batch_size=32):
    """
    Load and preprocess a batch of images
    """
    images = []
    for path in image_paths:
        img = preprocess_image(path)
        images.append(img)
    
    return np.array(images), np.array(labels)

Data Augmentation

Improving model generalization:

from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Create data augmentation generator
train_datagen = ImageDataGenerator(
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    zoom_range=0.2,
    shear_range=0.2,
    fill_mode='nearest',
    rescale=1./255
)

# Validation data (no augmentation)
val_datagen = ImageDataGenerator(rescale=1./255)

# Create generators
train_generator = train_datagen.flow_from_directory(
    'data/train',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)

val_generator = val_datagen.flow_from_directory(
    'data/val',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)

Training Process

Training Configuration

from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import (
    EarlyStopping,
    ModelCheckpoint,
    ReduceLROnPlateau,
    TensorBoard
)

# Compile model
model.compile(
    optimizer=Adam(learning_rate=0.001),
    loss='categorical_crossentropy',
    metrics=['accuracy', 'top_k_categorical_accuracy']
)

# Callbacks
callbacks = [
    EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
    ),
    ModelCheckpoint(
        'best_model.h5',
        monitor='val_accuracy',
        save_best_only=True,
        mode='max'
    ),
    ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=5,
        min_lr=1e-7
    ),
    TensorBoard(log_dir='./logs')
]

# Train model
history = model.fit(
    train_generator,
    epochs=50,
    validation_data=val_generator,
    callbacks=callbacks
)

Training Visualization

import matplotlib.pyplot as plt

def plot_training_history(history):
    """
    Plot training and validation metrics
    """
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))
    
    # Accuracy plot
    ax1.plot(history.history['accuracy'], label='Train Accuracy')
    ax1.plot(history.history['val_accuracy'], label='Val Accuracy')
    ax1.set_title('Model Accuracy')
    ax1.set_xlabel('Epoch')
    ax1.set_ylabel('Accuracy')
    ax1.legend()
    ax1.grid(True)
    
    # Loss plot
    ax2.plot(history.history['loss'], label='Train Loss')
    ax2.plot(history.history['val_loss'], label='Val Loss')
    ax2.set_title('Model Loss')
    ax2.set_xlabel('Epoch')
    ax2.set_ylabel('Loss')
    ax2.legend()
    ax2.grid(True)
    
    plt.tight_layout()
    plt.savefig('training_history.png')
    plt.show()

plot_training_history(history)

Model Evaluation

Performance Metrics

from sklearn.metrics import (
    classification_report,
    confusion_matrix,
    accuracy_score
)
import seaborn as sns

def evaluate_model(model, test_generator):
    """
    Comprehensive model evaluation
    """
    # Predictions
    predictions = model.predict(test_generator)
    predicted_classes = np.argmax(predictions, axis=1)
    
    # True labels
    true_classes = test_generator.classes
    class_labels = list(test_generator.class_indices.keys())
    
    # Accuracy
    accuracy = accuracy_score(true_classes, predicted_classes)
    print(f"Test Accuracy: {accuracy:.4f}")
    
    # Classification report
    print("\nClassification Report:")
    print(classification_report(
        true_classes,
        predicted_classes,
        target_names=class_labels
    ))
    
    # Confusion matrix
    cm = confusion_matrix(true_classes, predicted_classes)
    
    plt.figure(figsize=(20, 20))
    sns.heatmap(cm, annot=False, fmt='d', cmap='Blues')
    plt.title('Confusion Matrix')
    plt.ylabel('True Label')
    plt.xlabel('Predicted Label')
    plt.savefig('confusion_matrix.png')
    plt.show()
    
    return accuracy, cm

accuracy, cm = evaluate_model(model, test_generator)

Top-K Accuracy

def calculate_top_k_accuracy(model, test_generator, k=5):
    """
    Calculate top-k accuracy
    """
    predictions = model.predict(test_generator)
    true_classes = test_generator.classes
    
    # Get top-k predictions
    top_k_pred = np.argsort(predictions, axis=1)[:, -k:]
    
    # Check if true class is in top-k
    correct = sum([true_classes[i] in top_k_pred[i] for i in range(len(true_classes))])
    
    top_k_accuracy = correct / len(true_classes)
    print(f"Top-{k} Accuracy: {top_k_accuracy:.4f}")
    
    return top_k_accuracy

top_5_acc = calculate_top_k_accuracy(model, test_generator, k=5)

Prediction & Inference

Single Image Prediction

def predict_dog_breed(model, image_path, class_labels):
    """
    Predict dog breed for a single image
    """
    # Preprocess image
    img = preprocess_image(image_path)
    img = np.expand_dims(img, axis=0)
    
    # Predict
    predictions = model.predict(img)
    predicted_class = np.argmax(predictions[0])
    confidence = predictions[0][predicted_class]
    
    breed = class_labels[predicted_class]
    
    print(f"Predicted Breed: {breed}")
    print(f"Confidence: {confidence:.2%}")
    
    # Get top 5 predictions
    top_5_idx = np.argsort(predictions[0])[-5:][::-1]
    print("\nTop 5 Predictions:")
    for idx in top_5_idx:
        print(f"{class_labels[idx]}: {predictions[0][idx]:.2%}")
    
    return breed, confidence

# Example usage
breed, conf = predict_dog_breed(
    model,
    'test_images/golden_retriever.jpg',
    class_labels
)

Batch Prediction

def predict_batch(model, image_paths, class_labels):
    """
    Predict breeds for multiple images
    """
    results = []
    
    for path in image_paths:
        breed, confidence = predict_dog_breed(model, path, class_labels)
        results.append({
            'image': path,
            'breed': breed,
            'confidence': confidence
        })
    
    return results

Optimization Techniques

Architecture Tuning

Experiments conducted:

  1. Depth: 4-layer vs 6-layer vs 8-layer CNNs
  2. Filters: 32-64-128-256 vs 64-128-256-512
  3. Dropout: 0.25 vs 0.5 vs 0.3
  4. Activation: ReLU vs Leaky ReLU vs ELU

Hyperparameter Optimization

from keras_tuner import RandomSearch

def build_model(hp):
    model = models.Sequential()
    
    # Tune number of filters
    model.add(layers.Conv2D(
        filters=hp.Int('filters_1', 32, 128, step=32),
        kernel_size=(3, 3),
        activation='relu',
        input_shape=(224, 224, 3)
    ))
    model.add(layers.MaxPooling2D((2, 2)))
    
    # Tune dropout rate
    model.add(layers.Dropout(hp.Float('dropout_1', 0.2, 0.5, step=0.1)))
    
    # Add more layers...
    
    # Tune learning rate
    model.compile(
        optimizer=Adam(hp.Float('learning_rate', 1e-4, 1e-2, sampling='log')),
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )
    
    return model

tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=10,
    directory='tuner_results'
)

tuner.search(train_generator, epochs=10, validation_data=val_generator)

Results

Final Performance

  • Training Accuracy: 95.3%
  • Validation Accuracy: 92.7%
  • Test Accuracy: 91.5%
  • Top-5 Accuracy: 98.2%

Model Comparison

ModelAccuracyParametersTraining Time
Custom CNN91.5%15M2 hours
ResNet5094.2%25M3 hours
VGG1692.8%138M4 hours
InceptionV393.5%24M3.5 hours

Deployment

Model Export

# Save model
model.save('dog_breed_classifier.h5')

# Save as TensorFlow Lite (mobile)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

with open('dog_breed_classifier.tflite', 'wb') as f:
    f.write(tflite_model)

Web Application

Simple Flask API:

from flask import Flask, request, jsonify
import tensorflow as tf

app = Flask(__name__)
model = tf.keras.models.load_model('dog_breed_classifier.h5')

@app.route('/predict', methods=['POST'])
def predict():
    file = request.files['image']
    img = preprocess_image(file)
    prediction = model.predict(img)
    
    return jsonify({
        'breed': class_labels[np.argmax(prediction)],
        'confidence': float(np.max(prediction))
    })

if __name__ == '__main__':
    app.run(debug=True)

Use Cases

1. Pet Adoption Platforms

Automatically classify dog breeds in adoption listings

2. Veterinary Applications

Assist in breed identification for medical records

3. Pet Insurance

Automate breed verification for insurance applications

4. Educational Tools

Help people learn about different dog breeds

Future Improvements

  • Multi-dog detection: Handle multiple dogs in one image
  • Mixed breed support: Identify dominant breeds in mixed breeds
  • Age estimation: Predict dog age along with breed
  • Mobile app: Native iOS/Android applications
  • Real-time video: Process video streams
  • 3D pose estimation: Understand dog posture

Conclusion

This Dog Breed Detection project demonstrates the effectiveness of CNNs in fine-grained image classification tasks. Through careful data preprocessing, model architecture design, and optimization, the model achieves high accuracy in identifying dog breeds from diverse images.

The project showcases practical applications of deep learning in computer vision and provides a foundation for more advanced animal recognition systems.