13 December 2021 / CS231N

cs231n - Lecture 2. Image Classification

Image Classification: A Core Task in Computer Vision

The Problem: Semantic Gap
considering image as a tensor of integers between [0,255] with 3 channels RGB
Challenges:
Viewpoint variation
Background Clutter Illumination
Occlusion
Deformation
Intraclass variation
An image classifier

def classify_image(image):  
# Some magic here?  
	return class_label  

ML: Data-Driven Approach
1. Collect a dataset of images and labels
2. Use ML algorithms to train a classifier
3. Evaluate the classifier on new images

def train(images,labels):  
	# Machine Learning!  
	return model  

def predict(model, test_images):  
	# Use model to predict labels  
	return test_labels  

Nearest Neighbor Classifier
Predict the label of the most similar training imgae
Training data with labels x $\leftrightarrow$ query data $x^*$
distance metric $|x,x^*| \rightarrow R$
L1 distance $d_1(I_1,I_2) = \sum_p |I_1^p-I_2^p|$
pixel-wise absolute value differences $\rightarrow$ sum for scoring

import numpy as np
class NearestNeighbor:  
	def __init__(self):  
		pass

	def train(self,X,y):  ### Memorize training data
		""" X is N x D for n example. Y is 1-dim of size N"""  
		self.Xtr = Xf
		self.ytr = y  

	def predict(self,X):  
		num_test = X.shape[0]  
		Ypred = np.zeros(num_test, dtype = self.ytr.dtype)
		
		for i in xrange(num_test):
			### find closest train image for each test image, predict label of its
			distances = np.sum(np.abs(self.Xtr - X[i,:]), axis=1)
			min_index = np.argmin(distances)  
			Ypred[i] = self.ytr[min_index]
		return Ypred

Q: With N examples, how fast are training and prediction?
Answer: Train O(1), predict O(N)
$\rightarrow$ Bad: we want fast at prediction; slow for training is ok.
KNN with majority vote
Distance metric: L1(Manhattan), L2(Euclidean)
Hyperparameters
To find best value of k and best distance(metric) to use, use train-val-test approach
However, pixel distances are not informative for KNN
very slow at test time & curse of dimensionality

Linear Classifier

Parametric Approach
$f(x,W)=Wx+b$; W for parameters or weights
Interpreting a linear classifier: Geometric Viewpoint
hard cases in non-linearity

cs231n - Lecture 2. Image Classification

Image Classification: A Core Task in Computer Vision

Linear Classifier

cs231n - Lecture 3. Loss Functions and Optimization

GNN-based Fashion Coordinator

Image Classification: A Core Task in Computer Vision

Linear Classifier

Search Darron's Devlog