Quickstart

This page is meant to give you a quickstart guide to some of the research papers and resources I find most helpful when learning different concepts, ideas, algorithms, and techniques in machine learning and computer vision. I highly recommend reading them if you are doing research in those fields, or just out of morbid curiosity! The list mostly focuses on groundbreaking, foundational papers in machine learning, deep learning, and computer vision. They will provide you with great context to build upon, and progress in the field, and possibly come up with your own novel ideas!

Robust Real-time Object Detection

A ground-breaking object detection algorithm based on boosting of many Haar-based feature detectors. The framework is generic, and the Haar features can be swapped out for features such as histogram of oriented gradients, or local binary patterns. The classical vision / non-deep learning approach to object detection was the previous state-of-the-art, and was widely used in cell phones and cameras for face detection.

Distinctive Image Features from Scale-Invariant Keypoints

The original SIFT paper by Lowe. Of all the interest point detection and description mechanisms, SIFT was the most popular, and certainly the state-of-the-art approach for many years (lately has arguably been overtaken by KAZE and LIFT). It has seen many applications including image stitching, image classification, and image search. It is one of the seminal classical computer vision algorithms.

Face Recognition Using Eigenfaces

A foundational face recognition algorithm, hailed for its combination of simplicity and accuracy. The key idea was the ability to model faces using a low-dimensional linear subspace computed using PCA. Only a few principal components are needed to encode much of the structure and salient information of the face (indeed, only in highly controlled settings). This paper inspired much subsequent work in classical vision approaches to face recognition. The basic framework is also generic enough to be applicable to other problems.

An Iterative Image Registration Technique with an Application to Stereo Vision

The most popular image registration / optical flow algorithm known as Lucas-Kanade. This algorithm has seen many applications in many different areas, including video analysis, video compression, and action recognition. It is a least squares estimate to predict the flow of pixels, derived (as most optical flow is) from the brightness constancy constraint.

Robust Real-time Object Detection

Distinctive Image Features from Scale-Invariant Keypoints

Face Recognition Using Eigenfaces

An Iterative Image Registration Technique with an Application to Stereo Vision

Normalized cuts and image segmentation

Visual Categorization with Bags of Keypoints

ImageNet Classification with Deep Convolutional Neural Networks

Deep Residual Learning for Image Recognition

Improving the Fisher Kernel for Large-Scale Image Classification

Support Vector Networks

Random Forests

Reducing the dimensionality of data using neural networks

Learnability and the Vapnik-Chervonenkis Dimension

Generative Adversarial Networks

Auto-Encoding Variational Bayes

Dynamic Routing Between Capsules

Long Short-Term Memory

Attention Is All You Need