Blog | David Torpey

On the Robustness of Self-Supervised Representations for Multi-view Object Classification

15 January 2023

In this post, I’ll talk about a paper we recently published on the robustness of self-supervised representation with respective to viewpoint variation - one of the core tenants of any capable vision system. At this point, it is known that vision models pretrained using self-supervised objectives outperform standard supervised pretraining on a set of common, standard benchmark datasets such as ImageNet, CIFAR10, COCO, and Birdsnap. However, these datasets all serve to evaluate these models in a very narrow aspect - simple object classification performance.

ssl self-supervised learning representation learning

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

11 June 2021

As is commonly known at this point, transformers have transformed the field of NLP, and sequence modelling in general. However, computer vision has thus far remained dominated by the CNN. Its inductive biases result in unparalleled efficiency in terms of data and parameters for modelling data with a grid-like topology - most often images or video.

transformers vision transformers inductive biases

pydags - A lightweight DAG framework for Python

03 May 2021

I recently released a pre-alpha version of a Python library I’ve been working on. It’s still in the very early stages of development, but this tutorial aims to give an introduction to the library and its purpose.

python dag directed acyclic graph kubeflow airflow python3

A Simple Framework for Contrastive Learning of Visual Representations

12 September 2020

A popular and useful framework for contrastive self-supervised learning known as SimCLR was introduced by Chen et. al.. The framework simplifies previous contrastive methods to self-supervised learning, and at the time was state-of-the-art at unsupervised image representation learning. The main simplification lies in the fact that SimCLR requires no specialised modules or additions to the architecture such as memory banks.

self-supervised learning computer vision deep learning

A Foundation of Mathematics - The Peano Axioms

12 September 2020

In the past mathematicians wished to created a foundation for all of mathematics. The number system can be constructed hierarchically from the set of natural numbers \(\mathbb{N}\). From \(\mathbb{N}\), we can construct the integers \(\mathbb{Z}\), rationals \(\mathbb{Q}\), reals \(\mathbb{R}\), complex numbers \(\mathbb{C}\), and more. However, it is desirable to be able to construct the naturals (\(\mathbb{N}\)) from more basic ingredients, since there is no reason \(\mathbb{N}\) should itself be fundamental.

mathematics peano axoims

Reducing the dimensionality of data with neural networks

26 June 2020

Reducing the dimensionality of data has many valuable potential uses. The low-dimensional version of the data can be used for visualisation, or for further processing in a modelling pipeline. The low-dimensional version should capture only the salient features of the data, and can indeed be seen as a form of compression. Many techniques for dimensionality reduction exists, including PCA (and its kernelized variant Kernel PCA), Locally Linear Embedding, ISOMAP, UMAP, Linear Discriminant Analysis, and t-SNE. Some of these are linear methods, while others are non-linear methods. Many of the non-linear methods falls into a class of algorithms known as manifold learning algorithms.

deep learning autoencoder dimensionality reduction

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

25 June 2020

Labelled data is often either expensive or hard to obtain. As such, there has been a plethora of work to make better use of unlabelled data in machine learning, with paradigms such as unsupervised learning, semi-supervised learning, and more recently, self-supervised learning. FixMatch is an approach to semi-supervised learning (SSL) that combines two common approaches of SSL: 1. consistency regularisation and 2. pseudo-labelling.

semi-supervised learning computer vision deep learning

All About Convex Hulls

24 June 2020

The convex hull is a very important concept in geometry, and has many applications in fields such as computer vision, mathematics, statistics, and economics. Essentially, a convex hull of a shape or set of points is the smallest convex set that contains that shape or set of points. Many algorithms exist to compute a convex hull. Many of these algorithms have focused on the 2D or 3D case, however, the general \(d\)-dimensional case is of big interest in many applications.

computational geometry geometry computer vision

Representation Learning (1)

28 December 2019

For a while I’ve been interested in representation learning in the context of deep learning. Concepts such as self-supervised learning, unsupervised representation learning using GANs or VAEs, or simply through a vanilla supervised learning of some neural network architecture. Upon reading the literature, I had an idea that serves as a nice integration of two very interesting and useful models / techniques - the Fisher vector (which I’ve previously posted about in my blog here), and the variational autoencoder (which I’ve been meaning to write a blog post about!). This blog post just serves to flesh out the idea, should I choose to pursue or revisit it at some point.

representation learning fisher vectors deep learning

SVMs: A Geometric Interpretation

30 March 2019

Example Points

support vector machine svm machine learning

Human Action Recognition

18 March 2019

In this post we will discuss the problem of human action recognition - an application of video analysis / recognition. The task is simply to identify a single action from a video. The typically setting is a dataset consisting of \(N\) action classes, where each class has a set of videos associated with it relating to that action. We will focus on the approaches typically taken in early action recognition research, and then focus on the current state-of-the-art approaches. There is a recurring theme in action recognition of extending conventional two-dimensional algorithms into three dimensions to accommodate for the extra (temporal) dimension when dealing with videos instead of images.

cnn deep learning action recognition

Dimensionality Reduction

31 January 2019

In machine learning, we often work with very high-dimensional data. For example, we might be working in a genome prediction context, in which case our feature vectors would contains thousands of dimensions, or perhaps we’re dealing in another context where the dimensions reach of hundreds of thousands or possibly millions. In such a context, one common way to get a handle on the data - to understand it better - is to visualise the data by reducing its dimensions. The can be done using conventional dimensionality reduction techniques such as PCA and LDA, or using manifold learning techniques such as t-SNE and LLE.

dimensionality reduction pca t-SNE machine learning manifold learning

Optical Flow

23 December 2018

Optical flow is a method for motion analysis and image registration that aims to compute displacement of intensity patterns. Optical flow is used in many different settings in the computer vision realm, such as video recognition and video compression. The key assumption to many optical flow algorithms is known as the brightness constancy constraint, as is defined as:

optical flow lucas kanade dense optical flow computer vision

Ensemble Learning

10 December 2018

Ensemble learning is one of the most useful methods in the machine learning, not least for the fact that it is essentially agnostic to the statistical learning algorithm being used. Ensemble learning techniques are a set of algorithms that define how to combine multiple classifiers to make one strong classifier. There are various ensemble learning techniques, but this post will focus on the two most popular - bagging and boosting. These two approach the same problem in very different ways.

ensemble learning boosting bagging machine learning

Autoencoders

02 December 2018

Autoencoders fall under the unsupervised learning category, and are a special case of neural networks that map the inputs (in the input layer) back to the inputs (in the final layer). This can be seen mathematically as \(f : \mathbb{R}^m \mapsto \mathbb{R}^m\). Autoencoders were originally introduced to address dimensionality reduction. In the original paper, Hinton compares it with PCA, another dimensionality reduction algorithm. He showed that autoencoders outperform PCA when non-linear mappings are needed to represent the data. They are able to learn a more realistic low-dimensional manifold than linear methods due to their non-linear nature.

autoencoder auto-encoders neural networks deep learning dimensionality reduction

Face Recognition: Eigenfaces

25 November 2018

The main idea behind eigenfaces is that we want to learn a low-dimensional space - known as the eigenface subspace - on which we assume the faces intrinsically lie. From there, we can then compare faces within this low-dimensional space in order to perform facial recognition. It’s a relatively simple approach to facial recognition, but indeed one of the most famous and effective ones of the early approaches. It still works well in simple, controlled scenarios.

face recognition eigenfaces pca

Support Vector Machines - Why and How

25 November 2018

Support vector machines (SVMs) are one of the most popular supervised learning algorithms in use today, even with the onslaught of deep learning and neural network take-over. The reason they have remained popular is due to their reliability across a wide variety of problem domains and datasets. They often have great generalisation performance, and this is almost solely due to the clever way in which they work - that is, how they approach the problem of supervised learning and how they formulate the optimisation problem they solve.

svm support vector machine machine learning kernel machines

Local Feature Encoding and Quantisation

25 November 2018

In this post, I will describe local feature encoding and quantisation - why it is useful, where it is used, and some of the popular techniques used to perform it.

fisher vector vlad feature vectors feature encoding computer vision