Learn autoencoders by training one right in your browser!
				Autoencoders have many different applications, but most notably
				they have been used for dimensionality reduction and as
				generative models
				
				The rest of the article will dive into the
				Structure of an autoencoder in context
				to 1. More
				specifically, it will be broken up into the
				
					 Encoder,
				Latent Space, and
				Decoder 
				to explain each piece of the puzzle. Then, the article will end
				with a Conclusion that extends to
				applications of autoencoders elsewhere.
			
				The Encoder is the first half
				of the neural network that takes an input with a
				higher dimension, then outputs to a
				lower
				dimension – thereby creating a bottleneck trapezoid , starting with
				a larger base, moving to a smaller one: going from 3
				Dimensional(
3D) to 2 Dimensional(2D)
				data.
			
				3D to
					2D.
				
				You can see from
				2 after
				training the entire autoencoder, the
				Encoder is just a learned
				function that takes the input to a lower dimension. In the
				2, the
				2D Latent Space looks like the
				3D input Data if we ignored
				the vertical dimension. This is exactly what we should expect
				given the bottleneck defined from 3D to
				2D.
			
The Latent Space is composed of all outputs from the Encoder. Or in other words, one output is a latent vector, and all outputs would constitute a Latent Space. Among being able to do vector arithmetic to find connections and combinations, visualizing this space gives insight to the structure of the data.
				One way to visualize the structure that forms is through
				Opposite
				Gradients
				. By understanding what
				direction each point in the
				Latent Space
				is tending towards, we can get an idea of where the training is
				headed towards
				
				After computing the gradient of loss with respect to the latent
				output
				
				Just by observing the trails
				, you can see the structure
				that takes place over training. Notice how each point doesn't
				move exactly in the direction of it's trail
				
, it is more of an indicator
				of the gravity of the structure: larger and more trails
				
 will pull the data in that
				direction, and uniformly distributed trails
				
				will not affect the structure at all. This method of
				visualization can be applied to other outputs, demonstrated by
				
Kahng et al. in
				GAN Lab
				After the Encoder
				deconstructs
				the original input down to the
				Latent Space, the
				Decoder
				reconstructs back up to the original input. Hence the
				 trapezoid for the
				Decoder design, starting with a
				smaller base, moving to a larger one. The loss function,
				adequately named "reconstruction loss," is computed with the
				original input and the reconstructed input. Now we can
				backpropagate from the reconstruction loss and optimize! All the
				pieces are now present to train the autoencoder.
			
				2D to 3D.
				
You can see from 5 after training the entire autoencoder, the Decoder is just a learned function that reconstructs from the bottleneck. In the 5, the 3D Reconstruction looks like the 2D Latent Space if we added a dimension.
				Autoencoders are not just fixed to 3D data like the
				previous examples. They can be used on other examples too!
			
				In fact, in addition to being applied to many different shapes
				and sizes of data, the autoencoder structure can be used to
				tackle real problems like denoising images, removing
				imperfections or watermarks from images, learning complex or
				emergent structures in data, and many more
				
				To give one final example, if we wanted to visualize the
				structure of the
				MNIST digits dataset that consists of
				28 by 28 handwritten digits (0-9), we could use an
				autoencoder!
			
				
				In 6. after
				training the autoencoder with a 2D bottleneck, we
				can see clusters form in the
				Latent Space! Also notice the
				similarity between the digits! See how the
				9s are seen mixed in with
				the 7s and
				4s in the
				Latent Space.
			
				The outcome was heavily influenced and inspired by the amazing
				works of
				GAN Lab and
				Understanding UMAP
				3D data reduced down to
				2D
				
The article was styled with the Distill HTML Template.
Understanding UMAP, Communicating with Interactive Articles, and GAN Lab were used for a css styling reference for the article and controls.
Libraries used: Plotting and visualization done in Svelte, with help of d3.js and ScatterGL. Autoencoder created and trained with Tensorflow JS.
Donald "Donny" R. Bertucci implemented all of the visualizations and wrote the article. Donny is an undergraduate student at Oregon State University and a member of the Data Interaction and Visualization (DIV) Lab .