Just went to an awesome talk by Nvidia’s Will Ramey called Deep Learning Demystified. These are my notes – raw, stream of consciousness, and unedited. But interesting! (I hope)
Algorithms that learn have in the past used experts writing programs with classifiers to eg edge detect or look for particular features. First programmer has to conceive of classifiers and then connect them together in some kind of logic tree and characterise the features sought – eg wheels are round… so you get a system for distinguishing between cars/trucks/buses but it doesn’t cope with changes such as a rainy day or a foggy day. So you have to manually create new classifiers and work out how to make it work. Doesn’t scale/translate to new problems.
New approach of deep learning uses neural networks to learn from the examples that you give it. You don’t have to manually create classifiers, you can use examples in the data itself so that the network creates the classifiers. The major advantage is that it’s really easy to extend. If you train it on cars in the daytime and then want to use it on a foggy day you just provide more foggy pictures and retrain the network. You don’t have to create new hypotheses for what might work, you leave that to the network.
With GPU accelerators the process of training and retraining the networks as you expand the amount of data is relatively fast and doesn’t require a lot of manual effort.
Stunningly effective for internet services, medicine, media, security, autonomous machines etc
Deep learning is also being applied as a tool to understand and learn from massive amounts of data.
NASA uses deep learning to better understand the content of satellite images and provide feedback to farmers and scientists studying the ecosystem about how to manage their work and increase crop productivity etc.
How do you do deep learning?
Start with an untrained neural network. Deep neural networks have a large number of layers, but might not be well connected. There are known topologies of nns that are known to be good at image classification, or object recognition, or signals analysis, etc.
In its untrained state the network is just a bunch of math functions and weights that determine how the outputs of those functions are communicated to the next level. It can’t do anything. To train a NN to distinguish between dogs and cats we assemble a training set of images. We use a deep learning framework to feed the images through the NN one at a time and check the output. If we’re only trying to distinguish between cats and dogs there will be two output nodes with confidence levels – how confident is the nn that it might be a dog, and how confident that it might be a cat? The framework already knows the answer and will evaluate whether the NN infers the correct answer, and if so it will reward the neural net by strengthening the weights of those nodes that contributed most to the correct answer, and reducing the weights of the nodes that didn’t contribute. When the nn infers the incorrect answer it will decrease the weights that contributed most. etc. You keep showing the same collection of images over and over, continuing the training. Then the NN gets very good at it. It’s almost a skinnerian psychology experiment, but without electric shocks. Showing the dataset once is called an “epoch”. It takes many many epochs to train the network properly, and that’s the job of the deep learning framework.
This is supervised learning.
Now you have a trained network that can distinguish between dogs and cats. But nothing else. If you were to show it a raccoon it would probably give you a low probability value for both dogs and cats. The trained network still has all the flexibility you need for it to learn. But in most cases once you deploy it it doesn’t need to be able to learn anymore.
Colleague at Nvidia trained a neural net to recognise cats from a usb camera and setup a system to turn on the sprinkler system when there’s a cat on his lawn, to scare them away.
In some cases in the trained network there may be nodes that don’t contribute to the answer. The framework can pay attention to that and automatically remove nodes that don’t make a difference either way, or sometimes fuse layers to save time. And now you can integrate your optimised model into your application.
Deep Learning algorithms are evolving very rapidly, and it is challenging to keep up with them. Training NNs is incredibly computationally expensive, and you don’t necessarily know what the correct topology is at the start. So you might need to tweak it many times and train one, learn from that, and then explore different possibilities, which increases the computational needs and makes it more expensive.
You can check out nvidia deep learning resources online: https://developer.nvidia.com/deep-learning