This whole technology of putting the artificial brain into the machine is really fascinating, we humans because of our ingenious intellect has this natural instinct to go beyond what seems impossible to create tools and technology which becomes an extension to our day to day life, which can make decisions on our part and make our living super-efficient.
It is with this urge to make humans super productive we started putting artificial intelligence to the computer machines and now we have come a long way to build machines which can nearly think like human brains.
Today we will see how Deep learning a branch of Artificial intelligence is really doing justice to all those valuable data floating around in this universe and processing it efficiently to help us reach to some rational conclusions in the field of Speech recognition, Computer vision, NLP, Healthcare, Financial Sector, etc..
What is Deep Learning?
As we know that various Machine Learning techniques has been used to process our raw data to help us is content filtering on social network , to write recommendation engine for e-commerce based portals, In Image and Pattern recognitions, to transcribe speech to text etc. Most of these task are being implemented using a most popular class of ML called Deep Learning
“ Deep learning is a class of machine learning algorithms that allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. “
Deep Learning Algorithm uses:
- Multiple layers of Non-linear processing unit where each successive layer uses the output from previous layer as an input.
- A Supervised Learning for Classification & Unsupervised for Pattern Analysis
- Use some form of gradient descent for training
- Due to multiple level of data processing it forms hierarchical data representation where higher level learns from lower level data layers.
Origin of Deep Learning :
In 1986 Rina Dechter coined the expression Deep Learning for the first time, but Ivakhnenko & Lapa in 1965 wrote the first working learning algorithm for supervised, deep, feedforward, multilayer perceptrons was published by Ivakhnenko and Lapa in 1965.These ideas were implemented in a computer identification system by the World School Council London called “Alpha”, which demonstrated the learning process.
Deep Learning Architecture :
To understand how deep learning works we need to understand its architecture in depth.
The mother art is architecture. Without an architecture of our own we have no soul of our own civilisation
–Frank Lloyd Wright
There are generally 5 most popular deep learning network architecture :
- RNN’s : Recurrent neural networks
- LSTM/GRU: Long short-term memory / Gated recurrent unit
- DBN: Deep belief networks
- CNN’s : Convolutional neural networks
- DSNs: Deep stacking networks
Deep learning architectures spanning the past 20 years
Application of Deep Learning Architectures :
- RNN : Its is mostly employed in speech & hand-writing recognition
- LSTM/GRU : Natural language text compression, handwriting recognition, speech recognition, gesture recognition, image captioning
- CNN : Image recognition, video analysis, natural language processing
- DBN : Image recognition, information retrieval, natural language understanding, failure prediction
- DSN : Information retrieval, continuous speech recognition
Lets get more into the depth of each Deep Network architecture
A . Recurrent Neural Network :
RNN is a type of artificial neural network(ANN) where each connections between machine units forms a directional circle.
When back-propagation was first introduced, its most exciting use was for training recurrent neural networks (RNNs). Juergen Schmidhuber: a leading researcher in ML gave an awesome explanations of RNN :
Recurring neural network allow for both parallel and sequential computation, and in principle can compute anything a traditional computer can compute. Unlike traditional computers, however, Recurrent Neural Networks are similar to the human brain, which is a large feedback network of connected neurons that somehow can learn to translate a lifelong sensory input stream into a sequence of useful motor outputs. The brain is a remarkable role model as it can solve many problems current machines cannot yet solve.
RNN Application :
For tasks that involve sequential inputs, such as speech and language, it is often better to use RNNs . RNNs process an input sequence one element at a time, maintaining in their hidden units a ‘state vector’ that implicitly contains information about the history of all the past elements of the sequence. When we consider the outputs of the hidden units at different discrete time steps as if they were the outputs of different neurons in a deep multilayer network, it becomes clear how we can apply backpropagation to train RNNs. RNNs are very powerful dynamic systems, but training them has been a bit challenging because backpropagated gradients either grow or shrink at each time step, so over many time steps they typically explode or vanish.
RNN Architecture :
It consist of rich architecture like
- Fully recurrent : Basic RNNs are a network of neuron-like nodes, each with a directed (one-way) connection to every other node.Each node (neuron) has a time-varying real-valued activation. Each connection (synapse) has a modifiable real-valued weight. Nodes are either input (receiving data from outside the network), output nodes (yielding results) or hidden nodes (that modify the data en route from input to output).
- Recursive : Here RNN is created by applying same set of weights recursively over a differentiable graph-like structure, by traversing the structure in topological order. A special case of recursive neural networks is the RNN whose structure corresponds to a linear chain. Recursive neural networks have been applied to natural language processing. The Recursive Neural Tensor Network uses a tensor-based composition function for all nodes in the tree.
- Hopfield Network : It is a type of RNN getting its name from John Hopfiled in 1982, The Hopfield network is a RNN in which all connections are symmetric. It requires stationary inputs and is thus not a general RNN, as it does not process sequences of patterns. It guarantees that it will converge. If the connections are trained using Hebbian learning then the Hopfield network can perform as robust content-addressable memory, resistant to connection alteration. Ih human memory learning, the Hopfield model accounts for associative memory through the incorporation of memory vectors. Memory vectors can be slightly used, and this would spark the retrieval of the most similar vector in the network. However, we will find out that due to this process, intrusions can occur.
RNNs consist of a rich set of architectures (LSTM is one of the popular topology RNN network ). The key differentiator is feedback within the network, which could manifest itself from a hidden layer, the output layer, or some combination thereof.
Next, we will go deeper into Deep learning techniques like RNN, CNN, DBN, etc in detail and see how they work with some practical examples and also understand the advantages of deep learning techniques.