Summary | The latest progress of deep learning

Author:Data School Thu Time:2022.07.06

Source: Machine Learning Institute

This article is about 10500 words. It is recommended to read 20+ minutes

In this article, we will briefly discuss the latest progress in deep learning in recent years.

"Overview is always one of the fastest shortcuts in the new field of entry!"

Recent Advances in Deep Learning: An Overview

Abstract: Deep learning is one of the latest trends in machine learning and artificial intelligence research. It is also one of the most popular scientific research trends today. Deep learning methods have brought revolutionary progress to computer vision and machine learning. New deep learning technology is constantly being born, surpassing the most advanced machine learning and even existing deep learning technologies. In recent years, the world has achieved many major breakthroughs in this field. Due to the rapid development of deep learning, it is difficult to follow up, especially for new researchers. In this article, we will briefly discuss the latest progress in deep learning in recent years.

1 Introduction

The term deep learning (DL) was originally introduced into machine learning (ML) in 1986, and was later used for artificial neural networks (Ann) in 2000. Deep neural network consists of multiple hidden layers to learn data characteristics with multiple abstract levels. The DL method allows computers to learn complex concepts through relatively simple concepts. For artificial neural networks (Ann), deep learning (DL) (also known as hierarchical lead). In order to learn complex functions, deep architecture is used for multiple abstract levels, namely non -linear operations; for example, Anns; There are many hidden layers. To sum up with accurate, deep learning is a sub -domain of machine learning. It uses multi -level non -linear information processing and abstraction to have characteristics of supervision, unsupervised, semi -supervision, self -supervision, weak supervision, etc. Represents, classification, regression, and pattern recognition.

Deep learning means that learning is a branch or child in machine learning. Most people think that modern deep learning methods have developed from 2006. This article is a review of the latest deep learning technology, which is mainly recommended to researchers who are about to get involved in the field. This article includes the basic ideas, main methods, latest progress, and applications of DL.

The scriptures are very beneficial, especially for new researchers in a specific field. If a research field has great value in the near future and related application areas, it is usually difficult to be tracked to the latest progress in real time. Scientific research is now a very attractive profession, because knowledge and education are easier to share and obtain than ever. For a trend of technology research, the only normal assumption is that it has many improvements in all aspects. A few years ago, an overview of a certain field may be outdated now.

Considering the popularization and promotion of deep learning in recent years, we briefly outlined deep learning and neural networks (NN), as well as its main progress and major breakthroughs over the years. We hope that this article will help many novice researchers fully understand the recent research and technology of deep learning in this field, and guide them to start in the correct way. At the same time, we hope to pay tribute to the top DL and Ann researchers in this era: Geoffrey Hinton (Hinton), JUERGEN SCHMIDHuber (Schmidhuber), Yann Lecun (Lecun), Yoshua Bengio (Bengio), and many other researchers , Their research builds modern artificial intelligence (AI). Follow up their work to track the current best DL and ML research progress is also vital to us.

In this paper, we first briefly describe the past research papers to study the models and methods of deep learning. We will then describe the latest progress in this field. We will discuss deep learning (DL) methods, deep architectures (ie, deep neural networks (DNN)), and deep -generating models (DGM), followed by important regularization and optimization methods. In addition, two short parts summarize the open source DL framework and important DL applications. We will discuss the current status and future of in -depth learning in the last two chapters (that is, discussion and conclusions).

2 Related Studies

In the past few years, there are many schedules on deep learning. They described DL methods, methodology, and their application and future research directions in a good way. Here, we briefly introduce some excellent reviews about deep learning.

Young et al. (2017) discussed the DL model and architecture, mainly used for natural language processing (NLP). They showed DL applications in different NLP fields, compared the DL model, and discussed possible future trends.

ZHANG et al. (2017) discussed the current best deep learning technology for front -end and rear voice recognition systems.

Zhu et al. (2017) reviewed the latest progress of DL remote sensing technology. They also discussed the open source DL framework and other deep learning technical details.

Wang et al. (2017) described the evolution of deep learning models in a chronological manner. This article briefly introduces the model and breakthrough in DL research. This article understands the origin of deep learning by evolution, and has interpreted the optimization of neural networks and future research. Goodfellow et al. (2016) discussed the deep network and generating models in detail. Starting from the advantages and disadvantages of the basic knowledge and deep architecture of machine learning (ML), the DL research and applications in recent years have been summarized.

Lecun et al. (2015) outlined deep learning (DL) model from convolutional neural network (CNN) and recursive neural network (RNN). They described DL from the perspective of representation learning, showing how DL technology works, how to successfully use in various applications, and how to conduct learning based on unsupervised learning (UL) in the future. At the same time, they also pointed out the main progress of DL in the literature directory.

Schmidhuber (2015) has an overview of deep learning from CNN, RNN, and deep enhanced learning (RL). He emphasized the RNN processing of sequence processing, while pointing out the limitations of basic DL and NN, and improving their skills.

Nielsen (2015) describes the details of the neural network with code and examples. He also discussed deep neural networks and deep learning to a certain extent.

Schmidhuber (2014) discusses the time sequence -based neural network, the use of machine learning methods for classification, and the history and progress of using deep learning in neural networks.

DENG and YU (2014) describe deep learning categories and technologies, as well as the application of DL in several areas.

Bengio (2013) briefly outlined the DL algorithm from the perspective of representation learning, that is, monitoring and unsupervised networks, optimization and training models. He focuses on many challenges of deep learning, such as: for larger models and data expansion algorithms, reduce optimization difficulties, and design effective zoom methods.

Bengio et al. (2013) discussed the characteristics and characteristics of characterization, which are deep learning. They discussed various methods and models from the perspective of application, technology and challenges.

Deng (2011) summarizes deep structured learning and its architecture from the perspective of information processing and related fields.

Arel et al. (2010) briefly outlined DL technology in recent years.

Bengio (2009) discussed the deep architecture, the neural network and generating model of artificial intelligence.

Recently, all the papers on deep learning (DL) discussed the focus of deep learning from multiple angles. This is very necessary for researchers of DL. However, DL is currently a booming area. After the recent DL Overview papers were published, many new technologies and architectures were still proposed. In addition, previous papers studied from different perspectives. Our paper is mainly aimed at learners and novices just entered this field. To this end, we will strive to provide a deep learning foundation and clear concept for new researchers and anyone who interested in this field.

3 The latest progress

In this section, we will discuss the main deep learning (DL) method derived from the recent machine learning and artificial neural network (Ann). Artificial neural networks are the most commonly used forms of deep learning.

3.1 Deep architecture evolution

The artificial neural network (ANN) has made great progress, and has also brought other deep models. The first generation of artificial neural network consists of a simple sensor nerve layer (that is, perceptual), and only a limited simple calculation can be performed. The core algorithm of the second generation using reverse communication and reverse communication algorithm is to use the chain -type guidance rule, that is, the target function for the guide number (or gradient) of the output layer. Always passed to the first layer (input layer). Finally, the feature is passed to a non -linear activation function, which can get the result of classification. (The current most popular non-linear activation function is RELU. Compared with the previously popular TANH and Sigmoid activation functions, RELU's learning speed is faster, which can allow deep networks to directly learn without pre-training. A long time ago, the problem of the disappearance of gradient disappearance or explosion used the pre -training scheme). Then support the vector machine (SVM) surface, surpass Ann for a period of time. In order to overcome the limitations of reverse propagation (gradient disappearance and gradient explosion), people have proposed a limited Boltzmann (RBM) to make learning easier (in the second generation, there is no RELU, BN layer, etc. The introduction of the concept to overcome the problem of reverse propagation). At this time, other technologies and neural networks also appeared. For example, the feed neural network (FNN, which is our common DNN, the names are different, and other names are MLP and the like ... I think it is more accustomed to using DNN to remember it. ), Convolutional neural networks (CNN), circulating neural network (RNN), etc., as well as deep belief networks, self -encoders, GAN, etc. Since then, in order to achieve various purposes, AN has been improved and designed in different aspects. Schmidhuber (2014), Bengio (2009), DENG and YU (2014), Goodfellow et al. (2016), Wang et al. (2017) detailed the evolution and history of deep neural network (DNN) and deep learning (DL) (DL) In summary. In most cases, the deep architecture is a multi -layer non -linear duplication of a simple architecture, so as to obtain a highly complex function from the input.

4 deep learning method

Deep neural networks have achieved great success in supervision learning. In addition, deep learning models are also very successful in unsupervised, mixed, and enhanced learning.

4.1 In -depth supervision and learning

Supervise learning and application of data marks, classifier classification or numerical prediction. Lecun et al. (2015) gave a streamlined explanation of the supervision learning method and the formation of deep structure. Deng and YU (2014) mentioned many deep networks for supervision and mixed learning, and explained, such as deep stack networks (DSN) and its variants. Schmidthuber (2014) research covers all neural networks, from early neural networks to recent successful convolutional neural networks (CNN), circulating neural network (RNN), long -term memory (LSTM) and its improvement.

4.2 In -depth unsupervised learning

When the input data is not marked, the unsupervised learning method can be used to extract features from the data and classify or mark it. Lecun et al. (2015) predicts the future of unsupervised learning in deep learning. Schmidthuber (2014) also describes neural networks without supervision. Deng and YU (2014) briefly introduced the deep architecture of unsupervised learning, and explained in detail the deep self -encoder.

4.3 In -depth strengthening learning

Strengthen the next step of learning and using a reward and punishment system to predict the learning model. This is mainly used for games and robots to solve normal decision -making problems. Schmidthuber (2014) describes the progress of deep learning in enhanced learning (RL), as well as the application of deep feed neural networks (FNN) and circulating neural network (RNN) in RL. LI (2017) discussed the Deep Reinforcement Learning (DRL), its architecture (such as Deep Q-Network, DQN), and applications in various fields. (Specific information can be seen in the second edition of "Reinforcement Learning").

MNIH et al. (2016) proposed a DRL framework that uses asynchronous gradient declines.

Van Hasselt et al. (2015) proposed a DRL architecture that uses Deep Neural Network (DNN).

5 Deep neural network

In this section, we will briefly discuss deep neural networks (DNN) and their recent improvement and breakthroughs. The function of neural networks is similar to the human brain. They are mainly composed of neurons and connections. When we say deep neural networks, we can assume that there are quite a few hidden layers that can be used to extract features and calculate complex functions from the input.

Bengio (2009) explains neural networks of deep structures, such as convolutional neural networks (CNN), self -encoder (AE), and its variants. Deng and YU (2014) introduced some neural network architectures in detail, such as AE and its variants. GOODFELLOW et al. (2016) introduced the depth feedback network, convolutional network, recursive network, and its improvement. Schmidhuber (2014) mentioned the complete history of neural networks from early neural networks to recent successful technology. 5.1 In -depth self -encoder

The coder (AE) is a neural network (NN), where the output is input. AE uses the original input, encoded to compress, and then decodes to rebuild the input. In depth AE, the low hidden layer is used for coding, the high hidden layer is used for decoding, and the error is used for training.

5.1.1 Change the division self -encoder

Change the automatic encoder (VAE) can be counted as a decoder. VAE is based on standard neural networks and can be trained through random gradient (Dorsch, 2016).

5.1.2 Multi -layer noise reduction self -encoder

In the early self -coders (AE), the dimension of the encoding layer smaller (narrow) than the input layer (narrow). In the multi -layer noise reduction self -encoder (SDAE), the encoding layer is width than the input layer (deng and yu, 2014).

5.1.3 Change the coder

Deep automatic encoder (DAE) can be transformed, that is, the features extracted from multi -layer non -linear processing can be changed according to the needs of learners. Changing from the encoder (TAE), you can use the input vector, or the target output vector to apply the transformation of the unarmed attribute, and guide the code to the expected direction (deng and yu, 2014).

5.2 Deep convolutional neural network

The four basic ideas constitute the convolutional neural network (CNN), namely: local connection, sharing weight, pooling and multi -layer use. The first part of the CNN consists of a convolutional layer and the pooling layer, and the latter part is mainly the full connection layer. The composite connection of the convolutional layer detection characteristics, the pool layer merges similar features into one. CNN uses convolution instead of matrix multiplication in the convolution layer.

Krizhevsky et al. (2012) proposed a deep convolutional neural network (CNN) architecture, also known as AlexNet, a major breakthrough in Deep Learning (DL). The network consists of 5 convolution layers and 3 full connection layers. The architecture uses graphics processing unit (GPU) for convolutional calculation, and uses a linear rectifier function (RELU) as the activation function to reduce the fitting with Dropout.

IANDOLA et al. (2016) proposed a small CNN architecture called Squeeeznet.

SZEGEDY et al. (2014) proposed a deep CNN architecture called Inception. DAI et al. (2017) proposed to improve the Internet-Resnet.

Redmon et al. (2015) proposed a CNN architecture called YOLO (You only look once) for uniform and real -time target detection.

Zeiler and Fergus (2013) proposed a method of visualization and visualization of CNN.

Gehring et al. (2017) proposed a CNN architecture for sequence to sequence learning.

Bansal et al. (2017) proposed Pixelnet to express pixels.

Goodfellow et al. (2016) explained the basic architecture and ideas of CNN. GU et al. (2015) made a good overview of the latest progress of CNN, the various variants of CNN, the architecture of CNN, the regularization methods and functions, and the application in various fields.

5.2.1 Deep poolization convolutional neural network

The maximum pooling convolutional neural network (MPCNN) mainly operates convolution and maximum pooling, especially in digital image processing. MPCNN is usually composed of three layers outside the input layer. Get the input image at the convolution layer and generate the feature diagram, and then apply a non -linear activation function. The maximum pool layer is sampled down and maintains the maximum value of the child area. Linear multiplication of the full connection layer. In the depth MPCNN, after the input layer, it uses convolution and mixed pools, and then the full connection layer.

5.2.2 Very deep convolutional neural network

Simonyan and Zisserman (2014) proposed a very deep convolutional neural network (VDCNN) architecture, also known as VGG Net. VGG Net uses a very small convolution filter, which reaches 16-19 layers. Conneau et al. (2016) proposed another VDCNN architecture of text classification to use small convolution and pooling. They claim that the VDCNN architecture was the first to be used in text processing, and it played a role in character level. This architecture consists of 29 convolutional layers. 5.3 Network in the network

Lin et al. (2013) proposed the network in network (Nin) in the network. NIN replaces the traditional convolutional neural network (CNN) convolutional layer with a complicated structure. It uses a multi -layer perception (MLPCONV) to treat micro -neural networks and global average pooling layers, not the full connection layer. Deep nin architecture can be composed of multiple overlapping of Nin structure.

5.4 Regional convolutional neural network

Girshick et al. (2014) proposed a regional convolutional neural network (R-CNN) to identify the area. The R-CNN use area to locate and divide the target. This architecture consists of three modules: defining the category independent area of the collection of the candidate area, and extract large convolutional neural networks (CNN) with characteristics from the region, and a set of specific linear support vector machines (SVM).

5.4.1 Fast R-CNN

Girshick (2015) proposed a rapidly regional convolutional network (Fast R-CNN). This method uses the R-CNN architecture to quickly generate results. FAST R-CNN consists of convolutional layers and pooling layers, regional recommendations and a series of full connection layers.

5.4.2 Faster R-CNN

Ren et al. (2015) proposed a faster regional convolutional neural network (Faster R-CNN), which uses the regional recommendation network (RPN) for real-time target testing. RPN is a full convolutional network that can generate regional suggestions accurately and efficiently (ren et al., 2015).

5.4.3 Mask R-CNN

He Kaiming and others (2017) proposed a regional-based Mask R-CNN instance target division. Mask R-CNN expands the architecture of the R-CNN and uses an additional branch to predict the target mask.

5.4.4 Multi-EXPERT R-CNN

Lee et al. (2017) proposed a regional multi-experts convolutional neural network (ME R-CNN), using the Fast R-CNN architecture. ME R-CNN generates interest areas (ROI) from selective and detailed search. It also uses Per-Road multi-expert network instead of a single PER-loi network. Each expert is the same architecture from Fast R-CNN.

5.5 Deep residual network

The residual network (ResNet) proposed by He et al. (2015) consists of 152 layers. Resnet has a lower error and is easy to train through residual learning. Deeper ResNet can get better performance. In the field of deep learning, people think that resnet is an important progress.

5.5.1 ResNet in ResNet

TARG et al. (2016) proposed in the Resnet in ResNet (RIR) to combine ResNets and standard convolutional neural networks (CNN) into a deep dual -flow architecture.

5.5.2 resnext

Xie et al. (2016) proposed a resone bunge. RESNEXT uses ResNets to reuse the segmentation-conversion-merger strategy.

5.6 Capsule Network

SABOUR et al. (2017) proposed a capsule network (CAPSNET), that is, a architecture containing two convolutional layers and a full connection layer. CAPSNET usually contains multiple convolutional layers, and the capsule layer is located at the end. CAPSNET is considered one of the latest breakthroughs in deep learning, because it is said to be based on the limitations of convolutional neural networks. It uses layers of capsules, not neurons. The activated lower -level capsules make predictions. After agreeing with multiple predictions, the more advanced capsules become active. A protocol route mechanism is used in these capsule layers. After Hinton proposed an EM routing, using the expected (EM) algorithm to improve the CAPSNET.

5.7 Cycling neural network

Circular neural network (RNN) is more suitable for sequence input, such as voice, text, and generating sequences. A repeated hidden unit can be considered a very deep feed network with the same weight when the time is unfolded. Due to the problem of gradient disappearance and dimension explosion, RNN was difficult to train. To solve this problem, many people later raised improvements. Goodfellow et al. (2016) analyzed the details of circulation and recursive neural networks and architecture in detail, as well as related door control and memory networks.

Karpathy et al. (2015) uses character -level language models to analyze and visual prediction, characterize training dynamics, RNN and its variants (such as LSTM) error types.

J´ozefowicz et al. (2016) discussed the limitations of the RNN model and language model.

5.7.1 RNN-EM

Peng and YAO (2015) proposed to use external memory (RNN-EM) to improve RNN's memory ability. They claim that they have reached the most advanced level in terms of language understanding, better than other RNNs.

5.7.2 GF-RNN

Chung et al. (2015) proposed door-control feedback recursive neural network (GF-RNN), which expanded the standard RNN by overlapping multiple recursive layers and global door control units.

5.7.3 CRF-RNN

Zheng et al. (2015) proposed that conditions random airports were used as CRF-RNN (CRF-RNN), which combined the convolutional neural network (CNN) and conditioned random airport (CRF) for probability graphics.

5.7.4 Quasi-RNN

Bradbury et al. (2016) proposed a quasi -circulating neural network (QRNN) for neurological sequence modeling and parallel application along time.

5.8 Memory Network

Weston et al. (2014) proposed a Q & A and Memory Network (QA). The memory network consists of memory, input feature mapping, generalization, output feature mapping and response.

5.8.1 Dynamic memory network

Kumar et al. (2015) proposed a dynamic memory network (DMN) for QA mission. DMN has four modules: input, problems, scenario memory, output.

5.9 Enhance the neural network

Olah and Carter (2016) show their attention and enhance the circulating neural network, that is, neural Turing machines (NTM), attention interface, neurocoder, and adaptive calculation time. Enhancement of neural networks usually use additional attributes, such as logical functions and standard neural network architecture.

5.9.1 Neurotransphrait machine

Graves et al. (2014) proposed a neural Turing (NTM) architecture, consisting of neural network controllers and memory libraries. NTM usually combines RNN with external memory.

5.9.2 Neurological GPU

Kaiser and Sutskever (2015) proposed neural GPUs to solve the parallel problem of NTM.

5.9.3 Nervous random access machine

Kurach et al. (2015) proposed neurological random access, which uses external variable -size random access memory.

5.9.4 Neurochor

Nelakantan et al. (2015) proposed neurobilion, an enhanced neural network with arithmetic and logical functions.

5.9.5 Neurochor-Interpreter

Reed and de Freitas (2015) proposed a learning neurobeer-interpreter (NPI). NPI includes cyclical kernel, program memory, and encoders -specific in the field.

5.10 Long -term memory network

Hochreiter and Schmidhuber (1997) proposed long -term memory (Long Short -Term Memory (LSTM), overcoming the error return of the circulating neural network (RNN). LSTM is based on a circular network and gradient -based learning algorithm. LSTM introduces the path of self -loop to generate paths, so that gradients can flow.

GREFF et al. (2017) conducted a large -scale analysis of standard LSTM and 8 LSTM variants, which were used for voice recognition, handwriting recognition, and complex music modeling. They claimed that the eight variants of LSTM did not improve significantly, but only the standard LSTM performed well.

Shi et al. (2016B) proposed a deep and short -term memory network (DLSTM), which is a stack of LSTM unit for characteristic mapping learning.

5.10.1 Batch-Retanidity LSTM

COOIJMANS et al. (2016) proposed a batch-normalization LSTM (BN-LSTM), which uses batch-normalization of the hidden state of recursive neural networks.

5.10.2 Pixel RNN

Van Den OORD et al. (2016B) proposed pixel recursive neural network (Pixel-RNN), consisting of 12 two-dimensional LSTM layers. 5.10.3 two -way LSTM

Wöollmer et al. (2010) proposed that the cycle network of two -way LSTM (BLSTM) was used for the context -sensitive keyword detection with the dynamic Bayesian network (DBN).

5.10.4 variational bi-lstm

Shabanian et al. (2017) proposed a variable bidirectional LSTM (Variational Bi-LSTM), which is a variant of a two-way LSTM architecture. Variational Bi-LSTM uses a variable-encoder (VAE) to create an information exchange channel between LSEM to learn better representation.

5.11 Google Nervous Machine Translation

WU et al. (2016) proposed an automatic translation system called Google Neuroma Machine Translation (GNMT). This system combines the encoder network, decoder network, and attention network to follow the common sequence-to-sequence (Sequence-to-Sequence ) Learning framework.

5.12 Fader Network

Lample et al. (2017) proposed Fader network, which is a new type of encoder-decoder architecture that generates real input image changes by changing the attribute value.

5.13 Super Network

HA et al. (2016) proposed by Hyper Networks (Hyper Networks) to generate power value for other neural networks, such as static super network convolution networks, dynamic super networks for circulating networks.

Deutsch (2018) uses a super network to generate neural networks.

5.14 highway networks

Srivastava et al. (2015) proposed Highway Networks to learn management information by using door control units. The information flow of multiple levels is called information high -speed road.

5.14.1 recurren highway networks

Zilly et al. (2017) proposed a recurring highway network (RHN), which expanded the long -term memory (LSTM) architecture. RHN uses the Highway layer during a periodic transition.

5.15 Highway lstm rnn

ZHANG et al. (2016) proposed High -Long Short Memory (HLSTM) RNN, which expanded a deep LSTM network with a closed direction connection (ie Highway) between the adjacent memory units.

5.16 Long -term cycle CNN

DonaHue et al. (2014) proposed a long -term cyclic convolution network (LRCN), which uses CNN to input, and then uses LSTM for recursive sequences to build and generate predictions.

5.17 Deep nerve SVM

ZHANG et al. (2015) proposed deep nerve SVM (DNSVM), which supports the top layer of the SUPPORT VECTOR MACHINE (SVM) as a deeper neural network.

5.18 convolutional residual memory network Moniz and PAL (2016) proposes convolutional residual memory networks to incorporate the memory mechanism into the convolutional neural network (CNN). It uses a long and short -term memory mechanism to enhance convolutional residual networks. 5.19 Frank network

Larsson et al. (2016) proposed a fractal network, that is, FractalNet as an alternative to the residual network. They claim that they can train ultra -deep neural networks without need to learn. Forming is a duplicate architecture generated by simple expansion rules.

5.20 wavenet

Van Den OORD et al. (2016) proposed a deep neural network WaveNet for generating original audio. WaveNet consists of a bunch of convolutional layers and SoftMax distribution layers for output.

Rethage et al. (2017) proposed a WaveNet model for speech noise.

5.21 pointer network

Vinyals et al. (2017) proposed the pointer network (PTR-Nets), which solves the problem of the exist variable dictionary by using a SoftMax probability distribution called a pointer.

6 Deep generation model

In this section, we will briefly discuss other deep architectures. They use multiple abstract layers and representation layers similar to deep neural networks, also known as Deep Generate Models (DGM). Bengio (2009) explains deep architecture, such as Boltzmann Machine (BM) and Restricted Boltzmann Machines (RBM) and its variants. Goodfellow et al. (2016) explained in detail the depth generation models, such as restricted and non -limited Boltzmann and its variants, deep Bolzmann, deep belief network (DBN), directional generation network and random generation random generation Network, etc.

Maal 等e et al. (2016) proposed auxiliary deep -generation model (Auxiliary Deep Generation Models). In these models, they expanded deep -generating models with auxiliary variables. Auxiliary variables use random layers and jump connections to generate variable distribution.

Rezende et al. (2016) developed a single generalization of deep -generating models.

6.1 Bolzmann

Bolzmann is a connection method for learning arbitrary probability distribution, and uses the maximum like graive principle for learning.

6.2 Restricted Bolzmann

Restricted Boltzmann Machines (RBM) is a special type of Markov Rarshand, which contains a layer of random hidden unit, which is a potential variable and a layer of observation variables.

Hinton and Salakhutdinov (2011) proposed a deep generation model that uses limited Bolzmann (RBM) for document processing.

6.3 In -depth belief network

Deep Belief Networks (DBN) is a generating model with multiple potential binary or real variable layers.

Ranzato et al. (2011) used Deep Belief Network (DBN) to establish deep -generating models for image recognition.

6.4 Deep Langbo Network

Tang et al. (2012) proposed Deep Lambertian Networks (DLN), which is a multi -layered model. The potential variables are anti -illumination rate, surface method line and light source. DLNIS is a combination of Langbo's reflectance and Gaussian restricted Boltzmann and a deep belief network.

6.5 Generate confrontation network

Goodfellow et al. (2014) proposed the generate adversarial nets (GAN), which is used to evaluate the production model by the confrontation process. The GAN architecture is composed of a generating model for an opponent (that is, a discriminatory model of a learning model or data distribution). Mao et al. (2016), Kim et al. (2017) have made more improvements to GAN.

Salimans et al. (2016) proposed several methods of training Gans.

6.5.1 Laplas generates confrontation network

Denton et al. (2015) proposed a deep generation model (DGM) called Laplas generated against network (LAPGAN), and using GAN (GAN) method. The model also uses convolutional networks in the Laplas pyramid framework.

6.6 Cycling support vector machine

SHI et al. (2016A) proposed a circular support vector machine (RSVM), using the circulating neural network (RNN) to extract features from the input sequence, and use the standard support vector machine (SVM) for sequence -level target recognition.

7 training and optimization technology

In this section, we will briefly outline some major technologies for regularization and optimization of deep neural networks (DNN).

7.1 Dropout

(DropConnect, etc. expanded ... too much)

Srivastava et al. (2014) proposed Dropout to prevent neural network fitting. Dropout is an average regularization method of neural network model, which is to increase noise to its hidden unit. During the training process, it randomly extracts units and connections from the neural network. Dropout can be used in graphic models such as RBM (Srivastava et al., 2014), and can also be used for any type of neural network. A recently proposed improvement of Dropout is the Fraternal Dropout, which is used for circulating neural network (RNN).

7.2 maxout

Goodfellow et al. (2013) proposed Maxout, a new activation function for Dropout. The output of Maxout is a set of input maximum values, which is conducive to the average model of Dropout. 7.3 zoneout

Krueger et al. (2016) proposed a regularized method of circulating neural network (RNN) Zoneout. Zoneout uses noise in training, similar to Dropout, but retains hidden units instead of discarding.

7.4 Deep residual learning

HE et al. (2015) proposed a deep residual learning framework, which is called a resnet of low training errors.

7.5 batch of normalization

(BN and various BND variants....)

Ioffe and Szegendy (2015) proposed a batch of return to the way to accelerate deep neural network training by reducing the internal synergy movement. Ioffe (2017) proposed a return to onehearted, extending the previous method.

7.6 DistillationHINTON et al. (2015) proposed a method of converting knowledge from a highly regular model (that is, neural network) to a compressed model. 7.7 layer back

BA et al. (2016) proposed a layer of regulation, especially for the deep neural network acceleration training for RNN, solving the limitations of batch regulation.

8 Deep Learning Framework

There are a large number of open source libraries and frameworks for deep learning. Most of them are built for Python programming language. Such as Theano, TensorFlow, PyTorch, Pybrain, Caffe, Blocks and Fuel, CUDNN, Honk, Charrinecv, Pylearn2, Charrier, Torch, etc.

9 Deep learning application

In this section, we will briefly discuss some outstanding applications recently in deep learning. Since the beginning of deep learning (DL), the DL method has been widely used in various fields in the form of supervision, non -supervision, semi -supervision, or strengthening learning. Starting from classification and detection tasks, DL applications are rapidly expanding to every field.

E.g:

Image classification and recognition

Video classification

Sequence

Defect category

Text, voice, image and video processing

Text Categorization

Voice treatment

Voice recognition and speaking understanding

Text to voice generation

Query classification

Sentence classification

Sentence modeling

Vocabulary

Pre -choice

Document and sentence processing

Generate image text description

Photo style migration

Natural image streaming

Image color

Image quiz

Generate texture and style image

Visual and text Q & A

Visual recognition and description

Target Recognition

Document processing

Character action synthesis and editor

Song synthesis

Identification

Face recognition and verification

Video action recognition

Human action recognition

Action recognition

Classification and visualization action capture sequence

Handwriting and prediction

Automation and machine translation

Naming entity identification

Mobile vision

Dialog

Call genetic mutation

Cancer test

X -ray CT reconstruction

Seizure prediction

Hardware Acceleration

robot

Wait.

DENG and YU (2014) provide detailed lists of DL in the fields of voice processing, information retrieval, target recognition, computer vision, multi -modal, multi -mission learning and other fields.

Mastering the game has become a hot topic today. Whenever, artificial intelligence robots are created with DNN and DRL. They have defeated the human world champion and chess master in strategic and other games, starting from several hours of training. For example, Gohago and AlphaGo Zero of Go.

10 discussion

Although deep learning has achieved great success in many fields, it still has a long way to go. There are many places to be improved. As for limitations, there are quite a few examples. For example: NGUYEN et al. Show deep neural networks (DNN) easily deceived when identifying images. There are other issues, such as the characteristics of learning raised by Yosinski and others. Huang and others proposed a architecture of neural network attack defense, thinking that future work needs to defend these attacks. ZHANG and others proposed an experimental framework for understanding deep learning models. They believe that understanding of deep learning needs to rethink and summarize.

Marcus conducted an important review of the role, limitations, and essence of deep learning (DL) in 2018. He strongly pointed out the limitations of the DL method, that is, more data, limited capacity, unable to process hierarchical structure, open reasoning, cannot be fully transparent, cannot integrate with priority knowledge, and cannot distinguish the cause and effect. He also mentioned that DL assumed a stable world, realized in approximation, engineering is difficult, and there are potential risks of excessive hype. Marcus believes that DL needs to re -conceptualize, find possibilities in non -supervision learning, symbolic operations, and hybrid models, obtain insights from cognitive science and psychology, and welcome more bold challenges. 11 conclusions

Although deep learning (DL) has promoted the development of the world faster than ever before, there are still many aspects worthy of our research. We still cannot fully understand deep learning. How can we make machines smarter, closer to or more smarter than humans, or learn like humans. DL has been solving many problems and applies technology to all aspects. But humans still face many problems, such as people who still die from hunger and food crisis, cancer and other fatal diseases. We hope that deep learning and artificial intelligence will be more committed to improving the quality of human life and through the most difficult scientific research. In the end, but also the most important, I hope our world will become better.

There are some missing, but basically the summary, nice

Article Source:

https://zhuanlan.zhihu.com/p/85625555

Edit: Yu Tengkai

- END -