News classification
Contact us
- Add: No. 9, North Fourth Ring Road, Haidian District, Beijing. It mainly includes face recognition, living detection, ID card recognition, bank card recognition, business card recognition, license plate recognition, OCR recognition, and intelligent recognition technology.
- Tel: 13146317170 廖经理
- Fax:
- Email: 398017534@qq.com
Image segmentation based on depth learning
Image segmentation based on depth learning
Image segmentation
Saliency: the area of object that is most visible to human visual attention
Object segmentation: Graph Cut/Grab Cut
Semantic segmentation: manual features + graph model / recognition of image content
Deep learning commonly used three large data sets (Pascal VOC, MSCOCO, Cityscapes)
1. significant detection: modified by VGG network
Foreground Background Segmentation foreground contains objects, need to interact and provide markup.
2. object segmentation:
Graph Cut segmentation: a graph based segmentation method (connected by pixels)
Grab Cut segmentation: foreground, background color model
Gauss mixed model
K-means algorithm
1. optimization of foreground background color model
2. energy decreases with iteration
The 3. segmentation effect is getting better and better
3. semantic segmentation:
From pixel level, understanding, and recognition of picture content
Segmentation based on semantic information
Previous (before 2015) manual feature + graph model (CRF)
A deep neural network model started in 2015
The traditional CNN problem: there is no space information in the second half, and the input size is fixed.
DEMO
DeeLab
1. optimized DCNN+ traditional CRF (conditional random field)
2. new up sampling convolution: expansion convolution with hole structure
3. multi scale picture expression: space Pyramid pooling
4. boundary segmentation Optimization: using full connection with the airport CRF for iterative optimization
Module 1:DCNN output rough segmentation result
Module 2: refined segmentation results under full connection CRF
Hole algorithm: 1. solve the problem of non intensive output
2. reducing the reduced sampling multiplier of the pool layer
VGG 16 step length from 2->1
3. reduce the drop sampling Multiplier: 32->8
4. the receptive field will be affected
Change the structure with hole, convolution (hole)
Unsampled function
Guarantee the final dense output of the network (only 8 times the drop sampling)
The pooling of DCNN space in Pyramid
Different receptive fields (rate) capture characteristics on different scales
4 expansion convolutions are introduced in the conv6 layer
Rate:6,12,18,24
Full convolution network: all layers are convolution layer to solve the low resolution problem after falling sampling (AlexNet construction FCN)
Convolution: converting all fully connected layers into coiling layers to adapt to any size of input and output low resolution images
Deconvolution: a low resolution picture is sampled to output the image of the same resolution.
Layer hopping mechanism: refined segmentation image (directly with 32 times deconvolution, using the first two coils to output fusion)
Full connection CRF
An iterative refinement of the segmentation effect (restoring the exact boundary)
Input: 8 times bilinear interpolation of FCN network output results
Last round of iterative results
The energy calculation is based on the picture RGB pixel value
Deep learning for image segmentation
What are the main research directions of image segmentation at present?
Look at the model zoo of Caffe, which has the FCN parameters of release, which should be the level of the current frontier.
Now there's a lot of new things, DeepLab, what, maybe the frame and the FCN can't actually be too bad
Saliency: the area of object that is most visible to human visual attention
Object segmentation: Graph Cut/Grab Cut
Semantic segmentation: manual features + graph model / recognition of image content
Deep learning commonly used three large data sets (Pascal VOC, MSCOCO, Cityscapes)
1. significant detection: modified by VGG network
Foreground Background Segmentation foreground contains objects, need to interact and provide markup.
2. object segmentation:
Graph Cut segmentation: a graph based segmentation method (connected by pixels)
Grab Cut segmentation: foreground, background color model
Gauss mixed model
K-means algorithm
1. optimization of foreground background color model
2. energy decreases with iteration
The 3. segmentation effect is getting better and better
3. semantic segmentation:
From pixel level, understanding, and recognition of picture content
Segmentation based on semantic information
Previous (before 2015) manual feature + graph model (CRF)
A deep neural network model started in 2015
The traditional CNN problem: there is no space information in the second half, and the input size is fixed.
DEMO
DeeLab
1. optimized DCNN+ traditional CRF (conditional random field)
2. new up sampling convolution: expansion convolution with hole structure
3. multi scale picture expression: space Pyramid pooling
4. boundary segmentation Optimization: using full connection with the airport CRF for iterative optimization
Module 1:DCNN output rough segmentation result
Module 2: refined segmentation results under full connection CRF
Hole algorithm: 1. solve the problem of non intensive output
2. reducing the reduced sampling multiplier of the pool layer
VGG 16 step length from 2->1
3. reduce the drop sampling Multiplier: 32->8
4. the receptive field will be affected
Change the structure with hole, convolution (hole)
Unsampled function
Guarantee the final dense output of the network (only 8 times the drop sampling)
The pooling of DCNN space in Pyramid
Different receptive fields (rate) capture characteristics on different scales
4 expansion convolutions are introduced in the conv6 layer
Rate:6,12,18,24
Full convolution network: all layers are convolution layer to solve the low resolution problem after falling sampling (AlexNet construction FCN)
Convolution: converting all fully connected layers into coiling layers to adapt to any size of input and output low resolution images
Deconvolution: a low resolution picture is sampled to output the image of the same resolution.
Layer hopping mechanism: refined segmentation image (directly with 32 times deconvolution, using the first two coils to output fusion)
Full connection CRF
An iterative refinement of the segmentation effect (restoring the exact boundary)
Input: 8 times bilinear interpolation of FCN network output results
Last round of iterative results
The energy calculation is based on the picture RGB pixel value
Deep learning for image segmentation
What are the main research directions of image segmentation at present?
Look at the model zoo of Caffe, which has the FCN parameters of release, which should be the level of the current frontier.
Now there's a lot of new things, DeepLab, what, maybe the frame and the FCN can't actually be too bad