首页 > 论文 > 光学与光电技术 > 17卷 > 2期(pp:26-33)

基于级联CNN的自然场景文本检测

Natural Scene Text Detection Method by Cascaded CNN

  • 摘要
  • 论文信息
  • 参考文献
  • 被引情况
  • PDF全文
分享:

摘要

目前CNN成为计算机视觉领域, 特别是目标对象检测技术的主流方法之一。自然场景中的文本信息与一般目标对象不同, 目标检测算法对自然场景文本检测的鲁棒性差, 检测结果中的细小文本区域容易漏检, 狭长文本区域检测的完整性较差。针对这一问题, 对自然场景文本信息特征分析, 提出了一种基于级联CNN的自然场景文本检测方法。该方法利用检测模型尽可能地发现疑似文本区域, 然后利用分类模型分类筛选出最终的文本区域。在SSD目标检测算法的基础上, 设计一种适用于自然场景文本的检测模型; 然后对检测模型得到的疑似文本区域使用非极大值抑制和融合操作, 消除重叠检测对结果的影响; 最后使用针对性训练的分类模型对得到的候选区域进行分类筛选, 得到最终的检测结果。该方法在数据集ICDAR2013上的召回率、准确率和F值分别为0.77、0.81和0.79, 对于自然场景图像的文本检测有着较强的鲁棒性, 能够有效地检测到图中细小的文本区域, 明显改善狭长文本区域检测不全的情况。

Abstract

CNN has become one of the mainly methods in the field of computer vision, especially in the field of targets detection. Compared of the objects in targets detection, the text in natural scene has lots of differences, so the method of targets detection has low robustness when using for text detection. And the text detection results of targets detection methods are poor, because the small text is usually dropped and narrow text area is incomplete. For this problem, the text feature in the natural scene is analyzed, and a text detection method by cascaded CNN is proposed, which uses a detection model to find more text area as far as possible and uses a classification model to screen out final result. Firstly, a detection method on the basis of SSD is proposed, which is applicable for natural scene. Then, in order to eliminate the effect of overlapping detection, the method of non-maximum suppression and regional integration is used. Finally, the candidate region are classified by ResNet classification model to receive finally results. The proposed natural scene text detection method achieves 0.77, 0.81, and 0.79 in recall rate, precision rate, and F-score on the ICDAR 2013 database, respectively, and the method is robust for natural scene images, which can effectively detect the small text area and confirm the completeness of narrow text area. The experimental results show the effectiveness of the proposed method.

Newport宣传-MKS新实验室计划
补充资料

中图分类号:TP391.4

所属栏目:激光技术与应用

基金项目:国家科技重大专项(2017ZX01030102), 国家测绘地理信息局卫星测绘技术与应用重点实验室经费(KLSMTA-201702)资助项目

收稿日期:2018-08-20

修改稿日期:2018-10-09

网络出版日期:--

作者单位    点击查看

易尧华:武汉大学印刷与包装系彩色数字成像实验室, 湖北 武汉 430079
梁正宇:武汉大学印刷与包装系彩色数字成像实验室, 湖北 武汉 430079
胡 越:陆军工程大学基础部, 江苏 南京 210007
卢利琼:武汉大学印刷与包装系彩色数字成像实验室, 湖北 武汉 430079

联系人作者:易尧华(yyh@whu.edu.cn)

备注:易尧华(1976-), 男, 教授, 博士生导师, 主要研究方向为彩色数字成像技术。

【1】Ye Q, Doermann D. Text detection and recognition in imagery: A survey[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 37(7): 1480-1500.

【2】Yin X C, Zuo Z Y, Tian S, et al. Text detection, tracking and recognition in vedio: A comprehensive survey[J]. IEEE Transations on Image Processing, 2016, 25(6): 2752-2773.

【3】Zhang Z, Shen W, Yao C, et al. Symmetry-based text line detection in natural scenes[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2015: 2558-2567.

【4】Neumann L, Matas J. A method for text localization and recognition in real-world images[C]// Computer Vision-ACCV 2010-, Asian Conference on Computer Vision, Queenstown, New Zealand, November 8-12, 2010, Revised Selected Papers. DBLP, 2011: 770-783.

【5】Ye Q, Doermann D. Scene text detection via integrated discrimination of component appearance and consensus[C]// International Workshop on Camera-Based Document Analysis and Recognition. Springer, Cham., 2013: 47-59.

【6】Gupta A, Vedaldi A, Zisserman A. Synthetic data for text localisation in natural images[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016: 2315-2324.

【7】He P, Huang W, He T, et al. Single shot text detector with regional attention[C]// IEEE International Conference on Computer Vision. IEEE Computer Society, 2017: 3066-3074.

【8】Liu Y, Jin L. Deep matching prior network: Toward tighter multi-oriented text detection[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2017: 3454-3461.

【9】Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]// European Conference on Computer Vision. Springer International Publishing, 2016: 21-37.

【10】Fang Q, Yang Y, Chen Y, et al. A fast method for scene text detection[C]// CCF Chinese Conference on Computer Vision. Springer, Singapore, 2017: 738-747.

【11】Zhong Z, Jin L, Huang S. Deep text: A new approach for text proposal generation and text detection in natural images[C]// IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2017: 1-18.

【12】Hosang J, Benenson R, Schiele B. Learning non-maximum Suppression[C]. IEEE 2017 Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 6469-6477.

【13】Karatzas D, Shafait F, Uchida S, et al. ICDAR 2013 robust reading competition[C]// International Conference on Document Analysis and Recognition. IEEE Computer Society, 2013: 1484-1493.

【14】Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137-1149.

【15】Busta M, Neumann L, Matas J. FAS text: Efficient unconstrained scene text detector[C]// IEEE International Conference on Computer Vision. IEEE Computer Society, 2015: 1206-1214.

【16】范一华, 邓德祥, 颜佳. 基于色彩空间的最大稳定极值区域的自然场景文本检测[J]. 计算机应用, 2018, 38(1): 264-269.
FAN Yi-hua, DENG De-xiang, YAN Jia. Natural scene text detection based on maximally stable extremal region in color space[J]. Journal of Computer Applications, 2018, 38(1): 264-269.

【17】易尧华, 申春辉, 刘菊华, 等. 结合MSCRs与MSERs的自然场景文本检测[J]. 中国图象图形学报, 2017, 22(2): 154-160.
YI Yao-hua, SHEN Chun-hui, LIU Ju-hua, et al. Natural scene text detection method by integrating MSCRs into MSERs[J]. Journal of Image and Graphics, 2017, 22(2): 154-160.

引用该论文

YI Yao-hua,LIANG Zheng-yu,HU Yue,LU Li-qiong. Natural Scene Text Detection Method by Cascaded CNN[J]. OPTICS & OPTOELECTRONIC TECHNOLOGY, 2019, 17(2): 26-33

易尧华,梁正宇,胡 越,卢利琼. 基于级联CNN的自然场景文本检测[J]. 光学与光电技术, 2019, 17(2): 26-33

您的浏览器不支持PDF插件,请使用最新的(Chrome/Fire Fox等)浏览器.或者您还可以点击此处下载该论文PDF