Author Affiliations
Abstract
1 Jiangsu Key Laboratory of Medical Optics, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou 215163, P. R. China
2 School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China Hefei 230026, P. R. China
3 Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, P. R. China
The prediction of fundus fluorescein angiography (FFA) images from fundus structural images is a cutting-edge research topic in ophthalmological image processing. Prediction comprises estimating FFA from fundus camera imaging, single-phase FFA from scanning laser ophthalmoscopy (SLO), and three-phase FFA also from SLO. Although many deep learning models are available, a single model can only perform one or two of these prediction tasks. To accomplish three prediction tasks using a unified method, we propose a unified deep learning model for predicting FFA images from fundus structure images using a supervised generative adversarial network. The three prediction tasks are processed as follows: data preparation, network training under FFA supervision, and FFA image prediction from fundus structure images on a test set. By comparing the FFA images predicted by our model, pix2pix, and CycleGAN, we demonstrate the remarkable progress achieved by our proposal. The high performance of our model is validated in terms of the peak signal-to-noise ratio, structural similarity index, and mean squared error.
Fundus fluorescein angiography image fundus structure image image translation unified deep learning model generative adversarial networks Journal of Innovative Optical Health Sciences
2024, 17(3): 2450003
1 西京学院 材料与能源科学技术研究院,陕西 西安 710123
2 北京星航机电装备有限公司,北京 100074
3 西北工业大学 光电与智能研究院,陕西 西安 710072
在计算机视觉领域中,基于孪生网络的跟踪算法相比于传统算法提高了精度和速度,但是仍会受到目标遮挡、变形、环境变化等影响,导致孪生网络的跟踪算法的性能降低。为了深入了解基于孪生网络的单目标跟踪算法,本文对现有基于孪生网络目标跟踪算法进行了总结和分析,主要包括在孪生网络中引入注意力机制方法、超参数推理方法和模板更新方法,对这3种方法的目标跟踪算法进行了综述,详细介绍了国内外近几年基于孪生网络的算法研究和发展现状。对3个方面的代表算法采用VOT2016、VOT2017、VOT2018和OTB-2015数据集进行实验对比,获得了多种基于孪生网络的目标跟踪算法的性能。最后对基于孪生网络的目标跟踪算法进行了总结,并对未来的发展方向进行了展望。
计算机视觉 目标跟踪 孪生网络 深度学习 computer vision target tracking Siamese networks deep learning
1 辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105
2 汕头职业技术学院 计算机系,广东 汕头 515071
现有的层级式文本生成图像的方法在初始图像生成阶段仅使用上采样进行特征提取,上采样过程本质是卷积运算,卷积运算的局限性会造成全局信息被忽略并且远程语义无法交互。虽然已经有方法在模型中加入自注意力机制,但依然存在图像细节缺失、图像结构性错误等问题。针对上述存在的问题,提出一种基于自监督注意和图像特征融合的生成对抗网络模型SAF-GAN。将基于ContNet的自监督模块加入到初始特征生成阶段,利用注意机制进行图像特征之间的自主映射学习,通过特征的上下文关系引导动态注意矩阵,实现上下文挖掘和自注意学习的高度结合,提高低分辨率图像特征的生成效果,后续通过不同阶段网络的交替训练实现高分辨率图像的细化生成。同时加入了特征融合增强模块,通过将模型上一阶段的低分辨率特征与当前阶段的特征进行融合,生成网络可以充分利用低层特征的高语义信息和高层特征的高分辨率信息,更加保证了不同分辨率特征图的语义一致性,从而实现高分辨率的逼真的图像生成。实验结果表明,相较于基准模型(AttnGAN),SAF-GAN模型在IS和FID指标上均有改善,在CUB数据集上的IS分数提升了0.31,FID指标降低了3.45;在COCO数据集上的IS分数提升了2.68,FID指标降低了5.18。SAF-GAN模型能够有效生成更加真实的图像,证明了该方法的有效性。
计算机视觉 生成对抗网络 文本生成图像 CotNet 图像特征融合 computer vision generative adversarial networks text-to-image cotnet image feature fusion
Wei Yin 1,2,3†Yuxuan Che 1,2,3†Xinsheng Li 1,2,3Mingyu Li 1,2,3[ ... ]Chao Zuo 1,2,3,****
Author Affiliations
Abstract
1 Smart Computational Imaging Laboratory (SCILab), School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
2 Smart Computational Imaging Research Institute (SCIRI) of Nanjing University of Science and Technology, Nanjing 210019, China
3 Jiangsu Key Laboratory of Spectral Imaging & Intelligent Sense, Nanjing 210094, China
4 Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam, Hong Kong SAR 999077, China
Recently, deep learning has yielded transformative success across optics and photonics, especially in optical metrology. Deep neural networks (DNNs) with a fully convolutional architecture (e.g., U-Net and its derivatives) have been widely implemented in an end-to-end manner to accomplish various optical metrology tasks, such as fringe denoising, phase unwrapping, and fringe analysis. However, the task of training a DNN to accurately identify an image-to-image transform from massive input and output data pairs seems at best na?ve, as the physical laws governing the image formation or other domain expertise pertaining to the measurement have not yet been fully exploited in current deep learning practice. To this end, we introduce a physics-informed deep learning method for fringe pattern analysis (PI-FPA) to overcome this limit by integrating a lightweight DNN with a learning-enhanced Fourier transform profilometry (LeFTP) module. By parameterizing conventional phase retrieval methods, the LeFTP module embeds the prior knowledge in the network structure and the loss function to directly provide reliable phase results for new types of samples, while circumventing the requirement of collecting a large amount of high-quality data in supervised learning methods. Guided by the initial phase from LeFTP, the phase recovery ability of the lightweight DNN is enhanced to further improve the phase accuracy at a low computational cost compared with existing end-to-end networks. Experimental results demonstrate that PI-FPA enables more accurate and computationally efficient single-shot phase retrieval, exhibiting its excellent generalization to various unseen objects during training. The proposed PI-FPA presents that challenging issues in optical metrology can be potentially overcome through the synergy of physics-priors-based traditional tools and data-driven learning approaches, opening new avenues to achieve fast and accurate single-shot 3D imaging.
optical metrology deep learning physics-informed neural networks fringe analysis phase retrieval Opto-Electronic Advances
2024, 7(1): 230034
1 重庆邮电大学通信与信息工程学院,重庆 400065
2 重庆邮电大学智能通信与网络安全研究院,重庆 400065
提出了一种渐进式训练方案来重新配置马赫-曾德尔干涉仪(MZI)前馈光学神经网络(ONN)的相移,从而对抗MZI的相位误差和分束器误差,提高识别准确率。为了验证所提方案,利用Neuroptica Python仿真平台搭建了3层MZI-ONN结构,并在考虑到MZI相位误差和分束器误差的情况下,利用Iris和MNIST数据集验证了所提方案的有效性。仿真结果表明:在Iris数据集下,对于3层4×4 MZI-ONN结构,所提方案的识别准确率能够提升64.15百分点;在MNIST数据集下,对于4×4、6×6、8×8和16×16规模的MZI-ONN,所提方案的识别准确率能够提升2.00~37.00百分点。所提方案极大地提高了MZI-ONN的抗误差性能,有助于未来大规模、高准确率MZI-ONN的实现。
光计算 马赫-曾德尔干涉仪 光学神经网络 相位误差 分束器误差 渐进式训练 抗误差
光通信研究
2024, 50(2): 22008801
1 西安工业大学 光电工程学院, 陕西西安70032
2 西安交通大学 机械制造系统工程国家重点实验室, 陕西西安710049
为满足点衍射干涉测量对解包算法高精度、高效以及抗干扰的检测需求。提出一种基于空洞空间卷积的相移点衍射干涉图像的相位解包方法,通过将自编码器结构和空洞空间卷积结合获得更高的相位解包精度,实现对包裹相位图像可控的多尺度特征提取。根据点衍射图像特点制作的大量多样化数据集对其进行训练和优化,从而实现准确识别包裹相位所在阶次,最终可以快速处理包裹图像得到高精度的解包结果。利用所提方法对实际点衍射干涉图像进行处理,并与ESDI专业干涉图像处理软件以及其他解包算法处理结果进行比对,结果表明:本文解包结果与软件枝切法解包处理结果均方根误差值为0.022 2 rad,面形拟合结果与软件面形拟合结果峰谷差值仅为0.012 1λ、均方根差值仅为0.004 2λ;时间效率上,完成一幅图像的处理平均仅需0.035 s,而传统方法均大于1 s。与其他方法相比,所提方法在处理包裹相位方面具有快速、高精度的特性,为点衍射干涉图像处理的高精度相位解包提供了新的可行方案。
干涉测量 面形检测 干涉成像 神经网络 相位解包 interferometry surface measurement interference fringe neural networks phase unwrapping