光谱学与光谱分析, 2016, 36 (8): 2651, 网络出版: 2016-12-23   

一种基于Map/Reduce分布式计算的恒星光谱分类方法

A Method of Stellar Spectral Classification Based on Map/Reduce Distributed Computing
作者单位
1 山东大学(威海)机电与信息工程学院, 山东 威海 264209
2 中国科学院光学天文重点实验室, 国家天文台, 北京 100012
3 烟台大学计算机与控制工程学院, 山东 烟台 264005
摘要
天体光谱中蕴含着非常丰富的天体物理信息, 通过对光谱的分析, 可以得到天体的物理信息、 化学成分以及天体的大气参数等。 随着LAMOST和SDSS等大规模巡天望远镜的实施, 将会产生海量的光谱数据, 尤其是LAMOST正式运行后, 每个观测夜产生大约2~4万条光谱数据。 如此海量的光谱数据对光谱的快速有效的处理提出了更高的要求。 恒星光谱的自动分类是光谱处理的一项基本内容, 该研究主要工作就是研究海量恒星光谱的自动分类技术。 Lick线指数是在天体光谱上定义的一组用以描述光谱中谱线强度的标准指数, 代表光谱的物理特性, 以每个线指数最突出的吸收线命名, 是一个相对较宽的光谱特征。 研究了基于Lick线指数的贝叶斯光谱分类方法, 对F, G, K三类恒星进行分类。 首先, 计算各类光谱的Lick线指数作为特征向量, 然后利用贝叶斯分类算法对三类恒星进行分类。 针对海量光谱的情况, 基于Hadoop平台实现了Lick线指数的计算, 以及利用贝叶斯决策进行光谱分类的方法。 利用Hadoop HDFS高吞吐率和高容错性的特点, 结合Hadoop MapReduce编程模型的并行优势, 提高了对大规模光谱数据的分析和处理效率。 该研究的创新点为: (1) 以Lick线指数作为特征, 基于贝叶斯算法实现恒星光谱分类; (2) 基于Hadoop MapReduce分布式计算框架实现Lick线指数的并行计算以及贝叶斯分类过程的并行化。
Abstract
Celestial spectrum contains a great deal of astrophysical information. Through the analysis of spectra, people can get the physical information of celestial bodies, as well as their chemical composition and atmospheric parameters. With the implementation of LAMOST, SDSS telescopes and other large-scale surveys, massive spectral data will be produced, especially along with the formal operation of LAMOST, 2 000 to 4 000 spectral data will be generated each observation night. It requires more efficient processing technology to cope with such massive spectra. Automatic classification of stellar spectra is a basic content of spectral processing. The main purpose of this paper is to research the automatic classification of massive stellar spectra. The Lick index is a set of standard indices defined in astronomical spectra to describe the spectral intensity of spectral lines, which represent the physical characteristics of spectra. Lick index is a relatively wide spectral characteristics, each line index is named after the most prominent absorption line. In this paper, the Bayesian method is used to classify stellar spectra based on Lick line index, which divides stellar spectra to three subtypes: F, G, K. First of all, Lick line index of spectra is calculated as the characteristic vector of spectra, and then Bayesian method is used to classify these spectra. For massive spectra, the computation of Lick indices and the spectral classification using Bayesian decision method are implemented on Hadoop. With use of the high throughput and good fault tolerance of HDFS, combined with the advantages of MapReduce parallel programming model, the efficiency of analysis and processing for massive spectral data have been improved significantly. The main innovative contributions of this thesis are as follows. (1) Using Lick indices as the characteristic to classify stellar spectra based on Bayesian decision method. (2) Implementing parallel computation of Lick indices and parallel classification of stellar spectra using Bayesian based on Hadoop MapReduce distributed computing framework.

潘景昌, 王杰, 姜斌, 罗阿理, 韦鹏, 郑强. 一种基于Map/Reduce分布式计算的恒星光谱分类方法[J]. 光谱学与光谱分析, 2016, 36(8): 2651. PAN Jing-chang, WANG Jie, JIANG Bin, LUO A-li, WEI Peng, ZHENG Qiang. A Method of Stellar Spectral Classification Based on Map/Reduce Distributed Computing[J]. Spectroscopy and Spectral Analysis, 2016, 36(8): 2651.

本文已被 1 篇论文引用
被引统计数据来源于中国光学期刊网
引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!