基于轨迹大数据时空分布的索引与查询方法
CSTR:
作者:
作者单位:

1.北方工业大学信息学院,北京 100144;2.大规模流数据集成与分析技术北京市重点实验室(北方工业大学),北京 100144

通讯作者:

赵卓峰,男,研究员,E-mail:edzhao@ncut.edu.cn。

中图分类号:

TP311

基金项目:

北京市自然科学基金(4202021)。


Index and Query Method Based on Spatial-Temporal Distribution of Trajectory Big Data
Author:
Affiliation:

1.School of Information,North China University of Technolgy,Beijing 100144,China;2.Beijing Key Laboratory on Integration and Analysis of Large-Scale Stream Data (North China University of Technolgy),Beijing 100144,China

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [22]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    由于移动对象自身行为特征和整体规律的不同,使得其产生的轨迹数据具有较大的时空分布不均特点,从而影响轨迹数据索引和查询的效率。针对现有轨迹数据索引方法很少考虑轨迹数据分布不均特性的情况,提出了一种基于历史数据预分区的时空索引方法,其借助轨迹数据时空维度上分布的相似性,首先在空间上根据数据分布情况对Geohash编码进行预分区,进而建立轨迹数据的索引结构和基于HBase的存储模型,并利用该索引结构设计了基于Geohash分区的查询分解算法。基于真实出租车轨迹数据集的实验表明,相较于均匀划分的扩展的HGrid方法与混合编码的ST-hash方法,本文提出的索引结构及其查询方法可以有效提升海量具有不均匀特征轨迹数据的时空查询性能,并且可以在保证查询结果准确性的同时,最大限度地减少子查询的数量。

    Abstract:

    Due to the diversified behavioral characteristics and regular pattern of moving objects, the trajectory data generated by these objects shows obvious uneven distribution feature in time and space, which may lead to worse performance for trajectory data indexing and querying. However, the existing trajectory data indexing methods rarely consider this problem. In this paper, a temporal-spatial distribution based indexing and querying method is proposed. In the method, Geohash code is introduced and pre-partitioned spatially by utilizing the temporal-spatial similarity of the trajectory data distribution. Then, we use the pre-partitioned Geohash code, partition number and the trajectory data timestamp to compose the index structure. With this index structure, a storage model based on HBase and a query algorithm based on Geohash partition are designed respectively. The empirical study using real trajectory dataset shows that the method improves the spatiotemporal query performance of trajectory data by comparing with the Extend_HGrid and ST-hash methods, and effectively reduces the number of sub-queries during query.

    参考文献
    [1] 许佳捷, 郑凯, 池明旻, 等. 轨迹大数据:数据、应用与技术现状[J]. 通信学报, 2015, 36(12): 97-105.XU Jiajie,Zheng Kai,Chi Mingmin, et al. Trajectory big data: Data, applications and techniques[J]. Journal on Communications, 2015, 36(12): 97-105.
    [2] ZHENG Y. Trajectory data mining: An overview[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2015, 6(3): 1-41.
    [3] YUAN J, ZHENG Y, XIE X. Discovering regions of different functions in a city using human mobility and POIs[C]∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD’12. Beijing, China: ACM Press, 2012: 186-194.
    [4] ZHENG Y, XIE X, MA W Y. GeoLife: A collaborative social networking service among user, location and trajectory[J]. IEEE Data Eng Bull, 2010, 33(2): 32-39.
    [5] 夏冬. 基于RFID电子车牌数据的城市热点区域发现[D]. 重庆:重庆大学, 2018.Xia Dong. Discovery of urban hot spots based on RFID electronic license plate data[D]. Chongqing: Chongqing University, 2018.
    [6] HAN D, STROULIA E. Hgrid: A data model for large geospatial data sets in HBase[C]∥Proceedings of 2013 IEEE Sixth International Conference on Cloud Computing. [S.l.]:IEEE, 2013: 910-917.
    [7] THEODORIDIS Y, VAZIRGIANNIS M, SELLIS T. Spatio-temporal indexing for large multimedia applications[C]∥Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems. [S.l.]:IEEE, 1996: 441-448.
    [8] PFOSER D, JENSEN C S, THEODORIDIS Y. Novel approaches to the indexing of moving object trajectories[C]∥Proceedings of International Conference on Very Large Data Bases. Cario,Egypt:the VLDB Endownmen,2000: 395-406.
    [9] XU X, HAN J, LU W. RT-tree: An improved R-tree indexing structure for temporal spatial databases [C]∥Proceedings of The International Symposium on Spatial Data Handling (SDH). Zurich,Switzerland:Department of Geography University of Zurich, 2017: 1040-1049.
    [10] TAO Y, PAPADIAS D. Efficient historical R-trees[C]∥Proceedings of the Thirteenth International Conference on Scientific and Statistical Database Management(SSDBM 2001). [S.l.]:IEEE, 2001: 223-232.
    [11] HAVERKORT H, VAN WALDERVEEN F. Locality and bounding-box quality of two-dimensional space-filling curves[C]∥Proceedings of Algorithms—ESA 2008. Berlin,Heidelberg: Springer, 2008: 515-527.
    [12] MOON B, JAGADISH H V, FALOUTSOS C, et al. Analysis of the clustering properties of the Hilbert space-filling curve[J]. IEEE Transactions on Knowledge and Data Engineering, 2001, 13(1): 124-141.
    [13] HUGHES J N, ANNEX A, EICHELBERGER C N, et al. Geomesa: A distributed architecture for spatio-temporal fusion[C]∥Proceedings of Geospatial Informatics, Fusion, and Motion Video Analytics V. [S.l.]:International Society for Optics and Photonics, 2015: 94730F.
    [14] LI R, HE H, WANG R, et al. Just: Jd urban spatio-temporal data engine[C]∥Proceedings of 2020 IEEE 36th International Conference on Data Engineering (ICDE). [S.l.]:IEEE, 2020: 1558-1569.
    [15] LI R, HE H, WANG R, et al. Trajmesa: A distributed nosql storage engine for big trajectory data[C]∥Proceedings of 2020 IEEE 36th International Conference on Data Engineering (ICDE). [S.l.]:IEEE, 2020: 2002-2005.
    [16] GUAN X, BO C, LI Z, et al. ST-hash: An efficient spatiotemporal index for massive trajectory data in a NoSQL database[C]∥Proceedings of 2017 25th International Conference on Geoinformatics.Buffalo,NY,USA:IEEE,2017: 1-7.
    [17] UDDIN R, RAVISHANKAR C V, TSOTRAS V J. Indexing moving object trajectories with Hilbert curves[C]∥Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. New York, NY, USA: Association for Computing Machinery, 2018: 416-419.
    [18] WU Y, CAO X, AN Z. A Spatiotemporal trajectory data index based on the hilbert curve code[J]. IOP Conference Series: Earth and Environmental Science, 2020, 502: 012005.
    [19] CHEN S, OOI B C, TAN K L, et al. ST2B-tree: A self-tunable spatio-temporal b+-tree index for moving objects[C]∥Proceedings of the 2008 ACM SIGMOD International Conference on Management of data. New York, NY, USA: Association for Computing Machinery, 2008: 29-42.
    [20] GUO S, HUANG Z, JAGADISH H V, et al. Relaxed space bounding for moving objects: A case for the buddy tree[J]. ACM SIGMOD Record, 2006, 35(4): 24-29.
    [21] 邱峰. 基于HBase的时空数据的存储与查询技术研究[D]. 西安:西安电子科技大学, 2019.Qiu Feng. Research on storage and query technology of spatio-temporal data based on HBase[D]. Xi’an:Xidian University, 2019
    [22] 赵馨逸, 黄向东, 乔嘉林, 等. 基于不均匀空间划分和R树的时空索引[J]. 计算机研究与发展, 2019, 56(3): 666-676.Zhao Xinyi, Huang Xiangdong, Qiao Jialin, et al. Spatiotemporal index based on uneven space partition and R-tree[J]. Journal of Computer Research and Development, 2019, 56(3): 666-676.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

李征宇,赵卓峰.基于轨迹大数据时空分布的索引与查询方法[J].南京航空航天大学学报,2022,54(3):528-536

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-10-10
  • 最后修改日期:2022-01-23
  • 在线发布日期: 2022-06-05
文章二维码
您是第位访问者
网站版权 © 南京航空航天大学学报
技术支持:北京勤云科技发展有限公司
请使用 Firefox、Chrome、IE10、IE11、360极速模式、搜狗极速模式、QQ极速模式等浏览器,其他浏览器不建议使用!