# 设为星标 走进我们 #
The Photogrammetric Record 2024年第3期(第39卷,187期)已正式出版。本期一共收录学术论文8篇。欢迎各位专家学者登录期刊网站以了解期刊最新信息,欢迎大家查询、交流和积极投稿。
Photogrammetric Record 期刊由Wiley出版社负责出版发行,是英国摄影测量与遥感学会会刊之一。期刊包含了原创的、独立的摄影测量、三维成像、遥感、计算机视觉、激光扫描、地理信息以及其他与地理信息相关领域的文章,充分体现了现代地理信息学的进步。期刊由武汉大学遥感信息工程学院院长张永军教授和纽约大学 Debra F. Laefer 教授任共同主编,由40余名国际资深专家组成期刊编委团队,共同保障高效、公正的稿件审理工作。本期刊出的学术论文整理如下:
Comparison of 3D modelswith texture before and after restoration.
Real-scene 3D building models were reconstructed using three oblique photogrammetric image datasets from the official website of the Wingtra drone.
Top: Example 1: Comparison of a 3D building model in Solothurn, Switzerland before and after texture restoration. (a) represents the model before optimization. (b) shows a comparison of details. (c) represents the optimized model.
Middle: Example 2: Comparison of a 3D building model in the Digital Twin of Zurich City before and after texture restoration. (d) represents the model before optimization. (e) shows a comparison of details. (f) represents the optimized model.
Bottom: Example 3: Comparison of a 3D building model in Solothurn, Switzerland before and after texture restoration. (g) represents the model before optimization. (h) shows a comparison of details. (i) represents the optimized model.
For a full report see: Lv, K., Chen, L., He, H., Zhou, F. & Yu, S., 2024. Optimisation of real-scene 3D building models based on straight-line constraints. The Photogrammetric Record, 39, 680-704.
A photogrammetric approach for real-time visual SLAM applied to an omnidirectional system
Garcia, T.A.C., Tommaselli, A.M.G., Castanheiro, L.F. & Campos, M.B. (2024) A photogrammetric approach for real-time visual SLAM applied to an omnidirectional system. The Photogrammetric Record, 39, 577–599. Available from: https://doi.org/10.1111/phor.12494
Abstract: The problem of sequential estimation of the exterior orientation of imaging sensors and the three-dimensional environment reconstruction in real time is commonly known as visual simultaneous localisation and mapping (vSLAM). Omnidirectional optical sensors have been increasingly used in vSLAM solutions, mainly for providing a wider view of the scene, allowing the extraction of more features. However, dealing with unmodelled points in the hyperhemispherical field poses challenges, mainly due to the complex lens geometry entailed in the image formation process. To address these challenges, the use of rigorous photogrammetric models that appropriately handle the geometry of fisheye lens cameras can overcome these challenges. Thus, this study presents a real-time vSLAM approach for omnidirectional systems adapting ORB-SLAM with a rigorous projection model (equisolid-angle). The implementation was conducted on the Nvidia Jetson TX2 board, and the approach was evaluated using hyperhemispherical images captured by a dual-fisheye camera (Ricoh Theta S) embedded into a mobile backpack platform. The trajectory covered a distance of 140 m, with the approach demonstrating accuracy better than 0.12 m at the beginning and achieving metre-level accuracy at the end of the trajectory. Additionally, we compared the performance of our proposed approach with a generic model for fisheye lens cameras.
摘要:实时顺序估算成像传感器的外部方位以及三维环境重建的问题,通常被称作视觉SLAM(同步定位与建图)。在视觉SLAM解决方案中,全向光学传感器的使用日益增加,主要因其提供了更广阔的场景视野,允许提取更多特征。然而,在处理超半球视场中未模拟的点时面临挑战,主要由于影像形成过程中复杂的镜头几何结构。为了应对这些挑战,采用精确的摄影测量学模型,妥善处理鱼眼镜头相机的几何结构,可以克服这些难题。因此,本研究提出了一种适用于全向系统的实时视觉SLAM方法,该方法利用了一种严格的投影模型(等立体角)对ORB-SLAM进行了调整。实验在Nvidia Jetson TX2板上进行,通过使用嵌入到移动背包平台的双鱼眼相机(Ricoh Theta S)捕获的超半球影像来评估该方法。该轨迹覆盖了140米距离,方法在起始处的精度优于0.12米,并在轨迹终点实现了米级精度。此外,我们将所提出的方法与鱼眼镜头相机的通用模型的性能进行了比较。
This study presents a real-time vSLAM foromnidirectional systems, adapting ORB-SLAM with a rigorous equisolid-angleprojection model. Analysing a 140 m trajectory, it shows fewer discrepanciesusing hyperhemispherical images, outperforming a generic model like EUCM.Accuracy started at sub-0.12 m, reaching metre-level at the end.
Keywords: backpack systems,fisheye lenses, omnidirectional system, ORB-SLAM, real time, Ricoh Theta S,SLAM
A disparity-aware Siamese network for building change detection in bi-temporal remote sensing images
Yansheng Li, Xinwei Li, Wei Chen & Yongjun Zhang. (2024) A disparity-aware Siamese network for building change detection in bi-temporal remote sensing images. The Photogrammetric Record, 39, 528–548. Available from: https://doi.org/10.1111/phor.12495
Abstract: Building change detection has various applications, such as urban management and disaster assessment. Along with the exponential growth of remote sensing data and computing power, an increasing number of deep-learning-based remote sensing building change detection methods have been proposed in recent years. Objectively, the overwhelming majority of existing methods can perfectly deal with the change detection of low-rise buildings. By contrast, high-rise buildings often present a large disparity in multitemporal high-resolution remote sensing images, which degrades the performance of existing methods dramatically. To alleviate this problem, we propose a disparity-aware Siamese network for detecting building changes in bi-temporal high-resolution remote sensing images. The proposed network utilises a cycle-alignment module to address the disparity problem at both the image and feature levels. A multi-task learning framework with joint semantic segmentation and change detection loss is used to train the entire deep network, including the cycle-alignment module in an end-to-end manner. Extensive experiments on three publicly open building change detection datasets demonstrate that our method achieves significant improvements on datasets with severe building disparity and state-of-the-art performance on datasets with minimal building disparity simultaneously.
Keywords: bi-temporal high-resolution remote sensing images, building change detection, disparity-aware Siamese network
Detecting change ingraffiti using a hybrid framework
Wild, B., Verhoeven, G., Muszyński, R. & Pfeifer, N. (2024) Detecting change in graffiti using a hybrid framework. The Photogrammetric Record, 39, 549–576. Available from: https://doi.org/10.1111/phor.12496
Abstract: Graffiti, by their very nature, are ephemeral, sometimes even vanishing beforecreators finish them. This transience is part of graffiti's allure yetsignifies the continuous loss of this often disputed form of cultural heritage.To counteract this, graffiti documentation efforts have steadily increased overthe past decade. One of the primary challenges in any documentation endeavouris identifying and recording new creations. Image-based change detection cangreatly help in this process, effectuating more comprehensive documentation,less biased digital safeguarding and improved understanding of graffiti. Thispaper introduces a novel and largely automated image-based graffiti changedetection method. The methodology uses an incremental structure-from-motionapproach and synthetic cameras to generate co-registered graffiti images fromdifferent areas. These synthetic images are fed into a hybrid change detectionpipeline combining a new pixel-based change detection method with afeature-based one. The approach was tested on a large and publicly availablereference dataset captured along the Donaukanal (Eng. Danube Canal), one ofVienna's graffiti hotspots. With a precision of 87% and a recall of 77%, theresults reveal that the proposed change detection workflow can indicate newlyadded graffiti in a monitored graffiti-scape, thus supporting a morecomprehensive graffiti documentation.
本文介绍了一种新颖的、大部分自动化的基于影像的涂鸦变化检测方法。该方法使用增量结构从运动方法和合成相机来生成不同区域的共同配准涂鸦影像。这些合成影像被输入到一个结合了新的基于像素的变化检测方法和基于特征的变化检测方法的混合变化检测流程中。该方法在一个大型且公开可用的参考数据集上进行了测试,该数据集沿维也纳的涂鸦热点地区多瑙河运河(英语:Danube Canal)捕获。结果显示,以87%的精度和77%的召回率,提出的变化检测工作流程可以指示在被监控的涂鸦景观中新添加的涂鸦,从而支持更全面的涂鸦记录。
This paper presents an image-based graffitichange detection method, employing an incremental structure-from-motionapproach and a novel hybrid change detection method. Tested on Vienna'sDonaukanal graffiti hotspot, the approach achieved 87% precision and 77%recall, facilitating more comprehensive documentation of this dynamic form ofcultural heritage.
Keywords: 3Dmodelling, change detection, colour difference, cultural heritage, digitalimaging, edge-aware smoothing, feature matching, graffiti
A hierarchical occupancynetwork with multi-height attention for vision-centric 3D occupancy prediction
Can Li, Zhi Gao, Zhipeng Lin, Tonghui Ye & Ziyao Li. (2024) A hierarchical occupancy network with multi-height attention for vision-centric 3D occupancy prediction. The Photogrammetric Record, 39, 600 – 614. Available from: https://doi.org/10.1111/phor.12500
Abstract: The precise geometric representation andability to handle long-tail targets have led to the increasing attentiontowards vision-centric 3D occupancy prediction, which models the real world asa voxel-wise model solely through visual inputs. Despite some notableachievements in this field, many prior or concurrent approaches simply adaptexisting spatial cross-attention (SCA) as their 2D–3D transformation module,which may lead to informative coupling or compromise the global receptive fieldalong the height dimension. To overcome these limitations, we propose ahierarchical occupancy (HierOcc) network featuring our innovative height-awarecross-attention (HACA) and hierarchical self-attention (HSA) as its coremodules to achieve enhanced precision and completeness in 3D occupancyprediction. The former module enables 2D–3D transformation, while the latterpromotes voxels’ intercommunication. The key insight behind both modules is ourmulti-height attention mechanism which ensures each attention head correspondsexplicitly to a specific height, thereby decoupling height information whilemaintaining global attention across the height dimension. Extensive experimentsshow that our method brings significant improvements compared to baseline andsurpasses all concurrent methods, demonstrating its superiority.
As shown on the left of thegraphical abstract, we propose a new 3D occupancy prediction model called theHierOcc network. The inputs of HierOcc are multi-view temporal images. Afterpassing through the backbone, a feature pyramid network is used to obtainmulti-scale features of images. The image features and a set of voxel featuresinitialized by learnable parameters are fed into transformer blocks composed ofHSA and HACA. After this, the 3D feature volume corresponding to each group oftemporal images will be registered and concatenated together, and further fusedthrough a module composed of 3D convolutions. Finally, we up-sample the fused3D feature volume to the same resolution as the ground truth, and then use aclassification head to assign semantic label to each voxel. The core modules inour HierOcc are height-aware cross-attention and hierarchy self-attention. Asshown in the right part of the graphical abstract, HACA transforms visualfeatures from 2D image to 3D space and maintains global perception in theheight dimension while decoupling features from different heights, whereas HSAenables dynamic information exchange among voxels on the same height plane,enhancing the results’ completeness for planar categories.
Keywords: 3D occupancy prediction,autonomous driving, transformer
Forest canopy heightmodelling based on photogrammetric data and machine learning methods
Xingsheng Deng, Yujing Liu & Xingdong Cheng. (2024) Forest canopy height modelling based on photogrammetric data and machine learning methods. The Photogrammetric Record, 39, 615 – 640. Available from: https://doi.org/10.1111/phor.12507
Abstract: Forest topographic survey is a problem that photogrammetry has not solved for a long time. Forest canopy height is a crucial forest biophysical parameter which is used to derive essential information about forest ecosystems. In order to construct a canopy height model in forest areas, this study extracts spectral feature factors from digital orthophoto map and geometric feature factors from digital surface model, which are generated through aerial photogrammetry and LiDAR (light detection and ranging). The maximum information coefficient, Pearson, Kendall, Spearman correlation coefficients, and a new proposed index of relative importance are employed to assess the correlation between each feature factor and forest vertical heights. Gradient boosting decision tree regression is introduced and utilised to construct a canopy height model, which enables the prediction of unknown canopy height in forest areas. Two additional machine learning techniques, namely random forest regression and support vector machine regression, are employed to construct canopy height model for comparative analysis. The data sets from two study areas have been processed for model training and prediction, yielding encouraging experimental results that demonstrate the potential of canopy height model to achieve prediction accuracies of 0.3 m in forested areas with 50% vegetation coverage and 0.8 m in areas with 99% vegetation coverage, even when only a mere 10% of the available data sets are selected as model training data. The above approaches present techniques for modelling canopy height in forested areas with varying conditions, which have been shown to be both feasible and reliable.
The study employs threemachine learning techniques, namely gradient boosting decision tree regression,random forest regression and support vector machine regression, to constructhigh-resolution canopy height models in forested areas. These models are basedon spectral feature factors extracted from digital orthophoto maps andgeometric feature factors derived from digital surface models. Experimentalresults demonstrate the potential of the canopy height models constructed bythe gradient boosting decision tree regression to achieve prediction accuraciesof 0.2 m in areas with 50% canopy coverage and 0.6 m in areas with 99% canopycoverage, even when only utilising a subset 20% of the available data sets formodel training purposes.
Keywords: canopy height modelling,gradient boosting decision tree, LiDAR, photogrammetry, random forest
MoLO: Drift-free lidar odometry using a 3D model
Zhao, H., Zhao, Y., Tomko, M. & Khoshelham, K. (2024) MoLO: Drift-free lidar odometry using a 3D model. The Photogrammetric Record, 39, 641 – 663. Available from: https://doi.org/10.1111/phor.12509
Abstract: LiDAR odometry enables localising vehicles and robots in the environments where global navigation satellite systems (GNSS) are not available. An inherent limitation of LiDAR odometry is the accumulation of local motion estimation errors. Current approaches heavily rely on loop closure to optimise the estimated sensor poses and to eliminate the drift of the estimated trajectory. Consequently, these systems cannot perform real-time localization and are therefore not practical for a navigation task. This paper presents MoLO, a novel model-based LiDAR odometry approach to achieve real-time and drift-free localization using a 3D model of the environment containing planar surfaces, namely the structural elements of buildings. The proposed approach uses a 3D model of the environment to initialise the LiDAR pose and includes a scan-to-scan registration to estimate the pose for consecutive LiDAR scans. Re-registering LiDAR scans to the 3D model at a certain frequency provides the global sensor pose and eliminates the drift of the trajectory. Pose graphs are built frequently to acquire a smooth and accurate trajectory. A geometry-based method and a learning-based method to register LiDAR scans with the 3D model are tested and compared. Experimental results show that MoLO can eliminate drift and achieve real-time localization while providing an accuracy equivalent to loop closure optimization.
Each LiDAR scan isregistered with its previous scan to estimate the transformation. Periodically,a LiDAR scan will be registered with the 3D model and pose graph optimizationis then used to optimise the pose of each LiDAR scan between the two scans registeredwith the 3D model.
Keywords: drift elimination,localization, pose graph optimization, registration, sensor pose
Optical flow matching with automatically correcting the scale difference of tunnel parallel photogrammetry
Hao Li, Bohao Gao, Xiufeng He & Pengfei Yu. (2024) Optical flow matching with automatically correcting the scale difference of tunnel parallel photogrammetry. The Photogrammetric Record, 39, 664 – 679. Available from: https://doi.org/10.1111/phor.12511
Abstract: Using parallel photography to model tunnels is an efficient method for real scene modelling. Aiming at the problem that the accuracy of optical flow matching in tunnel parallel photography sequence photos is severely affected by the scale deformation of stereo images, a novel optical flow matching method with automatically correcting the scale difference of tunnel parallel photography stereo images is proposed from the perspective of imaging relationships. By analysing the distribution pattern of scale difference in stereo images, a model is obtained in which the scale difference of image points is symmetrically distributed radially on the image and follows a power function growth. Introduce it into traditional optical flow matching to correct image scale differences based on the model to improve matching accuracy. The mean square error of the optical flow matching after correcting scale difference in the experiment is less than 0.3 pixels, which is at least 34.3% higher than before correction and a maximum improvement of 45.5% in the experimental results. The research result indicates that the proposed optical flow matching method with automatically correcting the scale difference has a significant effect on improving the accuracy of tunnel parallel photography image matching and modelling.
Aiming at the problem thatthe matching accuracy of the optical flow method is low due to the influence ofscale deformation on tunnel parallel photography sequence images, this paperproposes a scale difference correction model. This model establishes the scaledifference of the whole image by image matching and automatically corrects thescale of the pre-sequence image. In Lucas Kanade optical flow matching, thefeature windows corresponding to the feature points of the pre-sequence imageand the post-sequence image are extracted, and the scale deformation of thewindow is automatically corrected by the scale difference model, which caneffectively improve the accuracy of the optical flow matching.
Keywords: optical flow matching,parallel photogrammetry, scale difference model, tunnel image
Frontispiece (封面图片文章)
Optimisation of real-scene 3D building models based on straight-line constraints
Kaiyun Lv, Longyu Chen, Haiqing He, Fuyang Zhou, Shixun Yu. (2024) Optimisation of real-scene 3D building models based on straight-line constraints. The Photogrammetric Record, 39, 680 – 704. Available from: https://doi.org/10.1111/phor.12514
Abstract: Due to the influence of repeated textures or edge perspective transformations on building facades, building modelling based on unmanned aerial vehicle (UAV) photogrammetry often suffers geometric deformation and distortion when using existing methods or commercial software. To address this issue, a real-scene three-dimensional (3D) building model optimisation method based on straight-line constraints is proposed. First, point clouds generated by unmanned aerial vehicle (UAV) photogrammetry are down-sampled based on local curvature characteristics, and structural point clouds located at the edges of buildings are extracted. Subsequently, an improved random sample consensus (RANSAC) algorithm, considering distance and angle constraints on lines, known as co-constrained RANSAC, is applied to further extract point clouds with straight-line features from the structural point clouds. Finally, point clouds with straight-line features are optimised and updated using sampled points on the fitted straight lines. Experimental results demonstrate that the proposed method can effectively eliminate redundant 3D points or noise while retaining the fundamental structure of buildings. Compared to popular methods and commercial software, the proposed method significantly enhances the accuracy of building modelling. The average reduction in error is 59.2%, including the optimisation of deviations in the original model's contour projection.
This study presents areal-scene three-dimensional (3D) building model optimisation method based onstraight-line constraints. The main workflow of this method consists of fourstages: real-scene 3D building model generation, neighbourhood search andbuilding edge extraction, co-constraints based on distance and angle for 3Dline extraction, and building model optimisation.
Keywords: building facade, buildingmodelling, co-constraint RANSAC, point clouds, straight line
Wiley 生态环境