The possibilities of using vision systems for navigation have already been discussed in the case of mosaicking, station keeping and cable tracking. For the purposes of this review vision based navigation will be discussed in relation to mosaic based localisation, Simultaneous Localisation and Mapping (SLAM) and motion estimation. Image mosaics are a large area composite view of the seafloor. This composite view is effectively a map of the area over which the vehicle has passed during the mission. If the mosaic updates in real-time and thus the most recent visual information is available it allows for comparison between current camera frames and the composite image in order to improve the mosaic but also to localise the vehicle within the composite image. This technique has been used in both station keeping and mosaicking. Cufi et al. compare the live image with the most recently updated mosaic to allow for greater inter-frame motion and improve the robustness of the station keeping system (Cufi et al. 2002). Gracias et al. used a technique in which the mosaic is created offline and then implemented for a subsequent mission as a map of the site to aid vehicle navigation (Gracias 2002; Gracias et al. 2003). Negadaripour and Xu take advantage of the mosaic by calculating the inter-frame motion in order to estimate vehicle position and subsequently use the rendered mosaic to improve the placement of image at the mosaic update stage (Negahdaripour & Xu 2002). Simultaneous Localisation and Mapping (SLAM) also known as concurrent mapping and localisation (CML) is the process in which a vehicle, starting at an unknown location in an unknown environment, incrementally builds a map within the environment while concurrently using the map to update is current position. Following vehicle motion, if at the next iteration of map building the measured distance and direction travelled has a slight inaccuracy than any features being added to the map will contain corresponding errors. If unchecked, these positional errors build cumulatively, grossly distorting the map and therefore the robot's ability to know its precise location. There are various techniques to compensate for this such as recognising features that it has come across previously and re-skewing recent parts of the map to make sure the two instances of that feature become one. The SLAM community has focused on optimal Bayesian filtering and many techniques exist including laser range scanning (Estrada et al. 2005) , sonar (Tardos et al. 2002) and video (Davison et al. 2007) . Almost all the literature is based on terrestrial environments where vehicle dynamics are more limited and manmade structures provide an abundance of robust scene features. Very little literature exists which has tackled the issues of SLAM based navigation in an underwater environment. The strong majority of research that has taken place in the underwater environment has focused on acoustic data (Tena Ruiz et al. 2004; Ribas et al. 2006). The key to successful visual SLAM for underwater vehicle navigation lies in the selection of robust features on the sea floor to allow for accurate correspondence in the presence of changing view points and non uniform illumination. Another important factor to be considered is the likely sparseness of image points due to the environment and the necessary selection of robust features.
One of the few examples of underwater optical SLAM was developed by Eustice who implemented a vision based SLAM algorithm that performs even in the cases of low overlap imagery (Eustice 2005). Inertial sensors are also taken advantage of in the technique developed to improve the production of detailed seabed image reconstructions. Using an efficient sparse information filter the approach scales well to large-scale mapping in testing where an impressive image mosaic of the RMS Titanic was constructed (Eustice et al. 2005).
Williams et al. describes a method of underwater SLAM that takes advantage of both sonar and visual information for feature extraction in reef environments (Williams & Mahon 2004). Unfortunately the performance of the system during testing is difficult to evaluate as no ground truth was available for comparison. Saez et al. detail a technique for visual SLAM that takes advantage of a trinocular stereo vision (Saez et al. 2006). A global rectification strategy is employed to maintain the global consistency of the trajectory and improve accuracy. While experiments showed good results all testing was carried out offline. The algorithm for global rectification becomes increasingly computational complex with time and as a result is unsuitable for large scale environments. Petillot et al. presents an approach to perform underwater 3D reconstruction of the seabed aided by SLAM techniques and the use of a stereo camera system (Petillot et al. 2008). A Rauch-Tung-Striebel (RTS) smoother is used to improve the trajectory information outputted by the implemented Kalman filter. This paper is unique in the way it uses a combination of SLAM and RTS techniques for the optical 3D reconstruction of the seabed.
The issues associated with metric motion estimation from vision are dealt with more directly by Caccia (Caccia 2003) and later developed into a more complete system with ocean environment experimental results (Caccia 2007). The system is based on an optical feature correlation system to detect motion between consecutive camera frames. This motion is converted into its metric equivalent with the implementation of a laser triangulation scheme to measure the altitude of the vehicle (Caccia 2006). The current system only allows for horizontal linear translation and doesn't account for changes in yaw but promising results were achieved using the Romeo vehicle for a constant heading and altitude in the Ligurian Sea. Cufi also calculates direct metric motion estimation for evaluation of a station keeping algorithm (Cufi et al. 2002). This technique uses altitude measurements gained from ultrasonic altimeter to convert offsets from images produced by a calibrated camera into metric displacements.
Machine vision techniques have been proven as a viable localisation and motion sensor in an unstructured land setting; unfortunately it is by no means a trivial task to transfer these techniques to subsea systems. The underwater environment adds the complexity of 3D motion and the inherent difficulties associated with optics underwater. However, recent work in the area of vision based SLAM and motion estimation techniques have proved that imaging systems can be complementary sensor to current sonar and inertial motion estimation solutions with the advantages of having high accuracy and update rate and being especially beneficial in near intervention environments. The SLAM community is focused on improving algorithms to allow for real time mapping of larger environments while improving robustness in the case of sparse features, changing illumination and highly dynamic motion.
Was this article helpful?