Light attenuation and backscatter inhibit the ability of a vision system to capture large area images of the sea floor. Image mosaicking is an attempt to overcome this limitation using a process of aligning short range images of the seabed to create one large composite map. Image mosaicking can be used as an aid to other applications such as navigation, wreckage visualisation, station keeping and also to promote a better understanding of the sea floor in areas such as biology and geology. Mosaicking involves the accurate estimation of vehicle motion in order to accurately position each frame in the composite image (mosaic). The general setup of the vision system remains the same for almost all mosaicking implementations. A single CCD camera is used to acquire images at a right angle to the seabed at an altitude ranging from 1-10 meters depending on water turbidity (see Fig. 2). One of the very earliest attempts at fusing underwater images to make a larger composite seafloor picture was published by Haywood (Haywood 1986). The simple method described did not take advantage of any image processing techniques but instead used the known vehicle offsets to merge the images in post processing. This method led to aesthetically poor results and gaps in the mosaic. Early attempts at automated image mosaicking were developed by Marks et al. who proposed a method of measuring offsets and connecting the images using correlation to create an accurate real-time mosaicking system (Marks et al. 1995). This method uses the incoming images to decide the position offset, rather than another type of sensor (acoustic), so it guarantees no gaps are encountered in the mosaic. Much like Marks et al. method for station keeping, discussed in the previous section, a stored image is correlated with live incoming images to derive the offset in pixels (Marks et al. 1994). The images are filtered using a Laplacian of Gaussian filter in order to highlight zero crossings and pronounce the image textures. The filtering reduces the image noise and also the effect of non-uniform illumination from artificial sources. The mosaic is created by repeatedly storing images and determining by the offset calculated where to place the image in the scene. The images are stored at intervals determined by predefined positional offsets in the x and y planes. Each time an image is stored, the system waits until the x and y value change limit has been reached and the process repeats itself. The system produced was capable of creating single column mosaics in real time using special purpose hardware. This correlation based method relies on well contrasted images in order to locate regions of correlation; a lack of texture will inhibit the system from correctly positioning images in the mosaic. A simple motion model is assumed as correlations inability to deal with rotations, scale changes and undersea currents (seen from results) may hinder its ability to create multiple column mosaics. This method was later extended by Fleisher et al. in order to reduce the effect of error growth due to image misalignments, in a similar fashion to current
Simultaneous Localisation and Mapping (SLAM) algorithms (Fleischer 2000). This involved the detection of vehicle trajectory crossover paths in order to register the current images with the stored frames to constrain the navigation error in real time. The use of either an augmented state Kalman filter or a least-squares batch formulation for image realignment estimation was proposed. The same image registration method is implemented thus the system continues to use a simplistic 2D translation image registration model. Garcia et al. proposed a method of feature characterisation to improve the correspondences between images in order to create a more accurate mosaic to position an underwater vehicle (Garcia 2001; Garcia et al. 2001b). Firstly regions of high spatial gradient are selected from the image using a corner detector. Image matching is accomplished by taking the textural parameters of the areas selected and correlating them with the next image in sequence. A colour camera improves the process as the matching is implemented on the hue and saturation components of the image as well as the intensity of the image. A set of displacement vectors for the candidate features from one image to the next is calculated. A transformation matrix can then be constructed to merge the images in the correct location in the final mosaic. The paper also implements a smoother filter which is an improvement on techniques first proposed by Fleischer et al. (Fleischer 2000). An augmented Kalman filter is used as the optimal estimator for image placement and has the advantage over batch methods of being able to handle multiple loops, real time dynamic optimisation and gives knowledge of the image position variance.
Negadaripour et al. extend previously discussed work in station keeping (Negahdaripour et al. 1999) and early work in image mosaicking (Negahdaripour et al. 1998) to create a fully automatic mosaicking system to aid submersible vehicle navigation (Negahdaripour & Xu 2002). As with the previously discussed station keeping methods, spatio-temporal image gradients are used to measure inter-frame vehicle motion directly which is then integrated over time to provide an estimate of vehicle position. Two methods are proposed for reducing the drift inherent in the system. The first method is based around trying to correct for the biases associated with the optical flow image registration to improve the inter-frame motion estimation and thus reduce accumulated system drift. The second addition attempts to bound the drift in the system by correcting errors in position and orientation at each mosaic update. This is performed by comparing the current image to a region extracted from the mosaic according to the current position estimate. The comparison between the expected image and the current image is used to feedback the correct position estimate and update the mosaic; thus constraining the error growth to the mosaic accuracy. Gracias et al. developed another approach to mosaic creation while also implemented it as an aid for navigation (Gracias 2002; Gracias et al. 2003). The estimation of motion is performed by selecting point features on the image using a Harris corner detector (Harris & Stephens 1988) and registering these control points on the proceeding images through a correlation based method. A two step variant of the least median of squares algorithm referred to as the MEDSERE is used to eliminate outliers. After estimating the inter-frame motion, the parameters are cascaded to form a global registration where all the frames are mapped to a single reference frame. After registration the mosaic is created by joining the images using the global registration transformation matrix. Where images overlap there are multiple contributions to a single point on the output image. A method of taking the median of the contributors is employed, as it is particularly effective in removing transient data, like moving fish or algae, which has been captured on camera. The creation of the mosaic is performed offline and then used for real time vehicle navigation. This technique has been experimentally tested for relatively small coverage areas and may not extend well to more expansive surveys due to the assumption of an extended planar scene. The method does not account for lens distortion, which can have a significant impact at larger scales (Pizarro & Singh 2003).
Pizarro et al. attempts to tackle the issues associated with the creation of large scale underwater image mosaicking using only image information in a global mosaicking framework (Pizarro & Singh 2003). The problem is broken down in three main parts: radial-distortion compensation, topology estimation and global registration. The proposed method uses feature descriptors invariant to changes in image rotation, scaling and affine changes in intensity and is capable of dealing with low overlap imagery. Radial distortion is accounted for by image warping in a pre-processing step prior to mosaicking. The mosaicking system uses all overlap information, including overlap from images that are not consecutive in time, in order to create a more accurate mosaic by partially limiting the effects of drift. The mosaic is rendered by multi-frequency blending to form a more globally consistent mosaic. The paper claims to have created the largest known published automatically generated underwater mosaic.
Gracias and Neighadaripour present two methods of creating mosaics using video sequences captured at different altitudes (Gracias & Negahdaripour 2005). The first method relies on a rendered mosaic of higher altitude images to act as a map to guide the position of the images in the lower altitude mosaic ('image to mosaic'). The second method does not require rendering of the higher altitude mosaic, just the topology to match each particular image of the lower altitude sequence against the higher altitude images ('image to image'). Ground truth points were used to compare the two methods presented. Both methods obtained good results but while the 'image to image' method showed less distortion, it had the disadvantage of higher computational expense. Unfortunately the method requires a small amount of user input to select correspondences and the flat, static and constant lighting of the environment are assumptions of the technique. Time efficiency is another factor to be considered due to the method requiring runs at different altitudes. It is difficult to compare and evaluate the performance of each of the methods described. Each technique has been tested in scenarios where different assumptions are made regarding the environment, vehicle dynamics and processing power available. Negahdaripour and Firoozfam attempted to compare methods (using a common data set) implemented by different institutions to document the various approaches and performances of different techniques to the marine world (Negahdaripour & Firoozfam 2001). Unfortunately, due to time constraints, only comparative results for feature-based and direct methods are reported. A more comprehensive report would give a better understanding of the strengths and weaknesses of current techniques available. Some recent research efforts in the area have investigated the construction of 3D mosaics, a further step forward in the evolution of mosaicking methods (Nicosevici et al. 2005). Video mosaicking remains a very complex and challenging application because of the inherent difficulties faced with accounting for 3D vehicle motion and the difficulty using optics underwater (Singh et al. 2004). 3D mosaicking is a glimpse of what the future could possibly hold for this application and what research institutes will be improving upon with advances in processing capability and vision systems.
Was this article helpful?