Depth extraction is one of the important steps of 3D computer vision (CV). Although, it has
been researched for many decades and there are variety of methods already that addresses
depth extraction, there is no perfect solution that satisfies the needs of all CV algorithms.
Stereo vision is implemented in CV as a matching algorithm where an image region in one
image is matched to another region in the other image and the disparity between the matches
indicates the depth of the region. One of the fundamental issues in stereo matching is the
repetition of pattern problem. When there are patterns that are repeated along the search path,
stereo algorithms can wronglymatch the searched pattern with its repetition in the other image.
We showed that a solution to this problem is to use enhanced features that will distinguish the
correct pattern from its repetitions. Therefore, the search space is limited to the pattern itself
rather than its repetitions and correct disparities can be found.
Although stereo algorithms require many computations, these computations are independent
from each other which make effective parallelization of these algorithms on a GPU possible.
However, their parallelization efficiency relies highly on their architecture. In order to
optimize the performance of stereo algorithms, it is important to consider both their accuracy
and their parallelization performance. We showed that certain architectures of stereo matching
provide better parallelization capability while providing similar accuracies with other architectures.
Another way of measuring the depth of a scene is to use depth sensors. After the release
of the Microsoft Kinect, depth sensors have been increasingly used in CV applications.
Kinect provides dense and real-time depth measurements of indoor scenes which has sufficient
quality for many CV applications. However, its quality is not enough for accurate 3D
reconstructions especially on the boundaries of objects. Since there is a mismatch between
the RGB and depth map of the Kinect, depth refinement algorithms that consider all of their
input depth information as correct, fail to refine depth maps accurately. We showed that, to
accurately refine regions around boundaries, refinement algorithms should mark outliers and
do the refinement based on the trustworthy part of the depth map.
Another fundamental problem of depth sensors including the Kinect is transparent surfaces.
On transparent regions, Kinect fail to estimate any depth measurements. Since depth
refinement algorithms require sparse depth estimations on a surface in order to estimate the
unknown depth, they fail to refine the depth on the transparent surfaces correctly. To fully
recover transparent objects on the depth map, we propose to use stereo matching between IR
and RGB views of the Kinect in a fully connected energy minimization framework. Our refinement
strategy can fully recover transparent objects and it can correct the errors fromKinect
measurements and stereo matching estimations.
Stereo matching requires distinctive similarity measures to match pixels between two images.
Different similarity measures perform differently depending on the noise and texture of
the regions. It is important to combine their advantages to increase the accuracy of the matching.
To measure which similarity measure performs better than others on a local region, we used stereo confidences. According to the confidence of each measure, multiple measures are
adaptively fused. The result of fusion provides more robust and accurate matching compared
to any of the fused similarities and any fusion of them with static weights.
Finally, we proposed a novel confidence measure for medical image registration based
on similar measures from stereo matching confidences. The proposed confidence measure is
shown to be correlated with error from expert control points. Besides, our confidencemeasure
can indicate the error as a continuous score, on any region of the image. |