Consistent image retrieval of video scanning images
[Email protected]
Http://blog.csdn.net/zouxy09
Here's a little more recent work: Consistent image retrieval. And in the image search graph, the image search is to retrieve and query images consistent or similar images from the database. And we are only retrieving consistent images (which are consistent after grayscale). And photo Search is also different, photo search is a user to take a photo, and then in the database to retrieve the image consistent with the picture, its retrieval trigger is provided by the user. And we are scanning the images in the real scene through the camera, without the user's triggering interaction. In fact, we scan the QR code similar, here we use is not the QR code, but the ordinary image. In other words: We store a lot of pictures in the database, and then we use the camera to scan the real scene, if there is a picture in our database that is consistent with an image, then this consistent image is the result of the search.
This technique is widely used in augmented reality AR, for example, to identify a card, and then overlay the video description of the corresponding card. For example, to introduce the children to animals, the children put a cat's card under the camera, then we go to retrieve know to overlay on this card to display a cat 3D model, but also can be 360-degree stereoscopic observation. The cat can also move and be gently teased by the child. If the holographic projection is civilian, it will dazzle. Ha ha.......
So we can find the shadow of this technique in a lot of AR SDK. For example, Vuforia, a well-known Qualcomm company, provides this functionality, but the SDK is not open source and charges for different library sizes and retrieval times. In addition, the open source Artoolkit also realizes this function, however this tool requires that the image to be retrieved in the scene must be a black square border, the thickness of the border should not be too thin. However, we do not have any restrictions on the picture here. Nature is beauty.
First, difficult points
There are several main difficulties in this application:
1, as with most of the visual tasks, will be affected by image scale, rotation, illumination, occlusion factors and so on. And, compared to the photo search, video scanning has greater arbitrariness.
2, in the video sweep the use, has the strong real-time request.
3, the video scan does not exist user-provided retrieval triggers.
Second, the effect
The initial effects are as shown in the demo video:
Iqiyi Art: Http://www.iqiyi.com/w_19rs3a6n55.html#vfrm=8-7-0-1
Baidu Cloud: HTTP://PAN.BAIDU.COM/S/1QW1FKJQ
The results of the personal assessment are as follows:
1, accuracy rate: more than 98% (1k+ image in the library, test image randomly selected on the ipad display test).
2, Recall rate: This does not know how to evaluate objectively. Because the video sweep, if the diagram appears in the video, the basic can be retrieved. However, due to the effects of light, you may need to adjust the angle of view and so on several attempts. In general, however, the response time is within the user's experience acceptance range.
3, real-time: At present, the database 100+ image, the average single-frame retrieval needs 300ms (ordinary i5 notebook). If it is a 1k+ image, it takes 2s of time on average. Where the code has been highly parallelized (each run, with a CPU core occupancy rate of more than 95%).
4, adaptable distance: The target image from the camera 10cm-100cm (the target image appears in the camera size of 64x64 pixels to slightly larger than the camera resolution size).
5, can adapt to the plane 360 degrees rotation, scale change, partial occlusion. adapt to normal lighting.
6, can adapt to the image source: print black and white, color map, display display image (mobile phone, tablet).
Some examples:
1, can adapt to a variety of image sources:
2, can adapt to multi-scale, rotation, translation, occlusion and micro-deformation factors interference:
Do you feel the charm of computer vision algorithms?
Note: If the image in the demo video infringes your interests, I am sorry. Please contact me promptly to delete.
Consistent image retrieval of video scanning images