- Home /
Integrating a C++ object detection library into Hololens
I am trying to integrate the Darknet YOLO (https://pjreddie.com/darknet/yolo/) object detection computer vision package into Unity to be used on the Hololens. I've only gotten as far as exporting the package as a DLL to Unity and making marshalled function calls to it. I want to be able to continuously put video feed through it rather than single pictures. Only problem is, for a single picture to be processed and the bounding box information to be given back to me, it takes several seconds as opposed to real-time, and I would like it to operate in real time (as in, it continuously provides me visuals of bounding boxes around the objects it can detect). Is there a reason why it takes so slow for the package to provide the detection information back to me when I call a function for it to read image information from a file, and is there a way I can optimize for this?
Also I'm not sure how (if I am able to) I can integrate Unity's VideoCapture functionalities to be used with the YOLO package, but I would also like to be able to do that so that I can have real-time video feed analyzed on the fly and projected onto the Hololens.
Any steps in the right direction would be much appreciated!
Answer by NorthStar79 · Apr 18, 2018 at 07:45 AM
I don't have any experience with YOLO but I can say that Image processing usually takes a lot of calculations. and doing it at the real-time requires some monstrous hardware. also, the resolution takes its place too. higher resolution means more calculations. Oh by the way video is just a stream of images. after this brief thoughts. I just want to give some advice how to create more performant image process techniques that I used before.
first of it, if you can stream your video through the internet (if your users have broadband-high speed connection) you can stream video to your server, process them and return results to your users. this will cause some delay but usually not that much to worry about. but i must say that while this option gives you best results probably(I mean certainly) it will cost you huge amounts of money.
Another option is capturing your videos at lower resolutions and lower framerates than usual, for example, 10-12 frames per second usually do not cause much trouble to users. and images with 0.5 k resolution can give enough accuracy. so, in short, capturing at low res can help you to reduce calculation times to milliseconds from seconds. and feeding fewer frames than normal can help you to minimize (or eliminate) overhead.
I guess these are only options that I can suggest. I highly encourage you to do your own research and experiments. and if you success please share it with us. We would love to hear about.
I hope This Answer was Helpfull. İf it is, please consider marking it as the correct answer. this way anyone else who searching similar question online can easily find this answer. Best Regards.
Answer by apsDev · Apr 03, 2019 at 07:57 AM
Hi, @Dorfdude8, can you help me understand the process of how you integrate YOLO with Unity, I'm trying to do the same thing, get object detection to work on Hololens, i just started but now i'm lost since there's nothing i can find related to hololens. Thanks