Multithreading Perceptual Computing Applications in Unity3d

August 21st, 2013 by Steff Kelsey

Once again we are excited about being featured on the Intel Developer Zone blog for a post done by Infrared5 Senior Developer Steff Kelsey. This post Multithreading Perceptual Computing Application in Unity3d talks about the challenges the Infrared5 team faced while participating in the Intel Ultimate Coder Challenge. Follow the link to hear the steps we went through to make Kiwi Catapult Revenge a reality! As always, we love hearing your feedback and feel free to share with others.

http://software.intel.com/en-us/blogs/2013/07/26/multithreading-perceptual-computing-applications-in-unity3d

, , , , ,

Infrared5 Ultimate Coder Week Six: GDC, Mouth Detection Setbacks, Foot Tracking and Optimizations Galore

April 3rd, 2013 by Chris Allen

For week six Chris and Aaron made the trek out to San Francisco to the annual Game Developers Conference (GDC) where they showed the latest version of our game Kiwi Catapult Revenge. The feedback we got was amazing! People were blown away at the head tracking performance that we’ve achieved, and everyone absolutely loved our unique art style. While the controls were a little difficult for some, that allowed us to gain some much needed insight into how to best fine tune the face tracking and the smartphone accelerometer inputs to make a truly killer experience. There’s nothing like live playtesting on your product!



Not only did we get a chance for the GDC audience to experience our game, we also got to meet some of the judges and the other Ultimate Coder competitors. There was an incredible amount of mutual respect and collaboration among the teams. The ideas were flowing on how to help improve each and every project in the competition. Chris gave some tips on video streaming protocols to Lee so that he will be able to stream over the internet with some decent quality (using compressed JPEGs would have only wasted valuable time). The guys from Sixense looked into Brass Monkey and how they can leverage that in their future games, and we gave some feedback to the Code Monkeys on how to knock out the background using the depth camera to prevent extra noise that messes with the controls they are implementing. Yes, this is a competition, but the overall feeling was one of wanting to see every team produce their very best.


The judges also had their fair share of positive feedback and enthusiasm. The quality of the projects obviously had impressed them, to the point that Nicole was quoted saying “I don’t know how we are going to decide”. We certainly don’t envy their difficult choice, but we don’t plan on making it any easier for them either. All the teams are taking it further and want to add even more amazing features to their applications before the April 12th deadline.


The staff in the Intel booth were super accommodating, and the exposure we got by being there was invaluable to our business. This is a perfect example of a win-win situation. Intel is getting some incredible demos of their new technology, and the teams are getting exposure and credibility by being in a top technology company’s booth. Not only that, but developers now get to see this technology in action, and can more easily discover more ways to leverage the code and techniques we’ve pioneered. Thank you Intel for being innovative and taking a chance on doing these very unique and experimental contests!


While Aaron and Chris were having a great time at GDC the rest of the team was cranking away. Steff ran into some walls with mouth detection for the breathing fire controls, but John, Rebecca and Elena were able to add more polish to the characters, environment and game play.



John added on a really compelling new feature – playing the game with your feet! We switched the detection algorithm so that it tracks your feet instead of your face. We call it Foot Tracking. It works surprisingly well, and the controls are way easier this way.



Steff worked on optimizing the face tracking algorithms and came up with some interesting techniques to get the job done.


This week’s tech tip and code snippet came to us during integration. We were working hard to combine the head tracking with the Unity game on the Ultrabook, and ZANG we had it working! But, there was a problem. It was slow. It was so slow it was almost unplayable. It was so slow that it definitely wasn’t “fun.” We had about 5 hours until Chris was supposed to go to the airport and we knew that the head tracking algorithms and the camera stream were slowing us down. Did we panic? (Don’t Panic!) No. And you shouldn’t either when faced with any input that is crushing the performance of your application. We simply found a clever way to lower the sampling rate but still have smooth output between frames.


The first step was to reduce the number of times we do a head tracking calculation per second. Our initial (optimistic) attempts were to update in realtime on every frame in Unity. Some computers could handle it, but most could not. Our Lenovo Yoga really bogged down with this. So, we introduced a framesToSkip constant and started sampling on every other frame. Then we hit a smoothing wall. Since the head controls affect every single pixel in the game window (by changing the camera projection matrix based on the head position), we needed to be smoothing the head position on every frame regardless of how often we updated the position from the camera data. Our solution was to sample the data at whatever frame rate we needed to preserve performance, save the head position at that instant as a target, and ease the current position to the new position on every single frame. That way, your sampling rate is down, but you’re still smoothing on every frame and the user feels like the game is reacting to their every movement in a non-jarring way. (For those wondering what smoothing algorithm we selected: Exponential Smoothing handles any bumps in the data between frames.) Code is below.

Feeling good about the result, we went after mouth open/closed detection with a vengeance! We thought we could deviate from our original plan of using AAM and POSIT, and lock onto the mouth using a mouth specific Haarcascade on the region of interest containing the face. The mouth Haarcascade does a great job finding and locking onto the mouth if the user is smiling – which is not so good for our purposes. We are still battling with getting a good lock on the mouth using a method that combines depth data with RGB, but we have seen why AAM exists for feature tracking. It’s not just something you can cobble together and have confidence that it will work well enough to act as an input for game controls.


Overall, this week was a step forward even with part of the team away. We’ve got some interesting and fun new features that we want to add as well. We will be sure to save that surprise for next week. Until then, please let us know if you have any questions and/or comments. May the best team win!

, , , , , , ,

Ultimate Coder Week #5: For Those About to Integrate We Salute You

March 21st, 2013 by Chris Allen

This post was featured on Intel Software’s blog, in conjunction with Intel’s Ultimate Coder Challenge. Keep checking back to read our latest updates!

We definitely lost one of our nine lives this week with integrating face tracking into our game, but we still have our cat’s eyes, and are still feel very confident that we will be able to show a stellar game at GDC. On the face tracking end of things we had some big wins. We are finally happy with the speed of the algorithms, and the way things are being tracked will work perfectly for putting into Kiwi Catapult Revenge. We completed some complex math to create very realistic perspective shifting in Unity. Read below on those details, as well as for some C# code to get it working yourself. As we just mentioned, getting a DLL that properly calls update() from Unity and passes in the tracking values isn’t quite there yet. We did get some initial integration with head tracking coming into Unity, but full integration with our game is going to have to wait for this week.

On the C++ side of things, we have successfully found the 3D position of the a face in the tracking space. This is huge! By tracking space, we mean the actual (x,y,z) position of the face from the camera in meters. Why do we want the 3D position of the face in tracking space? The reason is so that we can determine the perspective projection of the 3D scene (in game) from the player’s location. Two things made this task interesting: 1) The aligned depth data for a given (x,y) from the RGB image is full of holes and 2) the camera specs only include the diagonal field of view (FOV) and no sensor dimensions.

We got around the holes in the aligned depth data by first checking for a usable value at the exact (x, y) location, and if the depth value was not valid (0 or the upper positive limit), we would walk through the pixels in a rectangle of increasing size until we encountered a usable value. It’s not that difficult to implement, but annoying when you have the weight of other tasks on your back. Another way to put it: It’s a Long Way to the Top on this project.

The z-depth of the face comes back in millimeters right from the depth data, the next trick was to convert the (x, y) position from pixels on the RGB frame to meters in the tracking space. There is a great illustration here of how to break the view pyramid up to derive formulas for x and y in the tracking space. The end result is:
TrackingSpaceX = TrackingSpaceZ * tan(horizontalFOV / 2) * 2 * (RGBSpaceX – RGBWidth / 2) / RGBWidth)
TrackingSpaceY = TrackingSpaceZ * tan(verticalFOV / 2) * 2 * (RGBSpaceY – RGBHeight / 2) / RGBHeight)

Where TrackingSpaceZ is the lookup from the depth data, horizontalFOV, and verticalFOV are are derived from the diagonal FOV in the Creative Gesture Camera Specs (here). Now we have the face position in tracking space! We verified the results using a nice metric tape measure (also difficult to find at the local hardware store – get with the metric program, USA!)

From here, we can determine the perspective projection so the player will feel like they are looking through a window into our game. Our first pass at this effect involved just changing the rotation and position of the 3D camera in our Unity scene, but it just didn’t look realistic. We were leaving out adjustment of the projection matrix to compensate for the off-center view of the display. For example: consider two equally-sized (in screen pixels) objects at either side of the screen. When the viewer is positioned nearer to one side of the screen, the object at the closer edge appears larger to the viewer than the one at the far edge, and the display outline becomes trapezoidal. To compensate, the projection should be transformed with a shear to maintain the apparent size of the two objects; just like looking out a window! To change up our methods and achieve this effect, we went straight to the ultimate paper on the subject: Robert Koomla’s Generalized Perspective Projection. Our port of his algorithm into C#/Unity is below.

The code follows the mouse pointer to change perspective (not a tracked face) and does not change depth (the way a face would). We are currently in the midst of wrapping our C++ libs into a DLL for Unity to consume and enable us to grab the 3D position of the face and then compute the camera projection matrix using the face position and the position of the computer screen in relation to the camera.

Last but not least we leave you with this week’s demo of the game. Some final art for UI elements are in, levels of increasing difficulty have been implemented and some initial sound effects are in the game.

As always, please ask if you have any questions on what we are doing, or if you just have something to say we would love to hear from you. Leave us a comment! In the meantime we will be coding All Night Long!

, , , , , , ,