Wednesday, March 30, 2011

Image Manipulation with Kinect - Two Hands

This is more or less the same as yesterday, but now with two hands. While the app can handle it, it's almost too much for my own multi tasking abilities!

Grabbing two separate images moves them independently, grabbing a single image with both hands will trigger the resize and rotate mode.

Sunday, March 27, 2011

Multi Image Manipulation

This weekend I extended the image manipulation view to handle multiple images. Moving over the image ("touching" it) with the hand highlights it and moves it to the front. It can then be dragged around by "grabbing" it (closing the hand), or rotated and scaled when the second hand is closed.

And I played a little with the already implemented finger detection. All operations are now possible with one hand: Closing the hand drags the image, extending two fingers will the rotate and scale it. When the angle of the finger tip points changes, the image is rotated accordingly. And when the distance between the points changes, the image is scaled (this is still work in progress):


PS. For those who want to know: The pictures were taken in Iceland 2009.

Tuesday, March 22, 2011

Candescent NUI Lab - Preview

This is just a small update to show you what I've been working on. I'm building a "Kinect Lab" that should make it easier to work on different levels of abstraction (for example raw depth data vs. ready to use hand information).

Currently it sits on top of the OpenNI framework. But this should be easy to replace as the code itself uses only rgb and depth data and none of the other OpenNI features (like skeleton tracking and gesture recognition). This might come in handy, when Microsoft releases their SDK and I should decide to use that instead.

As you can see, you can add rgb and depth image streams generated from the raw data. And you can also add "layers" with data of higher abstraction (currently in a seperate space, but these can also be put on top of the images).

Basically you define data sources that take some input data and produce output data in some other form themselves. These can then be plugged into each other to form a production chain.

Here is a code sample:
       var openNIDepthDataGenerator = ...;  
       var depthDataSource = new DepthPointerDataSource(openNIDepthDataGenerator);  
       var clusterDataSource = new ClusterDataSource(depthDataSource);  
       var handDataSource = new HandDataSource(depthDataSource, clusterDataSource);  
       var mouseController = new MouseController(handDataSource);  
       mouseController.Enabled = true;  

The same with some inversion of control magic:
       var mouseController = new MouseController(Ioc.Resolve<HandDataSource>());  
       mouseController.Enabled = true;  

This is the code that initializes the cursor control with hand tracking (see my last post).

Friday, March 18, 2011

"Hand" Writing with Kinect

This is a small demo of what can be done with hand and finger tracking. The depth data is processed into hand movement and gesture events. I send these via the Win32 p/invoke method SendInput to windows. This allows you to control any application, like for example MSPaint:

The right hand controls the Cursor location. Closing the left hand will trigger a "mouse down" event, reopening it a "mouse up" event. I had to separate this for the moment, because opening and closing the hand moves the center of the hand cluster too much, it's not really possible to click a small icon this way. I need to spend some time on smoothing the Cursor movement.

Thursday, March 17, 2011

Comparing with others

Today I did some google research on what others can or can't do in my current field of interest (hand / finger tracking). Here is a summary of what I found:

1. One of the most impressive demos is the following:
Kinect Hand Detection  By Garrat Gallagher
It features image library scrolling, image selection, translation, scaling and rotation and looks quite like Minority Report.

2. This one is quite similar to what I do:
Palm position tracking [OpenCV]
There is not much description but it seems to work well. I don't know how much is functionality that OpenCV already offers.

3. Not bad either:
kinect - fingertip detection 
Offers a brief description of how it's done. Point 3 "approximate contours" might be something I could try.

4. One of the first videos I had seen. Doesn't detect fingers though:
Multitouch with hacked Kinect

5. Similar technique as in my solution:
Microsoft kinect with Delphi -- realtime hand and fingertip detection
But thinning does not seem to be working too well.

6. Some more links

Tuesday, March 15, 2011

Using hand and finger tracking to move, resize & rotate images - reworked

With the fingertip detection in place, it's now a lot easier to recognize the "grab" gesture. The old version monitored the cluster size and if it changed quickly would then guess that the hand was closed. The new version uses the number of fingers that are detected in each frame to decide whether the hand is open or closed.

Plus the hand's shape is now transfered to the image view:

I'm thinking of implementing a simple event based API for this.

Sunday, March 13, 2011

Fingertip Detection Part 2

In this post (Fingertip Detection Part 1) I presented the first version of my fingertip detection algorithm.

Since then I was able to refine it considerably. The key was to use the hand's contour and then combine this information with the points in the convex hull. Here is a video that showcases the current algorithm:

1. For each point in the hull (candidates for fingertips), find the nearest point in the contour curve. Let's call this set C.

2. For each point c in C, take the two points (p1, p2) in the two different directions along the contour that are in a given distance to c (the ideal distance has to be found experimentally and depends on the handshape's size).

3. If these three points are aligned, then it's not a fingertip point. To find out if they are aligned, find the center of p1 and p2 and calculate the distance to c. If this distance is bigger than a certain value (to be found experimentally), the points are not on a line and the cantidate point c is a fingertip point.

c is a fingertip, while c' is not.
This version also works when the fingers are pointing down:

Next I will try to find skin areas with the RGB camera, then use these areas to find hand shapes. And the current code base needs some heavy refactoring.

Thursday, March 10, 2011

Thinning / Skeletonizing

While reading through some pattern recognition scripts from my time back at university, I had the idea that maybe it would be easy to apply a skeletonizing / thinning algorithm and then work with this reduced data. I made some tests with a variation of Hilditch's algorithm. For more details, you can read this article (in German).

The image on the left it the hand cluster area reduced to a foreground / background map. This is the data that the thinning algorithm uses. The image right to it is the result, where only lines remain.

My implementation is currently not fast enough to process 30 FPS in real-time. I had to shrink the target image and even now it's not really smooth. I'm not sure if this is the way to go.

Asus WAVI Xtion, a Kinect Alternative for the PC?

At CeBIT 20011 Asus presented the WAVI Xtion (see the press release) device. It seems to only offer depth data, no RGB camera. I haven't found anything about the price yet. It's also not clear if you can only buy the Xtion sensor along with the WAVI wireless media streaming device.

From the press release: "ASUS plans to launch the Xtion Pro Developer Challenge in March" - Let's see whose SDK comes out earlier, Microsoft or Asus.

Tuesday, March 8, 2011

Fingertip Detection Part 1

This is a video of my first version of fingertip detection. It's based on points in the convex hull and the orientation of the hand (see previous post). It's still a bit shaky, expecially around the thumb. And it does only work when the fingers are fully stretched out. But it looks promising:

In the second video the hull, palm and  finger points are transfered onto the canvas window. Here it's more obvious that this version is not quite ready to use.

The next step will be to detect whether the hand is open or closed and only try to find the fingers and orientation if it's at least partly open. And then of course improve the finger detection (maybe even finger direction detection).

Sunday, March 6, 2011

Hand Orientation Detection

On the way to implement finger tip point detection I thought it would be a good idea to first know the hand's orientation. Here is what I tried:

Step 1: Find the convex hull of the hand shape (with the Graham Scan algorithm). This is the almost invisible white line around the hand in the video.

Step 2: Use linear regression to find the line function that minimizes the sum of distances to the points in the hull. The second, yellow line is just a helper to show all 4 directions.

And here is the result:

For step 2 I will try the Rotating Calipers algorithm when I have time to implement it (I didn't find any free c# math library that would do this for me). This algorithm calculates the minimum area rectangle that contains all points of the hand shape.

Tuesday, March 1, 2011

Using hand tracking to move, resize & rotate images

The current version of my hand tracking program allows me to grab an image with the right hand and move it around. Grabbing with the left hand while resting the right hand on the image will trigger the "rotate and scale" mode.

The distance change between the two cluster centers (hand points) is used to scale the image, changing the angle between the points will rotate the image accordingly.

Currently all objects closer than a certain threshold are tracked. The next step would be to improve the hand recognition, so hands could also be tracked outside this virtual plane. And the algorithm could be refined to track the fingers independently.

Microsoft has announced that they will release a SDK for the Kinect somewhen this spring (read more). Unfortunately there are not yet many details known, but I guess it will 'only' provide access to the Kinect sensor data and not contain any ready-to-use hand / full body tracking.