Windows Hello Rocks!!! Now Why can’t the Kinect for Windows Do This???

Let me start by repeating “Windows Hello Rocks!!!”.

For those of you who don’t quite now what this is, one of the new features of Windows 10 is to get rid of passwords, and use biometric sensors to recognize who you are.

Biometric sensors being fingerprint readers, Iris retina scanners, and of course Depth cameras such as the Intel RealSense F200. The Intel depth camera is eerily similar to the Kinect for Windows cameras: v1 and v2, so hopefully those with Kinects, can use this feature too in the near feature. I know some of you may be asking what’s the big deal… Fingerprint readers have been around for 15 years or more. I know I use to have one when I worked for the Air Force Reserves as a Crystal Reports developer.

Well, the big news is now you can use embedded cameras like the Intel RealSense F200 to simply have your face recognized securely so you don’t need your finger anymore!!!

But I digress, currently the Kinect is not supported so I ask why?

My only guess why is because doing this requires changes in the driver architecture. The current driver is designed to be run in User mode setting. User mode loads “AFTER” a user is logged in, so using the Kinect would require creating a driver which runs in Kernel mode – which runs before the user is logged in, thus allowing this device to run outside the realm of the user mode setting and be used for Facial recognition.

Well I just got my Intel RealSense Development Kit in the mail. It contains the F200 camera along with SDK and drivers for windows 8.1 and windows 10.

I installed the drivers and SDK on both my win8.1 machine, along with my surface pro 3 which has Windows 10 build 10240, and I attached the device. Windows recognized it perfectly. I followed the steps here.

Once complete Windows hello was working and “Look Ma, no hands” no more passwords. Windows 10 recognized my face and only my face. I can sign in by just getting in front of the camera.

Great work Microsoft!!!

I’m presenting at the MVP Virtual Conference

image

For those who would like to hear about using the Microsoft Kinect for Windows v2 to detect facial expressions, I will be presenting a 50 minute session online for the first ever Microsoft MVP Virtual Conference on just that.

Save the Date: May 14th @2pm – 2:50pm EST…

image

There will be many sessions not just mine. There will be topics on Windows 10, Microsoft Azure, Microsoft Edge (formerly project spartan), SQL Server, Power BI, Office 365, Enterprise Mobility, Surface, Powershell, Skype for Business, Hyper-V, System Center, OWIN, ASP.Net vNext, Unity 3D, AngularJS, Puppet, Microsoft Band, Xamarin and much more.

If you’re interested in this free online event, register here:  http://mvp.microsoft.com/en-us/virtualconference.aspx 

Come support me and my fellow MVP’s!!! Troll us or simply ask us questions. I hope to hear and see you there.

Denver Kinect + Microsoft Band + Unity Hackathon

DenverHackathon

Last year I went all around Colorado drumming up support for Kinect development by talking at user groups, breakout sessions, and meetings. I’ve spoken in Fort Collins, Fort Carson, Boulder, Denver, Colorado Springs, Fountain, and many more areas. I’m glad to announce with all the support we’ve received, we are now holding a Kinect hackathon in Denver Colorado, April 24th – 25th 2015.

This event will bring the Kinect for Microsoft engineers down to an all night event, answering any questions you have about the device and how to program with it. The Microsoft Kinect team will also bring, spare Kinect devices, Microsoft Bands, Unity Pro plugins, and the Unreal 4 engine plugin, along with other surprises.

Kinect for Windows MVP’s such as myself will also be there to help, and tell you about things we’re working on, experiences, and possibly even opportunities for you to get architecture and coding help.

This event is for designers, gamers, developers, enthusiasts, students, hobbyists, idealists, medical researchers. If you have an idea you’d like to see built, or a game you want to play that is Kinect enabled this is the place to be.

Register for the upcoming #denver #hackathon today at: http://aka.ms/denverhack Space is limited! #kinect #k4wdev #k4wv2

So click on the link above to register. Don’t miss this opportunity to meet the best minds in the NUI, Kinect, Hololens world…

My Kinect told me I have Dark Olive Green Skin…

Did you know the Kinect for windows v2 has the ability to determine your Skin  pigmentation and your hair color? – Yes I’m telling you the truth. One of the many features of the Kinect device is the ability to read skin complexion and hair color of a person who is being tracked by the device.

If you ever need or require the ability to read the skin complexion of a person or determine the color of a persons hair on their head, this posting will show you how to do just that.

image

The steps are rather quick and simple. Determining the skin color requires you to access Kinect’s HD Face features.

Kinect has the ability to detect Facial features in 3-D. This is known as “HD Face”. It can detect depth, height, and width. The Kinect can also use it’s High Definition Camera, to detect colors such as the Red, Green, and Blue intensities that reflect back, and infer the actual skin tone of a tracked face. Along with the skin tone, the Kinect can also detect the Hair color on top of a person’s head…

So What’s Your Skin Tone? Click Here to download the source code and try it out.

If you want to include this feature inside your application, the steps you must take are:

1. Create a new WPF or Windows 8.1 WPF application

2 Inside the new application, add a reference to the Microsoft.Kinect and Microsoft.Kinect.Face assemblies.

image

3. Let’s also make sure we set this up for the proper processor architecture: HD Face supports both 32bit and 64 bit. I want to use 64 bit. Change your build settings to use 64 bit configuration and builds from the project properties in VS.Net:

image

The above step is very important. You must choose either x86 (32 bit) or x64 (64 bit) architecture and build accordingly. “Any Cpu” won’t work as is here. The reason being that the Kinect Assemblies are named exactly the same thing, however they are compiled appropriately for each architecture. You can easily get a “BadFormat” exception if you’re using version x86 with a 64 bit build and vice versa.

4. Next copy the correct version of the NuiDatabase from the Kinect Redist folder into your \bin\x64\Debug  folder path for your project. This step is also important. If you mis-match your versions, by copying the x86 NuiDatabase into a 64bit compiled application you’ll start to see weird errors during runtime, things like it can’t find your Kinect.Face assembly, and “BadFormat” errors. So make sure you choose the correct architecture.

image

Note: Optionally You can also use the Kinect Nu-Get packages which will basically do the right thing for you. However you can’t mix and match. You can’t manually add references and then go back and add NuGet packages, things will quickly get out of sync:

image

5. Inside your code add the namespaces for Kinect and Kinect HD Face:

using Microsoft.Kinect;
using Microsoft.Kinect.Face;

6. Create some variables to hold pointers to the artefacts:

        private KinectSensor m_sensor;
        private BodyFrameReader m_bodyReader;
        private HighDefinitionFaceFrameReader m_hdFaceReader;
        private HighDefinitionFaceFrameSource m_hdFaceSource;
        private FaceModel m_faceModel;
        private FaceAlignment m_faceAlignment;
        private FaceModelBuilder m_faceBuilder;
        private ulong m_trackedBodyId;
        private bool m_faceBuilderStarted;
        private bool m_faceBuildComplete;

The m_sensor holds a pointer to the Kinect Device itself. We’ll use this to get access to the Body Frames, High Definition Face Frames, FaceModel, FaceModel Builder and tracked person. The m_bodyReader will be a frame reader for determining a body being tracked. The Kinect sends 30 frames per second. Each frame can tell us if a person is found within that frame of data. The m_hdFaceSource will be the HD Face source to keep track of bodyTrackingID’s and give us access to the 30 frames per second data of HD Face Frames. The m_hdFaceReader will be used as each HD Face Frame is processed, it allows us to get the 3-D Face information (FaceModel), and listen for events which allow us to build a complete 180 degree view of the face. The m_faceModel will be the 3-D Face measurements. The m_faceBuilder will be used to build the 180 degree HD Face model which will be stored inside the m_hdFaceModel. The m_faceBuilder provides us with the internal mechanism to build an internal matrice of 3-D Face depth values of  IR, and Color (RGB) information. This allows us to then produce the complete m_hdFaceModel with Skin Color and Hair Color respectively. The m_faceBuilder also allows us to listen for events that tell us when the tracked face needs to rotate left, right, and tilt up, to make sure the complete matrix is built. The m_trackedBodyId is a tracking id that synchronizes the tracked Body, with the HD Face Source. Without a synchronized tracked person the HD Face can not perform it’s work. Lastly, there are two flag variables that will help us keep track of when the Face Builder process has started, and when the Face Builder process has completed.

Game Plan:

Overall what the application is going to do is initialize the Kinect sensor  and variables to default values. It will then setup the BodyFrameReader to listen for body frames to come from the Kinect. Once a body frame is generated, we will determine if a body is within the frame and figure out if the body is tracked. If the body is tracked, we will get the trackingId of the body and set it to the HDFrameSource. Once the tracking Id is set on the HDFrame source, this will generate HDFaceFrame events. Once a valid HDFaceFrame is generated we will start the face builder process. We will ask the face builder to start the process of building the 180 degree face model matrix. At this point the tracked user needs to turn their head slowly left and back to center, right and back to center, up and down back to center until the face builder notify us when it’s complete building the matrix. Once complete we ask the face builder to produce the 3-d face model. The 3-D face model then gives us access to the Skin Color, Hair Color, and 3-D depth and matrices.

7.  Initialize the sensor get an instance of your Kinect sensor, initialize your bodyReader, hdFaceReader, faceModel, trackingId and faceAligment variables:

 public MainWindow()
        {
            InitializeComponent();
            InitializeKinect();
        }

        public void InitializeKinect()
        {
            m_sensor = KinectSensor.GetDefault();
            m_bodyReader = m_sensor.BodyFrameSource.OpenReader();
            m_bodyReader.FrameArrived += m_bodyReader_FrameArrived;
            
            m_hdFaceSource = new HighDefinitionFaceFrameSource(m_sensor);
            m_hdFaceReader = m_hdFaceSource.OpenReader();
            m_hdFaceReader.FrameArrived += m_hdFaceReader_FrameArrived;
            m_faceModel = new FaceModel();
            m_faceBuilder =
                m_hdFaceReader.HighDefinitionFaceFrameSource.OpenModelBuilder(FaceModelBuilderAttributes.HairColor 
                 | FaceModelBuilderAttributes.SkinColor);
            m_faceBuilder.CollectionCompleted += m_faceBuilder_CollectionCompleted;
            m_faceBuilder.CaptureStatusChanged += m_faceBuilder_CaptureStatusChanged;
            m_faceBuilder.CollectionStatusChanged += m_faceBuilder_CollectionStatusChanged;
            m_faceAlignment = new FaceAlignment();
            m_trackedBodyId = 0;
            m_faceBuilderStarted = false;
            m_faceBuildComplete = false;
            m_sensor.Open();
        }

8. Inside the BodyReader_FrameArrived, event handler, add code to determine when the Kinect tracks a body, once Kinect finds the tracked body, set the trackingId for the hdFaceReader Source.

void m_bodyReader_FrameArrived(object sender, BodyFrameArrivedEventArgs e)
        {
            using (var bodyFrame = e.FrameReference.AcquireFrame())
            {
                if (null != bodyFrame)
                {
                    Body[] bodies = new Body[bodyFrame.BodyCount];
                    bodyFrame.GetAndRefreshBodyData(bodies);
                    foreach (var body in bodies)
                    {
                        if (body.IsTracked)
                        {
                            m_trackedBodyId = body.TrackingId;
                            m_hdFaceReader.HighDefinitionFaceFrameSource.TrackingId = m_trackedBodyId;
                        }
                    }
                }
            }
        }

9. Once the trackingId is set for the HdFaceFrameReader Source, this will kick off the HD Face Frame Arrived event handler. Just check the flag and start the face builder process:

void m_hdFaceReader_FrameArrived(object sender, HighDefinitionFaceFrameArrivedEventArgs e)
        {
            if (!m_faceBuilderStarted)
            {
                m_faceBuilder.BeginFaceDataCollection();
            }
            
        }

 

10. In the FaceBuilder_CollectionStatus, just listen for a complete status. This allows us to set our flag for letting us know all the face views have been correctly captured and we can ask for the face builder to give us the model:

 void m_faceBuilder_CollectionStatusChanged(object sender, FaceModelBuilderCollectionStatusChangedEventArgs e)
        {
            var collectionStatus = e.PreviousCollectionStatus;
            switch (collectionStatus)
            {
                    case FaceModelBuilderCollectionStatus.Complete:
                    lblCollectionStatus.Text = "CollectionStatus: Complete";
                    m_faceBuildComplete = true;
                    break;              

            }
        }

11. In the faceBuilder_CollectionCompleted event handler check the collection status to make sure it’s completed, check your flag to make sure it’s set, and then ask the faceBuilder to produce the FaceModel using the event argument variable. The face Model provides access to the Skin Color and Hair Color as an Unsigned Integer (UINT).  To make this an actual drawing color, we’ll need to convert the UINT to a color structure. The color structure can be created using some old skool bit shifting, see below.

private void m_faceBuilder_CollectionCompleted(object sender, FaceModelBuilderCollectionCompletedEventArgs e)
        {
            var status = m_faceBuilder.CollectionStatus;
            //var captureStatus = m_faceBuilder.CaptureStatus;
            if (status == FaceModelBuilderCollectionStatus.Complete && m_faceBuildComplete)
            {
                try
                {
                    m_faceModel = e.ModelData.ProduceFaceModel();
                }
                catch (Exception ex)
                {
                    lblCollectionStatus.Text = "Error: " + ex.ToString();
                    lblStatus.Text = "Restarting...";
                    m_faceBuildComplete = false;
                    m_faceBuilderStarted = false;
                    m_sensor.Close();
                    System.Threading.Thread.Sleep(1000);
                    m_sensor.Open();
                    return;
                }
                    var skinColor = UIntToColor( m_faceModel.SkinColor);
                    var hairColor = UIntToColor(m_faceModel.HairColor);
                
                    var skinBrush = new SolidColorBrush(skinColor);

                    var hairBrush = new SolidColorBrush(hairColor);

                    skinColorCanvas.Background = skinBrush;

                    lblSkinColor.Text += " " + skinBrush.ToString();

                hairColorCanvas.Background = hairBrush;
                
                lblHairColor.Text += " " + hairBrush.ToString();


                    m_faceBuilderStarted = false;
                    m_sensor.Close();
               
            }
        }
        private Color UIntToColor(uint color)
        {
            //.Net colors are presented as
            // a b g r
            //instead of
            // a r g b
            byte a = (byte)(color >> 24);
            byte b = (byte)(color >> 16);
            byte g = (byte)(color >> 8);
            byte r = (byte)(color >> 0);
            return Color.FromArgb(250, r, g, b);
        }

 

12. Lastly add the WPF Labels, and Canvas elements to your app so you can actually see something:

<Window x:Class="KinectFindingSkinTone.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="MainWindow" Height="350" Width="525">
    <Window.Resources>
        <SolidColorBrush x:Key="MediumGreyBrush" Color="#ff6e6e6e" />
        <SolidColorBrush x:Key="KinectPurpleBrush" Color="#ff52318f" />
        <SolidColorBrush x:Key="KinectBlueBrush" Color="#ff00BCF2" />
    </Window.Resources>

    <Grid Background="White" Margin="10 0 10 0">

        <StackPanel Margin="20">
            <TextBlock x:Name="lblCollectionStatus"  Text="CollectionStatus: " Foreground="{StaticResource KinectBlueBrush}" FontSize="20" />
            <TextBlock x:Name="lblStatus"  Text="FrameStatus: " Foreground="{StaticResource KinectBlueBrush}" FontSize="20" />

            <TextBlock x:Name="lblSkinColor"  Text="Skin Color: " Foreground="{StaticResource KinectBlueBrush}" FontSize="20" />
                       <Border BorderBrush="Black"><Canvas Width="300" Height="100"  x:Name="skinColorCanvas" Background="DarkGray"></Canvas></Border>
            
            <TextBlock x:Name="lblHairColor"  Text="Hair Color: " Foreground="{StaticResource KinectBlueBrush}" FontSize="20" />
                <Border BorderBrush="Black">
            <Canvas Width="300" Height="100" x:Name="hairColorCanvas" Background="DarkGray"></Canvas>
                </Border>
        </StackPanel>
    </Grid>
</Window>

Once your application runs it should look similar to this (Minus the FrameStatus):

image

Try it out on your own.

Using Kinect HD Face to make the MicroHeadGesture Library

Currently, I am working on a medical project which requires detection of Head Nods (in agreement), Head Shakes (in disagreement), and Head Rolls (Asian/East Indian head gesture for agreement) within a computer application.

Being that I work with the Kinect for Windows device, I figured this device is perfect for this type of application.

This posting serves as explanation to how I built this library, the algorithm used, and how I used the Kinect device and Kinect for Windows SDK to implement it.

Before we get into the Guts of how this all works, let’s talk about why the Kinect is the device that is perfect for this type of application.

The Kinect v2.0 Device has many capabilities. One of which allows the device to capture a persons face in 3-D… That is 3-Dimensions:

clip_image001image

Envision the Z-axis arrow pointing straight out towards you in one direction, and out towards the back of the monitor/screen in the other direction.

In Kinect terminology, this feature is called HD Face. In HD Face, the Kinect can track the eyes, mouth, nose, eye brows, and other specific things about the face when a person looks towards the Kinect camera.

image

So envision a person’s face tracked in 3-D.

clip_image001[5]

We can measure height, width, and depth of a face. Not only can we measure 3-d values and coordinates on various axes, with a little math and engineering we can also measure movements and rotations over time.

Think about normal head movements for a second. We as humans twist and turn our heads for various reasons. One such reason is proper driving techniques. We twist and turn our heads when driving looking for other cars on the road. We look up at the skies on beautiful days. We look down on floors when we drop things. We even slightly nod our heads in agreement, and shake our heads in disgust.

Question: So from a technical perspective what does this movement look like?

Answer: When a person moves their head, the head rotates around a particular axis. It’s either the X, Y, Z, or even some combination of the three axis. This rotation is perceived from a point on the head. For our purposes, let’s look at the Nose as the point of perspective.

clip_image001[7]

When a person Nods their head, the nose is rotated around the X-axis in small up and down manner. The Nose coordinates for Head Nod makes the Y- coordinate values of the Nose point go up and down.

When a person Shakes their head, the nose is rotated around the Y-axis in a small left and right manner. The Nose coordinates for the Head Shake makes the X-coordinate values of the Nose point go up and down.

If we were to graph Nods and Shakes over time, their Y and X graphs would look like this:

image

Question: So great, we have a graph of Head Nods and Head Shakes… How do we get the Y, X and rotations from the head?

Answer: Luckily for us the Kinect for Windows SDK, provides us engineers with the HD Face Coordinates in 3-D. That is we get the X, Y, and Z coordinates of a Face. Due to linear algebra, and vector math, we can also derive the Rotational Data from this as well. HD Face gives us Facial orientation, and also Head Pivot data.

Question: Now we’re getting somewhere, so exactly how do you calculate Head Nods/Shakes/Rolls with the Kinect?

Answer: Well it takes a little creativity, and some help from some researchers in Japan (Shinjiro Kawato and Jun Ohya), who figured out the mathematically formula to derive the head position deviations.

So my implementation is based in part on this paper. Instead of “Between the eyes”, I decided to use the Nose, since the Kinect readily gives me this information fairly easily.

The implementation concept is simple.

First let’s assume, from the research paper that a typical Nod/Shake/Roll lasts about 1 to 1.4 seconds.

Next let’s take for fact that the Kinect device produces 30 frames per second. And as long as a person is facing the camera, the majority of these frames per second will produce a HD Face frame for us (assuming at least approx ~15-20 fps).

Therefore if I capture about 1-1.5 seconds of frames, I can determine Head Rotations, pixel coordinates (X, Y and Z), derive rotation in angles, and store this data in a state machine for each measured frame.

I can then change states for each measured frame from “Extreme” to “Stable” to “Transient” based on the algorithms provided by Kawato and Ohya.

I then use a delayed 5 frame buffer to evaluate a set of states for the last 3 of the 5 buffered frames.

Next thing I do is continue applying the algorithm from Kawato and Ohya to figure out when and precisely how to check for head nods/shakes/rolls inside my buffered frame states.

The mechanism to check is simple as well. If the current frame state changes from a non stable state to “Stable” then I go and evaluate for Nods/Shakes/Rolls.

The evaluation is also simple. During the evaluation process, if the previous frame states have more than 2 adjacent “Extreme” states, then I check to see if all the adjacent states have Nose rotation angles greater than a configurable threshold. By default my threshold is 1 degrees. Depending on which axis it is, Y – Nods, X – Shakes, Z – Rolls, I raise an event that the appropriate head action occurred.

Here’s a graphical view of the process flow:

clip_image001[15]

clip_image002
clip_image003

Frame state depiction:

clip_image001[17]

 

If you’re interested in testing out this library, please contact me here through this blog.

Here’s the library and a sample Windows 8.1 store application using the library in action. In the picture below, I have updated the HD Face Basic XAML sample for visualization. As the HD Face mesh head nods and shakes, I show the confidence of a Head Nod or Head Shake. On the left represents KinectStudio and a recorded clip of me testing the application

image

!Happi Kinecting!

Kinect HIG v2.0 Posted

Just a quick reminder, for those that will be developing applications for the Kinect for Windows v2, the Kinect Human Interface Guidelines v2.0 has been released back in October 30, 2014. I originally missed this posting so I’m putting here on my blog so I can find it again.

It contains 140 pages of recommendations, specifics, and best practices on how to place, use, develop and interact with Kinect enabled applications.

You can get the latest version of the guide here: http://download.microsoft.com/download/6/7/6/676611B4-1982-47A4-A42E-4CF84E1095A8/KinectHIG.2.0.pdf

Awarded MVP 2015

This morning I awoke to an email:

Microsoft MVP Banner
Dear Dwight Goins,
Congratulations! We are pleased to present you with the 2015 Microsoft® MVP Award! This award is given to exceptional technical community leaders who actively share their high quality, real world expertise with others. We appreciate your outstanding contributions in Kinect for Windows technical communities during the past year.

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 23,000 times in 2014. If it were a concert at Sydney Opera House, it would take about 9 sold-out performances for that many people to see it.

Click here to see the complete report.