Graphics, Vision & Video

Inertial Depth Tracker Dataset (IDT 13)

In recent years, the availability of inexpensive depth cameras, such as the Microsoft Kinect, has boosted the research in monocular full body skeletal pose tracking. Unfortunately, existing trackers often fail to capture poses where a single camera provides insufficient data, such as non-frontal poses, and all other poses with body part occlusions. With this dataset, we provide the means to evaluate a monocular depth tracker based on ground-truth joint positions that have been obtained with a optical marker-based system and which are calibrated with respect to the depth images. Furthermore, this dataset contains the readings of six inertial sensors worn by the person. This enables the development and testing of trackers that fuse information from these two complementary sensor types.

Publications

If you use this dataset, you are required to cite the following publication.

[1] Thomas Helten, Meinard Müller, Hans-Peter Seidel, Christian Theobalt
Real-time Body Tracking with One Depth Camera and Inertial Sensors
Proceedings of the International Conference on Computer Vision (ICCV), 2013.
[bib] [pdf] [pdf+] [vid]

Terms of Use

The Inertial Depth Tracker Dataset (IDT 13) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License by MPI Informatik. Any work created using the aforementioned dataset must cite publication [1].

Downloads

Sequences

S1	S2	S3	S4	S5	S6
7z	7z	7z	7z	7z	7z

The sequences are compressed using 7zip [Homepage]. To visualize or convert the sequence files you can use the Matlab files provided here. Also, code is included to calculate the joint error tracking error ([1], Section 8.2) yourself. If you want to download the data, please write an email to gvvperfcapeva [at] mpi-inf.mpg.de.

Results

The following table shows the average joint tracking error and standard deviation in millimeters for each of the six sequences. The used error function is described in [1], Section 8.2.. Please note that sequences S5 and S6 are much harder to track as the other sequences since they mostly motions that are non-frontal where large parts of the body are occluded. The best results in each category are depicted with bold font.

Tracker	S1	S2	S3	S4	S5	S6
Kinect SDK	32.9 (27.2)	39.6 (24.7)	40.3 (28.5)	28.8 (26.5)	76.6 (55.1)	108.9 (108.7)
HeltenICCV2013[1]	35.7 (24.9)	47.4 (31.4)	44.4 (33.8)	34.7 (25.4)	59.1 (45.3)	56.2 (41.6)
BaakICCV2011[2]	35.2 (30.1)	47.7 (32.4)	52.2 (28.2)	49.7 (57.0)	185.6 (145.2)	175.4 (138.1)

[1] Thomas Helten, Meinard Müller, Hans-Peter Seidel, Christian Theobalt
Real-time Body Tracking with One Depth Camera and Inertial Sensors
Proceedings of the International Conference on Computer Vision (ICCV), 2013.
[2] Andreas Baak, Meinard Müller, Gaurav Bharaj, Hans-Peter Seidel, Christian Theobalt
A Data-Driven Approach for Real-Time Full Body Pose Reconstruction from a Depth Camera
IEEE International Conference on Computer Vision (ICCV), 2011

Upload

If you want to upload your own tracking results, you can use the below form. Please upload your joint tracking result of Sequence S1 as File 1, Sequence S2 as File 2 and so on. The file format must be text only and of the structure:

joint1_x joint1_y joint1_z 1 0 0 0 joint2_x joint2_y joint2_z 1 0 0 0 ... joint16_x joint16_y joint16_z 1 0 0 0

Where the semantic meaning of joint1, joint2 and so on can be seen below. Note that we use the same joint semantics as the Kinect SDK's joint tracker.

joint1	HIP_CENTER
joint2	SPINE
joint3	SHOULDER_CENTER
joint4	HEAD
joint5	SHOULDER_LEFT
joint6	ELBOW_LEFT
joint7	WRIST_LEFT
joint8	SHOULDER_RIGHT
joint9	ELBOW_RIGHT
joint10	WRIST_RIGHT
joint11	HIP_LEFT
joint12	KNEE_LEFT
joint13	ANKLE_LEFT
joint14	HIP_RIGHT
joint15	KNEE_RIGHT
joint16	ANKLE_RIGHT

With one line per frame of the sequence. Example files can be found in the sequence files above. Please also always state a publication associated to that data otherwise the result will not be published under results.