Graphics, Vision & Video

Inertial Depth Tracker Dataset (IDT 13)

In recent years, the availability of inexpensive depth cameras, such as the Microsoft Kinect, has boosted the research in monocular full body skeletal pose tracking. Unfortunately, existing trackers often fail to capture poses where a single camera provides insufficient data, such as non-frontal poses, and all other poses with body part occlusions. With this dataset, we provide the means to evaluate a monocular depth tracker based on ground-truth joint positions that have been obtained with a optical marker-based system and which are calibrated with respect to the depth images. Furthermore, this dataset contains the readings of six inertial sensors worn by the person. This enables the development and testing of trackers that fuse information from these two complementary sensor types.


Terms of Use

Creative Commons Lizenzvertrag The Inertial Depth Tracker Dataset (IDT 13) is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License by MPI Informatik. Any work created using the aforementioned dataset must cite publication [1].



S1 S2 S3 S4 S5 S6
7z 7z 7z 7z 7z 7z

The sequences are compressed using 7zip [Homepage]. To visualize or convert the sequence files you can use the Matlab files provided here. Also, code is included to calculate the joint error tracking error ([1], Section 8.2) yourself. If you want to download the data, please write an email to gvvperfcapeva [at]


The following table shows the average joint tracking error and standard deviation in millimeters for each of the six sequences. The used error function is described in [1], Section 8.2.. Please note that sequences S5 and S6 are much harder to track as the other sequences since they mostly motions that are non-frontal where large parts of the body are occluded. The best results in each category are depicted with bold font.

Tracker S1S2S3S4S5S6
Kinect SDK 32.9 (27.2) 39.6 (24.7) 40.3 (28.5) 28.8 (26.5) 76.6 (55.1) 108.9 (108.7)
HeltenICCV2013[1] 35.7 (24.9) 47.4 (31.4) 44.4 (33.8) 34.7 (25.4) 59.1 (45.3) 56.2 (41.6)
BaakICCV2011[2] 35.2 (30.1) 47.7 (32.4) 52.2 (28.2) 49.7 (57.0) 185.6 (145.2) 175.4 (138.1)


If you want to upload your own tracking results, you can use the below form. Please upload your joint tracking result of Sequence S1 as File 1, Sequence S2 as File 2 and so on. The file format must be text only and of the structure:

joint1_x joint1_y joint1_z 1 0 0 0 joint2_x joint2_y joint2_z 1 0 0 0 ... joint16_x joint16_y joint16_z 1 0 0 0

Where the semantic meaning of joint1, joint2 and so on can be seen below. Note that we use the same joint semantics as the Kinect SDK's joint tracker.


With one line per frame of the sequence. Example files can be found in the sequence files above. Please also always state a publication associated to that data otherwise the result will not be published under results.

Last modified: 23 May 2018 13:01:35, | Imprint | Data Protection