>Frame: this is the frame number as analysed from the video feed. The analysed video is
divided in single frames, and each frame is given a number in the sequence. Therefore, the
first frame being analysed is frame 1, the second is frame 2 and so on. Only frames with
analysis are saved to the CSV. If a frame does not produce an analysis, it will be skipped in
>Timestamp: this is the time when the frame was analysed, according to the PC clock. The
time is expressed in Unix timestamp as it is the most widely accepted way for expressing
time. There are services like this to convert from Unix timestamp to a human readable format.
Excel, other tools, and programming languages, can also understand and convert the
timestamp format to other formats. One frame is relative to one moment in time and therefore
it appears in only one timestamp, but in one timestamp multiple frames can be analysed.
Multiple frames can be analysed in the same second. If the frame rate (FPS) is 10, then there
will be 10 frames in the same timestamp (timestamp is expressed in seconds).
>ID: this is the unique ID assigned by the toolkit to a detected person. Therefore if a person
with an ID appears in different frames, it is the same person, with the exception of ID 0 (zero).
ID 0 means that the system has not enough information yet to understand if it is a person
previously analysed. By default, at least 3 frames are needed in order to assign an ID to a
person, otherwise ID 0 will be given to all the detected faces which cannot be 'recognized'. The
EvidenceLevel can be changed in data/settings.ini such that an ID is assigned immediately or
after a different amount of frames other than 3, however lowering this number will lower the
confidence to track the same person. A unique visitor is identified by its ID. Please keep in
mind that there's a time limit in which the system can assign the same ID to the same person.
If a person is detected again after (for instance) one hour, it will be given a different ID.
>Age: This value represents age estimation per detection and ranges from 0 to 99 in years.
>Gender: This value represents the confidence of a detection belonging to a certain gender
category. The confidence ranges from -99 to 99, where -99 is considered a high confidence
for the subject being male and 99 a high confidence for the subject being female. The closer
the estimation is to 0, the more uncertain the estimation is.
>Ethnicity: The ethnicity estimations are grouped in four categories: African, Asian, Caucasian
and Hispanic. Each numbered entry represents the estimation for ethnicity type per face
detection. This feature is no longer part of Toolkit v5.x, however, the CSV file field has been
kept for backward compatibility.
>Headpose: The direction of where a person is looking at is represented in the head pose
using pitch, yaw and roll in degrees.
>Viewing: This value shows if the subject looked towards the point of interest (e.g. camera,
digital screen, etc.). Number 1 stands for ‘yes’ and 0 means ‘no’. This value does not include
subjects who were detected but did not look towards the point of interest.
>Headgaze: The position of a person’s gaze on top of the point of interest. The origin of the
position is considered the camera and the value is expressed as a point.
>Attention: This number represents the amount of time spent looking toward the point of
interest. The time is expressed in milliseconds.
>Interest: This number represents the aggregate value of multiple people looking at the same
point of interest. The higher the number, the more people are looking towards the same point
>Mood: This number represents a quantity of positive facial expression and ranges from 0 to
>Happiness: one of the 6 basic facial expressions, ranging from 0 to 100 where 100 is the
maximum happiness index.
>Surprise: one of the 6 basic facial expressions, ranging from 0 to 100 where 100 is the
maximum surprise index.
>Anger: one of the 6 basic facial expressions, ranging from 0 to 100 where 100 is the
maximum anger index.
>Disgust: one of the 6 basic facial expressions, ranging from 0 to 100 where 100 is the
maximum disgust index.
>Fear: one of the 6 basic facial expressions, ranging from 0 to 100 where 100 is the maximum
>Sadness: one of the 6 basic facial expressions, ranging from 0 to 100 where 100 is the
maximum sadness index.
>Face_x: top left x coordinate in pixels of the face location.
>Face_y: top left y coordinate in pixels of the face location.
>Face_width: the width of the rectangle enclosing the detected face.
>Face_height: the height of the rectangle enclosing the detected face.
>left_eye_x: x coordinate in pixels of the left eye position
>left_eye_y: y coordinate in pixels of the left eye position
>right_eye_x: x coordinate in pixels of the left eye position
>right_eye_y: y coordinate in pixels of the left eye position
>Mask points: 68 points that indicate different parts of the face, eg. eyebrows, eyes, nose,
mouth and the yaw line.