The 13-point calibration routine for the Kinarm Gaze-Tracker is slow, and often difficult to achieve a good subject calibration with, often requiring multiple calibration attempts. Two factors contribute to these problems: the large number of target locations used in the calibration routine and the large size of the region over which the calibration is performed. This study quantified the effects on calibration accuracy when each of these factors was reduced. The results show that reducing the number of targets from 13 to 5 has a small impact on calibration accuracy (1.0±0.6° vs 1.1±0.7°) while significantly reducing the total time for the calibration routine. Further, reducing the size of the 5-point calibration region from 100% to 70% had only a marginal increase in uncertainty (1.2±0.9°). It is recommended that for most Kinarm users a 5-point calibration routine will save time without sacrificing the quality of the collected data.


The Kinarm Gaze-Tracker is an optional component of a Kinarm Lab. It is based upon the Eyelink 1000+ with the Remote option, from SR Research. When first introduced, the calibration routine for the Kinarm Gaze-Tracker was based upon the one recommended for use with the Eyelink 1000+ with Remote option. However, over the past 10 years it has become clear that this routine is slow and often difficult to achieve a good calibration. Two factors contribute to these problems: the large number of target locations used in the calibration routine (13), and large size of the region over which the calibration is attempted.

The large number of points used in the calibration routine creates two problems. The first problem is that it contributes to a slow calibration because each point needs to be fixated by the subject 2x: once during calibration and once during validation. The time taken to achieve 26 successful fixations is not insignificant. This large number of points then has a secondary problem, because any mistake by the subject (i.e. accidentally not looking at the fixation target) can result in a failed calibration, requiring the entire calibration/validation to be re-done, which again will slow the overall process of calibration.

The large size over which calibration is typically performed only affects some fraction of subjects, typically those whose eyes are difficult to track for proximal target locations. The large, default size of the calibration region was chosen in order to maximize the area over which the calibration would be valid. As verified in this study, when using the 13-point calibration routine with the Eyelink, accuracy of gaze does not extrapolate well outside of the calibrated region, hence it is desirable to have as large a calibration region as possible.

While it is obvious that reducing the number of target locations used by the calibration routine would make it faster, the impact on accuracy of the data must be assessed. Likewise reducing the size of the calibration region will obviously increase the likelihood of a successful calibration for subjects who have difficulty with proximal targets, but again it is unclear what the impact of the change might be. The purpose of this study was to quantify these effects on calibration accuracy.


Data were collected from 30 adult subjects, from 6 different sites using Kinarm Labs (1 site was BKIN Technologies). Of these 6 different sites, 4 had Kinarm End-Point Labs and 2 had Kinarm Exoskeleton labs. Dexterit-E 3.9 was used for all data collection. For each subject, four calibration options were tested:

  1. 13-pt 100% scaling
  2. 13-pt 70% scaling
  3. 5-pt 100% scaling
  4. 5-pt 70% scaling

Data collection:

For each calibration option, the following steps were followed:

  1. Manually set the number of calibration points (5 or 13) and the scaling factor used for Gaze Calibration (70% or 100%)
    1. See details at end of this Methods section for how to manually set the number of calibration points.
  2. Calibrate and validate. Repeat until a “good” validation is found
    1. Make a note if a “good” validation was never found.
  3. Run a custom task that did the following:
    1. This task ran an experiment that was analogous to a 13-point gaze validation, but with five blocks of data collection (i.e. five repeats for each target location in total). The target locations were selected to cover the nominal gaze-tracking region calibrated by a 100% scaling calibration (note: none of these targets were at the same location as the calibration targets).
    2. The task automatically advanced based on recording a sufficiently long fixation, but there was a manual advance button as well.

Data analysis:

For each subject:

  1. For each trial, the gaze fixation location was identified by:
    1. Using fixation events from the Eyelink, identify the time period of fixation, which includes the subsequent 200 ms.
    2. Compute the mean location of gaze during fixation
    3. Compute the mean pupil location during fixation

Figure 1. For each trial, Fixation events are used to identify the start of a fixation. Data over the subsequent 200 ms are averaged.

  1. For each target location:
    1. Use geometric median (across 5 blocks) to calculate best estimates of location of gaze fixation and pupil location.
      1. Note geometric median is more robust than geometric mean when dealing with data that may contain outliers.
    2. Calculate gaze-vector (3D vector from pupil location to gaze fixation location) using the geometric median values.
    3. Calculate pupil-target vector (3D vector from pupil location to target location) using the geometric median values.
    4. Calculate the angular error in calibration for that target as the angular difference between the gaze vector and the pupil-target vector.


The results of this study are shown in Figures 2 and 3 below.

Figure 2. The mean and p95 values across all 30 subjects were plotted at each target location. The grey shaded area indicates the nominal shape of the region calibrated by the 5 or 13 point calibration routine.

As can be seen in Figure 2, the location of largest errors for all calibration types is in the proximal region. For all calibration types, the best accuracy is found within the calibrated region (i.e. within the grey-shaded area).

Figure 3. The errors from all subjects, at all locations, were combined into separate histograms for each of the four calibration options. The mean ± standard deviation was calculated for each histogram, along with the maximum error recorded.

As measured by both mean and standard deviation, the calibration with the best accuracy is 13-point, 100%, followed by 5-point, 100%, 5-point 70% and 13-point 70%. When calibrating over a small region (i.e. 70% calibration space), 13-point calibration extrapolates poorly, when compared to the 5-point.


As expected, reducing the number of calibration points results in a less accurate calibration when 100% scaling is used. However, the loss in accuracy is quite small, whereas in comparison the gain in speed of calibration is quite dramatic (i.e. ~2.5x faster).

With the 5-pt calibration, decreasing the region over which calibration is performed from 100% to 70% again decreases calibration accuracy, but again, the loss in accuracy is quite small. Although not explicitly tested, it seems likely that the error associated with intermediate scaling (e.g. 90% or 80%) would have accuracies in between those observed for 100% and 70% scaling.

While perhaps unexpected, the 13-point calibration with 70% scaling produces the worst calibration of all 4 types. The most likely explanation for this result is because the 13-point calibration uses a higher order polynomial to fit the calibration data compared to the 5-point calibration. Higher order polynomials tend to extrapolate more poorly compared to lower order models. As can be seen in Figure 2, the large errors that were observed when testing the 13-point calibration with 70% scaling were all outside of the calibration region, which is where extrapolation occurs.


Except for the most demanding applications requiring the highest accuracy, most Kinarm users would benefit greatly from switching to the faster 5-point calibration. Furthermore, a smaller calibration region (e.g. 70, 80 or 90%), which would lead to an increased likelihood of a successful calibration, would also most likely be acceptable for most Kinarm users.

Interested in trying the new 5-point calibration routine?

How to Manually Change the Number of Calibration Targets to Either 5 or 13.

With Dexterit-E 3.9.1 and later, the number of calibration points can be changed manually in the DEX.INI file, using the following instructions:

  1. Ensure the hidden files and extensions are visible on the Dexterit-E computer,
    1. Open Windows File Explorer and select the View menu
    2. In the Show/Hide section of the View menu:
      1. Check File name extensions.
      2. Check Hidden items.
  2. Manually set the number of calibration points:
    1. Close Dexterit-E.
    2. In Windows Explorer, navigate to “C:\ProgramData\BKIN Technologies\Dexterit-E 3.xx”.
    3. Make a backup copy of the dex.ini file (this step only needs to be done once).
    4. Open Notepad (or Notepad++) as an Administrator
      1. In Windows search type “notepad”.
      2. Right-click on the search result for the Notepad app, and select Run as administrator.
    5. Within Notepad, open the dex.ini file, and then edit it:
      1. In the [eyelink] section, set calibration_points=13 or calibration_points=5.
    6. Save the dex.ini file.
    7. Start Dexterit-E.