One of the first
entries in this blog dealt with calibrating an magnetometer. This post will be a continuation of that, only for the inertial sensors. The magnetometer calibration can be found here. Most AHRS use an accelerometer, gyroscope and a
magnetometer. Gyroscopes and accelerometer require calibration as
well which will be the subject of this post. Below I'll outline a
process in which calibrates an accelerometer and gyroscope for scale
factors errors, bias and non-orthogonality. The method can be
preformed without any special tools. Source code can be found at the project repository here
Before calibration
can be introduced a model for sensor output needs to be selected. The
procedure outlined in this post is a slightly modified version to that proposed in [1]. The primary sensor model used for calibration:
$s_m = TK(s_t + b + \eta) \tag{1.1}$
(1.1) relates a true signal $s_t$ to an erroneous signal $s_m$ affected by random noise $\eta$, bias $b$, scale factor $K$ and
non-orthogonality $T$. (1.1) gives a model for both accelerometer and gyroscope signals. For complete calibration one must correct for bias, scale factor and non-orthogonality. First a brief description of each:
Bias:
Bias is molded as a 3 element vector containing the zero offset along the $x,y$ and $z$ axis. Bias describes a constant offset present in a signal. For example a gyroscope left sitting still should only output zero. Ignoring noise any amount above or below zero is caused by bias. Typically bias is assumed to be fixed. However, bias in the gyroscope varies with time [1]. This is also why it is not enough to simply zero out this bias when the device is static and simply integrate the gyroscope reading to get angle estimation. Fusing the gyroscope with other sensors, see the AHRS post, help to compensate for this problem by using the magnetometer and, to a lesser extent, the accelerometer to correct time varying bias in the gyroscope.
Bias is molded as a 3 element vector containing the zero offset along the $x,y$ and $z$ axis. Bias describes a constant offset present in a signal. For example a gyroscope left sitting still should only output zero. Ignoring noise any amount above or below zero is caused by bias. Typically bias is assumed to be fixed. However, bias in the gyroscope varies with time [1]. This is also why it is not enough to simply zero out this bias when the device is static and simply integrate the gyroscope reading to get angle estimation. Fusing the gyroscope with other sensors, see the AHRS post, help to compensate for this problem by using the magnetometer and, to a lesser extent, the accelerometer to correct time varying bias in the gyroscope.
Scale factor:
Scale factor is modeled as a diagonal matrix:
s_x & 0 & 0\\
0 & s_y & 0 \\
0 & 0 & s_z
\end{bmatrix} \tag{1.2}$
It is assumed there is no cross axis scale factor errors to compensate for. This is why every element off the diagonal is zero. $s_x,s_y$ and $s_z$ are the values which converts the raw output of each sensor's axis into units of measurement. For example a gyroscope may have a 16
bit integer representing it's raw output. The scale factor converts
that 16 bit integer to radians per second. Most MEMS sensors should have a data sheet or product
manual which gives users this value. However, this is a suggested setting. Scale factors differ from sensor to sensor creating a need to find their unique scale factor.
Non-orthogonality:
The MEMS sensors used for navigation are all 3-axis. Meaning they will have sensor laying on an x,y and z plane. Factory imperfections might not leave each axis perpendicular to one another.
$T^a = \begin{bmatrix}
1 & -\alpha_{yz} & \alpha_{zy}\\
0 & 1 & -\alpha_{zx} \\
0 & 0 & 1
\end{bmatrix} \tag{1.3}$
The gyroscope non-orthagonality matrix is defined as:
$T^g = \begin{bmatrix}
1 & -\gamma_{yz} & \gamma_{zy}\\
\gamma_{xz} & 1 & -\gamma_{zx} \\
-\gamma_{xy} & \gamma_{yx} & 1
\end{bmatrix} \tag{1.4}$
$T^a$ and $T^g$ are actually obtained from regular rotation matricies:
$T = T_z \cdot T_y \cdot T_x$
Where $T_z, T_y$ and $T_x$ are all rotation matrices defining rotations about the $z,y$ and $x$ axis respectively:
$T_z = \begin{bmatrix}
\cos\beta_z & -\sin\beta_z & 0\\
\sin\beta_z & \cos\beta_z &0\\
0 & 0 & 1
\end{bmatrix}$
$T_y = \begin{bmatrix}
\cos\beta_y & 0 & \sin\beta_y\\
0 & 1 & 0\\
-\sin\beta_y & 0 & \cos\beta_y
\end{bmatrix}$
$T_z = \begin{bmatrix}
1 & 0 & 0 \\
0 & \cos\beta_x & -\sin\beta_x\\
0 & \sin\beta_x & \cos\beta_x
\end{bmatrix}$
There are other forms of errors as well. One such source is sensor misalignment whereby each MEMS sensors are not aligned with each other. I'll probably talk more about this in a future post. Being able to align sensors would allow for individual sensors to be bought and mounted as opposed to being restricted to IMUs which already have all three sensors mounted on it. For now though it is ignored and assumed that the sensors are aligned. For those interested [2] gives a method for magnetometer accelerometer alignment. The main reason I've left it out is I don't yet have a good way of aligning the gyroscope since precisely rotating it is challenging without tools.
The task is then to solve for $T$,$K$ and $b$ for the accelerometer and gyroscope. The approach taken by [1] and the one that will be discussed here. First each of the error terms, $T$,$K$,$b$, is parameterized into a single vector $\theta$ called the calibration parameters.
$\theta_{\text{accel}} = [\alpha_{yz}, \alpha_{zy}, \alpha_{zx}, s^a_x, s^a_y, s^a_z,b^a_x, b^a_y,b^a_z] \tag{1.5}$
$\theta_{\text{gyro}} = [\gamma_{yz},\gamma_{zy}, \gamma_{xz}, \gamma_{zx},\gamma_{xy},\gamma_{yx},s^g_x, s^g_y, s^g_z] \tag{1.6}$
$\theta_{\text{accel}}$ and $\theta_{\text{gyro}}$ are solved for by minimizing a cost function over a set of sampled points. Each cost function will be introduced shortly. In order to generate sample points the device must be moved to various orientations and left still for 1-2 seconds before being moved again. Now let me go into a bit more detail about the steps leading up to minimization.
Data Collection:
As mentioned in [1] the device is kept still for an initial interval of 30-50 seconds. This is done to provide a baseline for the variance of a device at rest. This will be used later as a threshold to detect when the device is static and when it is moving.
As mentioned in [1] the device is kept still for an initial interval of 30-50 seconds. This is done to provide a baseline for the variance of a device at rest. This will be used later as a threshold to detect when the device is static and when it is moving.
Next, sample points must
be collected from the device in order to create a space in which
calibration parameters can be minimized in. On the user end this
entails them moving the device to 30 or more different orientations.
In between each movement the device needs to stay still for 1-2
seconds. What this does is create static intervals that
the algorithm will find use
in calibration.
Figure 1: Static interval detection. Black lines will go high to indicated static intervals and low to indicated movement |
Figure 2: Close view of a region in figure 1. |
The program
finds periods of rest shown in figure 1, by calculating the variance over a rolling
window. If the window's variance is higher than a small multiple of the initial resting variance, then the device is considered to have been experiencing movement. It will remain flagged as movement until the window's variance variance falls below the threshold where it will be considered resting. Below is an demonstration:
Figure 3 shows how static intervals are detected. The width of the rectangle is the size of the rolling variance window. When the rectangle turns green no movement is detected so it's left most edge of the rectangle is logged as a static interval start. When it turns red, the right most edge of the rectangle is logged as the interval end. The distance from the start to the end must be larger than 1-2 seconds, this discounts the flickering seen in figure 3 as static intervals.
Variance is kept at a fixed window size in order to prevent mean and variance calculations from becoming over saturated (a new sample has more weight on a mean with 10 elements rather than 100). Over saturation might cause too much movement to be included in a static interval, which affects the accuracy of gyroscope calibration the most.
Figure 3: Demonstration of the moving window |
Variance is kept at a fixed window size in order to prevent mean and variance calculations from becoming over saturated (a new sample has more weight on a mean with 10 elements rather than 100). Over saturation might cause too much movement to be included in a static interval, which affects the accuracy of gyroscope calibration the most.
Static intervals on the accelerometer are
averaged over the user specified 1-2 second window and treated as the measured gravity
vector. These averaged intervals are called static accelerometer values and there will be one for each time the user let the device rest for 1-2 seconds. Essentially, the accelerometer is being used as a gravity
detector. Gravity
has a constant magnitude which provides a reliable
set point to be used for
minimization. Any accelerometer reading that doesn't
have the same magnitude as gravity
creates error, that error then minimized. This relation is seen in
the error function:
$C(\theta_{\text{accel}}) = \sum_{k=1}^{M} (\|{g}\|^2 - \|h(a^s_k,\theta_{\text{accel}})\|^2)^2 \tag{1.7}$
In (1.7) $M$ denotes the total number of static intervals, i.e. the number of times the users leave the device still. $g$ is the local gravity vector, but since $\|{g}\|$ is used it's enough to know the local magnitude of gravity, 9.81$m/s^2$ or 1g for example. $h$ denotes a function which applies the calibration parameters to the raw accelerometer signal using 1.1:
$h(a^s_k,\theta_{\text{accel}}) = TK(a^s_k + b) \tag{1.8}$
$a^s_k$ is the accelerometer value at time $k$ so random noise $\eta$ is already included in the measured value.
Non static intervals (periods where the lines are low in figure 1) are used in gyroscope calibration. A quaternion representing the total rotation between movements is obtained via Runge Kutta 4th order integration (RK4) for better accuracy. This is why it's important to both keep the movements brief and to make sure no movements are included in the static interval detection. Otherwise the quaternion calculation will be biased leading to poor results in calibration. The estimated quaternion is used to rotate the kth static accelerometer value. This predicted value is compared to the k+1 static accelerometer value where error is minimized. This relation is seen in the gyroscope cost function:
$C(\theta_{\text{gyro}}) = \sum_{k=2}^{M} \|u_{a,k} - u_{g,k}\|^2 \tag{1.9}$
$M$ is the same as in 1.7. $u_{a,k}$ denotes the static accelerometer value at time $k$. $u_{g,k}$ is the predicted static accelerometer value computed by rotating $u_{a,k-1}$ (the prior static accelerometer value) by the quaternion estimated from RK4 integration.
Initialization:
This
step has a
slight departure from the reference algorithm. Since minimization
is being done on a non-convex
plane, it is critical to have
an initial starting point close
to the global minimum. Using
the papers algorithm as is, I could never get the accelerometer bias
to come close to what the authors reported. The reasons are primarily because this algorithm minimizes 9 parameters with 1 output, making it underdetermined. So error in bias could have been over compensated for with scale factor.
To
get around this issue I used ellipsoid
fitting, mentioned in this post, on the accelerometer to initialize it. Ellipsoid fitting
gives the center and radii of an ellipsoid fitted to set of data. Since the accelerometer is being used to measure a constant field (gravity only) it will create a cloud of samples roughly on the surface of a sphere. In this case the ellipsoid center
serves as the initial bias offset. The ellipsoid radii serve as the
initial scale factor estimates. It also means that the orientations the user rotates the device to needs to be fairly distinct so that an ellipsoid can be better estimated.
Gyroscope
initialization uses the suggestions in the device's product to initialize the scale factor.
After initialization finishes, the two cost functions introduced in (1.7) and (1.9) are minimized using the Levenberg-Marquardt algorithm. This results in two parameter vectors $\theta_{\text{accel}}$ and $\theta_{\text{gyrol}}$ which can be applied to the accelerometer and gyroscope using (1.1).
Results:
Authors of [1] had
their data publicly available for download. When run on their data
the results are comparable to the original paper. Values listed under "paper" are those reported in [1], values listed under "factory" are the target values (they're taken from a factory calibrated sensor), finally values listed under "proposed" are those obtained using the technique outlined in this blog post. First are the accelerometer results:
Accel. | $S_x$ | $S_y$ | $S_z$ | $b_x$ | $b_y$ | $b_z$ |
factory | 415 | 413 | 415 | 33123 | 33276 | 32360 |
Paper | 414.41 | 412.05 | 414.61 | 32768 | 32466 | 32485 |
Initialization | 419.15 | 413.96 | 409.04 | 33119.93 | 33277.41 | 32364.29 |
Proposed | 414.42 | 412.03 | 414.62 | 33123.84 | 33275.12 | 32364.48 |
Proposed method misalignment results
1.0000 | -0.0065 | -0.0108 |
0.0102 | 1.0001 | 0.0114 |
0.0201 | 0.0098 | 0.9998 |
Paper misalignment results
1.0000 | -0.0066 | -0.0110 |
0.0102 | 1.0001 | 0.0114 |
0.0201 | 0.0098 | 0.9998 |
Factory misalignment
1.00 | 0.00 | -0.01 |
0.01 | 1.00 | 0.01 |
0.02 | 0.01 | 1.00 |
Next are the gyroscope calibration results:
Gyroscope
|
$S_x$
|
$S_y$
|
$S_z$
|
Factory
|
4778
|
4758
|
4766
|
Paper
|
4778.0
|
4764.8
|
4772.6
|
Proposed
|
4779.0
|
4764.9
|
4774.8
|
Proposed method misalignment results
0.9997 | -0.0152 | -0.0203 |
0.0003 | 1.0007 | 0.0429 |
-0.0064 | 0.0125 | 1.0004 |
Paper misalignment results
0.9998 | -0.0149 | -0.0218 |
0.0003 | 1.0007 | 0.0433 |
-0.0048 | 0.0121 | 1.0004 |
Factory misalignment
1.00 | -0.01 | -0.02 |
0.00 | 1.00 | 0.04 |
-0.01 | 0.01 | 1.00 |
Future:
As mentioned above this method still does not take into account misalignment between sensors. At some point this will be addressed, however it might require some type of machinery. The selection of what to multiply the static variance by is still arbitrary. One could preform a second minimization on a threshold multiplier. I chose not to in order to save run time and it isn't too laborious for the user to look at how well static intervals are found and adjust the multiplier if needed.
Thanks for reading,
-Joseph
References:
As mentioned above this method still does not take into account misalignment between sensors. At some point this will be addressed, however it might require some type of machinery. The selection of what to multiply the static variance by is still arbitrary. One could preform a second minimization on a threshold multiplier. I chose not to in order to save run time and it isn't too laborious for the user to look at how well static intervals are found and adjust the multiplier if needed.
Thanks for reading,
-Joseph
References:
[1]
. Tedaldi, David, Alberto Pretto, and Emanuele Menegatti. "A
robust and easy to implement method for IMU calibration without
external e10)
quipments." 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2014.
[2]. Kok, Manon, et al. "Calibration of a magnetometer in combination with inertial sensors." Information Fusion (FUSION), 2012 15th International Conference on. IEEE, 2012.
quipments." 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2014.
[2]. Kok, Manon, et al. "Calibration of a magnetometer in combination with inertial sensors." Information Fusion (FUSION), 2012 15th International Conference on. IEEE, 2012.
Thanks for sharing.
ReplyDelete