Aerial Arithmetic: Accelerometer and Gyroscope calibration: Bias, Non-othagonality and scale factor

One of the first entries in this blog dealt with calibrating an magnetometer. This post will be a continuation of that, only for the inertial sensors. The magnetometer calibration can be found here. Most AHRS use an accelerometer, gyroscope and a magnetometer. Gyroscopes and accelerometer require calibration as well which will be the subject of this post. Below I'll outline a process in which calibrates an accelerometer and gyroscope for scale factors errors, bias and non-orthogonality. The method can be preformed without any special tools. Source code can be found at the project repository here

Before calibration can be introduced a model for sensor output needs to be selected. The procedure outlined in this post is a slightly modified version to that proposed in [1]. The primary sensor model used for calibration:

$s_m = TK(s_t + b + \eta) \tag{1.1}$

(1.1) relates a true signal $s_t$ to an erroneous signal $s_m$ affected by random noise $\eta$, bias $b$, scale factor $K$ and non-orthogonality $T$. (1.1) gives a model for both accelerometer and gyroscope signals. For complete calibration one must correct for bias, scale factor and non-orthogonality. First a brief description of each:

Bias:
Bias is molded as a 3 element vector containing the zero offset along the $x,y$ and $z$ axis. Bias describes a constant offset present in a signal. For example a gyroscope left sitting still should only output zero. Ignoring noise any amount above or below zero is caused by bias. Typically bias is assumed to be fixed. However, bias in the gyroscope varies with time [1]. This is also why it is not enough to simply zero out this bias when the device is static and simply integrate the gyroscope reading to get angle estimation. Fusing the gyroscope with other sensors, see the AHRS post, help to compensate for this problem by using the magnetometer and, to a lesser extent, the accelerometer to correct time varying bias in the gyroscope.

Scale factor:

Scale factor is modeled as a diagonal matrix:

$K = \begin{bmatrix}
s_x & 0 & 0\\
0 & s_y & 0 \\
0 & 0 & s_z
\end{bmatrix} \tag{1.2}$

It is assumed there is no cross axis scale factor errors to compensate for. This is why every element off the diagonal is zero. $s_x,s_y$ and $s_z$ are the values which converts the raw output of each sensor's axis into units of measurement. For example a gyroscope may have a 16 bit integer representing it's raw output. The scale factor converts that 16 bit integer to radians per second. Most MEMS sensors should have a data sheet or product manual which gives users this value. However, this is a suggested setting. Scale factors differ from sensor to sensor creating a need to find their unique scale factor.

Non-orthogonality:

The MEMS sensors used for navigation are all 3-axis. Meaning they will have sensor laying on an x,y and z plane. Factory imperfections might not leave each axis perpendicular to one another.

Non-orthogonality is expressed as a matrix, but with different contents for the accelerometer and gyroscope. The accelerometer non-orthagonality matrix is defined as:

$T^a = \begin{bmatrix}
1 & -\alpha_{yz} & \alpha_{zy}\\
0 & 1 & -\alpha_{zx} \\
0 & 0 & 1
\end{bmatrix} \tag{1.3}$

The gyroscope non-orthagonality matrix is defined as:

$T^g = \begin{bmatrix}
1 & -\gamma_{yz} & \gamma_{zy}\\
\gamma_{xz} & 1 & -\gamma_{zx} \\
-\gamma_{xy} & \gamma_{yx} & 1
\end{bmatrix} \tag{1.4}$

$T^a$ and $T^g$ are actually obtained from regular rotation matricies:

$T = T_z \cdot T_y \cdot T_x$

Where $T_z, T_y$ and $T_x$ are all rotation matrices defining rotations about the $z,y$ and $x$ axis respectively:

$T_z = \begin{bmatrix}
\cos\beta_z & -\sin\beta_z & 0\\
\sin\beta_z & \cos\beta_z &0\\
0 & 0 & 1
\end{bmatrix}$

$T_y = \begin{bmatrix}
\cos\beta_y & 0 & \sin\beta_y\\
0 & 1 & 0\\
-\sin\beta_y & 0 & \cos\beta_y
\end{bmatrix}$

$T_z = \begin{bmatrix}
1 & 0 & 0 \\
0 & \cos\beta_x & -\sin\beta_x\\
0 & \sin\beta_x & \cos\beta_x
\end{bmatrix}$

It is assumed that the angle of misalignment, $\beta_x,\beta_y$ and $\beta_z$ is very small. So small that $\cos\beta \approx 1$. This causes the diagonal on $T$ to only contain ones. The left over off diagonal elements are computed directly from the minimization. As for the accelerometer, the x-axis is chosen to be the basis from which the other axis are made orthogonal to. Hence $T_x$ is the identity matrix and the lower triangular half of matrix $T^a$ are all zeros. $T$ also explains where the negative entries of $T^g$ and $T^a$ came from.

There are other forms of errors as well. One such source is sensor misalignment whereby each MEMS sensors are not aligned with each other. I'll probably talk more about this in a future post. Being able to align sensors would allow for individual sensors to be bought and mounted as opposed to being restricted to IMUs which already have all three sensors mounted on it. For now though it is ignored and assumed that the sensors are aligned. For those interested [2] gives a method for magnetometer accelerometer alignment. The main reason I've left it out is I don't yet have a good way of aligning the gyroscope since precisely rotating it is challenging without tools.

The task is then to solve for $T$,$K$ and $b$ for the accelerometer and gyroscope. The approach taken by [1] and the one that will be discussed here. First each of the error terms, $T$,$K$,$b$, is parameterized into a single vector $\theta$ called the calibration parameters.

$\theta_{\text{accel}} = [\alpha_{yz}, \alpha_{zy}, \alpha_{zx}, s^a_x, s^a_y, s^a_z,b^a_x, b^a_y,b^a_z] \tag{1.5}$

$\theta_{\text{gyro}} = [\gamma_{yz},\gamma_{zy}, \gamma_{xz}, \gamma_{zx},\gamma_{xy},\gamma_{yx},s^g_x, s^g_y, s^g_z] \tag{1.6}$

$\theta_{\text{accel}}$ and $\theta_{\text{gyro}}$ are solved for by minimizing a cost function over a set of sampled points. Each cost function will be introduced shortly. In order to generate sample points the device must be moved to various orientations and left still for 1-2 seconds before being moved again. Now let me go into a bit more detail about the steps leading up to minimization.

Data Collection:
As mentioned in [1] the device is kept still for an initial interval of 30-50 seconds. This is done to provide a baseline for the variance of a device at rest. This will be used later as a threshold to detect when the device is static and when it is moving.

Next, sample points must be collected from the device in order to create a space in which calibration parameters can be minimized in. On the user end this entails them moving the device to 30 or more different orientations. In between each movement the device needs to stay still for 1-2 seconds. What this does is create static intervals that the algorithm will find use in calibration.

Figure 1:
Static interval detection. Black lines will go high to indicated static intervals
and low to indicated movement

Figure 2:
Close view of a region in figure 1.

The program finds periods of rest shown in figure 1, by calculating the variance over a rolling window. If the window's variance is higher than a small multiple of the initial resting variance, then the device is considered to have been experiencing movement. It will remain flagged as movement until the window's variance variance falls below the threshold where it will be considered resting. Below is an demonstration:

Figure 3:
Demonstration of the moving window

Figure 3 shows how static intervals are detected. The width of the rectangle is the size of the rolling variance window. When the rectangle turns green no movement is detected so it's left most edge of the rectangle is logged as a static interval start. When it turns red, the right most edge of the rectangle is logged as the interval end. The distance from the start to the end must be larger than 1-2 seconds, this discounts the flickering seen in figure 3 as static intervals.

Variance is kept at a fixed window size in order to prevent mean and variance calculations from becoming over saturated (a new sample has more weight on a mean with 10 elements rather than 100). Over saturation might cause too much movement to be included in a static interval, which affects the accuracy of gyroscope calibration the most.

Static intervals on the accelerometer are averaged over the user specified 1-2 second window and treated as the measured gravity vector. These averaged intervals are called static accelerometer values and there will be one for each time the user let the device rest for 1-2 seconds. Essentially, the accelerometer is being used as a gravity detector. Gravity has a constant magnitude which provides a reliable set point to be used for minimization. Any accelerometer reading that doesn't have the same magnitude as gravity creates error, that error then minimized. This relation is seen in the error function:

$C(\theta_{\text{accel}}) = \sum_{k=1}^{M} (\|{g}\|^2 - \|h(a^s_k,\theta_{\text{accel}})\|^2)^2 \tag{1.7}$

In (1.7) $M$ denotes the total number of static intervals, i.e. the number of times the users leave the device still. $g$ is the local gravity vector, but since $\|{g}\|$ is used it's enough to know the local magnitude of gravity, 9.81$m/s^2$ or 1g for example. $h$ denotes a function which applies the calibration parameters to the raw accelerometer signal using 1.1:

$h(a^s_k,\theta_{\text{accel}}) = TK(a^s_k + b) \tag{1.8}$

$a^s_k$ is the accelerometer value at time $k$ so random noise $\eta$ is already included in the measured value.

Non static intervals (periods where the lines are low in figure 1) are used in gyroscope calibration. A quaternion representing the total rotation between movements is obtained via Runge Kutta 4th order integration (RK4) for better accuracy. This is why it's important to both keep the movements brief and to make sure no movements are included in the static interval detection. Otherwise the quaternion calculation will be biased leading to poor results in calibration. The estimated quaternion is used to rotate the kth static accelerometer value. This predicted value is compared to the k+1 static accelerometer value where error is minimized. This relation is seen in the gyroscope cost function:

$C(\theta_{\text{gyro}}) = \sum_{k=2}^{M} \|u_{a,k} - u_{g,k}\|^2 \tag{1.9}$

$M$ is the same as in 1.7. $u_{a,k}$ denotes the static accelerometer value at time $k$. $u_{g,k}$ is the predicted static accelerometer value computed by rotating $u_{a,k-1}$ (the prior static accelerometer value) by the quaternion estimated from RK4 integration.

Initialization:

This step has a slight departure from the reference algorithm. Since minimization is being done on a non-convex plane, it is critical to have an initial starting point close to the global minimum. Using the papers algorithm as is, I could never get the accelerometer bias to come close to what the authors reported. The reasons are primarily because this algorithm minimizes 9 parameters with 1 output, making it underdetermined. So error in bias could have been over compensated for with scale factor.

To get around this issue I used ellipsoid fitting, mentioned in this post, on the accelerometer to initialize it. Ellipsoid fitting gives the center and radii of an ellipsoid fitted to set of data. Since the accelerometer is being used to measure a constant field (gravity only) it will create a cloud of samples roughly on the surface of a sphere. In this case the ellipsoid center serves as the initial bias offset. The ellipsoid radii serve as the initial scale factor estimates. It also means that the orientations the user rotates the device to needs to be fairly distinct so that an ellipsoid can be better estimated.

Gyroscope initialization uses the suggestions in the device's product to initialize the scale factor.

After initialization finishes, the two cost functions introduced in (1.7) and (1.9) are minimized using the Levenberg-Marquardt algorithm. This results in two parameter vectors $\theta_{\text{accel}}$ and $\theta_{\text{gyrol}}$ which can be applied to the accelerometer and gyroscope using (1.1).

Results:

Authors of [1] had their data publicly available for download. When run on their data the results are comparable to the original paper. Values listed under "paper" are those reported in [1], values listed under "factory" are the target values (they're taken from a factory calibrated sensor), finally values listed under "proposed" are those obtained using the technique outlined in this blog post. First are the accelerometer results:

Accel.	$S_x$	$S_y$	$S_z$	$b_x$	$b_y$	$b_z$
factory	415	413	415	33123	33276	32360
Paper	414.41	412.05	414.61	32768	32466	32485
Initialization	419.15	413.96	409.04	33119.93	33277.41	32364.29
Proposed	414.42	412.03	414.62	33123.84	33275.12	32364.48

Proposed method misalignment results

1.0000	-0.0065	-0.0108
0.0102	1.0001	0.0114
0.0201	0.0098	0.9998

Paper misalignment results

1.0000	-0.0066	-0.0110
0.0102	1.0001	0.0114
0.0201	0.0098	0.9998

Factory misalignment

1.00	0.00	-0.01
0.01	1.00	0.01
0.02	0.01	1.00

Next are the gyroscope calibration results:

Gyroscope	$S_x$	$S_y$	$S_z$
Factory	4778	4758	4766
Paper	4778.0	4764.8	4772.6
Proposed	4779.0	4764.9	4774.8

Proposed method misalignment results

0.9997	-0.0152	-0.0203
0.0003	1.0007	0.0429
-0.0064	0.0125	1.0004

Paper misalignment results

0.9998	-0.0149	-0.0218
0.0003	1.0007	0.0433
-0.0048	0.0121	1.0004

Factory misalignment

1.00	-0.01	-0.02
0.00	1.00	0.04
-0.01	0.01	1.00

Future:
As mentioned above this method still does not take into account misalignment between sensors. At some point this will be addressed, however it might require some type of machinery. The selection of what to multiply the static variance by is still arbitrary. One could preform a second minimization on a threshold multiplier. I chose not to in order to save run time and it isn't too laborious for the user to look at how well static intervals are found and adjust the multiplier if needed.

Thanks for reading,
-Joseph

References:

[1] . Tedaldi, David, Alberto Pretto, and Emanuele Menegatti. "A robust and easy to implement method for IMU calibration without external e10)
quipments." 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2014.

[2]. Kok, Manon, et al. "Calibration of a magnetometer in combination with inertial sensors." Information Fusion (FUSION), 2012 15th International Conference on. IEEE, 2012.

Aerial Arithmetic

Wednesday, October 12, 2016

Accelerometer and Gyroscope calibration: Bias, Non-othagonality and scale factor

1 comment: