Quaternion estimator algorithm

From HandWiki
Short description: Algorithm to solve Wahba's problem

The quaternion estimator algorithm (QUEST) is an algorithm designed to solve Wahba's problem, that consists of finding a rotation matrix between two coordinate systems from two sets of observations sampled in each system respectively. The key idea behind the algorithm is to find an expression of the loss function for the Wahba's problem as a quadratic form, using the Cayley–Hamilton theorem and the Newton–Raphson method to efficiently solve the eigenvalue problem and construct a numerically stable representation of the solution.

The algorithm was introduced by Malcolm D. Shuster in 1981, while working at Computer Sciences Corporation.[1] While being in principle less robust than other methods such as Davenport's q method or singular value decomposition, the algorithm is significantly faster and reliable in practical applications,[2][3] and it is used for attitude determination problem in fields such as robotics and avionics.[4][5][6]

Formulation of the problem

Wahba's problem consists of finding a rotation matrix [math]\displaystyle{ \mathbf{A}^* }[/math] that minimises the loss function

[math]\displaystyle{ l \left( \mathbf{A} \right) = \frac{1}{2} \sum_{i=1}^{n} a_i \left\| \mathbf{w}_i - \mathbf{A} \mathbf{v}_i \right\|^2 }[/math]

where [math]\displaystyle{ \mathbf{w}_i }[/math] are the vector observations in the reference frame, [math]\displaystyle{ \mathbf{v}_i }[/math] are the vector observations in the body frame, [math]\displaystyle{ \mathbf{A} }[/math] is a rotation matrix between the two frames, and [math]\displaystyle{ a_i }[/math] are a set of weights such that [math]\displaystyle{ \textstyle \sum_i a_i = 1 }[/math]. It is possible to rewrite this as a maximisation problem of a gain function [math]\displaystyle{ g }[/math]

[math]\displaystyle{ g \left( \mathbf{A} \right) = 1 - l \left( \mathbf{A} \right) = \sum_i a_i \mathbf{w}_i^\top \mathbf{A} \mathbf{v}_i }[/math]

defined in such a way that the loss [math]\displaystyle{ l }[/math] attains a minimum when [math]\displaystyle{ g }[/math] is maximised. The gain [math]\displaystyle{ g }[/math] can in turn be rewritten as

[math]\displaystyle{ g \left( \mathbf{A} \right) = \operatorname{tr} \left( \mathbf{A} \mathbf{B}^\top \right) }[/math]

where [math]\displaystyle{ \mathbf{B} = \textstyle \sum_i a_i \mathbf{w}_i \mathbf{v}_i^\top }[/math] is known as the attitude profile matrix.

In order to reduce the number of variables, the problem can be reformulated by parametrising the rotation as a unit quaternion [math]\displaystyle{ \mathbf{q} = \left( v_1, v_2, v_3, q\right) }[/math] with vector part [math]\displaystyle{ \mathbf{v} = \left( v_1, v_2, v_3 \right) }[/math] and scalar part [math]\displaystyle{ q }[/math], representing the rotation of angle [math]\displaystyle{ \theta = 2 \cos^{-1} q }[/math] around an axis whose direction is described by the vector [math]\displaystyle{ \textstyle \frac{1}{\sin \frac{\theta}{2}} \mathbf{v} }[/math], subject to the unity constraint [math]\displaystyle{ \mathbf{q}^\top \mathbf{q} = 1 }[/math]. It is now possible to express [math]\displaystyle{ \mathbf{A} }[/math] in terms of the quaternion parametrisation as

[math]\displaystyle{ \mathbf{A} = \left( q^2 - \mathbf{v} \cdot \mathbf{v} \right) \mathbf{I} + 2 \mathbf{v}\mathbf{v}^\top + 2 q \mathbf{V}_\times }[/math]

where [math]\displaystyle{ \mathbf{V}_\times }[/math] is the skew-symmetric matrix

[math]\displaystyle{ \mathbf{V}_\times = \begin{pmatrix} 0 & v_3 & -v_2 \\ -v_3 & 0 & v_1 \\ v_2 & -v_1 & 0 \\ \end{pmatrix} }[/math].

Substituting [math]\displaystyle{ \mathbf{A} }[/math] with the quaternion representation and simplifying the resulting expression, the gain function can be written as a quadratic form in [math]\displaystyle{ \mathbf{q} }[/math]

[math]\displaystyle{ g(\mathbf{q}) = \mathbf{q}^\top \mathbf{K} \mathbf{q} }[/math]

where the [math]\displaystyle{ 4 \times 4 }[/math] matrix

[math]\displaystyle{ \mathbf{K} = \begin{pmatrix} \mathbf{S} - \sigma \mathbf{I} & \mathbf{z} \\ \mathbf{z}^\top & \sigma \end{pmatrix} }[/math]

is defined from the quantities

[math]\displaystyle{ \begin{align} \mathbf{S} &= \mathbf{B} + \mathbf{B}^\top \\ \mathbf{z} &= \sum_i a_i \left( \mathbf{w}_i \times \mathbf{v}_i \right) \\ \sigma &= \operatorname{tr} \mathbf{B} . \end{align} }[/math]

This quadratic form can be optimised under the unity constraint by adding a Lagrange multiplier [math]\displaystyle{ -\lambda \mathbf{q}^\top \mathbf{q} }[/math], obtaining an unconstrained gain function

[math]\displaystyle{ \hat{g} \left( \mathbf{q} \right) = \mathbf{q}^\top \mathbf{K} \mathbf{q} - \lambda \mathbf{q}^\top \mathbf{q} }[/math]

that attains a maximum when

[math]\displaystyle{ \mathbf{K} \mathbf{q} = \lambda \mathbf{q} }[/math].

This implies that the optimal rotation is parametrised by the quaternion [math]\displaystyle{ \mathbf{q}^* }[/math] that is the eigenvector associated to the largest eigenvalue [math]\displaystyle{ \lambda_{\text{max}} }[/math] of [math]\displaystyle{ \mathbf{K} }[/math].[1][2]

Solution of the characteristic equation

The optimal quaternion can be determined by solving the characteristic equation of [math]\displaystyle{ \mathbf{K} }[/math] and constructing the eigenvector for the largest eigenvalue. From the definition of [math]\displaystyle{ \mathbf{K} }[/math], it is possible to rewrite

[math]\displaystyle{ \mathbf{K} \mathbf{q} = \lambda \mathbf{q} }[/math]

as a system of two equations

[math]\displaystyle{ \begin{align} \mathbf{y} &= \left( (\lambda + \sigma) \mathbf{I} - \mathbf{S} \right)^{-1} \mathbf{z} \\ \lambda &= \sigma + \mathbf{z} \mathbf{y} \end{align} }[/math]

where [math]\displaystyle{ \mathbf{y} = \textstyle \frac{1}{q} \mathbf{v} }[/math] is the Rodrigues vector. Substituting [math]\displaystyle{ \mathbf{y} }[/math] in the second equation with the first, it is possible to derive an expression of the characteristic equation

[math]\displaystyle{ \lambda = \sigma + \mathbf{z}^\top \left( (\lambda + \sigma) \mathbf{I} - \mathbf{S} \right)^{-1} \mathbf{z} }[/math].

Since [math]\displaystyle{ \lambda_{\text{max}} = \max g\left(\mathbf{A}\right) }[/math], it follows that [math]\displaystyle{ \lambda_{\text{max}} = 1 - \min l\left(\mathbf{A}\right) }[/math] and therefore [math]\displaystyle{ \lambda_{\text{max}} \approx 1 }[/math] for an optimal solution (when the loss [math]\displaystyle{ l }[/math] is small). This permits to construct the optimal quaternion [math]\displaystyle{ \mathbf{q}^* }[/math] by replacing [math]\displaystyle{ \lambda_{\text{max}} }[/math] in the Rodrigues vector [math]\displaystyle{ \mathbf{y} }[/math]

[math]\displaystyle{ \mathbf{q}^* = \frac{1}{\sqrt{1 + \left| \mathbf{y}_{\lambda_{\text{max}}} \right|^2}} (\mathbf{y}, 1)^\top }[/math].

The [math]\displaystyle{ \mathbf{y} }[/math] vector is however singular for [math]\displaystyle{ \theta = \pi }[/math]. An alternative expression of the solution that does not involve the Rodrigues vector can be constructed using the Cayley–Hamilton theorem. The characteristic equation of a [math]\displaystyle{ 3\times3 }[/math] matrix [math]\displaystyle{ \mathbf{S} }[/math] is

[math]\displaystyle{ \det \left[ \mathbf{S} - \xi \mathbf{I} \right] = -\xi^3 + 2 \sigma \xi^2 - k \xi + \Delta = 0 }[/math]

where

[math]\displaystyle{ \begin{align} \sigma &= \frac{1}{2} \operatorname{tr}{\mathbf{S}} \\ k &= \operatorname{tr} \left( \operatorname{adj} \mathbf{S} \right) \\ \Delta &= \det \mathbf{S} \end{align} }[/math]

The Cayley–Hamilton theorem states that any square matrix over a commutative ring satisfies its own characteristic equation, therefore

[math]\displaystyle{ -\mathbf{S}^3 + 2 \sigma \mathbf{S}^2 - k \mathbf{S} + \Delta = 0 }[/math]

allowing to write

[math]\displaystyle{ \left( (\omega + \sigma) \mathbf{I} - \mathbf{S} \right)^{-1} = \frac{\alpha \mathbf{I} + \beta \mathbf{S} + \mathbf{S}^2}{\gamma} }[/math]

where

[math]\displaystyle{ \begin{align} \alpha &= \omega^2 - \sigma^2 + k \\ \beta &= \omega - \sigma \\ \gamma &= (\omega + \sigma) \alpha - \Delta \end{align} }[/math]

and for [math]\displaystyle{ \omega = \lambda_{\text{max}} }[/math] this provides a new construction of the optimal vector

[math]\displaystyle{ \begin{align} \mathbf{y}^* &= \left( (\lambda + \sigma) \mathbf{I} - \mathbf{S} \right)^{-1} \mathbf{z} \\ &= \frac{\alpha \mathbf{I} + \beta \mathbf{S} + \mathbf{S}^2}{\gamma} \mathbf{z} \end{align} }[/math]

that gives the conjugate quaternion representation of the optimal rotation as

[math]\displaystyle{ \mathbf{q}^* = \frac{1}{\sqrt{\gamma^2 + \left| \mathbf{x} \right|^2}} (\mathbf{x}, \gamma)^\top }[/math]

where

[math]\displaystyle{ \mathbf{x} = \left( \alpha \mathbf{I} + \beta \mathbf{S} + \mathbf{S}^2 \right) \mathbf{z} }[/math].

The value of [math]\displaystyle{ \lambda_{\text{max}} }[/math] can be determined as a numerical solution of the characteristic equation. Replacing [math]\displaystyle{ \left( (\omega + \sigma) \mathbf{I} - \mathbf{S} \right)^{-1} }[/math] inside the previously obtained characteristic equation

[math]\displaystyle{ \lambda = \sigma + \mathbf{z}^\top \left( (\lambda + \sigma) \mathbf{I} - \mathbf{S} \right)^{-1} \mathbf{z} }[/math].

gives

[math]\displaystyle{ \lambda^4 - (a + b) \lambda^2 - c \lambda + (ab + c \sigma - d) = 0 }[/math]

where

[math]\displaystyle{ \begin{align} a &= \sigma^2 - k \\ b &= \sigma^2 + \mathbf{z}^\top \mathbf{z} \\ c &= \Delta + \mathbf{z}^\top \mathbf{S} \mathbf{z} \\ d &= \mathbf{z}^\top \mathbf{S}^2 \mathbf{z} \end{align} }[/math]

whose root can be efficiently approximated with the Newton–Raphson method, taking 1 as initial guess of the solution in order to converge to the highest eigenvalue (using the fact, shown above, that [math]\displaystyle{ \lambda_{\text{max}} \approx 1 }[/math] when the quaternion is close to the optimal solution).[1][2]

See also

References

  1. 1.0 1.1 1.2 Shuster and Oh (1981)
  2. 2.0 2.1 2.2 Markley and Mortari (2000)
  3. Crassidis (2007)
  4. Psiaki (2000)
  5. Wu et al. (2017)
  6. Xiaoping et al. (2008)

Sources

External links