
Abstract
We consider the topic of multivariate regression on manifoldvalued output, that is, for a multivariate observation, its output response lies on a manifold.
Moreover, we propose a new regression model to deal with the presence of grossly corrupted manifoldvalued responses, a bottleneck issue commonly encountered in practical scenarios. Our model first takes a correction step on the grossly corrupted responses via geodesic curves on the manifold, and then performs multivariate linear regression on the corrected data. This results in a nonconvex and nonsmooth optimization problem on manifolds.
To this end, we propose a dedicated approach named PALMR, by utilizing and extending the proximal alternating linearized minimization techniques. Theoretically, we investigate its convergence property, where it is shown to converge to a critical point under mild conditions. Empirically, we test our model on both synthetic and real diffusion tensor imaging data, and show that our model outperforms other multivariate regression models when manifoldvalued responses contain gross errors, and is effective in identifying gross errors.

Our Approach
Our main idea can be summarized as follows: For each manifoldvalued response $\bs{y} \in \mathcal{M}$,
we explicitly model its possible gross error (in $\bs{y}$). This gives rise to a \emph{corrected}
manifoldvalued data $\bs{y}^c$ by removing the identified gross error component from $\bs{y}$, which is
realized via geodesic curves on $\mathcal{M}$. Note that $\bs{y}^c$ could be the same as $\bs{y}$,
corresponding to no gross error in $\bs{y}$. Then the corrected manifoldvalued data can be utilized as
the responses in multivariate geodesic regression.
Figure 1: Illustration of our approach, which consists of two main components:
First is to obtain the corrected response $\bs{y}_i^c$ by removing its possible gross error,
as illustrated by the directed lines; The second one involve the remaining manifoldvalued regression
process. Here $x_1^1$ and $x_1^2$ are the components of input $\bs{x}_1$ along tangent vectors
$\bs{v}_1$ and $\bs{v}_1$ of point $\bs{p}\in\mathcal{M}$. Similarly for $x_2^1$ and $x_2^2$.

Publications
 Xiaowei Zhang, Li Cheng, Shudong Shi and Yu Sun. Multivariate Regression with Gross Errors on Manifoldvalued Data.
arXiv:1703.08772, 2017. [pdf]

Acknowledgement
 The CMIND dataset used in our experiments contains only six slices of the whole brain DTI data which
are obtained from the CMIND database released by
Cincinnati Children's Hospital Medical Center (CCHMC). We do not claim any ownership or credit of the dataset
except that we format the data into .mat format.
 In our MATLAB code, we also include the implementation of a comparison method
(named MGLM in our manuscript), for the sake of easy comparison. The implementation is from the authors of MGLM
available here.
