============
Description:
============

This is the main function to implement the Stringing method. Stringing of high
dimensional data is implemented with distance-based metric Multidimensional Scaling,
mapping high-dimensional data to locations on a real interval, such that predictors that
are close in a suitable sample metric also are located close to each other on the interval. 
Established techniques from Functional Data Analysis can
be applied for further statistical analysis once an underlying stochastic process and the
corresponding random trajectory for each subject have been identified. 

Reference: Chen, K., Chen, K., Muller, H.G., Wang, J.L. (2011).
Stringing high-dimensional data for functional analysis.
J. American Statistical Association 106, 275-284.

========
Usage:
========

[newXmat, stringed, disMatrix, x, t_x, SubId, xx, pred_x, pred_pc] = Stringing(Xmat, dis_option, disMat, isNewSub, isFPCA, isPlot)

=======
Input:
=======
Xmat    :n *p data matrix, where xmat(i, :) is the row vector of
         measurements for the ith subject, i=1,...,n. It may contain data for subjects
         that are used for prediction; this is controlled by "isNewSub" which is either
         a vector consisting of 0's and 1's according to whether subject is used for prediction (0)
         or estimation (1), or is controlled by a positive integer nn. In this case,
         nn is the number of subjects to be used for estimation and n-nn is the number of
         remaining subjects to be used for prediction, corresponding to 
         the last n-nn data rows. When "isNewSub" is set to [], all n subjects
         are used for estimation and no prediction will be calculated;
         see "isNewSub" for more details.
dis_option: one of the following:
         'euclidean'   - Euclidean distance (default)
         'correlation' - One minus the sample linear correlation between
                         observations (treated as sequences of values).
         'spearman'    - One minus the sample Spearman's rank correlation
                         between observations (treated as sequences of values).
         'hamming'    - Hamming distance, percentage of coordinates that differ.
         'user'        - a user provided dissimilarity matrix named 'disMat' will be
                         used; in this case the input 'disMat' must be provided.

disMat:  the user provided dissimilarity matrix, only matters when dis_option
         is set to 'user'. You can specify disMat as either a full p-by-p matrix,
         or in upper triangle form. A full dissimilarity matrix must be
         real and symmetric, and have zeros along the diagonal and non-negative
         elements everywhere else.  A dissimilarity matrix in upper triangle
         form must have real, non-negative entries. NaNs are treated as
         missing values and ignored.  Inf is not accepted.
isNewSub: i) 1*n vector of 0s or 1s, where
             1 : the data for the corresponding subject, i.e.,
                 X(isNewSub == 1, :) are used for prediction only.

             0 : the data for the corresponding subject, i.e.,
                 X(isNewSub == 0, :)  are used for estimation only.

          This option is convenient for computing leave-one-out
          prediction if desired.

          ii) If it is a positive integer, say 5, then the last 5 subjects
             are used for prediction. This option is convenient for obtaining predictions 
             for a set of new subjects.

          iii) set isNewSub = [] for the default value, which is no prediction.


isFPCA : a scalar (0 or 1) that indicates whether functional principal compoenent
         analysis should be performed on the stringed data. 
         0 : not perform FPCA
         1 : perform FPCA and the results are stored in xx. (default)
isPlot:  a scalar(0 or 1) that indicates whether the stringing function. 
         0: not plot (default)
         1: plot
Details: i) Any unspecified or optional arguments can be set to "[]" for
            default values;
=======
Output:
=======

 newXmat: a matrix of n*p contains the stringed data.
 stringed: the estimated stringing function. newXmat = Xmat(:, stringed);
 disMatrix: the dissimilarity matrix used in stringing. If not provided
            by the user, it is calculated based on Xmat(IsNewSub == 1,:); 
 x      : 1*n cell array of the corresponding random trajectory for each subject identified
          by the stringing method, where x{i} is the row vector of
          measurements for the ith subject, i=1,...,n. Those x(IsNewSub
          == 1) are for prediction. 
 t_x    : 1*n cell array, t_x{i} is the row vector of time points for the ith
          subject at which corresponding measurements x{i} are taken,
          i= 1,...,n. 
 SubId:  a 0-1 vector indicates which subjects are used for estimation (0) and
         which are for prediction (1), computed from isNewSub.
 xx     : the output from FPCA.m based on the subject with isNewSub == 0.
 pred_x:  1*n cell array, pred_x{i} is the row vector of estimtaed (or predicted)
           curve for subject i. 
 pred_pc: n*K matrix of estimated (or predicted) FPC scores, where K is the number of components selected. 

 Note: If isFPCA == 0, the output xx, pred_x and pred_pc are all [].
        The output x and t_x can be used for further functional data
        analysis as needed,
        such as functional principal component analysis and functional regression.
        See FPCreg, FPCfam and so on.

 See exampleStringing.m
