 ===========
 Description:      
 =========== 
 
             This is the main program to perform Functional Principal Component
             Analysis (FPCA) via PACE for estimation of curves and their derivatives. 
             The principal component scores can be estimated through conditional 
             expectation or via classical integration. For the latter, one can 
             choose a shrinkage method for estimated scores.  See Reference
             [2].  For derivatives, see Reference [17].  For shrikage
             estimation, see Reference [1].
 
 ======
 Usage:
 ======
 
  function [no_opt,sigma,lambda,phi,eigen,xi_est,xi_var,mu,bw_mu,xcov,bw_xcov, 
            xcovfit, AIC,BIC,FVE,y_pred, y_predOrig, out1,out21,...
            y,t, rcov, regular, rho_opt, sigmanew, mucopy, phicopy, eigencopy, ...
            out1copy, out21copy, xcovcopy, xcovfitcopy, xcorr, ops]...
            =PCder(y,t,bwmu,bwmu_gcv, bwxcov,bwxcov_gcv, ntest1,ngrid1,selection_k, 
                   FVE_threshold, maxk,control,regular,error, ngrid,method,shrink,...
                   newdata, kernel, npoly, nder, method_int, numBins, yname, screePlot, designPlot, corrPlot, rho, verbose)
 
 ================
 Input Arguments:
 ================
 
 Input y:          1*n cell array, y{i} is the vector of measurements for the 
                   ith subject, i=1,...,n.
 
 Input t:          1*n cell array, t{i} is the vector of time points for the 
                   ith subject for which corresponding measurements y{i} are 
                   available, i=1,...,n.
 
 Input bwmu:       a vector of non-negative bandwidths for mu(t) and/or its
                   derivatives
                   0:  use cross-validation or generalized cross-validation
                       to choose bandwidth automatically. [Default]
                   bwmu(j) > 0: user-specified bandwidth.
                   * bwmu(1) is always the bandwidth for mean function with nder(1) = 0;
                   * bwmu(j), where j > 1 is bandwidth for derivative of mean function
                     with order nder(j).
                   See the description of "nder" below.
 
 Input bwmu_gcv:   a vector of 0's and 1's for mu(t) and/or its derivatives
                   For choice bwmu(j) = 0, two types of cross-validations can 
                   be performed: One-curve-leave-out cross-validation (CV) 
                   or  generalized cross-validation (GCV)
                   0: CV   (may be time-consuming) 
                   1: GCV  (faster)                [Default]                
 
 Input bwxcov:     a matrix of bandwidths for covariance surface used for 
                   smoothing of cov(X(t), X(s)) and/or its partial derivatives
                   w.r.t. t.
                   each row is a 1*2 vector
                   bwxcov(i): i-th coordinate of bandwidth vector, i=1,2.
                   bwxcov(1)==0 & bwxcov(2)==0: use cross-validation (CV)
                   or generalized cross-validation (GCV) for automatic
                   selection.                      [Default]
                   bwxcov(1)>0 & bwxcov(2)>0: user-specified bandwidths.
                   * bwxcov(1,1) and bwxcov(1,2) are always the bandwidths choices
                     for covariance function
                   * bwxcov(j,1) and bwxcov(j,2), where j > 1 are bandwidths for partial derivative
                     of the covariance function, where bwxcov(j,1) is in the derivative 
                     direction and bwxcov(j,2) is in the smoothing direction with order nder(j). 
                   See the description of "nder" below.
                   
 Input bwxcov_gcv: a vector of 0's and 1's for bandwidth choice used for 
                   smoothing of cov(X(t), X(s)) and/or its partial derivatives
                   w.r.t. t. 
                   If setting bwxcov = [0 0], automatic bandwidth selection
                   by CV or GCV choices
                   0: CV method (may be time-consuming, can be accelerated by
                      choosing small values for ntest1 and ngrid1)..
                   1: GCV method (faster)          [Default]
 
 Input ntest1:     integer(<=n), number of curves used for CV when choosing
                   bandwidths for smoothing the covariance surface. The subjects
                   in the test set are randomly selected from n subjects. Small
                   ntest1 will accelerate CV at less accuracy. [Default is 30.]
 
 Input ngrid1:     integer, number of support points for the covariance surface
                   in the CV procedure (selecting bandwidths of covariance).
                   Note that input ntest1 and ngrid1 provide options to save
                   computing time when using CV or GCV.        [Default is 30.]
 
 Input selection_k: the method of choosing the number of principal components K.
                    'AIC1': use AIC criterion with pseudo-likelihood of
                            measurements (marginal likelihood).
                    'AIC2': use AIC criterion with likelihood of measurements
                            conditional on estimated random coeffecients.
                    'BIC1': use BIC criterion with pseudo-likelihood of
                            measurements (marginal likelihood).
                    'BIC2': use BIC criterion with likelihood of measurements
                            conditional on estimated random coeffecients.
                    'FVE' (fraction of variance explained) : use scree plot
                            approach to select number of principal components),
                                                      [Default]
                     see "FVE_threshold" below.
                     'AIC_R': use regression AIC criterion (select number
                              of principal components based on the linear
                              relationship between predictor and response)
                     positive integer K: user-specified number of principal
                                         components
 
                    Note: BIC1 and FVE produce the most parsimonious models.
 
 Input FVE_threshold:  a positive number that is between 0 and 1 [Default is 0.85.] 
                       It is used with the option selection_k = 'FVE' and selects 
                       the number of principal components that explain at least 
                       "FVE_threshold" of total variation, which is the fraction 
                       of variance explained.
 
 Input maxk:      integer, the maximum number of principal components to consider
                  if using automatic methods to choose K, i.e., 'AIC1', 'AIC2',
                  'BIC1' or 'BIC2' defined by selection_k. [Default is 20.]
                  Note, when selection_k = 'FVE' or 'AIC_R', maxk is ignored.
 
 Input control:   'auto', Select K by minimizing AIC or BIC score, or find the 
                          first K such that the FVE_threshold is exceeded.  [Default]
                  'look', a scree plot (FVE% Vs No. of PC) will be generated based 
                          on K <= ngrid/2, where ngrid is the number of support points 
                          in each direction of covariance surface. User will be prompted 
                          to enter user-specified K after viewing scree plot. This can 
                          can be combined with any setting of selection_k.
 
 Input regular:   0, sparse (or irregular) functional data.      
                  1, regular data with missing values
                  2, completely balanced (regular) data.
                  [], automatically chosen based on the design in t.   [Default]
 
 Input error:     0, no additional measurement error assumed.       
                  1, additional measurement error is assumed.          [Default]
 
 Input ngrid:     integer, number of support points in each direction of 
                  covariance surface when performing principal component 
                  analysis ( ngrid > K).
                                                            [Default is 51.]
 Input method:    used for computing random effects \xi_{ik}                 
                  'CE': conditional expectation method      [Default]
                  'IN': classical integration method
                  Note: 'CE' can be applied for sparse data or regular data, but
                        'IN' only in the case of regular data.
 
 Input shrink:    indicator of whether applying shrinkage to estimates of random 
                  coefficients (for regular data only)
                  0:  no shrinkage when method = 'CE' or error = 0   [Default]
                  1:  shrinkage when method = 'IN' and error =1, otherwise, this 
                      will be re-set to 0.
 
 Input newdata:   a row vector of user-defined output time grids for
                  all curves. This corresponds to "out1" in the output argument
                  If newdata = [], then "out1" corresponds to the set of distinct
                  time points from the pooled data.
                  "newdata" is supposed to be a vector in ascending order on
                   the domain of the functions.           [Default is []]
 
 Input kernel:    a 1*length(nder) cell array
                  Each element of the cell array contains a character string to define 
                  the kernel to be used in the 1-D or 2-D smoothing for X(t) and/or
                  its derivatives
                  kernel{j} = 'epan'  ==> Epanechnikov kernel [Default for dense designs with n_i >= 20]  
                              'rect'  ==> Rectangular kernel
                              'gauss'  ==> Gaussian kernel    [Default for sparse designs, regular designs with
                                                               missings, dense designs for n_i < 20]
                              'gausvar' ==> variant of Gaussian kernel
                              'quar'  ==> Quartic kernel
                  Note: The Gaussian kernel is overall best for sparse designs but is slower than the other kernels 
                        and if computational speed is of concern then one may wish to use the Epanechnikov kernel 
                        also for the case of sparse designs.
 
 Input npoly:     a vector of integers that are >= 1, when it is 1, local linear fit is used
                  Each element in the vector, npoly(j), defines the degree of polynomial 
                  to be used to estimate mu(t), G(t,s) and/or their derivatives
                  Note that npoly(j) must be at least nder(j)+1 and npoly(j)- nder(j) should be
                  an odd number, where nder is defined below. 
                  npoly(1) is ALWAYS used for estimation of mu(t) and G(t,s). [Default: npoly = nder+1]
 
 Input nder:      a vector of integers that define the order of derivative of interest.
                  nder(j) = 0,1,2..., etc. When nder(j) = 0, estimation of curve itself.
                  nder(1) MUST be 0 and it is used for estimation of mu(t) and G(t,s).
                                                                              [Default: nder = 0]
 Input method_int: integer value of 1, 2 or 3 that defines the numerical integration methods for
                   for the estimation of derivative of eigenfunction(s)
                   1  trapezoidal rule
                   2  Romberg integration  [Default]
                   3  Gaussian Quadrature
 
 Input numBins:   0: no binning
                  a positive interger (>= 10): prebin the data with user-defined
                  number of bins. When numBins < 10, no binning will be performed.
 		  []:  prebin the data with the following rule     [Default]
                  
                  i) When the input data is regular = 1 or 2
                     m = max of n_i, where n_i is the number of repeated measurements
                     for i-th subject.
                  ii) regular = 0
                     m = median of n_i
                
                  When m <= 20 subjects, no binning.
                  When n <= 5000 subjects and m <= 400, no binning.
                  When n <= 5000 subjects and m > 400, numBins = 400.
                  When n > 5000 subjects, compute
                  
                  m* = max(20, (5000-n)*19/2250+400)  
 
                  if m > m*, numBins = m*
                  if m <= m*, no binning   
                     
                  This option accelerates analysis, especially for data with 
                  many time measurements. 
 
 Input yname:     a character string which denotes the name of the current 
                  function to be estimated.               [Default is []]
                  It is used in the title part of the scree plot output.
                  If it is set to [], then yname is set to be the same 
                  name as the first input argument from PCder() or FPCder(). 
                  For example, if this analysis is for function y,
                  then yname = 'y'.
 
 Input screePlot: indicator of whether to create the scree plot
                  1  a scree plot will be created     
                  0  no scree plot will be created    [Default]    
 
 Input designPlot: indicator of whether to create the design plot
                  1  a design plot will be created
                  0  no design plot will be created   [Default]
                  Interpretation of design plot: All subareas of the 
                  domain x domain support square of the covariance surface need
                  to be covered more or less homogeneously with pairs of design points.
 Input corrPlot:  indicator of whether to create the correlation surface plot
                  1  a correlation surface plot will be created
                  0  no correlation surface plot will be created    [Default]
 
 Input rho:       truncation threshold for the iterative residual that is used
                  in the estimation of FPC scores. (see FPCscore.pdf under Help/ for more details)
                  -1: compute unadjusted FPC scores (as in previous PACE version)
                  >0: user-defined choice of rho
                  0: do not set truncation threshold
                  'cv': use randomized leave-one-measurement-out CV approach to find
                        the optimal value of rho.     [Default]
 
 Input verbose:   a character string
                  'on': display diagnostic messages       [Default]
                  'off': suppress diagnostic messages
 ========
  Details:
 ========
        For j = 1,...,length(nder)
        1)       bwmu(j) = 0 and bwmu_gcv(j) = 1 ==> use GCV method to choose 
                 the bandwidth for mean function or its derivative.
 
                 For Gaussian kernel, the optimal bandwidth from GCV is multiplied
                 by a factor: a) nder(j) = 0, the factor is 1.1
                              b) nder(j) = 1, the factor is 1.2
                              c) nder(j) >= 2, the factor is 2.
 
                 bwmu(j) = 0 and bwmu_gcv(j) = 0 ==> use CV method to choose the 
                 bandwidth for mean function or its derivative
 
                 bwmu(j) > 0 then bwmu_gcv(j) will be ignored, subjective 
                 bandwidth choice
 
        2)       bwxcov(j,:) = [0 0] and bwxcov_gcv(j) = 1 ==> use GCV method to 
                 choose the bandwidth for cov function or its partial derivative
 
                 For Gaussian kernel, the optimal bandwidth from GCV is multiplied
                 by a factor: a) nder(j) = 0, the factor is 1.1
                              b) nder(j) = 1, the factor is 1.2
                              c) nder(j) >= 2, the factor is 2.
 
                 bwxcov(j,:) = [0 0] and bwxcov_gcv(j) = 0 ==> use CV method to 
                 choose the bandwidth for cov function or its partial derivative
 
                 bwxcov(j,1) > 0 and bwxcov(j,2) > 0 then bwxcov_gcv will be 
                 ignored, subjective bandwidth choice
 
                 For nder(j) = 0, if eigenvalues estimation is of primary interest, 
                 you may want to undersmooth the covariance surface for better estimates
                 of eigenvalues.
 
        3)       If error=1, AIC1, BIC1, AIC2, BIC2, FVE all can be used 
                 for choosing K.
 
                 If error=0, only AIC1, BIC1 and FVE can be used by 
                 definition of these criteria.
 
        4)       If some arguments are not needed or you want to use 
                 default values, just use [] for these arguments.
 
                 Alternatively, use setOptions() to define nessary input 
                 arguments and call PCA through FPCA.
 
        5)       In general, any output in the form of 1-dimensional 
                 vectors will be a row vector and any output related to 
                 1-dimensional cell arrays will be a row cell array.
 
 =================
 Output Arguments:
 =================
 Output no_opt:   integer, automatically or subjectively selected value of K, the number of selected components.
 
 Output sigma:    scalar, estimate of measurement error variance if 
                  error=1, while it is [] if error=0.
 
 Output lambda:   1*K vector, estimated eigenvalues (variances of functional principal components scores).
 
 Output phi:      a 1*length(nder) cell array of estimated eigenfunctions for 
                  X(t) and/or its derivatives.
                  Each element of the cell array contains
                  N*K matrix, estimated eigenfunctions 
                  valued at distinct input time points with ascending 
                  order of all subjects, corresponding to out1
                  phi{1} is ALWAYS the estimated eigenfunctions
 
 Output eigen:    a 1*length(nder) cell array of estimated eigenfunctions for X(t) and/or its derivatives.
                  Each element of the cell array contains:
                  ngrid*K matrix, estimated eigenfunctions, 
                  valued at out21, ngrid of the pooled distinct time points 
                  with ascending order of all subjects,
                  phi is an interpolated version of eigen at out1
                  eigen{1} is ALWAYS the estimated eigenfunctions
 
 Output xi_est:   n*K matrix, predictions for random coeffecients (PC 
                  scores) for n subjects.
 
 Output xi_var:   K*K matrix, Var(PC score)-Var(estimated PC score). The 
                  omega matrix in equation (7) of the paper, which is used 
                  to construct the point-wise C.I. for X_i(t)
 
 Output mu:       a 1*length(nder) cell array of estimated mu(t) and/or its derivatives.
                  Each element of the cell array contains:
                  1*N vector, estimated mean functions valued at distinct 
                  input time points (newdata = []), in ascending order from 
                  all subjects, corresponding to out1; when newdata is defined, 
                  corresponds to the time points from newdata, same as 
                  out1.
                  mu{1} is ALWAYS the estimated mean function
 
 Output bw_mu:    a vector of automatically or subjectively slected
                  bandwidth for mu(t) and/or its derivatives.
                  Each element of the vector contains:
                  a scalar(>0), automatically or subjectively selected 
                  bandwidth
 
 Output xcov:     a 1*length(nder) cell array of estimated smoothed G(t,s) and/or its
                  partial derivative w.r.t. t.
                  Each element of the cell array contains:
                  ngrid*ngrid matrix, smoothed covariance surface (diagnal 
                  removed), corresponding to out21
                  xcov{1} is ALWAYS the estimated smoothed G(t,s)
 
 Output bw_xcov:  a matrix of automatically or subjectively selected
                  bandwidths for G(t,s). 
                  Each row of the matrix contains:
                  1*2 vector(>0), automatically or subjectively selected 
                  bandwidths
 
 Output xcovfit:  a 1*length(nder) cell array of fitted covariance surface and/or its
                  partial derivative w.r.t. t. 
                  Each element of the cell array contains:
                  ngrid * ngrid matrix, fitted covariance surface(or its derivatives)
                  , based on truncated estimate of eigenvalues ("lambda") and 
                  eigenfunctions ("eigen") or its derivatives, corresponding 
                  to out21
                  xcovfit{1} is ALWAYS the fitted covariance surface
 
 Output xcorr:    ngrid * ngrid matrix, fitted correlation surface, based
                  on truncated estimate of eigenvalues ("lambda") and
                  principal component functions ("eigen"), corresponding
                  to out21
 
 Output AIC:      1*K vector, AIC values obtained when choosing K from 
                  K=1 to K=maxk, where AIC(K) is the minimum. If AIC 
                  method is not applied, it is []
 
 Output BIC:      1*K vector, BIC values obtained when choosing K from 
                  K=1 to K=maxk, where BIC(K) is the minimum. If BIC 
                  method is not applied, it is []
 
 Output FVE:      1*ngrid vector of fraction of variance explained
 
 Output y_pred:   a 1*length(nder) cell array of cell array for the predicted X(t) and/or
                  its derivatives.
                  Each element of ypred, say y_predj = y_pred{j} is a cell array, such that
                  y_predj{i} is the vector of predictions for the ith subject evaluated at 
                  time points from the output grid vector "out1".
                  j = 1: curve prediction with order nder(1) = 0, 
                  j > 1, derivative of curve prediction with order nder(j).
 
 Output y_predOrig: a 1*length(nder) cell array of cell array for the predicted X(t) and/or
                    its derivatives.
                    Each element of y_predOrig, say y_predOrigj = y_predOrig{j} is a cell array
                    such that y_predOrigj{i} is the vector of predictions for the ith subject 
                    at the same time points as the input.
                    j = 1, curve prediction with order nder(1) = 0, 
                    j > 1, derivative of curve prediction with order nder(j).
 
 Output out1:     1*N vector, distinct input time points with ascending 
                  order from all subjects if newdata = []; otherwise, it 
                  is the same as newdata.
 
 Output out21:    1*ngrid vector, a grid of time points for which the 
                  smoothed covariance surface assumes values, i.e.,
                  ngrids from out1.
 
 Output y:        if no binning is performed, same as the input y
                  if binning is performed, 1 * n cell array, y{i} is a vector 
                  of measurements after binning for subject i, i = 1,...,n
 
 Output t:        if no binning is performed, same as the input t
                  if binning, 1 * n cell array, t{i} is a vector of
                  time points after binning for subject i, i = 1,...,n
                  Each value of t{i} corresponds to the midpoints of
                  each bin.
 Output rcov:     a struct object that contains tpairn, cxxn, indx, win and cyy
                  tpairn:  2 * N  vector denotes the  pairs of time points for subject
                  concatenating as two vectors
                  cxxn:    1 * N vector of raw covariance corresponding to tpairn
                  For tpairn(1,:) == tpairn(2,:), then rcov.cxxn contains measurement
                  errors.
 
 Output regular:  if no binning is performed or regular = 2, this is the 
                  same as the input
                  if binning is performed and regular = 0, it will be
                  reset to regular = 1. In other words, after binning,
                  the sparse and irregular case is analyzed as regular 
                  with missings by sampled data.
 
 Output rho_opt:  if rho is set as 'cv', then rho_opt is the optimal rho obtained
                  from the CV method, otherwise, it is the same as the input rho
 
 Output sigmanew: if rho is set as >0, 0 or 'cv', then sigmanew is the iterative
                  residual sum of squares (see FPCscore.pdf for more details). It
                  can be used as an estimate of the variance of the measurement
                  errors.
                  if rho is set as -1, then sigmanew is set to the same as output
                  sigma. 
 
  See also FPCder, setDerOptions, showDerOptionNames, exampleDer