Title: | Bootstrap on Classical Biplots and Clustering Disjoint Biplot |
---|---|
Description: | A GUI with which the user can construct and interact with Bootstrap methods on Classical Biplots and with Clustering and/or Disjoint Biplot. This GUI is also aimed for estimate any numerical data matrix using the Clustering and Disjoint Principal component (CDPCA) methodology. |
Authors: | Ana Belen Nieto Librero<[email protected]>, Adelaide Freitas<[email protected]> |
Maintainer: | Ana Belen Nieto Librero <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.3 |
Built: | 2025-02-13 05:44:33 UTC |
Source: | https://github.com/cran/biplotbootGUI |
The biplotbootGUI package is a graphical user interface to construct and interact with Classical Biplots and, combined with Bootstrap methods, provides confidence intervals based on percentiles, t-bootstrap and BCa to measure the accuracy of the estimators of the parameters given by them.
Package: | biplotbootGUI |
Type: | Package |
Version: | 1.3 |
Date: | 2023-12-11 |
License: | GPL>=2 |
Ana Belen Nieto Librero [email protected], Adelaide Freitas [email protected]
Maintainer: Ana Belen Nieto Librero [email protected]
Gabriel, K. R. (1971). The Biplot graphic display of matrices with applications to principal components analysis. Biometrika, 58(3), 453-467.
Galindo, M. P. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Questiio, 10(1), 13-23.
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7, 1-26.
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82, 171-185.
Efron, B., & Tibshirani, R. J. (1993). An introduction into the bootstrap. New York: Chapman and Hall.
Macedo, E. and Freitas, A. (2015). The alternating least-squares algorithm for CDPCA. Communications in Computer and Information Science (CCIS), Springer Verlag pp. 173-191.
Nieto, A. B., & Galindo, M. P., & Leiva, V., & Vicente-Galindo, P. (2014). A methodology for biplots based on bootstrapping with R. Revista Colombiana de Estadistica, 37(2), 367-397.
Vichi, M and Saporta, G. (2009). Clustering and disjoint principal component analysis. Computational Statistics and Data Analysis, 53, 3194-3208.
data(iris) biplotboot(iris[,-5])
data(iris) biplotboot(iris[,-5])
The biplotboot function is a graphical user interface to construct and interact with Classical Biplots and, combined with Bootstrap methods, provides confidence intervals based on percentiles and t-bootstrap to measure the accuracy of the estimators of the parameters given by them.
biplotboot(x)
biplotboot(x)
x |
A data frame with the information to be analyzed |
When the function is launched, firstly, it is necessary to select the number of resamples to be extracted, the conficende level to calculate the intervals presented in the results and the parameters whose inferential form want to be calculated. Then, an option window is displayed where you can change the color, the size, the label and/or the symbol of an element or of a set of elements; to select the kind of Biplot factorization to be applied, to select the transformation data, to change the window size containing the graphs and to tick the checkbox to show the axes in the graph. Press the Graph button and then choose the number of axes to be retained. When the graph will be shown, the function will allow you to change characteristics of the points with the mouse. Press the right mouse button and a window will be displayed to change the color, the size, the label and/or the symbol of the nearest point of position clicked. Press the left mouse button and a window will be displayed to select one option: Change the position label, Remove label or Do nothing. It is also possible to select the dimensions shown in the graph and to change the limits of the axes. In the window there are five menus with their corresponding submenus:
File
Copy image
Save image
PDF file
Eps file
Png file
Jpg/Jpeg file
Exit
3D
3D
Projections
Variables
Back to original data
Options
Change title
Show/Hide axes
Cluster
Hierarchical cluster with biplot coordinates
K-means with biplot coordinates
K-medoids with biplot coordinates
Back to original graph
The File menu provides different options to save the graph and permits to exit the program. The second menu shows the graph in 3 dimensions. The third menu allows the user to project the individuals onto the direction representing one variable selected from a listbox. This menu permits to go back to original graph. The following menu permits to change the title and to show/hide the axes in the graph. The fifth menu allows the user to analyze the biplot coordinates with clustering techniques. The results in an inferential form have saved in a file and together with it graphs with histograms and QQ-plots generated with the bootstrap replications are saved.
A graph showing the data representation, an output file containing the contributions, qualities of representation, goodness of fit, coordinates and eigen values and another output file containing these results in an inferential form.
Ana Belen Nieto Librero [email protected], Purificacion Galindo Villardon [email protected]
Gabriel, K. R. (1971). The Biplot graphic display of matrices with applications to principal components analysis. Biometrika, 58(3), 453-467.
Galindo, M. P. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Questiio, 10(1), 13-23.
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7, 1-26.
Efron, B. (1987). Better bootstrap confidence intervals. Journal of the American Statistical Association, 82, 171-185.
Efron, B., & Tibshirani, R. J. (1993). An introduction into the bootstrap. New York: Chapman and Hall.
Nieto, A. B., & Galindo, M. P., & Leiva, V., & Vicente-Galindo, P. (2014). A methodology for biplots based on bootstrapping with R. Revista Colombiana de Estadistica, 37(2), 367-397.
data(iris) biplotboot(iris[,-5])
data(iris) biplotboot(iris[,-5])
The CDBiplot function is a graphical user interface to construct and interact with Clustering and/or Disjoint Biplot.
CDBiplot(data, clase)
CDBiplot(data, clase)
data |
A data frame with the information to be analyzed |
clase |
A vector containing the real classification of the objects in the data |
When the function is launched, firstly, it is necessary to select the kind of analysis to be used on the data. Then, a window to select the number of clusters, components, the tolerance, the number of iterations and the repetitions of the algorithm. Press the OK button and the graph will be shown. Press the left mouse button and a window will be displayed to select one option: Change the position label, Remove label or Do nothing. It is also possible to select the dimensions shown in the graph and to change the limits of the axes. In the window there are four menus:
File
Copy image
Save image
PDF file
Eps file
Png file
Jpg/Jpeg file
Exit
3D
3D
Options
Change title
Show/Hide axes
Show/Hide variables
Show/Hide row labels
Cluster
Convex-hull
The File menu provides different options to save the graph and permits to exit the program. The second menu shows the graph in 3 dimensions. The third menu allows the user to change the title and to show/hide the axes, the variables and the row labels in the graph. The last menu permits the user to draw (filled or empty) convex-hull on each cluster. The program saves a file containing the main results of the analysis.
A graph showing the data representation and an output file containing the information about the results.
Ana Belen Nieto Librero [email protected], Purificacion Vicente Galindo [email protected], Purificacion Galindo Villardon [email protected]
Gabriel, K. R. (1971). The Biplot graphic display of matrices with applications to principal components analysis. Biometrika, 58(3), 453-467.
Galindo, M. P. (1986). Una alternativa de representacion simultanea: HJ-Biplot. Questiio, 10(1), 13-23.
Vichi, M and Saporta, G. (2009). Clustering and disjoint principal component analysis. Computational Statistics and Data Analysis, 53, 3194-3208.
Macedo, E. and Freitas, A. (2015). The alternating least-squares algorithm for CDPCA. Communications in Computer and Information Science (CCIS), Springer Verlag pp. 173-191.
data(iris) CDBiplot(iris[,-5], iris[,5])
data(iris) CDBiplot(iris[,-5], iris[,5])
CDpca performs a clustering and disjoint principal components analysis (CDPCA) on the given numeric data matrix and returns a list of results Given a (IxJ) real data matrix X = [xij], the CDPCA methodology is allowed to cluster the I objects into P nonempty and nonoverlapping clusters Cp, p = 1,...,P, which are identified by theirs centroids, and, simultaneously, to partitioning the J attributes into Q disjoint components, PCq, q = 1,...,Q. The CDpca function models X estimating the parameter of the model using an Alternating Least Square (ALS) procedure originally proposed by Vichi and Saport (2009) and described in two steps by Macedo and Freitas (2015).
CDpca (data, class=NULL, P, Q, SDPinitial=FALSE, tol= 10^(-5), maxit, r, cdpcaplot=TRUE)
CDpca (data, class=NULL, P, Q, SDPinitial=FALSE, tol= 10^(-5), maxit, r, cdpcaplot=TRUE)
data |
A numeric matrix or data frame which provides the data for the CDPCA |
class |
A numeric vector containing the real classification of the objects in the data, or NULL if the class of objects is unknown |
P |
An integer value indicating the number of clusters of objects |
Q |
An integer value indicating the number of clusters of variables |
SDPinitial |
A logical value indicating whether the initial assignment matrices U and V are randomly generated (by default) or an algorithmic framework based on a semidefinite programming approach is preferred (TRUE) |
tol |
A positive (low) value indicating the maximum term for the difference between two consecutives values of the objective function. A tolerance value of 10^(-5) is indicated by default |
maxit |
The maximum number of iterations of one run of the ALS algorithm |
r |
Number of runs of the ALS algorithm for the final solution |
cdpcaplot |
A logical value indicating whether an additional graphic is created (showing the data projected on the first two CDPCA principal components) |
Cdpca returns a list of results containing the following components:
Iter |
The total number of iterations used in the best loop for computing the best solution |
loop |
The best loop number |
timebestloop |
The computation time on the best loop |
timeallloops |
The computation time for all loops |
Y |
The component score matrix |
Ybar |
The object centroids matrix in the reduced space |
A |
The component loading matrix |
U |
The partition of objects |
V |
The partition of variables |
F |
The value of the objective function to maximize |
bcdev |
The between cluster deviance |
bcdevTotal |
The between cluster deviance over the total variability |
tableclass |
The cdpca classification |
pseudocm |
The pseudo confusion matrix concerning the true (given by class) and cdpca classifications |
Enorm |
The error norm for the obtained cdpca model |
Eloisa Macedo [email protected], Adelaide Freitas [email protected], Maurizio Vichi [email protected]
Vichi, M and Saporta, G. (2009). Clustering and disjoint principal component analysis. Computational Statistics and Data Analysis, 53, 3194-3208.
Macedo, E. and Freitas, A. (2015). The alternating least-squares algorithm for CDPCA. Communications in Computer and Information Science (CCIS), Springer Verlag pp. 173-191.
Produce a list of summary measures to evaluate the result of the CDPCA
CDpcaSummary(obj)
CDpcaSummary(obj)
obj |
An object of the type produced by CDpca |
CDpcaSummary returns the following values associated to the loop where the best result was produced:
Number of the loops
Number of iterations
Value of the objective function F
Frobenius norm of the error matrix
Between cluster deviance (percentage)
Explained variance by CDpca components (percentage)
Pseudo Confusion Matrix (if available)
Eloisa Macedo [email protected], Adelaide Freitas [email protected], Maurizio Vichi [email protected]
Vichi, M and Saporta, G. (2009). Clustering and disjoint principal component analysis. Computational Statistics and Data Analysis, 53, 3194-3208.
Macedo, E. and Freitas, A. (2015). The alternating least-squares algorithm for CDPCA. Communications in Computer and Information Science (CCIS), Springer Verlag pp. 173-191.