Updating the Centroid Decomposition with Applications in LSI
Jason R. Blevins and Moody T. Chu, Unpublished Manuscript, 2004.
- Manuscript: updating.pdf
- BibTeX record: updating.bib
Abstract. The centroid decomposition (CD) is an approximate singular value decomposition (SVD) with applications in factor analysis and latent semantic indexing (LSI). This paper presents updating methods for the centroid decomposition based on recent work in SVD updating methods. A general rank–1 updating framework is developed and then more specific updates used in LSI are examined.
Matlab Code:
centroid.m
: A Matlab implementation of the general centroid algorithm written by M. T. Chu.ccentroid.m
: A Matlab implementation of the classical centroid method.centroidsign.m
: A modification ofcentroid.m
which allows specification of initial sign vectors. It also computes per centroid factor and total runtime and number of sign vector searches.cd_z_path.m
: An implementation of the general centroid method with sign vector search path tracing. Based oncentroid.m
by M. T. Chu. Visualization of the search path had been planned but is yet to be implemented.centroid_plot.m
: A modification of centroid.m which produces acompassplot
containing the data vectors and the centroid vectors. The data matrix must contain two dimensional normalized data since a compassplot is used. In other words, A must have two columns with rows of unit length.cdupdate.m
: A Matlab implementation of the general centroid updating method. Requirescentroidsign.m
.updatesvd.m
: A Matlab implementation of M. E. Brand’s subspace based SVD updating method which is described in Incremental singular value decomposition of uncertain data with missing values, European Conference on Computer Vision (ECCV), 2350:707–720, 2002 (link).