A closed-form theory of word2vec shows it is equivalent to running PCA on word co-occurrence statistics
Berkeley researchers prove that word2vec learns in discrete, sequential rank-incrementing steps, and that the final representations are exactly the top eigenvectors of a matrix defined by corpus co-occurrence probabilities.