tSNE_df.Rd
tSNE_df
makes use of Rtsne::Rtsne
, which is a wrapper for the C++ implementation of Barnes-Hut t-Distributed
Stochastic Neighbor Embedding. tSNE is a method for constructing a low dimensional embedding of high-dimensional data, distances, or
similarities. Exact t-SNE can be computed by setting theta = 0.0
.
tSNE_df( data, dims = 2, initial_dims = 50, perplexity = 3, theta = 0.5, check_duplicates = TRUE, pca = TRUE, partial_pca = FALSE, max_iter = 1000, verbose = FALSE, is_distance = FALSE, Y_init = NULL, pca_center = TRUE, pca_scale = FALSE, normalize = TRUE, stop_lying_iter = ifelse(is.null(Y_init), 250L, 0L), mom_switch_iter = ifelse(is.null(Y_init), 250L, 0L), momentum = 0.5, final_momentum = 0.8, eta = 200, exaggeration_factor = 12, num_threads = 1 )
data | A data frame object or matrix. |
---|---|
dims | integer; Output dimensionality (default: 2) |
initial_dims | integer; the number of dimensions that should be retained in the initial PCA step (default: 50) |
perplexity | numeric; Perplexity parameter (should not be bigger than 3 * perplexity < nrow(X) - 1, see details for interpretation) |
theta | numeric; Speed/accuracy trade-off (increase for less accuracy), set to 0.0 for exact TSNE (default: 0.5) |
check_duplicates | logical; Checks whether duplicates are present. It is best to make sure there are no duplicates present and set this option to FALSE, especially for large datasets (default: TRUE) |
pca | logical; Whether an initial PCA step should be performed (default: TRUE) |
partial_pca | logical; Whether truncated PCA should be used to calculate principal components (requires the irlba package). This is faster for large input matrices (default: FALSE) |
max_iter | integer; Number of iterations (default: 1000) |
verbose | logical; Whether progress updates should be printed (default: global "verbose" option, or FALSE if that is not set) |
is_distance | logical; Indicate whether X is a distance matrix (default: FALSE) |
Y_init | matrix; Initial locations of the objects. If NULL, random initialization will be used (default: NULL). Note that when using this, the initial stage with exaggerated perplexity values and a larger momentum term will be skipped. |
pca_center | logical; Should data be centered before pca is applied? (default: TRUE) |
pca_scale | logical; Should data be scaled before pca is applied? (default: FALSE) |
normalize | logical; Should data be normalized internally prior to distance calculations with |
stop_lying_iter | integer; Iteration after which the perplexities are no longer exaggerated (default: 250, except when Y_init is used, then 0) |
mom_switch_iter | integer; Iteration after which the final momentum is used (default: 250, except when Y_init is used, then 0) |
momentum | numeric; Momentum used in the first part of the optimization (default: 0.5) |
final_momentum | numeric; Momentum used in the final part of the optimization (default: 0.8) |
eta | numeric; Learning rate (default: 200.0) |
exaggeration_factor | numeric; Exaggeration factor used to multiply the P matrix in the first part of the optimization (default: 12.0) |
num_threads | integer; Number of threads to use when using OpenMP, default is 1. Setting to 0 corresponds to detecting and using all available cores |
index | integer matrix; Each row contains the identity of the nearest neighbors for each observation |
distance | numeric matrix; Each row contains the distance to the nearest neighbors in |
Krijthe, J. H. (2015). Rtsne: T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation, URL: https://github.com/jkrijthe/Rtsne
D. Schmitz
tSNE_df(gdsm_df) #> tSNE1 tSNE2 #> var01 44.600084 61.214476 #> var02 -78.204693 -78.844628 #> var03 -71.093507 -112.740061 #> var04 40.288728 82.899071 #> var05 -14.204317 -9.523896 #> var06 5.604320 26.788401 #> var07 -73.706376 -98.416387 #> var08 -51.391025 67.204823 #> var09 14.438828 44.324426 #> var10 92.967784 15.703337 #> var11 33.937197 50.806443 #> var12 20.753213 -47.637960 #> var13 -52.567206 86.943908 #> var14 0.822775 -21.488638 #> var15 92.560289 1.265384 #> var16 -12.479836 -78.672555 #> var17 62.961459 52.551014 #> var18 3.953976 -37.333902 #> var19 -19.133514 -64.451974 #> var20 -40.108181 59.408720