Distributed stochastic neighbor embedding

T-distributed Stochastic Neighbor Embedding (t-SNE), which is widely used in machine learning, is a non-linear, non-deterministic algorithm that creates a two-dimensional map of data with thousands of dimensions.

In other words, it transforms data in a high-dimensional space to fit into a 2D plane. t-SNE tries to hold, or preserve, the local neighbors in the data. It is a very popular approach for dimensionality reduction, as it is very flexible and able to find the structure or relationships in the data where other algorithms fail. It does this by calculating the probability of object i picking potential neighbor j. It will pick up the similar object from high dimension as it will have a higher probability ...

Get Natural Language Processing with Java - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.