There are many different forms of clustering models available, ranging from simple to extremely complex ones. The Spark MLlib currently provides k-means clustering, which is among the simplest approaches available. However, it is often very effective, and its simplicity means it is relatively easy to understand and is scalable.