Text this: Parallel clustering algorithm for large-scale biological data sets.