Text this: Feature extraction method based on point pair hierarchical clustering