Text this: Ethical challenges in collecting pre-existing digital data for linguistic research