Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research
In an era where professional career data is critical for analyzing occupational trends and organizational dynamics, LinkedIn data offers a rich corpus for academic research due to its expansive user base and frequent updates. This paper examines technical, legal, and ethical challenges associated wi...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Programa de Pós-Graduação em Ciência da Informação Ibict/UFRJ
2024-08-01
|
Series: | Liinc em Revista |
Subjects: | |
Online Access: | https://revista.ibict.br/liinc/article/view/7041 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832540558115995648 |
---|---|
author | André José de Queiroz Padilha Jesús Pascual Mena-Chalco |
author_facet | André José de Queiroz Padilha Jesús Pascual Mena-Chalco |
author_sort | André José de Queiroz Padilha |
collection | DOAJ |
description | In an era where professional career data is critical for analyzing occupational trends and organizational dynamics, LinkedIn data offers a rich corpus for academic research due to its expansive user base and frequent updates. This paper examines technical, legal, and ethical challenges associated with scraping LinkedIn profiles for research, arguing that scraping is the most effective method for acquiring comprehensive LinkedIn data compared to direct cooperation, purchasing data, or APIs. Despite prohibitive measures and potential legal issues outlined by LinkedIn, recent court decisions provide favorable precedents for the lawful scraping of public profiles. The paper also compiles prior research studies that leveraged LinkedIn data, highlighting various acquisition methods and their applicability to academic research. It explores strategies to ethically and legally navigate scraping, providing recommendations on how researchers can responsibly collect LinkedIn data, ensuring compliance with evolving privacy laws and ethical standards. Finally, technical considerations are discussed, emphasizing the use of tools like Selenium to overcome LinkedIn's sophisticated anti-scraping measures. |
format | Article |
id | doaj-art-c4464bd1127a4897a54416d6c77629bd |
institution | Kabale University |
issn | 1808-3536 |
language | English |
publishDate | 2024-08-01 |
publisher | Programa de Pós-Graduação em Ciência da Informação Ibict/UFRJ |
record_format | Article |
series | Liinc em Revista |
spelling | doaj-art-c4464bd1127a4897a54416d6c77629bd2025-02-04T20:06:03ZengPrograma de Pós-Graduação em Ciência da Informação Ibict/UFRJLiinc em Revista1808-35362024-08-01201e7041e704110.18617/liinc.v20i1.70419481Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic researchAndré José de Queiroz Padilha0https://orcid.org/0009-0001-7660-5298Jesús Pascual Mena-Chalco1https://orcid.org/0000-0001-7509-5532UFABCUFABCIn an era where professional career data is critical for analyzing occupational trends and organizational dynamics, LinkedIn data offers a rich corpus for academic research due to its expansive user base and frequent updates. This paper examines technical, legal, and ethical challenges associated with scraping LinkedIn profiles for research, arguing that scraping is the most effective method for acquiring comprehensive LinkedIn data compared to direct cooperation, purchasing data, or APIs. Despite prohibitive measures and potential legal issues outlined by LinkedIn, recent court decisions provide favorable precedents for the lawful scraping of public profiles. The paper also compiles prior research studies that leveraged LinkedIn data, highlighting various acquisition methods and their applicability to academic research. It explores strategies to ethically and legally navigate scraping, providing recommendations on how researchers can responsibly collect LinkedIn data, ensuring compliance with evolving privacy laws and ethical standards. Finally, technical considerations are discussed, emphasizing the use of tools like Selenium to overcome LinkedIn's sophisticated anti-scraping measures.https://revista.ibict.br/liinc/article/view/7041linkedin data scrapingdata acquisitionlegal and ethical challengesscraping |
spellingShingle | André José de Queiroz Padilha Jesús Pascual Mena-Chalco Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research Liinc em Revista linkedin data scraping data acquisition legal and ethical challenges scraping |
title | Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research |
title_full | Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research |
title_fullStr | Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research |
title_full_unstemmed | Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research |
title_short | Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research |
title_sort | navigating technical legal and ethical hurdles to scraping linkedin data for academic research |
topic | linkedin data scraping data acquisition legal and ethical challenges scraping |
url | https://revista.ibict.br/liinc/article/view/7041 |
work_keys_str_mv | AT andrejosedequeirozpadilha navigatingtechnicallegalandethicalhurdlestoscrapinglinkedindataforacademicresearch AT jesuspascualmenachalco navigatingtechnicallegalandethicalhurdlestoscrapinglinkedindataforacademicresearch |