Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research

In an era where professional career data is critical for analyzing occupational trends and organizational dynamics, LinkedIn data offers a rich corpus for academic research due to its expansive user base and frequent updates. This paper examines technical, legal, and ethical challenges associated wi...

Full description

Saved in:
Bibliographic Details
Main Authors: André José de Queiroz Padilha, Jesús Pascual Mena-Chalco
Format: Article
Language:English
Published: Programa de Pós-Graduação em Ciência da Informação Ibict/UFRJ 2024-08-01
Series:Liinc em Revista
Subjects:
Online Access:https://revista.ibict.br/liinc/article/view/7041
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832540558115995648
author André José de Queiroz Padilha
Jesús Pascual Mena-Chalco
author_facet André José de Queiroz Padilha
Jesús Pascual Mena-Chalco
author_sort André José de Queiroz Padilha
collection DOAJ
description In an era where professional career data is critical for analyzing occupational trends and organizational dynamics, LinkedIn data offers a rich corpus for academic research due to its expansive user base and frequent updates. This paper examines technical, legal, and ethical challenges associated with scraping LinkedIn profiles for research, arguing that scraping is the most effective method for acquiring comprehensive LinkedIn data compared to direct cooperation, purchasing data, or APIs. Despite prohibitive measures and potential legal issues outlined by LinkedIn, recent court decisions provide favorable precedents for the lawful scraping of public profiles. The paper also compiles prior research studies that leveraged LinkedIn data, highlighting various acquisition methods and their applicability to academic research. It explores strategies to ethically and legally navigate scraping, providing recommendations on how researchers can responsibly collect LinkedIn data, ensuring compliance with evolving privacy laws and ethical standards. Finally, technical considerations are discussed, emphasizing the use of tools like Selenium to overcome LinkedIn's sophisticated anti-scraping measures.
format Article
id doaj-art-c4464bd1127a4897a54416d6c77629bd
institution Kabale University
issn 1808-3536
language English
publishDate 2024-08-01
publisher Programa de Pós-Graduação em Ciência da Informação Ibict/UFRJ
record_format Article
series Liinc em Revista
spelling doaj-art-c4464bd1127a4897a54416d6c77629bd2025-02-04T20:06:03ZengPrograma de Pós-Graduação em Ciência da Informação Ibict/UFRJLiinc em Revista1808-35362024-08-01201e7041e704110.18617/liinc.v20i1.70419481Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic researchAndré José de Queiroz Padilha0https://orcid.org/0009-0001-7660-5298Jesús Pascual Mena-Chalco1https://orcid.org/0000-0001-7509-5532UFABCUFABCIn an era where professional career data is critical for analyzing occupational trends and organizational dynamics, LinkedIn data offers a rich corpus for academic research due to its expansive user base and frequent updates. This paper examines technical, legal, and ethical challenges associated with scraping LinkedIn profiles for research, arguing that scraping is the most effective method for acquiring comprehensive LinkedIn data compared to direct cooperation, purchasing data, or APIs. Despite prohibitive measures and potential legal issues outlined by LinkedIn, recent court decisions provide favorable precedents for the lawful scraping of public profiles. The paper also compiles prior research studies that leveraged LinkedIn data, highlighting various acquisition methods and their applicability to academic research. It explores strategies to ethically and legally navigate scraping, providing recommendations on how researchers can responsibly collect LinkedIn data, ensuring compliance with evolving privacy laws and ethical standards. Finally, technical considerations are discussed, emphasizing the use of tools like Selenium to overcome LinkedIn's sophisticated anti-scraping measures.https://revista.ibict.br/liinc/article/view/7041linkedin data scrapingdata acquisitionlegal and ethical challengesscraping
spellingShingle André José de Queiroz Padilha
Jesús Pascual Mena-Chalco
Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research
Liinc em Revista
linkedin data scraping
data acquisition
legal and ethical challenges
scraping
title Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research
title_full Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research
title_fullStr Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research
title_full_unstemmed Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research
title_short Navigating technical, legal, and ethical hurdles to scraping LinkedIn data for academic research
title_sort navigating technical legal and ethical hurdles to scraping linkedin data for academic research
topic linkedin data scraping
data acquisition
legal and ethical challenges
scraping
url https://revista.ibict.br/liinc/article/view/7041
work_keys_str_mv AT andrejosedequeirozpadilha navigatingtechnicallegalandethicalhurdlestoscrapinglinkedindataforacademicresearch
AT jesuspascualmenachalco navigatingtechnicallegalandethicalhurdlestoscrapinglinkedindataforacademicresearch