Data Extraction and Scratching Information Using R

  • G Midhu Bala Assistant Professor, Department of Computer Science, Mangayarkarasi College of Arts and Science for Women, Madurai, Tamil Nadu, India https://orcid.org/0000-0001-9751-2739
  • K Chitra Assistant Professor, Department of Computer Science, Governmentt Arts College, Melur, Madurai, Tamil Nadu, India
Keywords: web scraping, Web mining, Locating files in websites, R programming, R vest, Web Crawling

Abstract

Web scraping is the process of automatically extracting multiple WebPages from the World Wide Web. It is a field with active developments that shares a common goal with text processing, the semantic web vision, semantic understanding, machine learning, artificial intelligence and human- computer interactions. Current web scraping solutions range from requiring human effort, the ad-hoc, and to fully automated systems that are able to extract the required unstructured information, convert into structured information, with limitations. This paper describes a method for developing a web scraper using R programming that locates files on a website and then extracts the filtered data and stores it. The modules used and the algorithm of automating the navigation of a website via links are mentioned in this paper. Further it can be used for data analytics.

Published
2021-01-01
Statistics
Abstract views: 4 times
PDF downloads: 2 times
How to Cite
Midhu Bala, G., & Chitra, K. (2021). Data Extraction and Scratching Information Using R. Shanlax International Journal of Arts, Science and Humanities, 8(3), 140-144. https://doi.org/10.34293/sijash.v8i3.3588
Section
Articles