Data Extraction and Scratching Information Using R
Web scraping is the process of automatically extracting multiple WebPages from the World Wide Web. It is a field with active developments that shares a common goal with text processing, the semantic web vision, semantic understanding, machine learning, artificial intelligence and human- computer interactions. Current web scraping solutions range from requiring human effort, the ad-hoc, and to fully automated systems that are able to extract the required unstructured information, convert into structured information, with limitations. This paper describes a method for developing a web scraper using R programming that locates files on a website and then extracts the filtered data and stores it. The modules used and the algorithm of automating the navigation of a website via links are mentioned in this paper. Further it can be used for data analytics.
Copyright (c) 2021 G Midhu Bala, K Chitra
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.