The solution was developed to scrape 3 different websites. The scraper collects the following data:
- document’s address (URL)
- title of document
- abstract
- full text with some formatting
- author
- publication date
- application date
- backup URL
These data are gathered in the predefined JSON file. The project includes developing, testing the code, and providing a specification for running it in a standard Python environment.