Metadata-Version: 2.1
Name: waybackmachine
Version: 0.1.4
Summary: Envelope for archive.org API.
Home-page: https://github.com/martinbenes1996/waybackmachine
Author: Martin Beneš
Author-email: martinbenes1996@gmail.com
License: MPL
Download-URL: https://github.com/martinbenes1996/waybackmachine/archive/0.1.4.tar.gz
Description: 
        # Wayback Machine
        
        This project is an envelope for simple fetching of historical versions of page from archive.org API.
        
        The page can be used for subsequent webscraping
        
        ## Setup and usage
        
        Install from [pip](https://pypi.org/project/waybackmachine/) with
        
        ```python
        pip install waybackmachine
        ```
        
        Simple usage of the `WaybackMachine` class is as
        
        ```python
        from waybackmachine import WaybackMachine
        
        url = "https://www.gov.pl/web/koronawirus/wykaz-zarazen-koronawirusem-sars-cov-2"
        for response,version_time in WaybackMachine(url):
            # process response
            pass
        ```
        
        The iterated version goes from newest to the older and older version all the way to end date at given step of date axis for querying the archive.
        
        Returned are
        
        * `response` = string response
        * `version_time` = datetime of the version
        
        Update of package is done with
        
        ```bash
        pip install --upgrade waybackmachine
        ```
        
        ### Start, end and step configuration
        
        Library enables setting of start date, end date and step size as timedelta.
        
        Since iterating is done backwards in time, **end date precedes start date!**
        
        Setting the querying for weekly from 1st May back to 1st February 2020 is done with
        
        ```python
        from datetime import datetime,timedelta
        from waybackmachine import WaybackMachine
        
        url = "https://www.liu.se/"
        for response,version_time in WaybackMachine(url, start = datetime(2020,5,1), end = datetime(2020,2,1), step = timedelta(days = 7)):
            # process response
            pass
        ```
        
        The date can be also specified one of following string formats:
        
        * *%Y-%m-%d*
        * *%Y-%m-%d %H:%M*
        * *%Y-%m-%d %H:%M:%S*
        
        ```python
        for response,version_time in WaybackMachine(url, start = "2020-05-01", end = "2020-02-01", step = timedelta(days = 7)):
            # process response
            pass
        ```
        
        *String representation of timedelta will be added.*
        
        
        
        ### Configurations
        
        On frequent use-cases, custom configurations of parameters are added to the packages.
        
        These consist of default parameter values.
        
        So far following configurations are available:
        
        * *default* - start is *now()*, end is beginning of year of start (hence length can be 0 - 365 days), 1 day step
        * *covid* - start is *now()* (might be changed, if covid disappears), end is *2020-01-01*, COVID-19 spread into the world after. *In China the COVID-19 has already occurred before!*. Step is 12 h.
         
        ## Contribution
        
        Developed by [Martin Benes](https://github.com/martinbenes1996).
        
        Join on [GitHub](https://github.com/martinbenes1996/waybackmachine).
        
        
        
        
Keywords: waybackmachine,archive,web,html,webscraping
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Other Audience
Classifier: Environment :: Web Environment
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Description-Content-Type: text/markdown
