The Guardian
This module fetches data from The Guardian API.
To use first create TheGuardianCredentials
:
>>> from orangecontrib.text.guardian import TheGuardianCredentials
>>> credentials = TheGuardianCredentials('<your-api-key>')
Then create TheGuardianAPI
object and use it for searching:
>>> from orangecontrib.text.guardian import TheGuardianAPI
>>> api = TheGuardianAPI(credentials)
>>> corpus = api.search('Slovenia', max_documents=10)
>>> len(corpus)
10
- class orangecontrib.text.guardian.TheGuardianCredentials(key)[source]
The Guardian API credentials.
- property valid
Check if given API key is valid.
- class orangecontrib.text.guardian.TheGuardianAPI(credentials, on_progress=None, should_break=None)[source]
- __init__(credentials, on_progress=None, should_break=None)[source]
- Parameters
credentials (
TheGuardianCredentials
) – The Guardian Creentials.on_progress (callable) – Function for progress reporting.
should_break (callable) – Function for early stopping.
- search(query, from_date=None, to_date=None, max_documents=None, accumulate=False)[source]
Search The Guardian API for articles.
- Parameters
query (str) – A query for searching the articles by
from_date (str) – Search only articles newer than the date provided. Date should be in ISO format; e.g. ‘2016-12-31’.
to_date (str) – Search only articles older than the date provided. Date should be in ISO format; e.g. ‘2016-12-31’.
max_documents (int) – Maximum number of documents to retrieve. When not given, retrieve all documents.
accumulate (bool) – A flag indicating whether to accumulate results of multiple consequent search calls.
- Returns