The Guardian¶
This module fetches data from The Guardian API.
To use first create TheGuardianCredentials
:
>>> from orangecontrib.text.guardian import TheGuardianCredentials
>>> credentials = TheGuardianCredentials('<your-api-key>')
Then create TheGuardianAPI
object and use it for searching:
>>> from orangecontrib.text.guardian import TheGuardianAPI
>>> api = TheGuardianAPI(credentials)
>>> corpus = api.search('Slovenia', max_documents=10)
>>> len(corpus)
10
-
class
orangecontrib.text.guardian.
TheGuardianCredentials
(key)[source]¶ The Guardian API credentials.
-
valid
¶ Check if given API key is valid.
-
-
class
orangecontrib.text.guardian.
TheGuardianAPI
(credentials, on_progress=None, should_break=None)[source]¶ -
__init__
(credentials, on_progress=None, should_break=None)[source]¶ Parameters: - credentials (
TheGuardianCredentials
) – The Guardian Creentials. - on_progress (callable) – Function for progress reporting.
- should_break (callable) – Function for early stopping.
- credentials (
-
search
(query, from_date=None, to_date=None, max_documents=None, accumulate=False)[source]¶ Search The Guardian API for articles.
Parameters: - query (str) – A query for searching the articles by
- from_date (str) – Search only articles newer than the date provided. Date should be in ISO format; e.g. ‘2016-12-31’.
- to_date (str) – Search only articles older than the date provided. Date should be in ISO format; e.g. ‘2016-12-31’.
- max_documents (int) – Maximum number of documents to retrieve. When not given, retrieve all documents.
- accumulate (bool) – A flag indicating whether to accumulate results of multiple consequent search calls.
Returns:
-