async_retriever._utils#

Core async functions.

Module Contents#

class async_retriever._utils.BaseRetriever(urls, file_paths=None, read_method=None, request_kwds=None, request_method='GET', cache_name=None, ssl=None)#

Base class for async retriever.

static generate_requests(urls, request_kwds, file_paths)#

Generate urls and keywords.

async_retriever._utils.create_cachefile(db_name=None)#

Create a cache folder in the current working directory.

async async_retriever._utils.delete_url(url, method, cache_name, **kwargs)#

Delete cached response associated with url.

async_retriever._utils.get_event_loop()#

Create an event loop.

async async_retriever._utils.retriever(uid, url, s_kwds, session, read_type, r_kwds, raise_status)#

Create an async request and return the response as binary.

Parameters:
  • uid (int) – ID of the URL for sorting after returning the results

  • url (str) – URL to be retrieved

  • s_kwds (dict) – Arguments to be passed to requests

  • session (ClientSession) – A ClientSession for sending the request

  • read_type (str) – Return response as text, bytes, or json.

  • r_kwds (dict) – Keywords to pass to the response read function. It is {"content_type": None} if read is json else an empty dict.

  • raise_status (bool) – Raise an exception if the response status is not 200. If False return None.

Returns:

bytes – The retrieved response as binary.

Return type:

tuple[int, str | Awaitable[str | bytes | dict[str, Any]] | None]

async async_retriever._utils.stream_session(url, s_kwds, session, filepath, chunk_size=None)#

Stream the response to a file.