:mod:`ibid.utils` -- Helper functions for plugins ================================================= .. module:: ibid.utils :synopsis: Helper functions for plugins .. moduleauthor:: Ibid Core Developers This module contains common functions that many Ibid plugins can use. String Functions ---------------- .. function:: ago(delta, [units]) Return a string representation of *delta*, a :class:`datetime.timedelta`. If *units*, an integer, is specified then only that many units of time will be used. .. code-block:: pycon >>> ago(datetime.utcnow() - datetime(1970, 1, 1)) '39 years, 6 months, 12 days, 9 hours, 59 minutes and 14 seconds' >>> ago(datetime.utcnow() - datetime(1970, 1, 1), 2) '39 years and 6 months' .. function:: format_date(timestamp, [length='datetime', tolocaltime=True]) Convert :class:`datetime.datetime` *timestamp* to the local timezone (*timestamp* is assumed to be UTC if it has no *tzinfo*) and return a unicode string representation. *length* can be one of ``'datetime'``, ``'date'`` and ``'time'``, specifying what to include in the output. If *tolocaltime* is ``False``, the time will left in the original time zone. The format is specified in the ``plugins.length`` configuration subtree, as three values: ``datetime_format``, ``date_format`` and ``time_format``. Example:: >>> format_date(datetime.utcnow()) u'2009-12-14 12:41:55 SAST' .. function:: parse_timestamp(timestamp) Parse string *timestamp*, convert it to UTC, and strip the timezone. This function is good at parsing machine timestamps, but can't handle "human" times very well. (It uses :mod:`dateutil.parser`) Return a naive :class:`datetime.datetime`. .. function:: human_join(items, [separator=u',', conjunction=u'and']) Turn iterable *items* into a unicode list in the format ``a, b, c and d``. *separator* separates all values except the last two, separated by *conjunction*. Example:: >>> human_join(['a', 'b', 'c', 'd']) u'a, b, c and d' .. function:: plural(count, singular, plural) If *count* is 1, return *singular*, otherwise *plural*. It's recommended to use complete words for *singular* and *plural* rather than suffixes. .. function:: indefinite_article(phrase) Use heuristics to determine whether the pronunciation of *phrase* starts with a vowel or consonant (assuming it is English) and return 'an' or 'a' respectively. .. function:: decode_htmlentities(text) Return *text* with all HTML entities removed, both numeric and string-style. .. function:: file_in_path(program) Returns a boolean indicating whether the program of name *program* can be found, using the ``PATH`` environment variable. Similar to ``which`` on the command line. .. function:: get_process_output(command, input=None) Runs *command*, a list of arguments where the first argument is the process to run (as in :class:`subprocess.Popen`). The command will be fed *input* on standard input, and :func:`get_process_output` will block until the command exits. Returns a tuple of (*output*, *error*, *code*): standard output, standard error, and exit code. .. function:: unicode_output(output, [errors='strict']) Decodes *output* a string, to unicode, using the character set specified in the ``LANG`` environment variable. *errors* has the same behaviour as the builtin :func:`unicode`. Useful for parsing program output. .. function:: ibid_version() Return the current Ibid version or ``None`` if no version can be determined. .. function:: locate_resource(path, filename) Locate a resource shipped with Ibid. *path* is specified as a python package (e.g. ``'ibid'``). *filename* is the relative path within the package (e.g. ``'data/something.txt'``) Returns the filename to the resource. .. function:: get_country_codes() Retrieve and decode a list of ISO-3166-1 country codes. Returns a dict of code -> country_name. The codes are capitalised. .. function:: identity_name(event, identity) Refer to *identity* naturally in response to *event*. URL Functions ------------- .. function:: url_regex() Returns a regular expression string (not a :class:`re.RegexObject`) for matching a URL. .. function:: is_url(url) Is *url* a valid URL? (according to :func:`url_regex`) .. function:: iri_to_uri(iri) Convert a unicode *iri* to punycode host and UTF-8 path. This allows IRIs to be opened with :mod:`urllib`. Web Service Functions --------------------- .. function:: cacheable_download(url, cachefile, [headers, timeout=60]) Useful for data files that you don't want to keep re-downloading, but do occasionally change. *url* is a URL to download, to a file named *cachefile*. *cachefile* should be in the form of ``pluginname/filename``. It will be stored in the configured ``plugins.cachedir`` and the full filename returned. Extra HTTP headers in *headers* can be supplied, if necessary. If *cachefile* already exists, :func:`cacheable_download` will do an *If-Modified-Since* HTTP request. It handles HTTP-compression. Example:: filename = cacheable_download( 'http://www.iso.org/iso/country_codes/iso_3166_code_lists/iso-3166-1_decoding_table.htm', 'lookup/iso-3166-1_decoding_table.htm') .. function:: generic_webservice(url, [params, headers]) Request *url*, with optional dicts of parameters *params* and headers *headers*, and return the data. .. function:: json_webservice(url, [params, headers]) Request *url*, with optional dicts of parameters *params* and headers *headers*, and parse as JSON. :exc:`JSONException` will be raised if the returned data isn't valid JSON. .. exception:: JSONException(Exception) Raised by :func:`json_webservice` if invalid JSON is returned. :mod:`ibid.utils.html` -- HTML Parsing -------------------------------------- .. module:: ibid.utils.html :synopsis: HTML Parsing helper functions for plugins .. moduleauthor:: Ibid Core Developers .. function:: get_html_parse_tree(url, [data, headers, treetype='beautifulsoup]) Request *url*, and return a parse-tree of type *treetype*. *data* and *headers* are optionally used in the request. *treetype* can be any type supported by :mod:`html5lib`, most commonly ``'etree'`` or ``'beautifulsoup'``. :exc:`ContentTypeException` will be raised if the returned data isn't HTML. .. exception:: ContentTypeException(Exception) Raised by :func:`get_html_parse_tree` if the content type isn't HTML. .. vi: set et sta sw=3 ts=3: