ibid.utils – Helper functions for plugins

This module contains common functions that many Ibid plugins can use.

String Functions

ibid.utils.ago(delta[, units])

Return a string representation of delta, a datetime.timedelta.

If units, an integer, is specified then only that many units of time will be used.

>>> ago(datetime.utcnow() - datetime(1970, 1, 1))
'39 years, 6 months, 12 days, 9 hours, 59 minutes and 14 seconds'
>>> ago(datetime.utcnow() - datetime(1970, 1, 1), 2)
'39 years and 6 months'
ibid.utils.format_date(timestamp[, length='datetime', tolocaltime=True])

Convert datetime.datetime timestamp to the local timezone (timestamp is assumed to be UTC if it has no tzinfo) and return a unicode string representation.

length can be one of 'datetime', 'date' and 'time', specifying what to include in the output.

If tolocaltime is False, the time will left in the original time zone.

The format is specified in the plugins.length configuration subtree, as three values: datetime_format, date_format and time_format.

Example:

>>> format_date(datetime.utcnow())
u'2009-12-14 12:41:55 SAST'
ibid.utils.parse_timestamp(timestamp)

Parse string timestamp, convert it to UTC, and strip the timezone.

This function is good at parsing machine timestamps, but can’t handle “human” times very well. (It uses dateutil.parser)

Return a naive datetime.datetime.

ibid.utils.human_join(items[, separator=u', ', conjunction=u'and'])

Turn iterable items into a unicode list in the format a, b, c and d.

separator separates all values except the last two, separated by conjunction.

Example:

>>> human_join(['a', 'b', 'c', 'd'])
u'a, b, c and d'
ibid.utils.plural(count, singular, plural)

If count is 1, return singular, otherwise plural.

It’s recommended to use complete words for singular and plural rather than suffixes.

ibid.utils.indefinite_article(phrase)

Use heuristics to determine whether the pronunciation of phrase starts with a vowel or consonant (assuming it is English) and return ‘an’ or ‘a’ respectively.

ibid.utils.decode_htmlentities(text)

Return text with all HTML entities removed, both numeric and string-style.

ibid.utils.file_in_path(program)

Returns a boolean indicating whether the program of name program can be found, using the PATH environment variable.

Similar to which on the command line.

ibid.utils.get_process_output(command, input=None)

Runs command, a list of arguments where the first argument is the process to run (as in subprocess.Popen). The command will be fed input on standard input, and get_process_output() will block until the command exits.

Returns a tuple of (output, error, code): standard output, standard error, and exit code.

ibid.utils.unicode_output(output[, errors='strict'])

Decodes output a string, to unicode, using the character set specified in the LANG environment variable. errors has the same behaviour as the builtin unicode().

Useful for parsing program output.

ibid.utils.ibid_version()

Return the current Ibid version or None if no version can be determined.

ibid.utils.locate_resource(path, filename)

Locate a resource shipped with Ibid. path is specified as a python package (e.g. 'ibid'). filename is the relative path within the package (e.g. 'data/something.txt')

Returns the filename to the resource.

ibid.utils.get_country_codes()

Retrieve and decode a list of ISO-3166-1 country codes.

Returns a dict of code -> country_name. The codes are capitalised.

ibid.utils.identity_name(event, identity)

Refer to identity naturally in response to event.

URL Functions

ibid.utils.url_regex()

Returns a regular expression string (not a re.RegexObject) for matching a URL.

ibid.utils.is_url(url)

Is url a valid URL? (according to url_regex())

ibid.utils.iri_to_uri(iri)

Convert a unicode iri to punycode host and UTF-8 path. This allows IRIs to be opened with urllib.

Web Service Functions

ibid.utils.cacheable_download(url, cachefile[, headers, timeout=60])

Useful for data files that you don’t want to keep re-downloading, but do occasionally change.

url is a URL to download, to a file named cachefile. cachefile should be in the form of pluginname/filename. It will be stored in the configured plugins.cachedir and the full filename returned. Extra HTTP headers in headers can be supplied, if necessary.

If cachefile already exists, cacheable_download() will do an If-Modified-Since HTTP request. It handles HTTP-compression.

Example:

filename = cacheable_download(
   'http://www.iso.org/iso/country_codes/iso_3166_code_lists/iso-3166-1_decoding_table.htm',
   'lookup/iso-3166-1_decoding_table.htm')
ibid.utils.generic_webservice(url[, params, headers])

Request url, with optional dicts of parameters params and headers headers, and return the data.

ibid.utils.json_webservice(url[, params, headers])

Request url, with optional dicts of parameters params and headers headers, and parse as JSON.

JSONException will be raised if the returned data isn’t valid JSON.

exception ibid.utils.JSONException(Exception)

Raised by json_webservice() if invalid JSON is returned.

ibid.utils.html – HTML Parsing

ibid.utils.html.get_html_parse_tree(url[, data, headers, treetype='beautifulsoup])

Request url, and return a parse-tree of type treetype. data and headers are optionally used in the request.

treetype can be any type supported by html5lib, most commonly 'etree' or 'beautifulsoup'.

ContentTypeException will be raised if the returned data isn’t HTML.

exception ibid.utils.html.ContentTypeException(Exception)

Raised by get_html_parse_tree() if the content type isn’t HTML.