arcp

Create/parse arcp (Archive and Package) URIs.

This module provides functions for creating arcp URIs, which can be used for identifying or parsing hypermedia files packaged in an archive like a ZIP file:

>>> from arcp import *

>>> arcp_random()
'arcp://uuid,dcd6b1e8-b3a2-43c9-930b-0119cf0dc538/'

>>> arcp_random("/foaf.ttl", fragment="me")
'arcp://uuid,dcd6b1e8-b3a2-43c9-930b-0119cf0dc538/foaf.ttl#me'

>>> arcp_hash(b"Hello World!", "/folder/")
'arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/'

>>> arcp_location("http://example.com/data.zip", "/file.txt")
'arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt'

arcp URLs can be used with urllib.parse, for instance using urllib.parse.urljoin() to resolve relative references:

>>> css = arcp.arcp_name("app.example.com", "css/style.css")
>>> urllib.parse.urljoin(css, "../fonts/foo.woff")
'arcp://name,app.example.com/fonts/foo.woff'

In addition this module provides functions that can be used to parse arcp URIs into its constituent fields:

>>> is_arcp_uri("arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt")
True

>>> is_arcp_uri("http://example.com/t")
False

>>> u = parse_arcp("arcp://uuid,b7749d0b-0e47-5fc4-999d-f154abe68065/file.txt")
ARCPSplitResult(scheme='arcp',prefix='uuid',name='b7749d0b-0e47-5fc4-999d-f154abe68065',
  uuid='b7749d0b-0e47-5fc4-999d-f154abe68065',path='/file.txt',query='',fragment='')

>>> u.path
'/file.txt'
>>> u.prefix
'uuid'
>>> u.uuid
UUID('b7749d0b-0e47-5fc4-999d-f154abe68065')
>>> u.uuid.version
5

>>> parse_arcp("arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/").hash
('sha-256', '7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069')

The object returned from parse_arcp() is similar to urllib.parse.ParseResult, but contains additional properties prefix, uuid, ni, hash and name, some of which will be None depending on the arcp prefix.

The function arcp.parse.urlparse() can be imported as an alternative to urllib.parse.urlparse(). If the scheme is arcp then the extra arcp fields like prefix, uuid, hash and name are available as from parse_arcp(), otherwise the output is the same as from urllib.parse.urlparse():

>>> from arcp.parse import urlparse
>>> urlparse("arcp://ni,sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk/folder/soup;sads")
ARCPParseResult(scheme='arcp',prefix='ni',
   name='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
   ni='sha-256;f4OxZX_x_FO5LcGBSKHWXfwtSx-j1ncoSt3SABJtkGk',
   hash=('sha-256', '7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069',
   path='/folder/soup;sads',query='',fragment='')
>>> urlparse("http://example.com/help?q=a")
ParseResult(scheme='http', netloc='example.com', path='/help', params='', 
  query='q=a', fragment='')
arcp.is_arcp_uri(uri)[source]

Return True if the uri string uses the arcp scheme, otherwise False.

arcp.parse_arcp(uri)[source]

Parse an arcp URI string into its constituent parts.

The returned object is similar to urllib.parse.urlparse() in that it is a tuple of (scheme,netloc,path,params,query,fragment) with equally named properties, but it also adds properties for arcp fields:

  • prefix – arcp authority prefix, e.g. “uuid”, “ni” or “name”, or None if prefix is missing
  • name – arcp authority without prefix, e.g. “a4889890-a50a-4f14-b4e7-5fd83683a2b5” or “example.com”
  • uuid – a uuid.UUID object if prefix is “uuid”, otherwise None
  • ni – the arcp alg-val value according to RFC6920 if prefix is “ni”, otherwise None
  • hash – the hash method and hash as a hexstring if prefix is “ni”, otherwise None
arcp.arcp_uuid(uuid, path='/', query=None, fragment=None)[source]

Generate an arcp URI for the given uuid.

Parameters:
  • uuid – a uuid string or UUID instance identifying the archive, e.g. 58ca7fa6-be2f-48e4-8b69-e63fb0d929fe
  • path – Optional path within archive.
  • query – Optional query component.
  • fragment – Optional fragment component.
arcp.arcp_random(path='/', query=None, fragment=None, uuid=None)[source]

Generate an arcp URI using a random uuid.

Parameters:
  • path – Optional path within archive.
  • query – Optional query component.
  • fragment – Optional fragment component.
  • uuid – optional UUID v4 string or UUID instance
arcp.arcp_location(location, path='/', query=None, fragment=None, namespace=UUID('6ba7b811-9dad-11d1-80b4-00c04fd430c8'))[source]

Generate an arcp URI for a given archive location.

Parameters:
  • location: URL or location of archive, e.g. http://example.com/data.zip
  • path – Optional path within archive.
  • query – Optional query component.
  • fragment – Optional fragment component.
  • namespace – optional namespace UUID for non-URL location.
arcp.arcp_name(name, path='/', query=None, fragment=None)[source]

Generate an arcp URI for a given archive name.

Parameters:
  • name – Absolute DNS or package name, e.g. app.example.com
  • path – Optional path within archive.
  • query – Optional query component.
  • fragment – Optional fragment component.
  • namespace – optional namespace UUID for non-URL location.
arcp.arcp_hash(bytes=b'', path='/', query=None, fragment=None, hash=None)[source]

Generate an arcp URI for a given archive hash checksum.

Parameters:
  • bytes – Optional bytes of archive to checksum
  • path – Optional path within archive.
  • query – Optional query component.
  • fragment – Optional fragment component.
  • hash – Optional hash instance from hashlib.sha256()

Either bytes or hash must be provided. The hash parameter can be provided to avoid representing the whole archive bytes in memory.