Parameters¶
module: rspub.core.rs_paras
Parameters for ResourceSync publishing
The class RsParameters validates parameters for ResourceSync publishing that are used throughout the
application. RsParameters can be persisted as configuration.
Multiple sets of parameters can be saved and reused as named configurations. This enables configuring rspub-core to publish metadata on different sets of resources. Each configuration can have its own selection mechanism, metadata directory, strategy etc. Each set of resources can than be published in its own capability list.
The class RsParameters in this module and the class rspub.core.config.Configurations are
important assets in this endeavour. RsParameters can be associated with a saved rspub.core.selector.Selector.
-
class
rspub.core.rs_paras.RsParameters(config_name=None, resource_dir=None, metadata_dir=None, description_dir=None, url_prefix=None, strategy=None, selector_file=None, simple_select_file=None, select_mode=None, plugin_dir=None, history_dir=None, max_items_in_list=None, zero_fill_filename=None, is_saving_pretty_xml=None, is_saving_sitemaps=None, has_wellknown_at_root=None, exp_scp_server=None, exp_scp_port=None, exp_scp_user=None, exp_scp_document_root=None, zip_filename=None, imp_scp_server=None, imp_scp_port=None, imp_scp_user=None, imp_scp_remote_path=None, imp_scp_local_path=None, **kwargs)[source]¶ Bases:
objectClass capturing the core parameters for ResourceSync publishingParameters can be set in the
__init__()method of this class and as properties. Each parameter gets a screening on validity and a ValueError will be raised if it is not valid. Parameters can be saved collectively as a configuration. Multiple named configurations can be stored by using the methodsave_configuration_as(). Named configurations can be restored by giving the config_name at initialisation:# paras is an instance of RsParameters with configuration adequately set for collection 1 # it is saved as 'collection_1_config': paras.save_configuration_as("collection_1_config") # ... # Later on it is restored... paras = RsParameters(config_name="collection_1_config")
Note that the class
rspub.core.Configurationshas a method for listing saved configurations by name.RsParameters can be cloned:
# paras1 is an instance of RsParameters paras2 = RsParameters(**paras1.__dict__) paras1 == paras2 # False paras1.__dict__ == paras2.__dict__ # True
Besides parameters the RsParameters class also has methods for derived properties.
See also
-
__init__(config_name=None, resource_dir=None, metadata_dir=None, description_dir=None, url_prefix=None, strategy=None, selector_file=None, simple_select_file=None, select_mode=None, plugin_dir=None, history_dir=None, max_items_in_list=None, zero_fill_filename=None, is_saving_pretty_xml=None, is_saving_sitemaps=None, has_wellknown_at_root=None, exp_scp_server=None, exp_scp_port=None, exp_scp_user=None, exp_scp_document_root=None, zip_filename=None, imp_scp_server=None, imp_scp_port=None, imp_scp_user=None, imp_scp_remote_path=None, imp_scp_local_path=None, **kwargs)[source]¶ Construct an instance of RsParametersAll
parameterswill get their value from- the _named argument in **kwargs. (this is for cloning instances of RsParameters). If not available:
- the named argument. If not available:
- the parameter as saved in the current configuration. If not available:
- the default configuration value.
Parameters: - config_name (str) – the name of the configuration to read. If given, sets the current configuration.
- resource_dir (str) –
parameterresource_dir() - metadata_dir (str) –
parametermetadata_dir() - description_dir (str) –
parameterdescription_dir() - url_prefix (str) –
parameterurl_prefix() - int, str] strategy (Union[Strategy,) –
parameterstrategy() - selector_file (str) –
parameterselector_file() - simple_select_file (str) –
parametersimple_select_file() - select_mode (SelectMode) –
parameterselect_mode() - plugin_dir (str) –
parameterplugin_dir() - history_dir (str) –
parameterhistory_dir() - max_items_in_list (int) –
parametermax_items_in_list() - zero_fill_filename (int) –
parameterzero_fill_filename() - is_saving_pretty_xml (bool) –
parameteris_saving_pretty_xml() - is_saving_sitemaps (bool) –
parameteris_saving_sitemaps() - has_wellknown_at_root (bool) –
parameterhas_wellknown_at_root() - exp_scp_server (str) –
parameterexp_scp_server() - exp_scp_port (int) –
parameterexp_scp_port() - exp_scp_user (str) –
parameterexp_scp_user() - exp_scp_document_root (str) –
parameterexp_scp_document_root() - zip_filename (str) –
parameterzip_filename() - imp_scp_server (str) –
parameterimp_scp_server() - imp_scp_port (int) –
parameterimp_scp_port() - imp_scp_user (str) –
parameterimp_scp_user() - imp_scp_remote_path (str) –
parameterimp_scp_remote_path() - imp_scp_local_path (str) –
parameterimp_scp_local_path() - kwargs – named arguments, same as parameters, but preceded by _
Raises: ValueErrorif a parameter is not valid or if the configuration with the given config_name is not found
-
resource_dir¶ parameterThe local root directory for ResourceSync publishing(str)The given value should point to an existing directory. A relative path will be made absolute, calculated from the current working directory (os.getcwd()).
The resource_dir acts as the root of the resources to be published. The urls to the resources are calculated relative to the resource_dir. Example:
resourece_dir: /abs/path/to/resource_dir resource: /abs/path/to/resource_dir/sub/path/to/resource url: url_prefix + /sub/path/to/resource
default:user home directorySee also:
url_prefix()
-
metadata_dir¶ parameterThe directory for ResourceSync documents(str)The metadata_dir is the directory where sitemap documents will be saved. Names and relative path names are allowed. An absolute path will raise a
ValueError.The metadata directory will be calculated relative to the
resource_dir().If the metadata directory does not exist it will be created during execution of a synchronization.
default:‘metadata’See also:
abs_metadata_dir()
-
description_dir¶ parameterDirectory where a version of the description document is kept(str)The description document, also known as .well-known/resourcesync, is keeping links to the capability list(s) at the site. A local copy of the description document (or the real description document if synchronization takes place at the server) will be updated with newly created capability lists. The description_dir should point to a directory where the
.well-known/resourcesyncdocument can be found.If description_dir is None the
abs_metadata_dir()will be taken as description_dir.If the document
{description_dir}/.well-known/resourcesyncdoes not exist it will be created.default:NoneSee also:
abs_description_path()
-
url_prefix¶ parameterThe URL-prefix for ResourceSync publishing(str)The url_prefix substitutes
resource_dir()when calculating urls to resources. The url_prefix should be the host name of the server or host name + path that points to the root directory of the resources. url_prefix + relative/path/to/resource should yield a valid url.Example. Paths to resources are relative to the server host:
path to resource: {resource_dir}/path/to/resource url_prefix: http://www.example.com url to resource: http://www.example.com/path/to/resource
Example. Paths to resources are relative to some directory on the server:
path to resource: {resource_dir}/path/to/resource url_prefix: http://www.example.com/my/resources url to resource: http://www.example.com/my/resources/path/to/resource
default:‘http://www.example.com‘See also:
resource_dir()
-
strategy¶ parameterStrategy for ResourceSync publishing(str | int |Strategy)The strategy determines what will be done by
ResourceSyncupon execution. At the moment valid values for strategy are:0resourcelist- new resourcelist: create new resourcelist(s)1new_changelist- new changelist: create a new changelist on every execution2inc_changelist- incremental changelist: add changes to an existing changelist
If strategies new resourcelist or incremental changelist are chosen and there is no previous resourcelist found in the metadata directory the strategy
resourcelistwill be executed.
-
selector_file¶ parameterLocation of file to construct a Selector(str)A
rspub.core.selector.Selectorcan be used as input for the execute methods. The selector_file specifies the location of the selector file.default:None
-
simple_select_file¶
-
select_mode¶
-
history_dir¶ parameterDirectory for storing reports on executed synchronisations(str)Currently not in use.
-
plugin_dir¶ parameterDirectory where plugins can be found(str)The given value should point to an existing directory. A relative path will be made absolute, calculated from the current working directory (os.getcwd()).
At the moment plugins for
ResourceGateBuildercan be provided.default:NoneSee also: rspub.util.gates
-
max_items_in_list¶ parameterThe maximum amount of records in a sitemap(int, 1 - 50000)The ‘community defined’ maximum amount of records in a sitemap document is 50000. If on execution the maximum amount is reached, new sitemaps of the same category will be created with the remaining records.
default:50000
-
zero_fill_filename¶ parameterThe amount of digits in a sitemap filename(int, 1 - 10)Filenames of resourcelist, changelist etc. are numbered and are post-fixed with this number filled with zero’s up to zero_fill_filename. Examples of filenames with zero_fill_filename set at 4:
changelist_0002.xml changelist_0003.xml
default:4
-
is_saving_pretty_xml¶ parameterDetermines appearance of sitemap xml(bool)If no humans need to read or inspect sitemaps there is no need for linebreaks etc.
default:True, with linebreaks
-
is_saving_sitemaps¶ parameterDetermines if sitemaps will be written to disk(bool)An execution can be a dry-run. With this parameter set to False sitemaps will be generated, but not written to disk.
default:True, write sitemaps to disk
-
has_wellknown_at_root¶ parameterWhere is the description document .well-known/resourcesync on the server(bool)The description document is the main entry point for third parties trying to discover resources at a source. Capability lists point toward this document in their rel:up attribute. If for some reason the .well-known/resourcesync cannot be at the root of the server the rel:up link in capability lists will be made to be pointing at .well-known/resourcesync relative to
abs_metadata_dir().default:True, the .well-known/resourcesync is at the root of the server
-
exp_scp_server¶
-
exp_scp_port¶
-
exp_scp_user¶
-
exp_scp_document_root¶ parameterThe directory from which the web server will serve files(str)Example. Paths to resources are relative to the server host:
url_prefix: http://www.example.com url to resource: http://www.example.com/path/to/resource scp_document_root: /var/www/html/ scp_document_path: path on server: /var/www/html/path/to/resource
Example. Paths to resources are relative to some directory on the server:
url_prefix: http://www.example.com/my/resources url to resource: http://www.example.com/my/resources/path/to/resource scp_document_root: /var/www/html/ scp_document_path: my/resources path on server: /var/www/html/my/resources/path/to/resource
default:‘/var/www/html/’
-
zip_filename¶
-
imp_scp_server¶
-
imp_scp_port¶
-
imp_scp_user¶
-
imp_scp_remote_path¶ parameterThe directory at the remote server from which to import files(str)default:‘~’
-
imp_scp_local_path¶
-
save_configuration(on_disk=True)[source]¶ functionSave current configurationSave the current values of parameters to configuration. If on_disk is True (the default) persist the configuration to disk under the current configuration name.
Parameters: on_disk – True if configuration should be saved to disk, False otherwise See also:
current_configuration_name()
-
save_configuration_as(name: str)[source]¶ functionSave current configuration under nameSave the current configuration under the given name. If a configuration under the given name already exists it will be overwritten without warning.
Parameters: name (str) – the name under which the configuration will be saved See also:
load_configuration()
-
abs_metadata_dir() → str[source]¶ derivedThe absolute path to metadata directoryReturns: absolute path to metadata directory
-
abs_metadata_path(filename)[source]¶ derivedThe absolute path to file in the metadata directoryParameters: filename (str) – the filename to position relative to the abs_metadata_dir()Returns: absolute path to file in the metadata directory
-
abs_description_path()[source]¶ derivedThe absolute path to (the local copy of) the file .well-known/resourcesyncReturns: absolute path to (the local copy of) the file .well-known/resourcesync
-
server_root()[source]¶ derivedThe server root (of the web server) as derived from url_prefixReturns: server root
-
description_url()[source]¶ derivedThe current description urlThe current description url either points to
{server root}/.well-known/resourcesyncor to a file in the metadata directory.Returns: current description url See also:
has_wellknown_at_root()
-
capabilitylist_url() → str[source]¶ derivedThe current capabilitylist urlThe current capabilitylist url points to ‘capabilitylist.xml’ in the metadata directory.
Returns: current capabilitylist url
-
uri_from_path(path)[source]¶ derivedCalculate the url of a path relative to resource_dirParameters: path (str) – the path to calculate the url from Returns: the url of the path relative to resource_dir
-
abs_history_dir()[source]¶ derivedThe absolute path to directory for reports on synchronizationsCurrently not in use.
Returns: absolute path to directory for reports
-
static
configuration_name()[source]¶ functionCurrent configuration nameReturns: current configuration name
-
describe(as_string=False, fill=23)[source]¶ functionList parameters and derived valuesList parameters, values and derived values as a list of tuples. Each tuple contains:
n field contents 0 bool True for parameter, False for derived value1 name The name of the parameter or derived value 2 value The value of the parameter or derived value 3.. ... Anything else Parameters: - as_string – return contents as a printable string
- fill – if as_string: fill column ‘name’ with fill spaces
Returns: list[list] or str
-