Parameters¶
module: rspub.core.rs_paras
Parameters for ResourceSync publishing
The class RsParameters
validates parameters for ResourceSync publishing that are used throughout the
application. RsParameters can be persisted as configuration.
Multiple sets of parameters can be saved and reused as named configurations. This enables configuring rspub-core to publish metadata on different sets of resources. Each configuration can have its own selection mechanism, metadata directory, strategy etc. Each set of resources can than be published in its own capability list.
The class RsParameters
in this module and the class rspub.core.config.Configurations
are
important assets in this endeavour. RsParameters can be associated with a saved rspub.core.selector.Selector
.
-
class
rspub.core.rs_paras.
RsParameters
(config_name=None, resource_dir=None, metadata_dir=None, description_dir=None, url_prefix=None, strategy=None, selector_file=None, simple_select_file=None, select_mode=None, plugin_dir=None, history_dir=None, max_items_in_list=None, zero_fill_filename=None, is_saving_pretty_xml=None, is_saving_sitemaps=None, has_wellknown_at_root=None, exp_scp_server=None, exp_scp_port=None, exp_scp_user=None, exp_scp_document_root=None, zip_filename=None, imp_scp_server=None, imp_scp_port=None, imp_scp_user=None, imp_scp_remote_path=None, imp_scp_local_path=None, **kwargs)[source]¶ Bases:
object
Class capturing the core parameters for ResourceSync publishing
Parameters can be set in the
__init__()
method of this class and as properties. Each parameter gets a screening on validity and a ValueError will be raised if it is not valid. Parameters can be saved collectively as a configuration. Multiple named configurations can be stored by using the methodsave_configuration_as()
. Named configurations can be restored by giving the config_name at initialisation:# paras is an instance of RsParameters with configuration adequately set for collection 1 # it is saved as 'collection_1_config': paras.save_configuration_as("collection_1_config") # ... # Later on it is restored... paras = RsParameters(config_name="collection_1_config")
Note that the class
rspub.core.Configurations
has a method for listing saved configurations by name.RsParameters can be cloned:
# paras1 is an instance of RsParameters paras2 = RsParameters(**paras1.__dict__) paras1 == paras2 # False paras1.__dict__ == paras2.__dict__ # True
Besides parameters the RsParameters class also has methods for derived properties.
See also
-
__init__
(config_name=None, resource_dir=None, metadata_dir=None, description_dir=None, url_prefix=None, strategy=None, selector_file=None, simple_select_file=None, select_mode=None, plugin_dir=None, history_dir=None, max_items_in_list=None, zero_fill_filename=None, is_saving_pretty_xml=None, is_saving_sitemaps=None, has_wellknown_at_root=None, exp_scp_server=None, exp_scp_port=None, exp_scp_user=None, exp_scp_document_root=None, zip_filename=None, imp_scp_server=None, imp_scp_port=None, imp_scp_user=None, imp_scp_remote_path=None, imp_scp_local_path=None, **kwargs)[source]¶ Construct an instance of RsParameters
All
parameters
will get their value from- the _named argument in **kwargs. (this is for cloning instances of RsParameters). If not available:
- the named argument. If not available:
- the parameter as saved in the current configuration. If not available:
- the default configuration value.
Parameters: - config_name (str) – the name of the configuration to read. If given, sets the current configuration.
- resource_dir (str) –
parameter
resource_dir()
- metadata_dir (str) –
parameter
metadata_dir()
- description_dir (str) –
parameter
description_dir()
- url_prefix (str) –
parameter
url_prefix()
- int, str] strategy (Union[Strategy,) –
parameter
strategy()
- selector_file (str) –
parameter
selector_file()
- simple_select_file (str) –
parameter
simple_select_file()
- select_mode (SelectMode) –
parameter
select_mode()
- plugin_dir (str) –
parameter
plugin_dir()
- history_dir (str) –
parameter
history_dir()
- max_items_in_list (int) –
parameter
max_items_in_list()
- zero_fill_filename (int) –
parameter
zero_fill_filename()
- is_saving_pretty_xml (bool) –
parameter
is_saving_pretty_xml()
- is_saving_sitemaps (bool) –
parameter
is_saving_sitemaps()
- has_wellknown_at_root (bool) –
parameter
has_wellknown_at_root()
- exp_scp_server (str) –
parameter
exp_scp_server()
- exp_scp_port (int) –
parameter
exp_scp_port()
- exp_scp_user (str) –
parameter
exp_scp_user()
- exp_scp_document_root (str) –
parameter
exp_scp_document_root()
- zip_filename (str) –
parameter
zip_filename()
- imp_scp_server (str) –
parameter
imp_scp_server()
- imp_scp_port (int) –
parameter
imp_scp_port()
- imp_scp_user (str) –
parameter
imp_scp_user()
- imp_scp_remote_path (str) –
parameter
imp_scp_remote_path()
- imp_scp_local_path (str) –
parameter
imp_scp_local_path()
- kwargs – named arguments, same as parameters, but preceded by _
Raises: ValueError
if a parameter is not valid or if the configuration with the given config_name is not found
-
resource_dir
¶ parameter
The local root directory for ResourceSync publishing
(str)The given value should point to an existing directory. A relative path will be made absolute, calculated from the current working directory (os.getcwd()).
The resource_dir acts as the root of the resources to be published. The urls to the resources are calculated relative to the resource_dir. Example:
resourece_dir: /abs/path/to/resource_dir resource: /abs/path/to/resource_dir/sub/path/to/resource url: url_prefix + /sub/path/to/resource
default:
user home directorySee also:
url_prefix()
-
metadata_dir
¶ parameter
The directory for ResourceSync documents
(str)The metadata_dir is the directory where sitemap documents will be saved. Names and relative path names are allowed. An absolute path will raise a
ValueError
.The metadata directory will be calculated relative to the
resource_dir()
.If the metadata directory does not exist it will be created during execution of a synchronization.
default:
‘metadata’See also:
abs_metadata_dir()
-
description_dir
¶ parameter
Directory where a version of the description document is kept
(str)The description document, also known as .well-known/resourcesync, is keeping links to the capability list(s) at the site. A local copy of the description document (or the real description document if synchronization takes place at the server) will be updated with newly created capability lists. The description_dir should point to a directory where the
.well-known/resourcesync
document can be found.If description_dir is None the
abs_metadata_dir()
will be taken as description_dir.If the document
{description_dir}/.well-known/resourcesync
does not exist it will be created.default:
NoneSee also:
abs_description_path()
-
url_prefix
¶ parameter
The URL-prefix for ResourceSync publishing
(str)The url_prefix substitutes
resource_dir()
when calculating urls to resources. The url_prefix should be the host name of the server or host name + path that points to the root directory of the resources. url_prefix + relative/path/to/resource should yield a valid url.Example. Paths to resources are relative to the server host:
path to resource: {resource_dir}/path/to/resource url_prefix: http://www.example.com url to resource: http://www.example.com/path/to/resource
Example. Paths to resources are relative to some directory on the server:
path to resource: {resource_dir}/path/to/resource url_prefix: http://www.example.com/my/resources url to resource: http://www.example.com/my/resources/path/to/resource
default:
‘http://www.example.com‘See also:
resource_dir()
-
strategy
¶ parameter
Strategy for ResourceSync publishing
(str | int |Strategy
)The strategy determines what will be done by
ResourceSync
upon execution. At the moment valid values for strategy are:0
resourcelist
- new resourcelist: create new resourcelist(s)1
new_changelist
- new changelist: create a new changelist on every execution2
inc_changelist
- incremental changelist: add changes to an existing changelist
If strategies new resourcelist or incremental changelist are chosen and there is no previous resourcelist found in the metadata directory the strategy
resourcelist
will be executed.
-
selector_file
¶ parameter
Location of file to construct a Selector
(str)A
rspub.core.selector.Selector
can be used as input for the execute methods. The selector_file specifies the location of the selector file.default:
None
-
simple_select_file
¶
-
select_mode
¶
-
history_dir
¶ parameter
Directory for storing reports on executed synchronisations
(str)Currently not in use.
-
plugin_dir
¶ parameter
Directory where plugins can be found
(str)The given value should point to an existing directory. A relative path will be made absolute, calculated from the current working directory (os.getcwd()).
At the moment plugins for
ResourceGateBuilder
can be provided.default:
NoneSee also: rspub.util.gates
-
max_items_in_list
¶ parameter
The maximum amount of records in a sitemap
(int, 1 - 50000)The ‘community defined’ maximum amount of records in a sitemap document is 50000. If on execution the maximum amount is reached, new sitemaps of the same category will be created with the remaining records.
default:
50000
-
zero_fill_filename
¶ parameter
The amount of digits in a sitemap filename
(int, 1 - 10)Filenames of resourcelist, changelist etc. are numbered and are post-fixed with this number filled with zero’s up to zero_fill_filename. Examples of filenames with zero_fill_filename set at 4:
changelist_0002.xml changelist_0003.xml
default:
4
-
is_saving_pretty_xml
¶ parameter
Determines appearance of sitemap xml
(bool)If no humans need to read or inspect sitemaps there is no need for linebreaks etc.
default:
True, with linebreaks
-
is_saving_sitemaps
¶ parameter
Determines if sitemaps will be written to disk
(bool)An execution can be a dry-run. With this parameter set to False sitemaps will be generated, but not written to disk.
default:
True, write sitemaps to disk
-
has_wellknown_at_root
¶ parameter
Where is the description document .well-known/resourcesync on the server
(bool)The description document is the main entry point for third parties trying to discover resources at a source. Capability lists point toward this document in their rel:up attribute. If for some reason the .well-known/resourcesync cannot be at the root of the server the rel:up link in capability lists will be made to be pointing at .well-known/resourcesync relative to
abs_metadata_dir()
.default:
True, the .well-known/resourcesync is at the root of the server
-
exp_scp_server
¶
-
exp_scp_port
¶
-
exp_scp_user
¶
-
exp_scp_document_root
¶ parameter
The directory from which the web server will serve files
(str)Example. Paths to resources are relative to the server host:
url_prefix: http://www.example.com url to resource: http://www.example.com/path/to/resource scp_document_root: /var/www/html/ scp_document_path: path on server: /var/www/html/path/to/resource
Example. Paths to resources are relative to some directory on the server:
url_prefix: http://www.example.com/my/resources url to resource: http://www.example.com/my/resources/path/to/resource scp_document_root: /var/www/html/ scp_document_path: my/resources path on server: /var/www/html/my/resources/path/to/resource
default:
‘/var/www/html/’
-
zip_filename
¶
-
imp_scp_server
¶
-
imp_scp_port
¶
-
imp_scp_user
¶
-
imp_scp_remote_path
¶ parameter
The directory at the remote server from which to import files
(str)default:
‘~’
-
imp_scp_local_path
¶
-
save_configuration
(on_disk=True)[source]¶ function
Save current configuration
Save the current values of parameters to configuration. If on_disk is True (the default) persist the configuration to disk under the current configuration name.
Parameters: on_disk – True if configuration should be saved to disk, False otherwise See also:
current_configuration_name()
-
save_configuration_as
(name: str)[source]¶ function
Save current configuration under name
Save the current configuration under the given name. If a configuration under the given name already exists it will be overwritten without warning.
Parameters: name (str) – the name under which the configuration will be saved See also:
load_configuration()
-
abs_metadata_dir
() → str[source]¶ derived
The absolute path to metadata directory
Returns: absolute path to metadata directory
-
abs_metadata_path
(filename)[source]¶ derived
The absolute path to file in the metadata directory
Parameters: filename (str) – the filename to position relative to the abs_metadata_dir()
Returns: absolute path to file in the metadata directory
-
abs_description_path
()[source]¶ derived
The absolute path to (the local copy of) the file .well-known/resourcesync
Returns: absolute path to (the local copy of) the file .well-known/resourcesync
-
server_root
()[source]¶ derived
The server root (of the web server) as derived from url_prefix
Returns: server root
-
description_url
()[source]¶ derived
The current description url
The current description url either points to
{server root}/.well-known/resourcesync
or to a file in the metadata directory.Returns: current description url See also:
has_wellknown_at_root()
-
capabilitylist_url
() → str[source]¶ derived
The current capabilitylist url
The current capabilitylist url points to ‘capabilitylist.xml’ in the metadata directory.
Returns: current capabilitylist url
-
uri_from_path
(path)[source]¶ derived
Calculate the url of a path relative to resource_dir
Parameters: path (str) – the path to calculate the url from Returns: the url of the path relative to resource_dir
-
abs_history_dir
()[source]¶ derived
The absolute path to directory for reports on synchronizations
Currently not in use.
Returns: absolute path to directory for reports
-
static
configuration_name
()[source]¶ function
Current configuration name
Returns: current configuration name
-
describe
(as_string=False, fill=23)[source]¶ function
List parameters and derived values
List parameters, values and derived values as a list of tuples. Each tuple contains:
n field contents 0 bool True for parameter
, False for derived value1 name The name of the parameter or derived value 2 value The value of the parameter or derived value 3.. ... Anything else Parameters: - as_string – return contents as a printable string
- fill – if as_string: fill column ‘name’ with fill spaces
Returns: list[list] or str
-