Endpoints

Basics

To be able to work with this package properly, you first have to establish a connection to the SPARQL endpoint(s) you want extract desired the data from (except you just want to work with DBpedia, see Using Preconfigured Endpoints).

SPARQL endpoints are implemented as kgextension.sparql_helper.Endpoint objects.

Note

For all methods in this package that need to access an Endpoint, the default Endpoint is set to be the DBpedia endpoint. This means, that if you just want to work with the DBPedia endpoint, you don’t have to take care of configuring or selecting an endpoint and can just use this package directly.

Types of Endpoints

This package provides support for two kinds of Endpoints:

  • Remote Endpoints: Remote Endpoint allow you to work with classic hosted SPARQL endpoints.

  • Local Endpoints: Local Endpoints allow you to work with local RDF files as if they were hosted SPARQL endpoints.

Connecting to an Endpoint

Using Preconfigured Endpoints

For user convenience the package comes with a selection of preconfigured SPARQL endpoints, that are “ready to use” out of the box. This collection currently includes:

These Endpoints can simply be imported when needed, as shown in the following example:

from kgextension.endpoints import DBpedia

The loaded Endpoint object can (with the exception of the WikiData Endpoint) then be directly passed to the applicable functions.

Warning

Wikidata requires a custom (!) HTTP User-Agent header for all requests! You can set it as follows:

from kgextension import __agent__
from kgextension.endpoints import WikiData

WikiData.agent = "CoolToolName/0.0 (https://example.org/cool-tool/; cool-tool@example.org) "+__agent__

Setup of own Endpoints

Remote Endpoints

To set up your own Remote Endpoint, you have to create an kgextension.sparql_helper.RemoteEndpoint object, as in the following example:

from kgextension.sparql_helper import RemoteEndpoint

DBpedia = RemoteEndpoint(url = "http://dbpedia.org/sparql", timeout=120, requests_per_min=100*60, retries=10, page_size=10000)

Note

Theoretically the only parameter needed to set up a RemoteEndpoint is the url parameter. However, it is important correctly set the remaining parameters, as they are needed for the automatic Query Limiting done by this package.

After the successful creation, the resulting RemoteEndpoint object can be passed to the applicable functions.

Local Endpoints

Local Endpoints use the serializers provided by the RDFLib package to parse the local RDF files. Therefore, multiple serialisation formats are supported (e.g. RDF/XML, N3 & Turtle). For more information regarding the supported formats, please reference the RDFlib documentation.

To set up your own Local Endpoint, you have to create an kgextension.sparql_helper.LocalEndpoint object, as in the following example:

from kgextension.sparql_helper import LocalEndpoint

Mondial = LocalEndpoint(file_path = "mondial-europe.rdf")

Mondial.initialize()

Note the additional initialization call of the initialize() method, which will load the provided data into local memory. As this can, depending on the size of the dataset, take quite some time and will potentially consume lots of memory, it is not performed automatically. After the initialization, the created LocalEndpoint object can then be passed to the applicable functions.

If you want to remove the data from your local memory, you can call the close() method.

Mondial.close()