Version 6, last updated by Robert Isele at November 21, 2011 17:55 UTC
Data Sources
Overview
Data sources hold the access parameters to local or remote SPARQL endpoints or RDF files. The defined data sources may later be referred to and used by their ID. Data Sources can be defined using either the API or XML.
Available Data Sources
SPARQL Endpoint Data Source Definitions
For SPARQL endpoints (dataSource type: sparqlEndpoint) the following parameters exist:
| Parameter | Description | Default |
|---|---|---|
| file | The URI of the SPARQL endpoint. | |
| login | Login required for authentication | No login |
| password | Password required for authentication | No password |
| instanceList | A list of instances to be retrieved. If not given, all instances will be retrieved. Multiple instances can be separated by a space. | Retrieve all instances |
| pageSize | Limits each SPARQL query to a fixed amount of results. Silk implements a paging mechanism which translates the pagesize parameter into SPARQL LIMIT and OFFSET clauses. | 1000 |
| graph | Only retrieve instances from a specific graph. | |
| pauseTime | To allow rate-limiting of queries to public SPARQL severs, the pauseTime statement specifies the number of milliseconds to wait in between subsequent queries. | 0 |
| retryCount | To recover from intermittent SPARQL endpoint connection failures, the retryCount parameter specifies the number of times to retry connecting. | 3 |
| retryPause | Specifies how long to wait between retries. | 1000 |
Example (XML)
<DataSource id="dbpedia" type="sparqlEndpoint">
<Param name="endpointURI" value="http://dbpedia.org/sparql" />
<Param name="retryCount" value="100" />
</DataSource>Example (Scala API)
Note that all parameters except the endpoint URI are optional and can be left out.
Source("dbpedia",
SparqlDataSource(
endpointURI = "http://dbpedia.org/sparql",
login= "user",
password= "password",
graph= "http://dbpedia.org",
pageSize = 1000,
pauseTime = 0,
retryCount = 3,
retryPause = 1000
)
)RDF File Data Source Definitions
For RDF files (dataSource type: file) the following parameters exist:
| Parameter | Description | Default |
|---|---|---|
| file (mandatory) | The location of the RDF file. | |
| format (mandatory) | The format of the RDF file. Allowed values: “RDF/XML”, “N-TRIPLE”, “TURTLE”, “TTL”, “N3” |
Currently the data set is held in memory.
Example (XML)
<DataSource id="musicbrainz" type="file">
<Param name="file" value="musicbrainz_dump.nt" />
<Param name="format" value="N-TRIPLE" />
</DataSource>Example (Scala API)
Source("musicbrainz",
FileDataSource(
file = "musicbrainz_dump.nt",
format = "N-TRIPLE"
)
)