# Wikidata JSKOS

[![GitHub release](https://img.shields.io/github/release/gbv/wikidata-jskos.svg)](https://github.com/gbv/wikidata-jskos/releases/latest)
[![API Status](https://coli-conc-status.fly.dev/api/badge/19/status?label=API)](https://coli-conc.gbv.de/services/wikidata/)
[![License](https://img.shields.io/github/license/gbv/wikidata-jskos.svg)](./LICENSE.md)
[![Docker](https://img.shields.io/badge/Docker-ghcr.io%2Fgbv%2Fwikidata--jskos-informational)](./docker/README.md)
[![Test](https://github.com/gbv/wikidata-jskos/actions/workflows/test.yml/badge.svg)](https://github.com/gbv/wikidata-jskos/actions/workflows/test.yml)
[![npm version](http://img.shields.io/npm/v/wikidata-jskos.svg?style=flat)](https://www.npmjs.org/package/wikidata-jskos)
[![standard-readme compliant](https://img.shields.io/badge/readme%20style-standard-brightgreen.svg)](https://github.com/RichardLitt/standard-readme)

> Access [Wikidata] in [JSKOS] format

This node module provides [a web service](#web-service), a [command line client](#command-line-tool), and [a library](#api) to access Wikidata in [JSKOS] format. The data includes Wikidata items as concepts and concept schemes (read) and mappings between Wikidata and other authority files (read and write).

## Table of Contents

- [Background](#background)
- [Install](#install)
  - [Docker](#docker)
  - [Node.js](#nodejs)
- [Configuration](#configuration)
- [Usage](#usage)
- [Web Service](#web-service)
  - [Authentication](#authentication)
  - [GET /status](#get-status)
  - [GET /concepts](#get-concepts)
  - [GET /concepts/suggest](#get-conceptssuggest)
  - [GET /mappings](#get-mappings)
  - [GET /mappings/voc](#get-mappingsvoc)
  - [GET /mappings/:\_id](#get-mappings_id)
  - [POST /mappings](#post-mappings)
  - [PUT /mappings/:\_id](#put-mappings_id)
  - [DELETE /mappings/:\_id](#delete-mappings_id)
- [Command line tool](#command-line-tool)
  - [wdjskos concept](#wdjskos-concept)
  - [wdjskos mappings](#wdjskos-mappings)
  - [wdjskos schemes](#wdjskos-schemes)
  - [wdjskos update](#wdjskos-update)
  - [wdjskos find](#wdjskos-find)
  - [wdjskos mapping-item](#wdjskos-mapping-item)
- [API](#api)
  - [mapEntity](#mapentity)
    - [Map selected parts of a Wikidata entity](#map-selected-parts-of-a-wikidata-entity)
    - [Map simplified Wikidata entities](#map-simplified-wikidata-entities)
  - [mapMapping](#mapmapping)
- [Maintainers](#maintainers)
  - [Publish](#publish)
- [Contributing](#contributing)
- [License](#license)

[mapEntity]: #mapentity
[mapMapping]: #mapmapping

## Background

[Wikidata] is a large knowledge base with detailed information about all kinds of entities. Mapping its data model to [JSKOS] data format allows simplified reuse of Wikidata as authority file. This implementation is used in the [Cocoda web application](https://coli-conc.gbv.de/cocoda/) but it can also be used independently.

The mapping between Wikidata and JSKOS format includes:

* Wikidata items expressed as authority records ([JSKOS Concepts](https://gbv.github.io/jskos/jskos.html#concept))
* Selected Wikidata items covering information about authority files ([JSKOS Concept Schemes](https://gbv.github.io/jskos/jskos.html#concept-schemes))
* Selected Wikidata statements linking Wikidata to other authority files ([JSKOS Mappings](https://gbv.github.io/jskos/jskos.html#concept-mappings))

In addition a search service is provided for selecting a Wikidata item with typeahead.

Editing Wikidata mapping statements to other authority files requires [authentification](#authentification) via OAuth. The following authority files have been tested succesfully:

* Basisklassifikation (BK)
* Regensburg Classification (RVK)
* Integrated Authority File (GND)
* Nomisma
* Iconclass

Other systems (not including DDC) may also work but they have not been converted to JSKOS yet, so they are not provided for browsing in Cocoda.

## Install

### Docker

The easiest way to run wikidata-jskos is via Docker. Please refer to the [Docker documentation](./docker/README.md).

### Node.js

Node.js 18 is required (Node.js 20 recommended).

```sh
git clone https://github.com/gbv/wikidata-jskos.git
cd wikidata-jskos
npm ci
```

Optionally make the [command line tool](#command-line-tool) `wdjskos` available:

```sh
npm link
```

For development of the web service with hot reload and auto reconnect at <http://localhost:2013/>:

```bash
npm run start
```

For deployment of the web service (if not using Docker) there is a config file to use with [pm2](http://pm2.keymetrics.io/):

```bash
cp ecosystem.example.json ecosystem.config.json
pm2 start ecosystem.config.json
```

## Configuration

You can customize the application settings via a configuration file, e.g. by providing a generic `config.json` file and/or a more specific `config.{env}.json` file (where `{env}` is the environment like `development` or `production`). The latter will have precendent over the former, and all missing keys will be defaulted with values from `config.default.json`.

All configuration options can also be set via environment variables (`.env` when running via Node.js or using `environment` or `env_file` in Docker Compose).

Some notes:

- To use a custom Wikibase instance, you can set the subkeys of the `wikibase` property. Both `instance` and `sparqlEnpoint` are necessary. By default, Wikidata is used.
- wikidata-jskos supports saving, editing, and deleting mappings in Wikidata. To enable this, you will need to provide `auth.algorithm` and `auth.key` (algorithm and key to decode the JWT, usually coming from [login-server]), as well as `oauth.consumer_key` and `oauth.consumer_secret` (for your registered OAuth consumer).
- `auth.key`/`AUTH_KEY` contain line breaks. In JSON, these can simply be set as `\n`. When using `.env` or `env_file`, the whole key needs to be double-quoted (`"-----BEGIN PUBLIC KEY-----\n..."`). To set `AUTH_KEY` directly in `docker-compose.yml` via `environment`, please look at the included [`docker-compose.yml`](./docker/docker-compose.yml) file or refer to [this StackOverflow answer](https://stackoverflow.com/a/53198865).
- Please provide a `baseUrl` when used in production. If no baseUrl is provided, `http://localhost:${port}/` will be used.
- List of all available configuration options:

| `config.json` key       | environment variable | default value                       |
| ----------------------- | -------------------- | ----------------------------------- |
| title                   | TITLE                | Wikidata JSKOS Service              |
| wikibase.instance       | WIKIBASE_INSTANCE    | `https://www.wikidata.org`          |
| wikibase.sparqlEndpoint | WIKIBASE_SPARQL      | `https://query.wikidata.org/sparql` |
| wikibase.api            | WIKIBASE_API         | `${wikibase.instance}/w/api.php`    |
| verbosity               | VERBOSITY            | false                               |
| baseUrl                 | BASE_URL             | `http://localhost:${port}/`         |
| port                    | PORT                 | 2013                                |
| auth.algorithm          | AUTH_ALGORITHM       | HS256                               |
| auth.key                | AUTH_KEY             | null                                |
| oauth.consumer_key      | OAUTH_KEY            | null                                |
| oauth.consumer_secret   | OAUTH_SECRET         | null                                |

The list of concept schemes to read and write mappings to, is hard-coded in directory [assests](assets). To update concept schemes, regularly run:

```bash
npm run update
```


## Usage

See below for use of the [Web Service](#web-service), the [command line tool](#command-line-tool), and the JavaScript [API](#api).

## Web Service

An instance is available at <https://coli-conc.gbv.de/services/wikidata/>. The service provides selected endpoints of [JSKOS API](https://github.com/gbv/jskos-server#api).

### Authentication

The following endpoints require an authenticated user:

- [POST /mappings](#post-mappings)
- [PUT /mappings/:_id](#put-mappings_id)
- [DELETE /mappings/:_id](#delete-mappings_id)

Authentication works via a JWT (JSON Web Token). The JWT has to be provided as a Bearer token in the authentication header, e.g. `Authentication: Bearer <token>`. It is integrated with [login-server] and the JWT is required to have the same format as the one login-server provides. Specifically, the OAuth token and secret for the user need to be provided as follows:

```json
{
  "user": {
    "identities": {
      "wikidata": {
        "oauth": {
          "token": "..",
          "token_secret": "..."
        }
      }
    }
  }
}
```

There are more properties in the JWT, but those are not used by wikidata-jskos. Note that the JWT needs to be signed with the respective private key for the public key provided in the [configuration](#configuration). Also, the OAuth user token and secret need to come from the same OAuth consumer provided in the config.

### GET /status

Returns a JSKOS API status object. See [JSKOS Server] for details.

### GET /concepts

Look up Wikidata items as [JSKOS Concepts] by their entity URI or QID.

* **URL Params**

  `uri=[uri]` URIs for concepts separated by `|`.

  `language` or `languages`: comma separated list of language codes.

* **Success Response**

  JSON array of [JSKOS Concepts]

Only some Wikidata properties are mapped to JSKOS fields.  The result also
contains `broader` links determined by an additional SPARQL request.

Deprected alias at `/concept` is going to be removed.

### GET /concepts/suggest

OpenSearch Suggest endpoint for typeahead search.

[JSKOS Concept Schemes]: https://gbv.github.io/jskos/jskos.html#concept-schemes
[JSKOS Server]: https://github.com/gbv/jskos-server
[JSKOS Concepts]: https://gbv.github.io/jskos/jskos.html#concept
[JSKOS Concept Mappings]: https://gbv.github.io/jskos/jskos.html#concept-mappings
[Wikidata properties for authority control]: http://www.wikidata.org/entity/Q18614948

Deprected aliases at `/concept/suggest` and `/suggest` are going to be removed.

### GET /mappings

Look up Wikidata mapping statements as [JSKOS Concept Mappings] between
Wikidata items (query parameter `from`) and external identifiers (query
parameter `to`). At least one of both parameters must be given.

* **URL Params**

  `from=[uriOrNotation1|uriOrNotation2|...]` specify the source URI or notation (multiple URIs/notations separated by `|`)

  `to=[uriOrNotation1|uriOrNotation2|...]` specify the target URI or notation (multiple URIs/notations separated by `|`)

  `fromScheme=[uri|notation]` only show mappings from this concept scheme (URI or notation)

  `toScheme=[uri|notation]` only show mappings to this concept scheme (URI or notation)

  `language` or `languages` enables inclusion of entity labels. A comma separated list of language codes is used as preference list.

  `mode=[mode]` specify the mode for `from`, `to`, one of `and` (default) and `or`

  `direction=forward|backward|both` searches mappings from `from` to `to` (default), reverse, or in both directions

  `limit=[number]` maximum number of mappings to return (not fully implemented)

  `offset=[number]` start number of mappings to return (not fully implemented)

Concept Schemes are identified by BARTOC IDs (e.g. <http://bartoc.org/en/node/430>`).

* **Success Response**

  JSON array of [JSKOS Concept Mappings]

* ***Examples***

  `?from=http://www.wikidata.org/entity/Q42`

  `?to=http://d-nb.info/gnd/119033364`

Mapping relation types ([P4390]) are respected, if given, see for example
mapping from Wikidata to <http://d-nb.info/gnd/7527800-5>.

[P1921]: http://www.wikidata.org/entity/P1921
[P1793]: http://www.wikidata.org/entity/P1793
[P1629]: http://www.wikidata.org/entity/P1629
[P2689]: http://www.wikidata.org/entity/P2689
[P4390]: http://www.wikidata.org/entity/P2689

### GET /mappings/voc

Look up Wikidata items with [Wikidata properties for authority control] as
[JSKOS Concept Schemes] with used for mappings. These schemes need to have a
BARTOC-ID ([P2689]), and be subject item ([P1629]) of an external identifier
property with statements [P1921] (URI template) and [P1793] (regular
expression).

* **URL Params**

  None.

* **Success Response**

  JSON array of [JSKOS Concept Schemes]

### GET /mappings/:_id

Returns a specific mapping for a Wikidata claim/statement.

* **Success Response**

  JSKOS object for mapping.

* **Error Response**

  If no claim with `_id` could be found, it will return a 404 not found error.

* **Sample Call**

  ```bash
  curl https://coli-conc.gbv.de/services/wikidata/mappings/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598
  ```

  ```json
  {
    "uri": "http://localhost:2013/mappings/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598",
    "identifier": [
      "http://www.wikidata.org/entity/statement/Q11351-9968E140-6CA7-448D-BF0C-D8ED5A9F4598",
      "urn:jskos:mapping:content:2807c55eac85ed8c0c9254ff04b457f89b801ac9",
      "urn:jskos:mapping:members:daafcd8580e6f0304f0b1cee024f65f04da98a3c"
    ],
    "to": {
      "memberSet": [
        {
          "uri": "http://rvk.uni-regensburg.de/nt/VK",
          "notation": [
            "VK"
          ]
        }
      ]
    },
    "type": [
      "http://www.w3.org/2004/02/skos/core#exactMatch"
    ],
    "fromScheme": {
      "uri": "http://bartoc.org/en/node/1940",
      "notation": [
        "WD"
      ]
    },
    "toScheme": {
      "uri": "http://bartoc.org/en/node/533",
      "notation": [
        "RVK"
      ]
    },
    "from": {
      "memberSet": [
        {
          "uri": "http://www.wikidata.org/entity/Q11351",
          "notation": [
            "Q11351"
          ]
        }
      ]
    },
    "@context": "https://gbv.github.io/jskos/context.json"
  }
  ```

### POST /mappings

Saves a mapping in Wikidata. Requires [authentication](#authentication).

Note that if an existing mapping in Wikidata is found with the exact same members, that mapping will be overwritten by this request.

* **Success Reponse**

  JSKOS Mapping object as it was saved in Wikidata.

### PUT /mappings/:_id

Overwrites a mapping in Wikidata. Requires [authentication](#authentication).

* **Success Reponse**

  JSKOS Mapping object as it was saved in Wikidata.

### DELETE /mappings/:_id

Deletes a mapping from Wikidata. Requires [authentication](#authentication).

* **Success Reponse**

  Status 204, no content.

## Command line tool

The command line client `wdjskos` provides roughly the same commands as accessible via
[the web service](#web-service).

Mapping schemes are cached in the subfolder `./cache`. To update the cache
include option `--force` or run command `update`.

### wdjskos concept

Look up Wikidata items as [JSKOS Concepts].

    wdjskos concept Q42

### wdjskos mappings

Look up [JSKOS Concept Mappings].

    wdjskos mappings Q42 | jq .to.memberSet[].uri
    wdjskos mappings - http://viaf.org/viaf/113230702

A single hyphen (`-`) can be used to nullify argument `from` or `to`,
respectively. Mappings can be limited to a target scheme. These are equivalent:

    wdjskos --scheme P227 mappings Q42
    wdjskos --scheme 430 mappings Q42
    wdjskos --scheme http://bartoc.org/en/node/430 mappings Q42

### wdjskos schemes

Return up [JSKOS Concept Schemes] with [Wikidata properties for authority control].

### wdjskos update

Look up concept schemes from Wikidata and update the cache.

### wdjskos find

Search a Wikidata item by its names and return OpenSearch Suggestions response.

### wdjskos mapping-item

Convert a JSKOS mapping to a Wikidata item.

    wdjskos mapping-item mapping.json
    wdjskos --simplfiy mapping-item mapping.json

## API

The node library can be used to convert Wikidata JSON format to JSKOS
([mapEntity]) and to convert JSKOS mappings to Wikidata JSON format
([mapMapping]).

### mapEntity

```js
jskos = wds.mapEntity(entity)
```

Entity data can be retrieved via Wikidata API method [wbgetentities] and from
Wikidata database dumps. See JavaScript libraries [wikidata-sdk] and
[wikidata-filter] for easy access and processing.

#### Map selected parts of a Wikidata entity

All methods return a JSKOS item.

```js
jskos = wds.mapIdentifier(entity.id)
// { uri: "http://www.wikidata.org/entity/...", notation: [ "..." ] }

jskos = wds.mapLabels(entity.labels)
// { prefLabel: { ... } }

jskos = wds.mapAliases(entity.aliases)
// { altLabel: { ... } }

jskos = wds.mapDescriptions(entity.descriptions)
// { scopeNote: { ... } }

jskos = wds.mapSitelinks(entity.sitelinks)
// { occurrences: [ { ... } ], subjectOf: [ { url: ... }, ... ] }

jskos = wds.mapClaims(entity.claims)
// ...

// convert claims with mapping properties
jskos = wds.mapMappingClaims(claims)

jskos = wds.mapInfo(entity)
// ...
```

#### Map simplified Wikidata entities

Each method has a counterpart to map simplified Wikidata entities.

```js
jskos = wds.mapSimpleEntity(entity)
jskos = wds.mapSimpleIdentifier(entity.id)
jskos = wds.mapSimpleLabels(entity.labels)
...
```

### mapMapping

Convert a JSKOS mapping into a Wikidata claim. Only respects JSKOS fields
`from`, `to`, `uri`, and `type` (if given) and only supports 1-to-1 mappings
from a single Wikidata item to a concept in another concept scheme.

*this is work in progress!*

## Maintainers
- [@nichtich](https://github.com/nichtich)

### Publish
**For maintainers only**

Please work on the `dev` branch during development (or better yet, develop in a feature branch and merge into `dev` when ready).

When a new release is ready (i.e. the features are finished, merged into `dev`, and all tests succeed), run the included release script (replace "patch" with "minor" or "major" if necessary):

```bash
npm run release:patch
```

This will:
- Check that we are on `dev`
- Run tests and build to make sure everything works
- Make sure `dev` is up-to-date
- Run `npm version patch` (or "minor"/"major")
- **Ask you to confirm the version**
- Push changes to `dev`
- Switch to `main`
- Merge changes from `dev`
- Push `main` with tags
- Switch back to `dev`

After running this, GitHub Actions will automatically create a new GitHub Release draft. Please edit and publish the release manually.

## Contributing

PRs accepted against the `dev` branch.

To enbale debugging ouput set enviroment variable `DEBUG` to comma-separated list of components (`sparql`, `http`, `query`).

Please lint JavaScript code (e.g. run `npm run lint` or `npm run fix`).

If editing the README, please conform to the [standard-readme](https://github.com/RichardLitt/standard-readme) specification.

## License

[MIT © 2024 Verbundzentrale des GBV (VZG)](LICENSE.md)


[wbgetentities]: https://www.wikidata.org/w/api.php?action=help&modules=wbgetentities
[wikidata-sdk]: https://github.com/maxlath/wikidata-sdk
[wikidata-cli]: https://github.com/maxlath/wikidata-cli
[wikidata-filter]: https://github.com/maxlath/wikidata-filter
[Wikidata]: https://www.wikidata.org/
[JSKOS]: https://gbv.github.io/jskos/
[login-server]: https://github.com/gbv/login-server
