Wikipedia Recommender

The Wikipedia recommender contains 463,000 Wikipedia pages which had 20+ page views in 2013.

JSON spec

The recommender responds with an array of objects in the following format:

    "id": "0000362a0b589c908fe01fd7bb302b6a",
    "meta": {
      "url": "",
      "date": "2013-10-20",
      "title": "Arc International",
      "document_id": "wikipedia-12566618"
    "similarity": 1
    "id": "46951eff438212c9df72c303a27cff6b",
    "meta": {
      "url": "",
      "date": "2013-10-20",
      "title": "World Kitchen",
      "document_id": "wikipedia-3546083"
    "similarity": 0.8403602838999999

(Note: ... has been used to truncate long fields)

Each field of the document object is described below:

idStringID of the document to be used in /documents/{id}/similar
similarityFloatThe similarity to the input document or text (higher is closer)
meta->urlStringThe URL the Wikipedia page
meta->dateStringThe date the page was parsed
meta->titleStringThe title of the page
meta->document_idStringThe legacy document ID


Let’s say that you have some text and you want to get Wikipedia papers that are conceptually similar to the text. This is possible with /documents/similar-to-text?fields=meta. The first thing that you will need to do is get an API key. Once you have that, you can get recommendations in the terminal using the following cURL command:

curl --request POST \
  --url '' \
  --header 'content-type: application/json' \
  --header 'subscription-key: YOUR_WIKIPEDIA_KEY' \
  --data '{"text":"Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the construction and study of algorithms that can learn from and make predictions on data."}'

This will return a JSON object as specified above. If you want to pretty print it for testing (and have Python 2.6+) you can pipe the output of the above command to | python -m json.tool.

Now you’ve got some results back from the API you might want to get similar papers for one of the results. To do this you can query using /documents/{id}/similar. Given the ID f47a946fb9a375debde783a37f182eff which is the ID for Artificial intelligence we can query the API for similar papers to an ID:

curl --request GET \
  --url '' \
  --header 'content-type: application/json' \
  --header 'subscription-key: YOUR_WIKIPEDIA_KEY' 

To call the API in your programming language of choice, check out the API specification where there are code samples available.

