Wednesday, August 8, 2018

Building an Oxford Dictionary API Client: Inspecting the JSON Data

If you would like to follow along, the source code for this tutorial is available from this Github repo, but remember to apply for your own free app_id an app_key from the Oxford REST API support/admin team.
Before we dive into this tutorial, we will examine the JSON data returned from different REST API endpoints in order to get a better idea of how we should create custom data types and design our data models.  It is highly recommend that you consult the Oxford API REST documentation to better understand the finer details of making API requests,  as each project may have different needs which in turn may necessitate different implementations of a REST API client.  Furthermore, in order to better follow along, it is recommended that you apply for your own app_key and app_id here, as the app_id and app_key used here are outdated and provided for demonstration purposes only.  In order to examine the return JSON data, we will be making request via Postman, which provides fields for entering authentication credentials (in this case, app_key and app_id) in the URL request header fields.  Since the base URL is https://od-api.oxforddictionaries.com/api/v1, your fields should look something like what is shown below:


First, we will take a look at the Thesaurus API, which provides the following endpoints:
  1.  /entries/{source_lang}/{word_id}/synonyms
  2.  /entries/{source_lang}/{word_id}/antonyms
  3.  /entries/{source_lang}/{word_id}/synonyms;antonyms
Let’s look at the synonyms that we get for the word “love.” For all of the API requests demonstrated here, English is assumed to be the default language. Please be aware, however, that the language code can be changed to get data for words from other languages.  A sample API request is show below:
JSON Data 1(a).  Returned JSON Data for URL: https://od-api.oxforddictionaries.com/api/v1/entries/en/love/synonyms
 

{
    "metadata": {
        "provider": "Oxford University Press"
    },
    "results": [
        {
            "id": "love",
            "language": "en",
            "lexicalEntries": [
                {
                    "entries": [
                        {
                            "homographNumber": "000",
                            "senses": [
                                {
                                    "examples": [
                                        {
                                            "text": "his friendship with Helen grew into love"
                                        },
                                        {
                                            "text": "she has a great love for her children"
                                        }
                                    ],
                                    "id": "t_en_gb0008942.001",
                                    "subsenses": [
                                        {
                                            "id": "ide826eadf-07eb-4b26-b5c5-d1be93adc85f",
                                            "synonyms": [
                                                {
                                                    "id": "devotion",
                                                    "language": "en",
                                                    "text": "devotion"
                                                },
                                                {
                                                    "id": "adoration",
                                                    "language": "en",
                                                    "text": "adoration"
                                                },
                                                {
                                                    "id": "doting",
                                                    "language": "en",
                                                    "text": "doting"
                                                },
                                                {
                                                    "id": "idolization",
                                                    "language": "en",
                                                    "text": "idolization"
                                                },
                                                {
                                                    "id": "worship",
                                                    "language": "en",
                                                    "text": "worship"
                                                }
                                            ]
                                        },
                                        {
                                            "id": "id6316b26b-17be-484b-8f64-159043e4fe76",
                                            "synonyms": [
                                                {
                                                    "id": "passion",
                                                    "language": "en",
                                                    "text": "passion"
                                                },
                                                {
                                                    "id": "ardour",
                                                    "language": "en",
                                                    "text": "ardour"
                                                },
                                                {
                                                    "id": "desire",
                                                    "language": "en",
                                                    "text": "desire"
                                                },
                                                {
                                                    "id": "lust",
                                                    "language": "en",
                                                    "text": "lust"
                                                },
                                                {
                                                    "id": "yearning",
                                                    "language": "en",
                                                    "text": "yearning"
                                                },
                          ....
                          ....


As you can see, there are a large number of results, and as of the time of this writing, there is no offset or limit parameter whereby we can change the starting point from which we view the results or limit the number of results.  Nor can we filter the results by lexical category or grammatical features.  So the returned JSON data includes synonyms for “love,” irrespective of lexical category (i.e. noun, verb, adjective, etc.)
For the Dictionary Entry API, we have the following endpoints:
  1.  /entries/{source_lang}/{word_id}
  2.  /entries/{source_lang}/{word_id}/regions={region}
  3. /entries/{source_lang}/{word_id}/{filters}
Let’s begin by testing the word “love” again. but this time we will filter by ‘pronunciations.’  That is, we just want the returned JSON data to give us pronunciation data for the dictionary entry (you can also request definitions, etymologies, domains, registers, etc.).  As a small caveat, this filter is more akin to an endpoint rather than a filter per se, since it doesn’t use key-value pairs to match a given query parameter with a given value, but I guess that is more of a semantic issue regarding the word ‘filter.’
JSON Data (1)b.  For this API request, we’ve specified ‘pronunciations’ as a filter parameter.  Returned JSON Data for URL: https://od-api.oxforddictionaries.com/api/v1/entries/en/love/pronunciations
    {
    "metadata": {
        "provider": "Oxford University Press"
    },
    "results": [
        {
            "id": "love",
            "language": "en",
            "lexicalEntries": [
                {
                    "language": "en",
                    "lexicalCategory": "Noun",
                    "pronunciations": [
                        {
                            "audioFile": "http://audio.oxforddictionaries.com/en/mp3/love_gb_1.mp3",
                            "dialects": [
                                "British English"
                            ],
                            "phoneticNotation": "IPA",
                            "phoneticSpelling": "lʌv"
                        }
                    ],
                    "text": "love"
                },
                {
                    "language": "en",
                    "lexicalCategory": "Verb",
                    "pronunciations": [
                        {
                            "audioFile": "http://audio.oxforddictionaries.com/en/mp3/love_gb_1.mp3",
                            "dialects": [
                                "British English"
                            ],
                            "phoneticNotation": "IPA",
                            "phoneticSpelling": "lʌv"
                        }
                    ],
                    "text": "love"
                }
            ],
            "type": "headword",
            "word": "love"
        }
    ]
}                         
                          
                      


The data gives us links to audio files with the pronunciation in British English, and also provides the IPA symbols, which can be helpful for students of other languages who use the IPA system to help them learn pronunciation.
For the LexiStats API, we have the following endpoints, each of which we will examine more closely in turn:
  1. /stats/frequency/word/{source_lang}/
  2. /stats/frequency/words/{source_lang}/
  3. /stats/frequency/ngrams/{source_lang}/{corpus}/{ngram-size}/
JSON Data (1)c.  For this API request, we’ve provided query parameters for “lemma” and “lexicalCategory.”  Returned JSON Data for URL: https://od-api.oxforddictionaries.com/api/v1/stats/frequency/word/en/lemma=love;lexicalCategory=noun
{
    "metadata": {
        "language": "en",
        "options": {
            "corpus": "nmc",
            "lemma": "love",
            "lexicalCategory": "noun"
        },
        "provider": "Oxford University Press"
    },
    "result": {
        "frequency": 1133853,
        "lemma": "love",
        "lexicalCategory": "noun",
        "matchCount": 10476,
        "normalizedFrequency": 126.7507038321887
    }
}



Now that we have an idea of what our returned JSON data looks like, and what kind of query parameters and endpoints must be specified in the URL string, we can proceed to define some custom data types, data models, and helper classes for building the API URL request.  So click here to continue.

No comments:

Post a Comment