Azure Mobile Services as an API Cache, avoiding API Rate Limits

 

Let’s say you are using a third party REST API that allows you to cache its data in your mobile application, and everything is working great. But all of a sudden, you come across an unexpected challenge: the API calls have started to fail and your users are getting ugly errors. After doing your due diligence you pin point that this happens after your application exceeds a threshold on the number of API calls it makes. If this is the case you are likely to be the experiencing API throttling or API rate limit.

Many would argue that this is a good problem to have, since it likely implies that users are using your application. Great you have a market! But on the other hand you have a challenge that you need to address.

One way would be to get more API transactions (API calls) from your API provider and if the provider has a tiered pricing model, you are probable looking at a higher cost. Many API providers give you a reasonable number of transactions free of charge or at a relative low price. However, once you reach a given limit you have to move to another pricing tier. In other cases, your API might use API throttling as a preventive measure to avoid overutilization of their back-end resources.

But now in this context, imagine that you have a way to minimize the number of redundant API calls your app makes to the API provider without relaying on complex client side coding. And you do this in a very cost efficient way.

Well, that’s exactly what this post is all about: show you how you can create a cost-effective solution to implement an approach with these characteristics using Azure Mobile Services (AMS).

Caching Data

In a great number of scenarios, API calls retrieve data that does not change for a period of time. This period could be days, hours, minutes or even seconds. Often, developers rely on this point and create client-side data caching schemes where the application mitigates redundant calls (calls that will retrieve the same data) by storing data on the client device for a period of time. However, if your user population is significant enough, what you can do within your client application is limited. Even if you manage to bring the number of calls to a minimum per user, you could still have many users calling the API for the same data.

One strategy to mitigate the impact of this scenario, consists of having a middle tier where you can make the API call on behalf of the user to the API provider and then cache the data. So when another user calls the same API, the middle tier returns the cached data instead of making another call to the API provider. I am going to show you an implementation of this strategy using Azure Mobile Services.

 

Azure Mobile Services as a Caching Middle Tier

To demonstrate how to implement this strategy using AMS, I am going to use the Bing Search API to retrieve news about a particular topic and a Windows Store App as the client. First, let’s create an Azure Mobile Service and the table where we will store the cached data. This table will have a data structure equivalent to the result set from the Bing Search API.

 

1. - Create a new Mobile Service in the Azure Portal

Go to the Azure Management Portal, and click on New and then select Compute, Mobile Service and click Create.

image 

 

2. – Create a new Table with columns representing the data that the Bing Search API returns.

Once your service is created, go to the mobile service and click on the Data tab and then click on Add a Table.

image

 

Set the permissions for your table to Only Scripts and Admins

 

image

 

Next, let’s create the columns. When searching for news, the Bing API returns the following properties: ID, Title, Url, Source, Description and Date -for more information about the Bing Search API schema see here. So these are the columns we will be creating.

To add a column to the table, click on the name of your table and then click on the Columns tab and then on Add Column.

 

image

 

Repeat the same procedure for the other properties.

 

3. - Create cache management columns.

We need a way to uniquely identify the API call so when a user makes the same call, AMS can make the decision on whether to call the Bing API or return the cached data. We will use the actual search query and an expiration timestamp. Let’s call these columns searchquery and expirationdate. We’ll add these columns as well.

At the end, your table should look like this:

 

image

Handling Concurrency

*Warning: Geeky Stuff Ahead*

One of the challenges we need to tackle is what happens when two or more clients, that are using the same parameter (search query), call the API at the same time. If the implementation doesn’t handle this scenario, there’s a risk of storing duplicate data in the cache table. We can avoid this situation by implementing a synchronization scheme with a locking mechanism so when we have two concurrent requests, the request that succeeds on acquiring the lock is the one that inserts the data into the cache and returns the data to the client, while the one that fails just returns the data to the client.

To implement this is in a simple way we can create another table where we can store the search query (cache key) and add a unique constraint to the column. The unique constraint allows, that in the event of having concurrent calls, only one request will succeed in inserting the search query as a cache key into the table (i.e. acquiring a lock). This request will be responsible for inserting the cache data as well. The following image shows the schema of the cachekeys table.

 

image


Finally, to add a unique constraint to the cachekey column we can use Visual Studio or SQL Management Studio.

 

For more information see: Guidelines for Connecting to Windows SQL Azure Database and How to: Connect to a Database from Server Explorer and Primary and Unique Keys.

 

..and One Script to Rule Them All...

One of the cool features of AMS is the ability to create Custom APIs in addition to the read, insert, update and delete operations on a table.

Since we are only expecting one parameter from the client, it is more reasonable to implement the logic to call the API, manage the results and return the data to the client as a Custom API.  And since we are not expecting direct read, insert, update or delete operations on the table, when we created the table we set the permissions to scripts only for all these operations.

To create a custom API, click on the API tab and then click on Create a Custom API.

 

image

 

Enter a name for the API and set the permission to Anybody with the Application Key for the Get method and Only Administrators for the rest.

 

image

 

The first thing we need to do in our Custom API is checking for data in the cache and depending on the results we will either return the data or call the API. The script below shows the implementation of these two steps.

 

Note: Since we are using the Bing Search API, we are sending the application key in the basic authentication header both as the username and password. Click here to sign up for the Bing Search API and get an your application key.

 

exports.get = function(request, response) {

    var cacheTable = request.service.tables.getTable('bingnews');

    console.log('Search query: %s',request.query.searchquery);

    //Search by search query and expiration date.
    cacheTable.where(function(searchquery) {
        return this.searchquery == searchquery &&
            this.expirationdate > new Date();
    }, request.query.searchquery).read({
            success: getAPIData
        });

    function getAPIData(results) {

        if (results.length > 0) {
            // Cached records were found.
            console.log("Cache hit");
            console.log("Returning cached API data to the client");
            response.set('Content-Type', 'application/json');
            request.respond(statusCodes.OK, results);

        } else {
            console.log('No data found. Making an API call');

            var httpRequest = require('request');
            var url = "https://api.datamarket.azure.com/Bing/Search/v1/News?$format=json&Query='"+ request.query.searchquery + "'";

            console.log('API Url: %s', url);

            httpRequest.get({
                url: url,
                auth: "YOUR_BING_SEARCH_APP_KEY:YOUR_BING_SEARCH_APP_KEY"
            }, function(err, response, body) {

                    if (err) {
                        console.log('Error while attempting to call the API: %s', err);
                        request.respond(statusCodes.INTERNAL_SERVER_ERROR,
                            'Unable to connect to the API Provider.');
                    } else if (response.statusCode !== 200) {
                        console.log('Error while attempting to call the API: %s', err);
                        request.respond(statusCodes.BAD_REQUEST,
                            'Bad request');
                    } else {
                        console.log('Received response from the API provider');
                        respondAndCacheData(JSON.parse(body).d.results);
                    }
                });
        }
    }
};

 

Next, let’s take a look at the code that handles the response back to the client and writes the cached data into the database.

 

function respondAndCacheData(resultsData) {
    var clientResults = [];
    var expirationDate = new Date();
    expirationDate.setMinutes(expirationDate.getMinutes() + config.appSettings.expirationMins);

    var cacheLockTable = request.service.tables.getTable('cachekeys');
    var cacheKeyItem = {
        cachekey: request.query.searchquery,
        cacheexpiration: expirationDate
    };

    resultsData.forEach(function (item) {
        var returnItem = {
            bingnewsid: item.ID,
            title: item.Title,
            description: item.Description,
            date: item.Date,
            source: item.Source,
            url: item.Url,
            expirationdate: item.Date,
            searchquery: request.query.searchquery
        };
        clientResults.push(returnItem);
    });

    console.log('Attempting to create a lock for search query: %s', request.query.searchquery);

    cacheLockTable.insert(cacheKeyItem, {
        error: function () {
            //If we are here that means that there was a race condition and the cache records
            //will be inserted by another request, so we just need return the API data...
            console.log('Inserting items into the cache table: false');
            expirationDate = null;
        },
        success: function (resultlock) {
            //If we are here that means we have the lock and will need to insert the cache items into the table                                      
            console.log('Inserting items into the cache table: true');
            console.log('Cache expiration date: %s', expirationDate);

            clientResults.forEach(function (item) {
                cacheTable.insert(item, {
                    error: function () {
                        console.log('Error occurred while writing to the cache table.');
                    }
                });
            })
        }
    });

    //Return data to the client
    console.log('Returning API data to the client');
    response.set('Content-Type', 'application/json');
    request.respond(statusCodes.OK, clientResults);
}

Understanding the Cost Benefits

The benefit of using Azure Mobile Services for this scenario comes from AMS’s pricing model. There are three pricing tiers for AMS: Free, Basic and Standard.

Let’s consider the basic tier. With the basic tier you get 1.5 million API calls per month for a fee that is significantly less than the cost of a similar amount of transactions from many Open Data API providers. So by leveraging Azure Mobile Services as a middle tier that makes the API calls on behalf of the client application and that also handles when to return cached data, you have the potential to get substantial savings. This also means that you can serve more users while making less API calls to the API provider.

 

Note: The free tier also provides a quite significant amount of functionality for dev and test scenarios. It provides you with up to 500K API calls per month. For more info see the pricing calculator.

Client Application

To showcase this approach I created a simple Windows Store App using the Azure Mobile Service SDK. The app is a very simple app that displays the results in a GridView. I am also using the Azure Mobile Service SDK. You can download the app from here.

 

 

image

 

After configuring the App Key and the URI for your mobile service, you should be able to use the app. If you call the same query a few times within 5 minutes, you will be hitting the cache. You can see this is the AMS log.

 

Final Thoughts

The inspiration for this post came from a real scenario that I faced while developing a mobile application using a third party REST API. But as I was writing this blog post and working on the technical assets I realized that this approach might be beneficial for other scenarios. For example you can adapt this approach to create a internet facing version of your API for a LOB application hosted on-premises or you just want to create a central API gateway where you can monitor the runtime behavior of your application –using Azure built-in analytics.

Finally, it's important that you check with your API provider to see if they allow data caching in their terms and conditions.

On my next blog, I will discuss how you can use AMS Scheduler to purge the expired cache items from the database.

Hope you find this blog useful, let me know your thoughts!


No Comments

Post Reply