How to Implement full text search in your mobile application using Azure Mobile Services

Search is one of those features that users expect and could be challenging to implement for mobile applications. In this blog post I will show how you can implement full text search capabilities for your mobile application using Azure Mobile Services with a .NET backend.

The approach is quite simple and leverages some of the cool features of Azure Mobile Services such as custom APIs and the ability to schedule jobs. I will be using Lucene.Net, which is a stable port of the popular Java based Lucene search library. To simplify the implementation, I’m also using the really handy component, AzureDirectory library for Lucene.Net, which makes it super easy to create and manage the Lucene’s search index using Azure storage.

At a high level the approach consists of including in the REST endpoints that perform the table operations (insert,update, and delete) the ability to add the changes made to the data to a cloud queue.  A scheduled job will “listen” for these changes by reading from the same cloud queue and will update the search index accordingly. Finally, we will expose the search functionality as a custom API  that uses Lucene to perform the search and returns the results back to the client.

 

image

 

I am sold, let’s do it!

 

1.  First, let’s create a new Azure Mobile Service from Visual Studio 2013.

This will create a new solution with the Azure Mobile Service’s project based on the Web API.  The project is a fully functional Mobile Service that exposes a “to-do” list. So we will be indexing and implementing search capabilities for the “todo” list’s data.

 

image

 

Note: You will need Visual Studio 2013 with Update 2 RC and the Azure SDK 2.3

 

2. Next, download the AzureDirectory Library for Lucene .NET from NuGet.

 

image

 

3. Queuing changes from the table operations.

We want to update the search index whenever an item is updated, inserted or deleted. These operations correlate to the PATCH, POST and DELETE methods of our Mobile Services’ REST API. When a client invokes any of these operations, the API will create a new message containing the data item, and add it to the queue. The following code shows the implementation for the relevant API operations in the TodoItemController.

 

Note: The AzureDirectory library requires cloud storage, so you will need to create a cloud storage account and provide the connection string via a configuration entry. When debugging in your local computer you can put this setting in the Web.config. Once deployed to Azure, you can set this setting in the configuration section of your Mobile Service using the Azure portal.

We will be using the same storage account for the cloud queue.

For more information check How To Create a Storage Account.

        // PATCH tables/TodoItem/48D68C86-6EA6-4C25-AA33-223FC9A27959
        public async Task<TodoItem> PatchTodoItem(string id, Delta<TodoItem> patch)
        {
             var update = await UpdateAsync(id, patch);

             var msg = new CloudQueueMessage(JsonConvert.SerializeObject(update));
             _qref.AddMessage(msg, new System.TimeSpan(24, 0, 0));

        
             return update;
        }

        // POST tables/TodoItem/48D68C86-6EA6-4C25-AA33-223FC9A27959
        public async Task<IHttpActionResult> PostTodoItem(TodoItem item)
        {
            TodoItem current = await InsertAsync(item);

            var msg = new CloudQueueMessage(JsonConvert.SerializeObject(current));
            _qref.AddMessage(msg,new System.TimeSpan(24,0,0));

            return CreatedAtRoute("Tables", new { id = current.Id }, current);
        }

        // DELETE tables/TodoItem/48D68C86-6EA6-4C25-AA33-223FC9A27959
        public Task DeleteTodoItem(string id)
        {
            var msg = new CloudQueueMessage(JsonConvert.SerializeObject(new TodoItem { Id = id,Text =string.Empty , Deleted = true }));
            _qref.AddMessage(msg, new System.TimeSpan(24, 0, 0));

            return DeleteAsync(id);
        }

Finally, the code that initializes the private field in the TodoItemController that references the queue client.

 public class TodoItemController : TableController<TodoItem>
    {
                
        private CloudQueue _qref;
              
        protected override void Initialize(HttpControllerContext controllerContext)
        {
            base.Initialize(controllerContext);
            AMSSearchContext context = new AMSSearchContext(Services.Settings.Schema);
            DomainManager = new EntityDomainManager<TodoItem>(context, Request, Services);
            var cloudStorageAccount = CloudStorageAccount.DevelopmentStorageAccount;
            CloudStorageAccount.TryParse(CloudConfigurationManager.GetSetting("blobStorage"), out cloudStorageAccount);
            var qClient = cloudStorageAccount.CreateCloudQueueClient();
            _qref = qClient.GetQueueReference("searchqueue");
            _qref.CreateIfNotExists();
        }

      // ...

}

4. Create the indexer job.

The indexer job reads messages from the queue and updates the index according to the type of changes made to the data.

When you are using .NET as the backend for Azure Mobile Services, you implement a scheduled job by creating a class derived from Microsoft.WindowsAzure.Mobile.Service.ScheduledJob and overriding the method ExecuteAsync() with your job’s details. You must place your class in the ScheduledJobs project folder.

The following code is an implementation of the ExecuteAsync() for the indexer job that reads from the queue and determines whether it needs to update, insert or delete the data from the search index.

 

public class IndexerJob : ScheduledJob
    {
        public override Task ExecuteAsync()
        {

            CloudStorageAccount cloudStorageAccount = CloudStorageAccount.DevelopmentStorageAccount;
            CloudStorageAccount.TryParse(CloudConfigurationManager.GetSetting("blobStorage"), out cloudStorageAccount);
            var qClient = cloudStorageAccount.CreateCloudQueueClient();
            var q = qClient.GetQueueReference("searchqueue");

            if (!q.Exists())
            {
                return Task.FromResult(true);
            }

            var azureDirectory = new AzureDirectory(cloudStorageAccount, "AMSCatalog");
            var luceneAnalyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);

            using (var indexWriter = new IndexWriter(azureDirectory, luceneAnalyzer, true, new IndexWriter.MaxFieldLength(IndexWriter.DEFAULT_MAX_FIELD_LENGTH)))
            {

                foreach (var msg in q.GetMessages(20, TimeSpan.FromMinutes(5)))
                {
                    var item = JsonConvert.DeserializeObject<TodoItem>(msg.AsString);
                    var doc = new Document();

                    doc.Add(new Field("id", item.Id, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));
                    doc.Add(new Field("title", item.Text, Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));

                    //Delete case...
                    if (item.Deleted)
                    {
                        var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, "id", luceneAnalyzer);
                        indexWriter.DeleteDocuments(parser.Parse("id:" + item.Id));
                    }
                    //Insert case...
                    else if (item.CreatedAt.ToString() == item.UpdatedAt.ToString())
                    {
                        indexWriter.AddDocument(doc);
                    }
                    //Udpate case...
                    else
                    {
                        indexWriter.UpdateDocument(new Term("id"), doc);
                    }

                    q.DeleteMessage(msg);

                }
            }

            return Task.FromResult(true);
        }
    }

 

 

5. Search endpoint

At this point we have a job that updates the index and operations that submit, via a cloud queue, changes to the index. Next, we need to implement an endpoint that the client will use to perform a search and get the search results.

In Azure Mobile Services with a .NET backend, custom APIs are implemented in a custom controller. To create a custom controller, right-click on the Controllers folder and then select Add, Controller…

image

 

The following code shows the implementation of an endpoint invoking the search and returning the results as a list of key-value pairs.

 

public class SearchController : ApiController
    {
        public ApiServices Services { get; set; }

        // GET api/Search
        public IEnumerable<KeyValuePair<string,string>> Get(string searchQuery)
        {

            CloudStorageAccount cloudStorageAccount = CloudStorageAccount.DevelopmentStorageAccount;
            CloudStorageAccount.TryParse(CloudConfigurationManager.GetSetting("blobStorage"), out cloudStorageAccount);

            var azureDirectory = new AzureDirectory(cloudStorageAccount, "AMSCatalog");
            var results = new List<KeyValuePair<string,string>>();

            using (var searcher = new IndexSearcher(azureDirectory))
            {
                var analyzer = new StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);
                QueryParser parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_30, "title", analyzer);
                Query query = parser.Parse(searchQuery);

                var hits = searcher.Search(query, 10);

                foreach (var hit in hits.ScoreDocs)
                {
                    var doc = searcher.Doc(hit.Doc);

                    results.Add(new KeyValuePair<string,string>(doc.Get("id"),doc.Get("title") ));
                }             
            }

            return results;
        }

    }

6. Run and test locally.

One of the benefits of using .NET as the backend for your Mobile Services, is that you can run your code locally. So at this point, you should able to press F5 and see your project running!

 

image

 

But the goodies don’t stop here! You can try your mobile service without an app, just click on the try it out link on the web page and then you will get page listing all the APIs that you have in your service. And after clicking on a specific API, you can directly call it  from your browser!

imageimage

 

Note: To insert a new data entity in Azure Mobile Services, in most cases, you don’t need to provide values for the id and all the control columns that start with “_”. The Azure Mobile Services’ backend handles these for you.

Testing your scheduled job locally is a similar experience.

image

 

You should be able to use the custom search API and get the search results.

 

imageimage

7. Deploy

Before deploying your application to Azure you need to create your Azure Mobile Service in Azure via the portal or using Visual Studio – more info here. Once you have your service ready, you can simply publish your project directly from Visual Studio.

image

 

As the last two steps make sure that you add a configuration entry for the storage’s connection string in the Azure portal and that you set up your scheduled job.

 

Note: The name of the scheduled job must match the name of your class where the implementation of the job resides. Also consider that due to the routing based on naming conventions that is fundamental for the Web API, the postfix “Job” is not considered part of the name.

In this example, I named the class IndexerJob, so the name of my scheduled job in the azure portal should be Indexer.

 

image

image

 

Wrapping up

In this blog post I covered how you can implement a full text search functionality using Azure Mobile Services with a .NET backend. Please let me know your thoughts and I hope this information is helpful!


1 Comment

Post Reply