Wednesday, April 22, 2020

Migrating your Mailbox searches in EWS to the Graph API Part 2 KQL and new search endpoints

This is part 2 of my blog post on migrating EWS Search to the Graph API, in this part I'm going to be looking at using KQL Searches and using the new Microsoft Search API (currently in Beta). The big advantage these type of searches have over using SearchFilters is that these type of searches use the content indexes which can improve the performance of searches when folder item counts get high. They also allow you to query the contents of  Attachments which are indexed through ifilters on the server.

KQL queries on the Mailbox and Mailbox Folders

In EWS you have been able to use firstly AQS and now KQL in the FindItems operation from Exchange 2013 up. To migrate these searches to Microsoft Graph is pretty simple eg an EWS FindItem query to search for all messages with a pdf attachment

FindItemsResults fiItems = service.FindItems(QueryFolder, "Attachmentnames:.pdf", iv);

in the Graph you would use something like'Inbox')/messages

the slightly disappointing thing with the Graph is that you can't use count along with a search which when your doing statistical type queries eg say I wanted to know how many email that where received in 2019 had a pdf attachment makes this very painful to do in the Graph where in EWS it can be done with one call (its a real snowball that one).

Searching the recipient fields like To and CC, in the forums you see some absolute clangers search filters that try to search the recipients and from fields of messages that can easily be done using the participants keyword which includes all the people fields in an email message. These fields are From, To, Cc. The one thing to be aware of is the following note on expansion in . So if you don't want expansion to happen you need to ensure you use the wildcard character after the participant your searching for. A simple participants query looks like'Inbox')/messages?

Date range queries

One of the good things about KQL with dates is that you can use reserved keywords like today,yesterday,this week eg'Inbox')

to get all the received sent between two dates you can use either'Inbox')/messages?

$search="(received>=2019-01-01 AND received<=2019-02-01)"

If you want to search the whole of the Mailbox using the graph eg if you have use the AllItems Search Folder in EWS to do a Search that spans all the MailFolders in a Mailbox in the Graph you just need to use the /Messages endpoint eg
$search="(received>=2019-01-01 AND received<=2019-02-01)"

New Search Methods

The traditional search methods in EWS give you the normal narrow refiner search outputs that most mail apps have been providing over the past 10-20 years. While these methods have improved over the years there hasn't been any real great leaps in functionality with Search. So the Microsoft Graph has been adding some newer endpoints that do allow a more modern approach to searching . The first is Microsoft Graph data connect which has been around for a while now and the Microsoft Search API which is still in Beta. As this article is about migrating EWS searches you probably wouldn't consider either of these for your traditional search migration as $filter and $search are going to meet those needs. However if you are looking at overhauling the search functionality in your application or you are building greenfield functionality then both of these new methods are worth consideration.

Graph Data connect is your go-to endpoint when you want to do any mass processing of Mailbox data. It solves that problem of having to crawl every item in a Mailbox when you want to do any data-mining type operations by basically providing an Azure dataset of this information for you. Data connect is great however it has a high entry level, first you need a Workplace analytics licence for every mailbox you wish to analyse and the costs can mount pretty quickly the larger the Mailbox count your dealing with. The other requirements is paying for the underlying Azure Storage etc that your dataset ends up consuming. I think it can be a bit of a shame that the licencing costs can lock a lot of  developers out of using this feature as it really does provide a great way or working with Mail item data. And it leaves some having to resort to doing their own crawling of Mailbox data to avoid these costs (eg that licencing cost is a pretty hard sell for any startup looking to use this) 

Microsoft Search API

This is the newest way of searching mailbox data, while the underlying mechanism for doing mailbox searches is still KQL so its very similar to the $Search method described about,  this API does enhance the search results with some more "Search Intelligence" like relevance bringing AI into the picture . One of the other main benefits of this endpoint is when you want to broaden your search to other Office365 workflows or even include your own custom data searches. So this really is the endpoint that will provide you with a modern search experience/workflow. Which is getting more critical due to the sheer amount of data we have (eg the datageddon). Its still in beta and is a little restricted at the moment eg

  • It can't be used to search delegate Mailboxes so only the primary mailbox 
  • It only returns the pageCount for items not the Total number of Items found in a search (to be fair $search does this as well which is really annoying)
  • Searches are scoped across the entire mailbox 
  • Just Messages and Events are searchable at the moment