Thursday, October 10, 2013

Parsing out URL's in the body of a Message with EWS and Powershell

Sometimes when your writing an Automation script you might want to parse a certain URL from the Body of a Message. One example would be the Lync Meeting URL from an Online Meeting invitation or another might but a DropBox URL for a shared file.

To grab the Body of a Message in the EWS Managed API you need to use either Load() or LoadPropertiesForItems() if you have a number of messages your processing. Using these methods will do a GetItem (or batch GetItem) in EWS.

To parse the URL's from the HTML body markup that EWS returns you can use some RegEx to separate out all the links. Then you can use the URI class from .net to parse the matches further and identify the hosts in the URL to see if its the URL your looking for. The following sample will loop through the last 100 emails in a mailbox's and parse any Lync Meeting URL's (for Office365) or Dropbox URL's.

I've put a download of this script here

The code looks like

  1. ## Get the Mailbox to Access from the 1st commandline argument  
  2.   
  3. $MailboxName = $args[0]  
  4.   
  5. ## Load Managed API dll    
  6. Add-Type -Path "C:\Program Files\Microsoft\Exchange\Web Services\2.0\Microsoft.Exchange.WebServices.dll"    
  7.     
  8. ## Set Exchange Version    
  9. $ExchangeVersion = [Microsoft.Exchange.WebServices.Data.ExchangeVersion]::Exchange2010_SP2    
  10.     
  11. ## Create Exchange Service Object    
  12. $service = New-Object Microsoft.Exchange.WebServices.Data.ExchangeService($ExchangeVersion)    
  13.     
  14. ## Set Credentials to use two options are availible Option1 to use explict credentials or Option 2 use the Default (logged On) credentials    
  15.     
  16. #Credentials Option 1 using UPN for the windows Account    
  17. $psCred = Get-Credential    
  18. $creds = New-Object System.Net.NetworkCredential($psCred.UserName.ToString(),$psCred.GetNetworkCredential().password.ToString())    
  19. $service.Credentials = $creds        
  20.     
  21. #Credentials Option 2    
  22. #service.UseDefaultCredentials = $true    
  23.     
  24. ## Choose to ignore any SSL Warning issues caused by Self Signed Certificates    
  25.     
  26. ## Code From http://poshcode.org/624  
  27. ## Create a compilation environment  
  28. $Provider=New-Object Microsoft.CSharp.CSharpCodeProvider  
  29. $Compiler=$Provider.CreateCompiler()  
  30. $Params=New-Object System.CodeDom.Compiler.CompilerParameters  
  31. $Params.GenerateExecutable=$False  
  32. $Params.GenerateInMemory=$True  
  33. $Params.IncludeDebugInformation=$False  
  34. $Params.ReferencedAssemblies.Add("System.DLL") | Out-Null  
  35.   
  36. $TASource=@' 
  37.   namespace Local.ToolkitExtensions.Net.CertificatePolicy{ 
  38.     public class TrustAll : System.Net.ICertificatePolicy { 
  39.       public TrustAll() {  
  40.       } 
  41.       public bool CheckValidationResult(System.Net.ServicePoint sp, 
  42.         System.Security.Cryptography.X509Certificates.X509Certificate cert,  
  43.         System.Net.WebRequest req, int problem) { 
  44.         return true; 
  45.       } 
  46.     } 
  47.   } 
  48. '@   
  49. $TAResults=$Provider.CompileAssemblyFromSource($Params,$TASource)  
  50. $TAAssembly=$TAResults.CompiledAssembly  
  51.   
  52. ## We now create an instance of the TrustAll and attach it to the ServicePointManager  
  53. $TrustAll=$TAAssembly.CreateInstance("Local.ToolkitExtensions.Net.CertificatePolicy.TrustAll")  
  54. [System.Net.ServicePointManager]::CertificatePolicy=$TrustAll  
  55.   
  56. ## end code from http://poshcode.org/624  
  57.     
  58. ## Set the URL of the CAS (Client Access Server) to use two options are availbe to use Autodiscover to find the CAS URL or Hardcode the CAS to use    
  59.     
  60. #CAS URL Option 1 Autodiscover    
  61. $service.AutodiscoverUrl($MailboxName,{$true})    
  62. "Using CAS Server : " + $Service.url     
  63.      
  64. #CAS URL Option 2 Hardcoded    
  65.     
  66. #$uri=[system.URI] "https://casservername/ews/exchange.asmx"    
  67. #$service.Url = $uri      
  68.     
  69. ## Optional section for Exchange Impersonation    
  70. $psPropset= new-object Microsoft.Exchange.WebServices.Data.PropertySet([Microsoft.Exchange.WebServices.Data.BasePropertySet]::FirstClassProperties)    
  71. # Bind to the Inbox Folder  
  72. $folderid= new-object Microsoft.Exchange.WebServices.Data.FolderId([Microsoft.Exchange.WebServices.Data.WellKnownFolderName]::Inbox,$MailboxName)     
  73. $Inbox = [Microsoft.Exchange.WebServices.Data.Folder]::Bind($service,$folderid)  
  74.   
  75. #Define ItemView to retrive just 1000 Items      
  76. $ivItemView =  New-Object Microsoft.Exchange.WebServices.Data.ItemView(100)      
  77. $fiItems = $service.FindItems($Inbox.Id,$ivItemView)      
  78. [Void]$service.LoadPropertiesForItems($fiItems,$psPropset)    
  79. foreach($Item in $fiItems.Items){        
  80.     #Process Item    
  81.     "Processing : " + $Item.Subject  
  82.     $dupChk = @{}  
  83.     $RegExHtmlLinks = "<a href=\`"(.*?)\`">"  
  84.     $matchedItems = [regex]::matches($Item.Body, $RegExHtmlLinks,[system.Text.RegularExpressions.RegexOptions]::Singleline)  
  85.     foreach($Match in $matchedItems){  
  86.         $SplitVal = $Match.Value.Split('"')  
  87.         if($SplitVal.Count -gt 0){  
  88.             $ParsedURI=[system.URI]$SplitVal[1]  
  89.             if($ParsedURI.Host -eq "meet.lync.com"){ 
  90.                 if(!$dupChk.Contains($ParsedURI.AbsoluteUri)){ 
  91.                     Write-Host -ForegroundColor Green   "LyncURL     : " + $ParsedURI.AbsoluteUri 
  92.                     $dupChk.add($ParsedURI.AbsoluteUri,0) 
  93.                 } 
  94.             }  
  95.             if($ParsedURI.Host -eq "www.dropbox.com"){ 
  96.                 if(!$dupChk.Contains($ParsedURI.AbsoluteUri)){ 
  97.                     Write-Host -ForegroundColor Blue "DropBox    : " + $ParsedURI.AbsoluteUri  
  98.                     $dupChk.add($ParsedURI.AbsoluteUri,0)  
  99.                 }  
  100.             }    
  101.         }  
  102.     }  
  103. }      

2 comments:

Laeeq Qazi said...

Hi Glen,

Can we use AQS serach string to only got those emails which have URL(s)?

I tried following AQS strings:

findResults = service.FindItems(folder, "body:(\"https://\" OR \"http://\")", itemView);
findResults = service.FindItems(folder, "body:(\"http://\")", itemView);

findResults = service.FindItems(folder, "http:// OR https://", itemView);

and also tried SearchFilter with OR condition, but didnt succeed.

Regards,
Laeeq

Glen Scales said...

Try using

findResults = service.FindItems(folder, \"body:"http?//\"", itemView);

The colon will affect the AQS statement that seems to work for me anyway

Cheers
Glen