Thursday, February 16, 2006

Oversize mail reporting with Monad

These days spikes in email traffic could mean anything from its Valentine's Day and your email server is being flooded with Ecards, newsletters or the latest funny video clip that people just had to send to every person in there contacts list. So when you receive that call that email is slow message tracking is a good place to start. The one bad thing is that while the Message Tacking Center GUI is functional it lacks the basic ability to export the logs. Which is exactly what I wanted to do this week so I could ship it off to someone to say here’s your problem “with the tag so do you still think having no message restrictions is a good idea“. The good thing about message tracking logs is that they are accessible using WMI so running quick live reports is pretty easier with a script. Last month Microsoft planked out Beta 3 of monad so I thought it was time to update and give it another test drive with this task.

Msh’s ability to present things in a tabular format is very cool the ability to which you can format the result with format-table and a hash table means it leaves VBS eating its dust.

The Script

The script itself is very easy it takes 3 command line parameters the first is the server you want to run it against the second is the number of hours to look back in the logs and the third is the size in MB for mails to report on. The script takes these three parameters and queries the server via WMI for any mail in the message tracking logs over a certain size within the time frame specified and produces a report. The first version of the script just displays this to the commandline. The second version uses the new System.Net.Mail.MailMessage namespace to send a message over SMTP where the body is a HTML table of the results of the query. The script will try to send the email by port 25 of the server you are reporting on. To use this script you first need to configure a sending and recieving email address within the script.

To run the script to report on email over 2MB for the last 8 hours use something like c:\osizedisp.msh servername 8 2

I’ve put a download copy of the scripts here

The script itself looks like

param([String] $servername = $(throw "Please specify the Servername"),
[int32] $timerange = $(throw "Please specify a Time Range in Hours"),[int32] $sizeg
= $(throw "Please specify the lower size limit"))
$dtQueryDT = [DateTime]::UtcNow.AddHours(-$timerange)
$sizeg = $sizeg * 1024 * 1024
$WmidtQueryDT = [System.Management.ManagementDateTimeConverter]::ToDmtfDateTime($dtQueryDT)
$WmiNamespace = "ROOT\MicrosoftExchangev2"
$filter = "entrytype = '1020' and OriginationTime >= '" + $WmidtQueryDT + "' and
size > " + $sizeg + " or entrytype = '1028' and OriginationTime >= '" + $WmidtQueryDT
+ "' and size > " + $sizeg
$Qresults = get-wmiobject -class Exchange_MessageTrackingEntry -Namespace $WmiNamespace
-ComputerName $servername -filter $filter
$format = @{expression={[System.Management.ManagementDateTimeConverter]::ToDateTime($_.OriginationTime)};width=22;Label="Time"},@{Expression
= {$_.Senderaddress};width=18;Label = "Sender"},@{Expression = {$_.recipientaddress};width=18;Label
= "Recipient"},@{Expression ={$_.Subject};width=30;Label
="Subject"},@{Expression = {($_.Size)/1024/1024}; Format = "{0:N2} MB";Label =
"Size MB"}
$Qresults | format-table $format

The Second script looks like

param([String] $servername = $(throw "Please specify the Servername"),
[int32] $timerange = $(throw "Please specify a Time Range in Hours"),[int32] $sizeg
= $(throw "Please specify the lower size limit"))
$dtQueryDT = [DateTime]::UtcNow.AddHours(-$timerange)
$sizegdiv = $sizeg * 1024 * 1024
$WmidtQueryDT = [System.Management.ManagementDateTimeConverter]::ToDmtfDateTime($dtQueryDT)

$WmiNamespace = "ROOT\MicrosoftExchangev2"
$filter = "entrytype = '1020' and OriginationTime >= '" + $WmidtQueryDT + "' and
size > " + $sizegdiv + " or entrytype = '1028' and OriginationTime >= '" + $WmidtQueryDT
+ "' and size > " + $sizegdiv
$Qresults = get-wmiobject -class Exchange_MessageTrackingEntry -Namespace $WmiNamespace
-ComputerName $servername -filter $filter
$BodyTable = "<table border=`"1`" cellpadding=`"0`" cellspacing=`"0`"
width=`"100%`">`r<tr>`r"
$td = "<td width=`"20%`" align=`"center`">"
$BodyTable = $BodyTable + "<tr>" + $td + "Date/Time</td>" + $td + "From</td>" +
$td + "Sent-to</td>" + $td + "Subject</td>" + $td + "Size (MB)</td></tr>"
foreach ($Mentry in $Qresults){
$BodyTable = $BodyTable + "<td align=`"center`">" + [System.Management.ManagementDateTimeConverter]::ToDateTime($Mentry.OriginationTime)
+ "</td>`r"
$BodyTable = $BodyTable + "<td align=`"center`">" + $Mentry.Senderaddress +
"</td>`r"
$BodyTable = $BodyTable + "<td align=`"center`">" + $Mentry.recipientaddress +
"</td>`r"
$BodyTable = $BodyTable + "<td align=`"center`">" + $Mentry.subject + "</td>`r"

$BodyTable = $BodyTable + "<td align=`"center`">" + ($Mentry.size/1024/1024).tostring("0.00")
+ "</td>`r"
$BodyTable = $BodyTable + "</tr>`r"
}
$SmtpClient = new-object system.net.mail.smtpClient
$SmtpClient.host = $servername
$MailMessage = new-object System.Net.Mail.MailMessage
$MailMessage.To.Add("youruser@domain.com")
$MailMessage.From = "source@domain.com"
$MailMessage.Subject = "Messages larger then " + $sizeg + " MB for the past " +
$timerange + " Hours on Server " + $servername
$MailMessage.IsBodyHtml = $TRUE
$MailMessage.body = $BodyTable
$SMTPClient.Send($MailMessage)

Tuesday, February 07, 2006

Aggregating RSS feeds into a public folder via a script

Recently a few people have asked me if I had a script that could store the content of a RSS feeds in a public folder. Initially I was puzzled as to why you would want to do this but when you start to look at the problems RSS can cause in large networks it started to make a lot more sense. After going though the issues of building the script and testing how some RSS aggregators work and the way different people support publishing feeds it became a lot clearer that RSS as a standard can cause a lot of problems. I guess because of the fast pace of RSS adoption its shown up holes in the initial design. If your interested do a search on Google for bandwidth usage of RSS the Blogspear has been bashing this out for the last couple of years.

The Script

An overview of what this script does is it takes a RSS feed and a pubic folder as command-line parameter and then synchronizes the content of the feed with a public folder by creating or modifying posts. The script uses the Msxml2.XMLHTTP.4.0 object to access the RSS feeds. The main reason this object was used vs others is that it supports decompressing gzip content automatically. To create the post in the public folder CDOEX is used because of this fact the script must be run locally on an Exchange server where there is an instance of the public folder you want to create the feed in.

Keeping the Bandwidth Lean

This was the most challenging part of this script initially I was just pulling down the whole feed to work out if anything had changed. The problem is that doing this several times a day over a lot of feeds meant you start consuming a lot of bandwidth. The solution to fixing this was two fold the first was to use conditional gets. A Conditional get allows you to make a normal get request with the addition of two headers If-Modified-Since and If-None-Match that means if the content has not changed since the last request it will return a status of 304 and no content. To use a conditional get the values from the previous get request must be stored to do this the script creates a custom property on the public folder itself named after the URL of the blog your aggregating. The value of the Last-Modified and Etag headers are stored within this property and used on future requests.

The other thing that is done to attempt to kept the bandwidth used to a minimum is to request http compression be used. For this the Accept-Encoding header is used. With the amount of bloat in XML feeds this can have quite a large saving during the initial synchronization of feeds.

Unfortunately some content providers don’t support either of the standards most do support conditional get (although I did find a number that didn’t). I only found around 40% of the blogs I tried supported compression.

Reading the feed’s XML

This was the second most challenging part of the script dealing with all the different formats that syndication feeds come in. There are 3 main feed formats that are used Atom, RSS version 2.0 and RSS 1.0 RDF feeds. The real pain comes from the fact that most elements in a feed are optional so when you’re trying to read a lot of feeds from different sources you can never be to sure what elements are going be used in a feed. For example pubdate is an option element most rss feeds have it but some don’t. Without pubdate working out if an item is new in a feed creates a bit of a problem. Atom feeds are a lot better but they still have a lot of optional elements and the way in which the content is published in an atom feed can also vary (especially the content fields). Basically to parse this there are three separate subs in the script that handle the different feeds and do a best effort to work out if a post has been changed based on whether a date can be retrieved. This is one part of the script that may need re-engineering for it to support different type of feeds you wish to argregate.

The last section of the code does the synchronization with a public folder. The Sync basically works by trying to use one of unique elements from the Feed entry to create a Href value in a public folder. If its comes down to not being able to work out if a item has been modified or not the createpost function will try to open a the item at calculated href if this fails it will instead create a new item. If the item does open it will do a comparison of the bodytext to detect if any changes have been made and update the post if necessary.

Running the script

To run the script you need to give it the name of the blog you want to aggregate as the first command line parameter and the name of the public folder as second command line parameter eg (to aggregate this blog to a public folder called rssfeed you would do)

cscript readfeed.vbs "http://gsexdev.blogspot.com/atom.xml" http://servername/public/rssFeeds

The script is designed so you can have multiple feeds being feed into one public folder and they shouldn’t affect each other. (I’ve got up to 15 going into one folder). As the script runs its writes a fairly verbose log to a logfile at c:\temp\rssfeedlog.txt this can be used to help diagnose problems with the script.


The script is a little on the large side to post verbatim (around 450 lines) I’ve put a downloadable copy of the script here.

If you wish to aggregate a number of blogs there are a few options when running the script. The first is use a batch file and include a line for each blog you want to aggregate. Jörg-Stefan Sell has also come with another great idea which is to create a script that reads a XML config file which contains the blogs and the public folders you want to aggregate and then it shells out to the readfeed script. You can download a copy of Jörg-Stefan script here.

Special thanks to Bill Pogue from Aztec Systems, Inc. for his help with the idea and the code.