Skip to main content

Mailbox Attachment Auditing Script

Mofidifed 15.9.2005 to include not about date format

I had a question from a customer last week about where the folder for attachments was in Exchange so they could go in and delete anything that they thought was a bit old or that they thought they didn’t need. Although these questions at times can make me cringe my explanations of how Exchange goes about storing things and what’s considered best practice when it comes to mail archiving didn’t seem to be getting though so there arose a need to produce some sort of reports so that the customer could at least see what’s going on themselves. This was a small site 10 users so this wasn’t going to be too much information overload so I set to the task of building something to do this. There where two ways I could have gone with this one way would have been to export all the information about the attachments into a database and then do some ASP pages to report on the database. Which wasn’t going to be to practicle because I needed a delivery mechinsm to the client So I decided to go with some data shaping code that would produce a CSV file for each mailbox that was scanned that could then be opened in Excel. Data shaping was good fit because it allowed me to group the data I was going to retrieve in a hierarchal fashion and after some very mundane work or working out attachment classification for each extension type I came up with some relatively useful reports. Also I made the script so it could target specific date periods like attachments more then a year old or you could target attachments from the last week if your store was showing rapid file growth. What did I learn from this well it was interesting to see how many of the same copies of a certain attachments where stored in people’s mailboxes or that one user had decided to store all the NDR’s for a large message that they tried to send (this totaled over a few hundred meg). For me this shows the future of mail storage shouldn’t be an information store or a database but some sort of intelligent content management store. In this way the content you’re storing in more relevant, less redundant and can grow and learn with the user. With everybody storing more and more email data just having a better search engine isn’t going to be good enough. At the coal face you have one person trying to use and control this so having a method where the way in which its stored is more relevant you have half a chance of making it more efficient. I’ve always wondered weather there was a correlation between the amount of data you store and accuracy and efficiency you can achieve when retrieving it. It all shows that no matter how good any system is it’s really down to how you educate the users to weather they will get the best out of it.

The script

I started out using Exoledb and ADO to do this script which did work but because the only way of getting the attachment size via Exoledb (that I know) is to use an ADO stream object which made the code very slow and memory hungry. Once I switched to using Webdav where the X-MS-ENUMATTS method can be used to scan and return all the attachment details on a message this gave a large lift in performance and I was surprised how fast it could actually process very large mailboxes. I’ve included the ADO version in the download of the post but the WebDAV version is the one I’ll talk about because this is the one that I used.

The script itself works by first setting up the data shape that will be used to create the reports. This is done by first setting up some categories for all the files and then setting up child branches for these categories for each file extension. All the extensions are tracked in one variable which is used to then identify any unknown extension types (or ones that I haven’t categorized) and then a category is dynamically created for that. After this the RecurseFolder sub is called this sub is based on the code from the mailbox size KB. With a few exceptions the main one being is that it only checks normal mail folders this was to prevent an issue where searchfolder exist in the mailbox. The ResurseFolder sub job is basically to retrieve all the folder URL’s in the mailbox and then call the procfolder sub. The profolder sub processes all the mail in a folder by looking for all email with attachments between the specified date ranges of the query. The embedattach attachment processes the attachments on each message using the X-MS-ENUMATTS method and also checks and processes any attachments that are on embedded messages attached to the primary message. Because the received time is in UTC when you retrieve it using WebDAV there’s some code to do time conversion and also some other code that cleans the values up a bit so they can be used within the data shape. The script runs pretty verbosely and outputs a lot of stuff to the command-line because this is a script that will take a long period of time to run I find this is important to let you see the progress the script is making. At the end of the script it displays some summaries of all the attachments and also creates a CSV file with the result of all the attachment scans. To make the result more meaningful I added command-line parameters so you could restrict the date range the script will look at and also the minimum size of the attachments to look at. So you can scan for all attachments that are a year old over 500 KB.

I’ve created 4 different version of this script

Mbaudit.vbs in the ADO/Exoldb Version of the script (must be run locally on the Exchange server)

Mbauditwd.vbs is the WebDAV version (using http) can be run locally or remotely

Mbauditwdfba is a WebDAV version if you are using forms based authentication on your Exchange server

Mbauditrep.vbs is the same as Mbauditwd.vbs but has some front end code added that selects all the users on a particular server and will go though each user sequentially and create a report.

To run any of these scripts requires specifying command-line parameters the usage is for (all scripts except Mbauditrep.vbs)

Cscript Mbauditwd.vbs

The start and enddate needs to be in ISO format (year-month-date) eg to scan for all attachments that are a year and old over 500 KB

Cscript Mbauditwd.vbs myserver mymailbox 500 2004-01-01 2005-01-01

The Mbauditrep.vbs script is just run with the servename and size and date parameters

Cscript Mbauditwd.vbs myserver 500 2004-01-01 2005-01-01

A Important note about the date format because this is an ISO date you need to use the date format yyyy-mm-dd (this is different from what your use to if you in the US)

A word of caution with all of these scripts the WebDAV script creates a bit of network traffic and load on the server that it is reporting against and also with the number of requests that are being made if you have logging enabled on the OWA directory it will significantly increase the size of the log file if you scan a number of mailboxes. The amount of testing that has been done with these scripts is very minimal so as always its use at your own risk and make sure you do your own testing.

I’ve put a download copy of the scripts here the script itself looks like

on error resume next
set shell = createobject("wscript.shell")
strValueName = "HKLM\SYSTEM\CurrentControlSet\Control\TimeZoneInformation\ActiveTimeBias"
minTimeOffset = shell.regread(strValueName)
toffset = datediff("h",DateAdd("n", minTimeOffset, now()),now())
set conn1 = createobject("ADODB.Connection")
set req = createobject("microsoft.xmlhttp")
Set fso = CreateObject("Scripting.FileSystemObject")
fname = "c:\" & wscript.arguments(1) & ".csv"
set wfile = fso.opentextfile(fname,2,true)
wfile.writeline("File Group,File Type,File Extension,File Name, File Size, Email
Location,Message Subject,Sent From,Date Sent")
public exlist
public maxsize
public datefrom
public dateto
maxsize = wscript.arguments(2)
datefrom = wscript.arguments(3) & "T00:00:00Z"
dateto = wscript.arguments(4) & "T00:00:00Z"
exlist = ","
strConnString = "Data Provider=NONE; Provider=MSDataShape"
conn1.Open strConnString
set objParentRS = createobject("adodb.recordset")
set objChildRS = createobject("adodb.recordset")
set objgrandchild = createobject("adodb.recordset")
strSQL = "SHAPE APPEND" & _
" NEW adVarChar(255) AS RFileType, " & _
" ((SHAPE APPEND " & _
" NEW adVarChar(255) AS CFileext, " & _
" NEW adVarChar(255) AS CFileType, " & _
" NEW adVarChar(255) AS CFileDesc, " & _
" ((SHAPE APPEND " & _
" NEW adVarChar(255) AS GCFileext, " & _
" NEW adVarChar(255) AS GCFilename, " & _
" NEW adVarChar(255) AS GCFilesize, " & _
" NEW adVarChar(255) AS GCMessLocation, " & _
" NEW adVarChar(255) AS GCMessSubject, " & _
" NEW adVarChar(255) AS GCMessFrom, " & _
" NEW adVarChar(255) AS GCDateSent) " & _
" RELATE CFileext TO GCFileext) AS MOWMI" & _
")" & _
" RELATE RFileType TO CFileType) AS rsSOMO "
objParentRS.LockType = 3
objParentRS.Open strSQL, conn1
rem *********** Add Document Types
call adddoctype(objParentRS,"Microsoft Office Documents")
call adddoctype(objParentRS,"Compressed Files")
call adddoctype(objParentRS,"Acorbat Files")
call adddoctype(objParentRS,"Executables and Installers")
call adddoctype(objParentRS,"Sound Files")
call adddoctype(objParentRS,"Video Files")
call adddoctype(objParentRS,"Image Files")
call adddoctype(objParentRS,"Attached Email Message")
call adddoctype(objParentRS,"Unclassified Files")
rem ********************************************
rem *********** Add Document Extensions"
Set objChildRS = objParentRS("rsSOMO").Value
call addfileexts(objChildRS)
Set objgrandchild = objChildRS("MOWMI").Value
sConnString = "http://" & wscript.arguments(0) & "/exchange/" &
wscript.arguments(1) & "/NON_IPM_SUBTREE"
call RecurseFolder(sConnString,objChildRS,objgrandchild)

objParentRS.MoveFirst
Do While Not objParentRS.EOF
wscript.echo objParentRS(0)
Set objChildRS = objParentRS("rsSOMO").Value
Do While Not objChildRS.EOF
attsum = 0
attnum = 0
Set objgrandchild = objChildRS("MOWMI").Value
attnum = objgrandchild.recordcount
if objgrandchild.recordcount <> 0 then
wscript.echo " " & objChildRS(1) & " " & objChildRS(2)
wfile.writeline(objChildRS(1) & "," & objChildRS(2))
end if
Do While Not objgrandchild.EOF
wscript.echo " " & objgrandchild(1) & " " & objgrandchild(2) & " " &
objgrandchild(3) & " " & objgrandchild(4) & " "
wfile.writeline(",," & objgrandchild(0) & "," & objgrandchild(1) & "," &
objgrandchild(2) & "," & objgrandchild(3) & "," & objgrandchild(4) & "," &
objgrandchild(5) & "," & objgrandchild(6))
attsum = attsum + clng(objgrandchild("GCFilesize"))
objgrandchild.movenext
loop
if attnum <> 0 then
attachmentsumy = attachmentsumy & objChildRS(1) & "," & objChildRS(2) & "," &
objChildRS(0) & "," & attnum & "," & attsum & vbcrlf
end if
objChildRS.movenext
loop
objParentRS.MoveNext
Loop
wscript.echo
wscript.echo "Attachment Summary"
wscript.echo
wscript.echo attachmentsumy
wfile.writeline
wfile.writeline "Attachment Summary"
wfile.writeline
wfile.writeline attachmentsumy

Public Sub RecurseFolder(sUrl,objChildRS,objgrandchild)

Set oXMLHttp = CreateObject("Microsoft.xmlhttp")
oXMLHttp.open "SEARCH", sUrl, False, "", ""
sQuery = "<?xml version=""1.0""?>"
sQuery = sQuery & "<g:searchrequest xmlns:g=""DAV:"">"
sQuery = sQuery & "<g:sql>SELECT ""http://schemas.microsoft.com/"
sQuery = sQuery & "mapi/proptag/x0e080003"", ""DAV:hassubs"" FROM SCOPE "
sQuery = sQuery & "('SHALLOW TRAVERSAL OF """ & sUrl & """') "
sQuery = sQuery & "WHERE ""DAV:isfolder"" = true and ""DAV:ishidden"" = false
and ""http://schemas.microsoft.com/mapi/proptag/x36010003"" = 1"
sQuery = sQuery & "</g:sql>"
sQuery = sQuery & "</g:searchrequest>"
oXMLHttp.setRequestHeader "Content-Type", "text/xml"
oXMLHttp.setRequestHeader "Translate", "f"
oXMLHttp.setRequestHeader "Depth", "0"
oXMLHttp.setRequestHeader "Content-Length", "" & Len(sQuery)
oXMLHttp.send sQuery
Set oXMLDoc = oXMLHttp.responseXML
Set oXMLSizeNodes = oXMLDoc.getElementsByTagName("d:x0e080003")
Set oXMLHREFNodes = oXMLDoc.getElementsByTagName("a:href")
Set oXMLHasSubsNodes = oXMLDoc.getElementsByTagName("a:hassubs")
For i = 0 to oXMLSizeNodes.length - 1
call procfolder(oXMLHREFNodes.Item(i).nodeTypedValue,objgrandchild,objChildRS)
wscript.echo oXMLHREFNodes.Item(i).nodeTypedValue
If oXMLHasSubsNodes.Item(i).nodeTypedValue = True Then
call RecurseFolder(oXMLHREFNodes.Item(i).nodeTypedValue,objChildRS,objgrandchild)
End If
Next
End Sub


sub procfolder(strURL,objgrandchild,objChildRS)

strQuery = "<?xml version=""1.0""?><D:searchrequest xmlns:D = ""DAV:"" xmlns:b=""urn:uuid:c2f41010-65b3-11d1-a29f-00aa00c14882/"">"
strQuery = strQuery & "<D:sql>SELECT ""DAV:displayname"", ""urn:schemas:httpmail:subject"",
"
strQuery = strQuery & """urn:schemas:httpmail:datereceived"", ""urn:schemas:httpmail:fromname"",
"
strQuery = strQuery & """urn:schemas:httpmail:fromemail"""
strQuery = strQuery & " FROM scope('shallow traversal of """
strQuery = strQuery & strURL & """') Where ""DAV:ishidden"" = False AND ""DAV:isfolder""
= False AND "
strQuery = strQuery & """urn:schemas:httpmail:hasattachment"" = True AND "
strQuery = strQuery & """urn:schemas:httpmail:datereceived"" &lt; CAST(""" &
dateto & """ as 'dateTime') AND "
strQuery = strQuery & """urn:schemas:httpmail:datereceived"" &gt; CAST(""" &
datefrom & """ as 'dateTime')</D:sql></D:searchrequest>"
req.open "SEARCH", strURL, false
req.setrequestheader "Content-Type", "text/xml"
req.setRequestHeader "Translate","f"
req.send strQuery
If req.status >= 500 Then
ElseIf req.status = 207 Then
set oResponseDoc = req.responseXML
set oNodeList = oResponseDoc.getElementsByTagName("a:displayname")
set oNodeList1 = oResponseDoc.getElementsByTagName("a:href")
set oSubject = oResponseDoc.getElementsByTagName("d:subject")
set odatereceived = oResponseDoc.getElementsByTagName("d:datereceived")
set ofromemail = oResponseDoc.getElementsByTagName("d:fromemail")
For i = 0 To (oNodeList.length -1)
set oNode = oNodeList.nextNode
set oNode1 = oNodeList1.nextNode
set oNode2 = oSubject.nextNode
set oNode3 = odatereceived.nextNode
set oNode4 = ofromemail.nextNode
call
embedattach(oNode1.Text,oNode2.Text,oNode3.Text,oNode4.Text,oNode.Text,objgrandchild,objChildRS)
Next
Else
End If

end sub

sub
embedattach(objhref,
subject,daterecieved,recievedfrom,davdisplay,objgrandchild,objChildRS)
req.open "X-MS-ENUMATTS", objhref, false, "", ""
req.send
If req.status > 207 Or req.status < 207 Then
Else
set resDoc1 = req.responseXML
Set objHrefNodeList = resDoc1.getElementsByTagName("a:href")
Set objattachmethod = resDoc1.getElementsByTagName("d:x37050003")
Set objcnval = resDoc1.getElementsByTagName("f:cn")
set objattachname = resDoc1.getElementsByTagName("d:x3704001f")
set objattachfilename = resDoc1.getElementsByTagName("e:attachmentfilename")
set objattsize = resDoc1.getElementsByTagName("d:x0e200003")
If objHrefNodeList.length > 0 Then
For f = 0 To (objHrefNodeList.length -1)
set objHrefNode1 = objHrefNodeList.nextNode
set objNodef = objattachmethod.nextnode
if objattachmethod.length <> 0 then
if objNodef.Text = 5 then
call
embedattach(objHrefNode1.Text,subject,daterecieved,recievedfrom,davdisplay,
objgrandchild,objChildRS)
else
settodav = 0
set objNode1f = objattachfilename.nextNode
if objattachfilename.length = 0 then
if objattachname.length = 0 then
set objNode1f = objcnval.nextNode
if objcnval.length = 0 then settodav = 1
else
set objNode1f = objattachname.nextNode
end if
end if
fnFileName = objNode1f.Text
if err.number <> 0 then wscript.echo "error" & objHrefNode1.Text
set objNode3f = objattsize.nextnode
attsize = objNode3f.Text

if fnFileName <> "" and clng(attsize/1024) > clng(maxsize) then
wscript.echo fnFileName
fatt1 = len(fnFileName)
lcMaloc = replace(objhref,"%E2%80%99S%20"," ")
lcMaloc = replace(unescape(lcMaloc),davdisplay,"")
lcMaloc = replace(lcMaloc,"http://" & wscript.arguments(0) & "/exchange/" &
wscript.arguments(1) & "/","")
wscript.echo lcMaloc
fatt2 = fatt1 - 2
attname = UCASE(fnFileName)
rtime =
dateserial(mid(daterecieved,1,4),mid(daterecieved,6,2),mid(daterecieved,9,2)) &
" " & mid(daterecieved,12,8)
objgrandchild.addnew
objgrandchild("GCFileext") = mid(attname,fatt2,3)
objgrandchild("GCFilename") = right(replace(fnFileName,",",""),254)
objgrandchild("GCFilesize") = replace(formatnumber(attsize/1024,2),",","")
objgrandchild("GCMessLocation") = right(replace(lcMaloc,",",""),254)
objgrandchild("GCMessSubject") = right(replace(subject,",",""),254)
objgrandchild("GCMessFrom") = recievedfrom
objgrandchild("GCDateSent") = dateadd("h",toffset,formatdatetime(rtime,0))
objgrandchild.update
elistchk = "," & mid(attname,fatt2,3) & ","
if instr(exlist,elistchk) = 0 then
call adddocext(objChildRS,"Unclassified Files",mid(attname,fatt2,3),"Unknown")

end if
end if
end if
end if
next
Else
End If
End If
end sub

sub adddoctype(objParentRS,doctype)

objParentRS.addnew
objParentRS("RFileType") = doctype
objParentRS.update

end sub


sub adddocext(objChildRS,doctype,ext,fdesc)

objChildRS.addnew
objChildRS("CFileType") = doctype
objChildRS("CFileext") = ext
objChildRS("CFileDesc") = fdesc
exlist = exlist & "," & ext & ","
objChildRS.update

end sub

sub addfileexts(objChildRS)
call adddocext(objChildRS,"Microsoft Office Documents","DOC","Microsoft Word
Document")
call adddocext(objChildRS,"Microsoft Office Documents","DOT","Microsoft Word
Template")
call adddocext(objChildRS,"Microsoft Office Documents","XLS","Microsoft Excel
Spreedsheet")
call adddocext(objChildRS,"Microsoft Office Documents","PPT","Microsoft
Powerpoint Presentation")
call adddocext(objChildRS,"Microsoft Office Documents","PPS","Microsoft
Powerpoint Slide Show")
call adddocext(objChildRS,"Microsoft Office Documents","MDB","Microsoft Access
Database")
call adddocext(objChildRS,"Microsoft Office Documents","ADP","Microsoft Access
Project")
call adddocext(objChildRS,"Microsoft Office Documents","VSD","Microsoft Visio
Diagram")
call adddocext(objChildRS,"Microsoft Office Documents","ONE","Microsoft OneNote
Note")
call adddocext(objChildRS,"Microsoft Office Documents","RTF","Rich Text Format
file")
call adddocext(objChildRS,"Microsoft Office Documents","TXT","Text File")
call adddocext(objChildRS,"Microsoft Office Documents","CSV","Comma seperated
File")
call adddocext(objChildRS,"Compressed Files","ZIP","Zip Compressed File")
call adddocext(objChildRS,"Compressed Files","TAR","nix Tar Compressed File")
call adddocext(objChildRS,"Compressed Files","ARG","Arg Compressed File")
call adddocext(objChildRS,"Compressed Files","RAR","RAR Compressed File")
call adddocext(objChildRS,"Compressed Files","ACE","ACE Compressed File")
call adddocext(objChildRS,"Compressed Files","BHX","Binary Hex Compressed File")
call adddocext(objChildRS,"Acorbat Files","PDF","Adobe Acrobat File")
call adddocext(objChildRS,"Executables and Installers","EXE","Executable File")
call adddocext(objChildRS,"Executables and Installers","MSI","Windows
Installer")
call adddocext(objChildRS,"Sound Files","WAV","Wave File")
call adddocext(objChildRS,"Sound Files","MP3","MPeg3 Sound file")
call adddocext(objChildRS,"Sound Files","WMA","Windows Media file")
call adddocext(objChildRS,"Sound Files","WMV","Windows Media file")
call adddocext(objChildRS,"Sound Files","SND","Windows Sound File")
call adddocext(objChildRS,"Sound Files",".AU","AU Sound File")
call adddocext(objChildRS,"Sound Files","RPM","Real Audio Sound File")
call adddocext(objChildRS,"Sound Files","MID","MIDI Audio Sound File")
call adddocext(objChildRS,"Sound Files",".RM","Real Audio Sound File")
call adddocext(objChildRS,"Sound Files",".RA","Real Audio Sound File")
call adddocext(objChildRS,"Sound Files","ASF","Advanced Streaming format File")
call adddocext(objChildRS,"Video Files","AVI","AVI Video format file")
call adddocext(objChildRS,"Video Files","MPG","MPG Video format file")
call adddocext(objChildRS,"Video Files","MOV","MOV Video format file")
call adddocext(objChildRS,"Video Files","IVX","DIVX Video format file")
call adddocext(objChildRS,"Video Files","PG4","MPG4 Video format file")
call adddocext(objChildRS,"Video Files","SWF","Shockwave format file")
call adddocext(objChildRS,"Image Files","JPG","JPG picture file")
call adddocext(objChildRS,"Image Files","BMP","Bit Map picture file")
call adddocext(objChildRS,"Image Files","GIF","Gif picture file")
call adddocext(objChildRS,"Image Files","PNG","Portable Network graphics picture
file")
call adddocext(objChildRS,"Image Files","TIF","Tag Image picture file")
call adddocext(objChildRS,"Image Files","IFF","Tag Image picture file")
call adddocext(objChildRS,"Image Files","WMF","Windows Metafile file")
call adddocext(objChildRS,"Image Files","EMF","Enhanced Metafile file")
call adddocext(objChildRS,"Image Files","PEG","Enhanced Metafile file")
call adddocext(objChildRS,"Attached Email Message","EML","Attached Email
Message")
call adddocext(objChildRS,"Attached Email Message","ICS","Attached Calendar
file")
end sub

Popular posts from this blog

The MailboxConcurrency limit and using Batching in the Microsoft Graph API

If your getting an error such as Application is over its MailboxConcurrency limit while using the Microsoft Graph API this post may help you understand why. Background   The Mailbox  concurrency limit when your using the Graph API is 4 as per https://docs.microsoft.com/en-us/graph/throttling#outlook-service-limits . This is evaluated for each app ID and mailbox combination so this means you can have different apps running under the same credentials and the poor behavior of one won't cause the other to be throttled. If you compared that to EWS you could have up to 27 concurrent connections but they are shared across all apps on a first come first served basis. Batching Batching in the Graph API is a way of combining multiple requests into a single HTTP request. Batching in the Exchange Mail API's EWS and MAPI has been around for a long time and its common, for email Apps to process large numbers of smaller items for a variety of reasons.  Batching in the Gr...

Sending a Message in Exchange Online via REST from an Arduino MKR1000

This is part 2 of my MKR1000 article, in this previous post  I looked at sending a Message via EWS using Basic Authentication.  In this Post I'll look at using the new Outlook REST API  which requires using OAuth authentication to get an Access Token. The prerequisites for this sketch are the same as in the other post with the addition of the ArduinoJson library  https://github.com/bblanchon/ArduinoJson  which is used to parse the Authentication Results to extract the Access Token. Also the SSL certificates for the login.windows.net  and outlook.office365.com need to be uploaded to the devices using the wifi101 Firmware updater. To use Token Authentication you need to register an Application in Azure https://msdn.microsoft.com/en-us/office/office365/howto/add-common-consent-manually  with the Mail.Send permission. The application should be a Native Client app that use the Out of Band Callback urn:ietf:wg:oauth:2.0:oob. You ...

How to test SMTP using Opportunistic TLS with Powershell and grab the public certificate a SMTP server is using

Most email services these day employ Opportunistic TLS when trying to send Messages which means that wherever possible the Messages will be encrypted rather then the plain text legacy of SMTP.  This method was defined in RFC 3207 "SMTP Service Extension for Secure SMTP over Transport Layer Security" and  there's a quite a good explanation of Opportunistic TLS on Wikipedia  https://en.wikipedia.org/wiki/Opportunistic_TLS .  This is used for both Server to Server (eg MTA to MTA) and Client to server (Eg a Message client like Outlook which acts as a MSA) the later being generally Authenticated. Basically it allows you to have a normal plain text SMTP conversation that is then upgraded to TLS using the STARTTLS verb. Not all servers will support this verb so if its not supported then a message is just sent as Plain text. TLS relies on PKI certificates and the administrative issue s that come around certificate management like expired certificates which is why ...
All sample scripts and source code is provided by for illustrative purposes only. All examples are untested in different environments and therefore, I cannot guarantee or imply reliability, serviceability, or function of these programs.

All code contained herein is provided to you "AS IS" without any warranties of any kind. The implied warranties of non-infringement, merchantability and fitness for a particular purpose are expressly disclaimed.