Skip to main content

Mailbox Attachment Auditing Script

Mofidifed 15.9.2005 to include not about date format

I had a question from a customer last week about where the folder for attachments was in Exchange so they could go in and delete anything that they thought was a bit old or that they thought they didn’t need. Although these questions at times can make me cringe my explanations of how Exchange goes about storing things and what’s considered best practice when it comes to mail archiving didn’t seem to be getting though so there arose a need to produce some sort of reports so that the customer could at least see what’s going on themselves. This was a small site 10 users so this wasn’t going to be too much information overload so I set to the task of building something to do this. There where two ways I could have gone with this one way would have been to export all the information about the attachments into a database and then do some ASP pages to report on the database. Which wasn’t going to be to practicle because I needed a delivery mechinsm to the client So I decided to go with some data shaping code that would produce a CSV file for each mailbox that was scanned that could then be opened in Excel. Data shaping was good fit because it allowed me to group the data I was going to retrieve in a hierarchal fashion and after some very mundane work or working out attachment classification for each extension type I came up with some relatively useful reports. Also I made the script so it could target specific date periods like attachments more then a year old or you could target attachments from the last week if your store was showing rapid file growth. What did I learn from this well it was interesting to see how many of the same copies of a certain attachments where stored in people’s mailboxes or that one user had decided to store all the NDR’s for a large message that they tried to send (this totaled over a few hundred meg). For me this shows the future of mail storage shouldn’t be an information store or a database but some sort of intelligent content management store. In this way the content you’re storing in more relevant, less redundant and can grow and learn with the user. With everybody storing more and more email data just having a better search engine isn’t going to be good enough. At the coal face you have one person trying to use and control this so having a method where the way in which its stored is more relevant you have half a chance of making it more efficient. I’ve always wondered weather there was a correlation between the amount of data you store and accuracy and efficiency you can achieve when retrieving it. It all shows that no matter how good any system is it’s really down to how you educate the users to weather they will get the best out of it.

The script

I started out using Exoledb and ADO to do this script which did work but because the only way of getting the attachment size via Exoledb (that I know) is to use an ADO stream object which made the code very slow and memory hungry. Once I switched to using Webdav where the X-MS-ENUMATTS method can be used to scan and return all the attachment details on a message this gave a large lift in performance and I was surprised how fast it could actually process very large mailboxes. I’ve included the ADO version in the download of the post but the WebDAV version is the one I’ll talk about because this is the one that I used.

The script itself works by first setting up the data shape that will be used to create the reports. This is done by first setting up some categories for all the files and then setting up child branches for these categories for each file extension. All the extensions are tracked in one variable which is used to then identify any unknown extension types (or ones that I haven’t categorized) and then a category is dynamically created for that. After this the RecurseFolder sub is called this sub is based on the code from the mailbox size KB. With a few exceptions the main one being is that it only checks normal mail folders this was to prevent an issue where searchfolder exist in the mailbox. The ResurseFolder sub job is basically to retrieve all the folder URL’s in the mailbox and then call the procfolder sub. The profolder sub processes all the mail in a folder by looking for all email with attachments between the specified date ranges of the query. The embedattach attachment processes the attachments on each message using the X-MS-ENUMATTS method and also checks and processes any attachments that are on embedded messages attached to the primary message. Because the received time is in UTC when you retrieve it using WebDAV there’s some code to do time conversion and also some other code that cleans the values up a bit so they can be used within the data shape. The script runs pretty verbosely and outputs a lot of stuff to the command-line because this is a script that will take a long period of time to run I find this is important to let you see the progress the script is making. At the end of the script it displays some summaries of all the attachments and also creates a CSV file with the result of all the attachment scans. To make the result more meaningful I added command-line parameters so you could restrict the date range the script will look at and also the minimum size of the attachments to look at. So you can scan for all attachments that are a year old over 500 KB.

I’ve created 4 different version of this script

Mbaudit.vbs in the ADO/Exoldb Version of the script (must be run locally on the Exchange server)

Mbauditwd.vbs is the WebDAV version (using http) can be run locally or remotely

Mbauditwdfba is a WebDAV version if you are using forms based authentication on your Exchange server

Mbauditrep.vbs is the same as Mbauditwd.vbs but has some front end code added that selects all the users on a particular server and will go though each user sequentially and create a report.

To run any of these scripts requires specifying command-line parameters the usage is for (all scripts except Mbauditrep.vbs)

Cscript Mbauditwd.vbs

The start and enddate needs to be in ISO format (year-month-date) eg to scan for all attachments that are a year and old over 500 KB

Cscript Mbauditwd.vbs myserver mymailbox 500 2004-01-01 2005-01-01

The Mbauditrep.vbs script is just run with the servename and size and date parameters

Cscript Mbauditwd.vbs myserver 500 2004-01-01 2005-01-01

A Important note about the date format because this is an ISO date you need to use the date format yyyy-mm-dd (this is different from what your use to if you in the US)

A word of caution with all of these scripts the WebDAV script creates a bit of network traffic and load on the server that it is reporting against and also with the number of requests that are being made if you have logging enabled on the OWA directory it will significantly increase the size of the log file if you scan a number of mailboxes. The amount of testing that has been done with these scripts is very minimal so as always its use at your own risk and make sure you do your own testing.

I’ve put a download copy of the scripts here the script itself looks like

on error resume next
set shell = createobject("")
strValueName = "HKLM\SYSTEM\CurrentControlSet\Control\TimeZoneInformation\ActiveTimeBias"
minTimeOffset = shell.regread(strValueName)
toffset = datediff("h",DateAdd("n", minTimeOffset, now()),now())
set conn1 = createobject("ADODB.Connection")
set req = createobject("microsoft.xmlhttp")
Set fso = CreateObject("Scripting.FileSystemObject")
fname = "c:\" & wscript.arguments(1) & ".csv"
set wfile = fso.opentextfile(fname,2,true)
wfile.writeline("File Group,File Type,File Extension,File Name, File Size, Email
Location,Message Subject,Sent From,Date Sent")
public exlist
public maxsize
public datefrom
public dateto
maxsize = wscript.arguments(2)
datefrom = wscript.arguments(3) & "T00:00:00Z"
dateto = wscript.arguments(4) & "T00:00:00Z"
exlist = ","
strConnString = "Data Provider=NONE; Provider=MSDataShape"
conn1.Open strConnString
set objParentRS = createobject("adodb.recordset")
set objChildRS = createobject("adodb.recordset")
set objgrandchild = createobject("adodb.recordset")
" NEW adVarChar(255) AS RFileType, " & _
" ((SHAPE APPEND " & _
" NEW adVarChar(255) AS CFileext, " & _
" NEW adVarChar(255) AS CFileType, " & _
" NEW adVarChar(255) AS CFileDesc, " & _
" ((SHAPE APPEND " & _
" NEW adVarChar(255) AS GCFileext, " & _
" NEW adVarChar(255) AS GCFilename, " & _
" NEW adVarChar(255) AS GCFilesize, " & _
" NEW adVarChar(255) AS GCMessLocation, " & _
" NEW adVarChar(255) AS GCMessSubject, " & _
" NEW adVarChar(255) AS GCMessFrom, " & _
" NEW adVarChar(255) AS GCDateSent) " & _
" RELATE CFileext TO GCFileext) AS MOWMI" & _
")" & _
" RELATE RFileType TO CFileType) AS rsSOMO "
objParentRS.LockType = 3
objParentRS.Open strSQL, conn1
rem *********** Add Document Types
call adddoctype(objParentRS,"Microsoft Office Documents")
call adddoctype(objParentRS,"Compressed Files")
call adddoctype(objParentRS,"Acorbat Files")
call adddoctype(objParentRS,"Executables and Installers")
call adddoctype(objParentRS,"Sound Files")
call adddoctype(objParentRS,"Video Files")
call adddoctype(objParentRS,"Image Files")
call adddoctype(objParentRS,"Attached Email Message")
call adddoctype(objParentRS,"Unclassified Files")
rem ********************************************
rem *********** Add Document Extensions"
Set objChildRS = objParentRS("rsSOMO").Value
call addfileexts(objChildRS)
Set objgrandchild = objChildRS("MOWMI").Value
sConnString = "http://" & wscript.arguments(0) & "/exchange/" &
wscript.arguments(1) & "/NON_IPM_SUBTREE"
call RecurseFolder(sConnString,objChildRS,objgrandchild)

Do While Not objParentRS.EOF
wscript.echo objParentRS(0)
Set objChildRS = objParentRS("rsSOMO").Value
Do While Not objChildRS.EOF
attsum = 0
attnum = 0
Set objgrandchild = objChildRS("MOWMI").Value
attnum = objgrandchild.recordcount
if objgrandchild.recordcount <> 0 then
wscript.echo " " & objChildRS(1) & " " & objChildRS(2)
wfile.writeline(objChildRS(1) & "," & objChildRS(2))
end if
Do While Not objgrandchild.EOF
wscript.echo " " & objgrandchild(1) & " " & objgrandchild(2) & " " &
objgrandchild(3) & " " & objgrandchild(4) & " "
wfile.writeline(",," & objgrandchild(0) & "," & objgrandchild(1) & "," &
objgrandchild(2) & "," & objgrandchild(3) & "," & objgrandchild(4) & "," &
objgrandchild(5) & "," & objgrandchild(6))
attsum = attsum + clng(objgrandchild("GCFilesize"))
if attnum <> 0 then
attachmentsumy = attachmentsumy & objChildRS(1) & "," & objChildRS(2) & "," &
objChildRS(0) & "," & attnum & "," & attsum & vbcrlf
end if
wscript.echo "Attachment Summary"
wscript.echo attachmentsumy
wfile.writeline "Attachment Summary"
wfile.writeline attachmentsumy

Public Sub RecurseFolder(sUrl,objChildRS,objgrandchild)

Set oXMLHttp = CreateObject("Microsoft.xmlhttp") "SEARCH", sUrl, False, "", ""
sQuery = "<?xml version=""1.0""?>"
sQuery = sQuery & "<g:searchrequest xmlns:g=""DAV:"">"
sQuery = sQuery & "<g:sql>SELECT """
sQuery = sQuery & "mapi/proptag/x0e080003"", ""DAV:hassubs"" FROM SCOPE "
sQuery = sQuery & "('SHALLOW TRAVERSAL OF """ & sUrl & """') "
sQuery = sQuery & "WHERE ""DAV:isfolder"" = true and ""DAV:ishidden"" = false
and """" = 1"
sQuery = sQuery & "</g:sql>"
sQuery = sQuery & "</g:searchrequest>"
oXMLHttp.setRequestHeader "Content-Type", "text/xml"
oXMLHttp.setRequestHeader "Translate", "f"
oXMLHttp.setRequestHeader "Depth", "0"
oXMLHttp.setRequestHeader "Content-Length", "" & Len(sQuery)
oXMLHttp.send sQuery
Set oXMLDoc = oXMLHttp.responseXML
Set oXMLSizeNodes = oXMLDoc.getElementsByTagName("d:x0e080003")
Set oXMLHREFNodes = oXMLDoc.getElementsByTagName("a:href")
Set oXMLHasSubsNodes = oXMLDoc.getElementsByTagName("a:hassubs")
For i = 0 to oXMLSizeNodes.length - 1
call procfolder(oXMLHREFNodes.Item(i).nodeTypedValue,objgrandchild,objChildRS)
wscript.echo oXMLHREFNodes.Item(i).nodeTypedValue
If oXMLHasSubsNodes.Item(i).nodeTypedValue = True Then
call RecurseFolder(oXMLHREFNodes.Item(i).nodeTypedValue,objChildRS,objgrandchild)
End If
End Sub

sub procfolder(strURL,objgrandchild,objChildRS)

strQuery = "<?xml version=""1.0""?><D:searchrequest xmlns:D = ""DAV:"" xmlns:b=""urn:uuid:c2f41010-65b3-11d1-a29f-00aa00c14882/"">"
strQuery = strQuery & "<D:sql>SELECT ""DAV:displayname"", ""urn:schemas:httpmail:subject"",
strQuery = strQuery & """urn:schemas:httpmail:datereceived"", ""urn:schemas:httpmail:fromname"",
strQuery = strQuery & """urn:schemas:httpmail:fromemail"""
strQuery = strQuery & " FROM scope('shallow traversal of """
strQuery = strQuery & strURL & """') Where ""DAV:ishidden"" = False AND ""DAV:isfolder""
= False AND "
strQuery = strQuery & """urn:schemas:httpmail:hasattachment"" = True AND "
strQuery = strQuery & """urn:schemas:httpmail:datereceived"" &lt; CAST(""" &
dateto & """ as 'dateTime') AND "
strQuery = strQuery & """urn:schemas:httpmail:datereceived"" &gt; CAST(""" &
datefrom & """ as 'dateTime')</D:sql></D:searchrequest>" "SEARCH", strURL, false
req.setrequestheader "Content-Type", "text/xml"
req.setRequestHeader "Translate","f"
req.send strQuery
If req.status >= 500 Then
ElseIf req.status = 207 Then
set oResponseDoc = req.responseXML
set oNodeList = oResponseDoc.getElementsByTagName("a:displayname")
set oNodeList1 = oResponseDoc.getElementsByTagName("a:href")
set oSubject = oResponseDoc.getElementsByTagName("d:subject")
set odatereceived = oResponseDoc.getElementsByTagName("d:datereceived")
set ofromemail = oResponseDoc.getElementsByTagName("d:fromemail")
For i = 0 To (oNodeList.length -1)
set oNode = oNodeList.nextNode
set oNode1 = oNodeList1.nextNode
set oNode2 = oSubject.nextNode
set oNode3 = odatereceived.nextNode
set oNode4 = ofromemail.nextNode
End If

end sub

subject,daterecieved,recievedfrom,davdisplay,objgrandchild,objChildRS) "X-MS-ENUMATTS", objhref, false, "", ""
If req.status > 207 Or req.status < 207 Then
set resDoc1 = req.responseXML
Set objHrefNodeList = resDoc1.getElementsByTagName("a:href")
Set objattachmethod = resDoc1.getElementsByTagName("d:x37050003")
Set objcnval = resDoc1.getElementsByTagName("f:cn")
set objattachname = resDoc1.getElementsByTagName("d:x3704001f")
set objattachfilename = resDoc1.getElementsByTagName("e:attachmentfilename")
set objattsize = resDoc1.getElementsByTagName("d:x0e200003")
If objHrefNodeList.length > 0 Then
For f = 0 To (objHrefNodeList.length -1)
set objHrefNode1 = objHrefNodeList.nextNode
set objNodef = objattachmethod.nextnode
if objattachmethod.length <> 0 then
if objNodef.Text = 5 then
settodav = 0
set objNode1f = objattachfilename.nextNode
if objattachfilename.length = 0 then
if objattachname.length = 0 then
set objNode1f = objcnval.nextNode
if objcnval.length = 0 then settodav = 1
set objNode1f = objattachname.nextNode
end if
end if
fnFileName = objNode1f.Text
if err.number <> 0 then wscript.echo "error" & objHrefNode1.Text
set objNode3f = objattsize.nextnode
attsize = objNode3f.Text

if fnFileName <> "" and clng(attsize/1024) > clng(maxsize) then
wscript.echo fnFileName
fatt1 = len(fnFileName)
lcMaloc = replace(objhref,"%E2%80%99S%20"," ")
lcMaloc = replace(unescape(lcMaloc),davdisplay,"")
lcMaloc = replace(lcMaloc,"http://" & wscript.arguments(0) & "/exchange/" &
wscript.arguments(1) & "/","")
wscript.echo lcMaloc
fatt2 = fatt1 - 2
attname = UCASE(fnFileName)
rtime =
dateserial(mid(daterecieved,1,4),mid(daterecieved,6,2),mid(daterecieved,9,2)) &
" " & mid(daterecieved,12,8)
objgrandchild("GCFileext") = mid(attname,fatt2,3)
objgrandchild("GCFilename") = right(replace(fnFileName,",",""),254)
objgrandchild("GCFilesize") = replace(formatnumber(attsize/1024,2),",","")
objgrandchild("GCMessLocation") = right(replace(lcMaloc,",",""),254)
objgrandchild("GCMessSubject") = right(replace(subject,",",""),254)
objgrandchild("GCMessFrom") = recievedfrom
objgrandchild("GCDateSent") = dateadd("h",toffset,formatdatetime(rtime,0))
elistchk = "," & mid(attname,fatt2,3) & ","
if instr(exlist,elistchk) = 0 then
call adddocext(objChildRS,"Unclassified Files",mid(attname,fatt2,3),"Unknown")

end if
end if
end if
end if
End If
End If
end sub

sub adddoctype(objParentRS,doctype)

objParentRS("RFileType") = doctype

end sub

sub adddocext(objChildRS,doctype,ext,fdesc)

objChildRS("CFileType") = doctype
objChildRS("CFileext") = ext
objChildRS("CFileDesc") = fdesc
exlist = exlist & "," & ext & ","

end sub

sub addfileexts(objChildRS)
call adddocext(objChildRS,"Microsoft Office Documents","DOC","Microsoft Word
call adddocext(objChildRS,"Microsoft Office Documents","DOT","Microsoft Word
call adddocext(objChildRS,"Microsoft Office Documents","XLS","Microsoft Excel
call adddocext(objChildRS,"Microsoft Office Documents","PPT","Microsoft
Powerpoint Presentation")
call adddocext(objChildRS,"Microsoft Office Documents","PPS","Microsoft
Powerpoint Slide Show")
call adddocext(objChildRS,"Microsoft Office Documents","MDB","Microsoft Access
call adddocext(objChildRS,"Microsoft Office Documents","ADP","Microsoft Access
call adddocext(objChildRS,"Microsoft Office Documents","VSD","Microsoft Visio
call adddocext(objChildRS,"Microsoft Office Documents","ONE","Microsoft OneNote
call adddocext(objChildRS,"Microsoft Office Documents","RTF","Rich Text Format
call adddocext(objChildRS,"Microsoft Office Documents","TXT","Text File")
call adddocext(objChildRS,"Microsoft Office Documents","CSV","Comma seperated
call adddocext(objChildRS,"Compressed Files","ZIP","Zip Compressed File")
call adddocext(objChildRS,"Compressed Files","TAR","nix Tar Compressed File")
call adddocext(objChildRS,"Compressed Files","ARG","Arg Compressed File")
call adddocext(objChildRS,"Compressed Files","RAR","RAR Compressed File")
call adddocext(objChildRS,"Compressed Files","ACE","ACE Compressed File")
call adddocext(objChildRS,"Compressed Files","BHX","Binary Hex Compressed File")
call adddocext(objChildRS,"Acorbat Files","PDF","Adobe Acrobat File")
call adddocext(objChildRS,"Executables and Installers","EXE","Executable File")
call adddocext(objChildRS,"Executables and Installers","MSI","Windows
call adddocext(objChildRS,"Sound Files","WAV","Wave File")
call adddocext(objChildRS,"Sound Files","MP3","MPeg3 Sound file")
call adddocext(objChildRS,"Sound Files","WMA","Windows Media file")
call adddocext(objChildRS,"Sound Files","WMV","Windows Media file")
call adddocext(objChildRS,"Sound Files","SND","Windows Sound File")
call adddocext(objChildRS,"Sound Files",".AU","AU Sound File")
call adddocext(objChildRS,"Sound Files","RPM","Real Audio Sound File")
call adddocext(objChildRS,"Sound Files","MID","MIDI Audio Sound File")
call adddocext(objChildRS,"Sound Files",".RM","Real Audio Sound File")
call adddocext(objChildRS,"Sound Files",".RA","Real Audio Sound File")
call adddocext(objChildRS,"Sound Files","ASF","Advanced Streaming format File")
call adddocext(objChildRS,"Video Files","AVI","AVI Video format file")
call adddocext(objChildRS,"Video Files","MPG","MPG Video format file")
call adddocext(objChildRS,"Video Files","MOV","MOV Video format file")
call adddocext(objChildRS,"Video Files","IVX","DIVX Video format file")
call adddocext(objChildRS,"Video Files","PG4","MPG4 Video format file")
call adddocext(objChildRS,"Video Files","SWF","Shockwave format file")
call adddocext(objChildRS,"Image Files","JPG","JPG picture file")
call adddocext(objChildRS,"Image Files","BMP","Bit Map picture file")
call adddocext(objChildRS,"Image Files","GIF","Gif picture file")
call adddocext(objChildRS,"Image Files","PNG","Portable Network graphics picture
call adddocext(objChildRS,"Image Files","TIF","Tag Image picture file")
call adddocext(objChildRS,"Image Files","IFF","Tag Image picture file")
call adddocext(objChildRS,"Image Files","WMF","Windows Metafile file")
call adddocext(objChildRS,"Image Files","EMF","Enhanced Metafile file")
call adddocext(objChildRS,"Image Files","PEG","Enhanced Metafile file")
call adddocext(objChildRS,"Attached Email Message","EML","Attached Email
call adddocext(objChildRS,"Attached Email Message","ICS","Attached Calendar
end sub

Popular posts from this blog

Using the MSAL (Microsoft Authentication Library) in EWS with Office365

Last July Microsoft announced here they would be disabling basic authentication in EWS on October 13 2020 which is now a little over a year away. Given the amount of time that has passed since the announcement any line of business applications or third party applications that you use that had been using Basic authentication should have been modified or upgraded to support using oAuth. If this isn't the case the time to take action is now. When you need to migrate a .NET app or script you have using EWS and basic Authentication you have two Authentication libraries you can choose from ADAL - Azure AD Authentication Library (uses the v1 Azure AD Endpoint) MSAL - Microsoft Authentication Library (uses the v2 Microsoft Identity Platform Endpoint) the most common library you will come across in use is the ADAL libraries because its been around the longest, has good support across a number of languages and allows complex authentications scenarios with support for SAML etc. The

How to test SMTP using Opportunistic TLS with Powershell and grab the public certificate a SMTP server is using

Most email services these day employ Opportunistic TLS when trying to send Messages which means that wherever possible the Messages will be encrypted rather then the plain text legacy of SMTP.  This method was defined in RFC 3207 "SMTP Service Extension for Secure SMTP over Transport Layer Security" and  there's a quite a good explanation of Opportunistic TLS on Wikipedia .  This is used for both Server to Server (eg MTA to MTA) and Client to server (Eg a Message client like Outlook which acts as a MSA) the later being generally Authenticated. Basically it allows you to have a normal plain text SMTP conversation that is then upgraded to TLS using the STARTTLS verb. Not all servers will support this verb so if its not supported then a message is just sent as Plain text. TLS relies on PKI certificates and the administrative issue s that come around certificate management like expired certificates which is why I wrote th

A walk-though using the Graph API Mailbox reports in Powershell

Quite recently the Reporting side of the Graph API has moved in GA from beta, there are quite a number of reports that can be run across various Office365 surfaces but in this post I'm going to focus on the Mailbox related ones. Accessing Office365 Reports using Powershell is nothing new and has been available in the previous reporting endpoint however from the end of January many of these cmdlets are now being depreciated in favour of the Graph API . Prerequisites  In comparison to using the Remote PowerShell cmdlets where only the correct Office365 Admin permissions where needed, to use the new Graph API reports endpoint you need to use OAuth for authentication so this requires an Application Registration  that is then given the correct oAuth Grants to use the Reports EndPoin
All sample scripts and source code is provided by for illustrative purposes only. All examples are untested in different environments and therefore, I cannot guarantee or imply reliability, serviceability, or function of these programs.

All code contained herein is provided to you "AS IS" without any warranties of any kind. The implied warranties of non-infringement, merchantability and fitness for a particular purpose are expressly disclaimed.