Most recent post

Thursday, September 11, 2008

Freeware - Logparser.ZIP

My last project was an OLEDB Log Parser COM input format plugin. One of the comments was that ‘we’ (meaning I) should work on a solution to query within ZIP files. I read the comment at the time and thought the suggestion was very specific and technically very difficult. Several months have passed and I did some more research and found a few more comments on trying to do this.....

Ta-da! Introducing a Log Parser COM input format ZIP plugin that will allow you to execute any Log Parser input query (e.g., –i:CSV, -i:IISW3C, -:TSV) against a single, or multiple Zip, GZip, Tar and BZip2 files. * For convenience I’ll refer to this set of files as ZIP files.

Features

  • Freeware with VB.NET 2005 source code included
  • Can query Zip files (GZip, Tar and BZip2 files are supported via an extension written by joelangley). To reduce disk requirements the LogParser.ZIP plugin extracts each file inside the zip, runs the Log Parser query, then deletes the unpacked file and repeats for each file.
  • Supports multiple zip files including wildcards, so you can query in the same Log Parser query across not only multiple files within a single ZIP, but multiple ZIP files either using comma separated list, or wildcards. E.g. ‘Select * From ‘2009*.zip’,’ 200812.zip’”
  • Supports all Log Parser Input Queries and their associated parameters. Note: Because the zip files are unpacked and then deleted the iCheckPoint parameter doesn’t work as it thinks the files have completely changed.
  • Implements iCheckPoint to support only showing the latest added files to the ZIP (not too sure if this will always work). This is supported across multiple ZIP files.
  • Implements 2 custom columns (as per standard LogParser). They are LogParserZIPFilename, LogParserRecordNumber in addition to returned columns. This is to provide extra querying features.
  • Logs any errors to console and to EventLog.Application
  • Supports all Log Parser data types except TIMESTAMP_TYPE, see challenges below.

*************************************************

Download. VB.NET source is included in download.

*************************************************
Note: I've given the code a good test, however there may be unexpected bugs. As this code has overwrite and delete file features please ensure that you test the software.

To install:
1. Copy the LogParser.ZIP folder to a location on your harddrive
2. Run the command .\LogParser.ZIP\InstalldotNETasCOM.bat (need GACUTIL.EXE - part of the .NET FW 2.0 SDK - why?). This will install into the GAC the 3 components being Gluegood.LogParser.ZIP, Interop.MSUtil.dll (LogParser wrapper) and ICSharpCode.SharpZipLib.dll.
3. Change directory to the location where LogParser.ZIP can extract and then delete the unpacked zip files and directories. E.g. cd c:\test
4. Run your Log Parser query. e.g.

LogParser.exe -i:COM "select * From 'C:\Test\Test.zip'" -iProgId:Gluegood.LogParser.ZIP -iCOMParams:iQuery="Select * From *.txt",iInputParameters="-i:CSV -HeaderRow:Off" -o:DataGrid

Syntax

LogParser.exe -i:COM "select * From {ZIPFile(s)}'" -iProgId:Gluegood.LogParser.ZIP -iCOMParams:iQuery="{LogParserQuery}",iInputParameters="{LogParser InputParameters}" -o:DataGrid

*************************************************

Challenges
1. Calling Log Parser APIs within VB.NET
The LogParser.ZIP COM Plugin is a VB.NET program that is COM interop enabled (to allow for LogParser to call it). Additionally it calls the LogParser COM API to execute LogParser queries thus making the whole interfacing rather technically challenging.

a) GAC, assemblies and DLLs
To enable LogParser to call LogParser.ZIP COM input plugin you need to give it a strong name, register and then publish it to the Global Assembly Cache (GAC). Due to the LogParser.ZIP residing in the GAC you need to create a dotNet wrapper for the Log Parser COM APIs (LogParser.dll), give it a strong name, and then place it too into the GAC. After a number of trials I found that the TLBIMP program did just this :-

C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin\tlbimp "C:\Program Files\Log Parser 2.2\LogParser.dll" /out:Interop.MSUtil.dll /keyfile:"GluegoodStrongKey.snk"

I then needed to Register the dotNet wrapper (Interop.MSUtil.dll) using REGASM.

C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\RegAsm.exe .\Interop.MSUtil.dll

I then needed to place Interop.MSUtil.dll into the GAC

"C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\Bin\gacutil" /i .\Interop.MSUtil.dll

(The InstalldotNETasCOM.bat does all of the above for you)

I then added a reference to the Interop.MSUtil.dll within my VB.Net project and can now call the LogParser COM API.

b) Calling late bound a COM dll using VB.NET reflection
To enable the greatest flexibility with the solution I needed to enable users to specify all Input types. E.g. –i:CSV, -i:TSV etc. Therefore I needed to dynamically call the COM Input Context Class. In my research this seems only achievable in using Reflection. Unfortunately there isn’t a lot of documentation on COM interop reflection using the GAC (bit unique I’d suggest). I found 2 useful articles that helped provide guidance on what to do – I’d like the thank the authors for documenting this.

Effectively I late bound to the COM input class using code exampled in this site. One gotcha was the late bound GAC name I had to use for Interop.MSUtil.dll. Initially I used just Interop.MSUtil, however I found that the late bound function didn’t work, therefore I used the full GAC name for the assembly being

Interop.MSUtil, Version=1.0.0.0, Culture=neutral, PublicKeyToken=048dde0ba838787f, processorArchitecture=MSIL

I then had to set the Input Parameters and I used a function called CallByName, which allows you to set object properties by name, instead of early binding (very cool).

2. Deciding on ZIP API
There are 2 major free ZIP APIs available for dotNET developers. The jslib ZIP methods, or the #ZIPLib dotNET Library. I decided on the latter simply because it was open source like my solution, it had a smaller footprint (wasn’t an extra download) and the comments out in the community tended to favour it as a solution. Additionally it provided more than just ZIP extraction which I thought may be useful. For the sceptics it was really easy to implement within my code.

3. TIMESTAMP_TYPE datatype conversion issue
This remains unsolved, so if anyone has any ideas let me know. The LogParser.ZIP program extracts each value in the IRecordSet response and places into a DataTable. In my code I match the DataTable column data types to the LogParser column data types. When I then return to LogParser (via the GetFieldType method) the values I re-convert the DataTable column types to the appropriate LogParser FieldType. For some very strange reason the delivery of TIMESTAMP_TYPE on sample data I tried performed extremely poorly when compared to STRING_TYPE. E.g. 2 seconds compared to 5 minutes. I therefore decided not to convert TIMESTAMP_TYPE, but pass any TIMESTAMP columns back as STRING_TYPE.

*************************************************

** Legal **
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

5 Comments:

Anonymous Anonymous said...

I'm getting a 403 on the Logparser.zip.zip download.

Also, until I signed in and then previewed, I got no CAPTCHA.

February 18, 2009 at 5:10 PM  
Anonymous Anonymous said...

Have you tried the link http://files.openomy.com/public/gluegood/LogParser.ZIP.zip

Seemed to work for me okay.

February 18, 2009 at 10:26 PM  
Anonymous Anonymous said...

please explain exactly how to install cos im lost

March 11, 2009 at 5:27 PM  
Blogger Greg Cymbala said...

Isn't this code limited to only working with .zip format files? GZip and other compression formats don't seem to be handled in this code.

July 1, 2009 at 9:36 AM  
Blogger Gluegood Software said...

Yes, this is limited to ZIP only. However JoelAngley has improved on the concept and supports other formats. See site - http://joelangley.blogspot.com/2009/06/logparser-unzip-and-parse.html

July 1, 2009 at 7:09 PM  

Post a Comment

<< Home