Home Up Feedback Contents Search

Manual

 

RtvLogan User's Manual

This user manual applies to RtvLogan version 10. If you do not have this version, you can download it from the RtvLogan page.

Contents

bulletThe General Idea
bulletThe Main Screen
bulletAdding Your Logs To RtvLogan
bulletSetting RtvLogan's Options
bulletData Analysis
bulletExporting The Data
bulletWhy Is My Data File So Small?
bulletDisclaimer

The General Idea

RtvLogan is designed to produce summary statistics from the raw data logs generated by UNIX web servers. Originally, I wrote this program simply for use on the Demon Internet web server because that was the only log information I needed to analyse. Since then, however, I obtained a web account with Clearlight Communications and it turns out that their log files are similar to Demon's logs except for the length of the time period. In Demon's case, the period is 1 week, whereas Clearlight has monthly logs. RtvLogan can analyse both types of logs, and any logs in between.

Statistics are produced using 3 categories. Firstly, it uses the actual files accessed by the server. This is the primary function of log analysis, because it allows you to figure out which pages are the most popular, and how many times certain files or other pages have been downloaded. RtvLogan can provide as much or little detail as you need; the entire length of all logs can be examined, or just a specific weekend or set of days.

Secondly it looks at where the accesses come from. Unlike some perhaps more sophisticated applications, RtvLogan only distinguishes between the final part of an IP address. For example, if I was to access a web site from my UK provider, my IP address would be rtvsoft.demon.co.uk, and hence RtvLogan classes this as a "uk" access.

Lastly, you can see how many errors are produced when accessing pages. Normally, a page is accessed and produces a "normal" error (code 200). Sometimes, however, the page has been reached through a redirection (code 202), or didn't have the correct permissions attached to it. Often a link to your pages is outdated and the page no longer exists; this produces the infamous "not found" error (code 404). Error analysis allows you to see how many non-fatal errors are occurring at your site.

The Main Screen

When you first start RtvLogan, you are presented with a Windows95-style property sheet which serves as the main screen. The picture below shows what you might expect to see:

Main Screen

All of RtvLogan's options are set by navigating through the tabs at the top of the screen. The analysis is performed by pressing the "Analyse Now..." button shown in the main screen. This is discussed in more detail below.

From this screen, you would select the date range your main data file contains (or any sub-period in that range), choose the type of analysis you wish to look at (by file, location or error), and then actually do the analysis.

Adding Your Logs To RtvLogan

The screen below shows the parsing screen in RtvLogan, which is where you add your log files to the main data file. When you first fire up RtvLogan it sets up default locations for your log files and your main data file; you might find it easier to change these settings first, before you continue. See the section below on how to do this.

Adding Files To RtvLogan

To add a file to the parsing selection, simply press the "Add File..." button. This brings up a standard "file open" dialog box where you can select your file. Since many users will have a collection of log files that haven't been analysed, you may find it easier to select all the files at once and then press the "OK" button in the "file open" dialog. This will take the names of all the files and put them into the list box for you.

It is extremely important that you don't add a log file twice, otherwise your analysis will be wrong. Currently, RtvLogan doesn't have a feature to detect a log file that has been added more than once, which means that it will happily add the analysis to its data file and essentially "double" the counters for the files and locations.

Once you are satisfied with the selection, press the "Add To Main Data File..." button to begin the parsing process. You can't interrupt this process, so RtvLogan tries to provide feedback on how it's getting on. When all the files you have selected have been parsed, RtvLogan shows a quick summary of the time taken and the number of lines parsed.

Setting RtvLogan's Options

If you click on the "Options" tab, the following screen will be displayed:

Setting Options

From here, you can set various options including the locations of the files you are going to work with, and the size of the log files.

It is highly recommended that you leave the data file location as the default value. When you start RtvLogan for the first time, the data file location is chosen to be the directory in which you installed RtvLogan. If you change this location, you will need to re-parse all your logs from scratch. The data file itself is called RtvLogan.dat. If you have problems with your logs, perhaps because you have added a file twice, you can delete this file and start again.

As noted above, RtvLogan supports log files of varying length. Unless you are sure your log files are only 1 or 2 weeks in length, it is sensible to leave the "log period" value at "Monthly". Work on the basis that more is better; a 1 week log file obviously fits into a 1 month file, but the converse it not true and can cause serious problems.

The last setting available from the tab is the amount of memory you want to give RtvLogan whilst it parses the files. Try to allocate the maximum amount of memory you can spare from your system. This memory is only used during the parse itself, and is made available to the system immediately the parse operation finishes, so you don't need to worry about RtvLogan "hogging" the system memory. You will find considerably faster parses with more memory allocated.

Data Analysis

The data analysis box is really the heart of RtvLogan, and can be invoked by pressing the "Analyse Now.." button in the main screen (as described above). Here is a typical analysis box:

Data Analysis By File

The analysis is organised in a "report" view; this means that you see the file, location, or error code on the left hand side, and then the numerical values associated with these items to the right. If you want to sort any of the columns, simply press the title of the column. To make more space available (for instance, if you can't read the entire name of a column or perhaps one of the items in the column), simply place the mouse over the column divider (in the heading row) and drag the divider to resize the column.

Another example of the analysis would be to do it by location. The picture below shows a typical example of this:

Data Analysis By Location

The numerical columns show the total number of "hits" achieved for each item over the period being analysed, the average number of hits per day, and the highest and lowest hits over the period. The average and lowest figures can be adjusted to include days where no hits were found at all, by changing the "Ignore Zero-Hit Days" check box.

Normally, webmasters are only interested in certain files or pages at their sites, but RtvLogan defaults to displaying all items found in the log files. To display only a subset of the items, press the "Filter..." button. The following dialog box will be displayed:

Filtering The Analysis

Initially, all items will be set to display in the analysis. To add items to the "Don't Display" category, either double-click each item in the left-hand list box, or select the item and press the "Add" button. Similarly, to remove items from the filtered box, either double-click the item or press the "Remove" button.

For the filter to be active, be sure you have selected the "Enable Filter" check box as shown above. The filtered items and the state of the "Enable Filter" box are saved between RtvLogan sessions so you can simply add a new log file at a later date and still see only the items you want.

Exporting The Data

There are 2 ways that you can export the data analysis to other applications: copying and saving as a file. A third method, printing the analysis directly from RtvLogan, will be added in a later version. All of the methods are accessible from the data analysis dialog box shown above.

If you press the "Copy" button, the data is copied to the clipboard in a format that allows you to paste the analysis directly into a spreadsheet program like Microsoft Excel or Lotus 1-2-3. The data is actually copied as tab-delimited text, making it ideal for use in word-processing applications like Microsoft Word as well.

For those who prefer files to be generated, you can choose the "Export..." button in the data analysis dialog box. This will bring up the following dialog box:

Exporting The Data

In this dialog box, you can choose the name and directory of the file you wish to export to, and the format of the file. Three formats are available, as shown in the above picture.

Excel 2.1 format is compatible with all versions of Excel later than 2.1, as well as most other spreadsheet programs. It may or may not be compatible with your word-processing application, depending on whether the appropriate file filters have been installed (this is normally an option during the program's setup procedure). Where possible, you should use this format because it avoids having to parse the resulting file - all the columns are predefined in this format.

Comma delimited and tab delimited formats produce standard ASCII text files, with the columns separated by either commas or tabs. Both of these formats are widely understood by a slew of other applications, so use these if nothing else works.

Why Is My Data File So Small?

RtvLogan creates a main list of file names and locations in its data file. As a result, to store the number of accesses for a particular file on a given date, all it needs to know is the "index" into the list for that file, and then the date and number of accesses.

The storage allocated for a filename is a standard 200 bytes (ie. 200 characters). If you have a typical 150 files on your web site, this amounts to 30,000 bytes to store the names of all files. For one week's worth of data, assuming analysis by day only, this would amount to 210,000 bytes of storage if we used the name for each file. However, using the alternative "list" system, we only need to store 4 bytes of data per file name (or location), as opposed to 200 bytes. This would amount to 4,200 bytes using the numbers above, which is a significant saving.

You can see the lists that RtvLogan uses by pressing the "File Dump..." button in the main screen. This presents you with 3 lists, matching the three types of analysis you can do: by file, by location, and by error. In reality, the location and error lists save a lot less space because the location is normally only 3 or 4 characters long, and the errors are simple numbers anyway (4 bytes of storage). However, by keeping lists of all items RtvLogan is able to manipulate its use of storage to maximum effect.

Disclaimer

RtvLogan is provided "AS IS" without warranty of any kind, either express or implied. In no event shall Jeremy Gelber or RTV Software be held liable for any damages whatsoever including direct, indirect, incidental, consequential, loss of business profits or special damages.

 

Home ] Up ]

Send mail to webmaster@rtvsoft.com with questions or comments about this web site.
Copyright 2003 RTV Software
Last modified: Tuesday, 25 February 2003 07:25