Getting started with StatSVN (0.7.0) and CollabNet Subversion Server

This past week I was looking at advanced statistical information about a couple Subversion repositories we use at work.

While TortoiseSVN has some basic reporting, the downside is that, out of the box, users must have access to the repository to access this information.

StatSVN, seemingly the most popular solution, works rather well as an alternative to granting this access. The downside (or upside, depending upon your perspective) is that viewers of the report can see what files changed, and how many lines, but not what the actual changes were (outside of the logged message).

At least for our implementation, the lines of code statistic, which seems to be stressed in StatSVN, is also fairly useless for specific commits, which in turn throws off the statistics for the entire repository. (We use third-party code, so being the user that committed that code at a couple of different points, that inflated my numbers.)

Having worked through part of the process at work, refined it at home, and performed a second implementation at work, I present the following steps to implement StatSVN with a CollabNet Subversion Server installation.

Note that these steps, with minor modifications, willl work fine with any Windows and Apache-based installation of Subversion.

The basic steps

  1. If using authentication, create a user, if one isn't already created, that has access to the repositories and that can be used to checkout the repositories to report on.
  2. If necessary, download and install the Java 4+ requirement for StatSVN.
  3. Download and extract StatSVN to a directory of your choosing. For example, C:\statsvn
  4. Create a directory to store the generated content from StatSVN.
  5. Update Apache to allow access to the reports directory (from step 4).
  6. Checkout or update a working copy of the repository to report on.
  7. Generate the StatSVN reports for the working copy of the repository (from step 6).

To make this even easier, I'm including the implementation I've setup at home.

Example implementation

StatSVN directory: C:\statsvn

StatSVN outputs directory: C:\statsvn\output

Working copies of repositories are saved to the C:\statsvn directory (C:\statsvn\repositoryName).

From the command line at C:\Program Files (x86)\CollabNet\Subversion Server\httpd\bin I ran the following:

htpasswd -m c:\svn_resources\svn-auth-file james-cq5320y

I was then prompted for a password, which I entered twice. This now gives me an account I can use within a batch file to checkout/update repositories, that is specific to the server.

Next the Apache httpd.conf file needed to have the two following additions (where applicable):

Alias /stats C:\statsvn\output

That allows http://server1:8080/stats to point to the appropriate directory.

<Directory C:/statsvn/output>
    Options Indexes
    allow from all

And the above allows the directory structure to be returned (the alternative is to manually create an index page, either static or dynamic), that anyone can access. (Since access to this server is only available on my network, this works fine for my situtation.)

Since we've modified the Apache configuration we need to restart the service.

Next we create the batch file that will generate the StatSVN outputs.

set repositoryName=NorthwindExamples
svn checkout http://localhost:8080/svn/%repositoryName% --username james-cq5320y --password PASSWORD
cd %repositoryName%
svn log -v --xml > ..\%repositoryName%.log
cd -jar statsvn.jar -output-dir output\%repositoryName%.Stats %repositoryName%.log %repositoryName%
REM pause

After running this a couple of times I felt satisified that it was working correctly, so you'll see that I commented out the pause.

Of course, that works fine for the initial checkout. Once you have a working copy you can simply update, such as shown in the below batch file contents.

set repositoryName=NorthwindExamples
cd %repositoryName%
svn update
svn log -v --xml > ..\%repositoryName%.log
cd -jar statsvn.jar -output-dir output\%repositoryName%.Stats %repositoryName%.log %repositoryName%


Of course, some optimizations could be made, as currently this makes a bit of a mess of the StatSVN directory.

Because of this I'd recommend creating a directory for the working copy and/or logs generated.