This section of my web site is dedicated to one of my pet projects; the Mariner Web Validation Tool.
Brief Story |
The Mariner is a web spider which has been designed to traverse large web sites whilst obeying rules such as Robot Exclusion and can be used as a framework for further processing such as to generate statistics or search the content. The prototype was researched and developed as my main project whilst in my Final Year at the University of Glamorgan (1997). It's primary objective was academic quality, but I dislike theoretical work and the project was developed be useful to web developers and site administrators as a link checker for live web sites. On finishing university I re-acquired the copyright and used the project to learn new areas of computing. This effort spawned a development edition that has an improved logical structure and a multithreaded engine. The source was primarily developed on Linux systems, it makes use of glib for it's structure operations and uses POSIX threads heavily. In retrospect the source isn't particularly good, many of the mechanisms are flawed, it probably has some silly bugs in it and was being hacked when I gave up on it for a year or so, but that isn't the point. n the past I spent a lot of time hacking sections of the code and someone might be interested in it, I've used the tool do quickly pull data out of web sites someone else may do too. |
Download |
As an active supporter of the open source community I'm going to share my research and I present three downloads;
I also have my design notes and project report which I can publish if there is any interest (I have a pdf copy of most of the report). |
Rules |
They are released under the gnu public licence, I would particularly like to hear if it is referenced in any academic projects. If you use the program (even once, and HATE it) I'd like some feedback and if you add to or modify (or port) the code I'd like to merge in the changes. |