New features: Users, Applications, Online status

April 8th, 2008

In the last days I did some development on ParaDe, here are the results:

User support

If you go on the front page of parade, you will be asked to fill in some information about yourself. This is just in order to display it somewhere in ParaDe. And if you have a gmail account, one idea is that it would be used for a ParaDe Gtalk bot - but that is just an idea for the moment.

Application support

This is not directly visible now, but will be useful later. There’s now an applications.properties file in the same directory as rows.properties, which specifies the module to an application. That way ParaDe knows all the files of the CVS repository of the application.

By default, if not otherwise specified in rows.properties, rows use the application configured in parade.properties (”parade.applications.default” property).

Later on this will be used in order to provide support for CVS update of a context (i.e. it will show you that you need to update 10 files, and make it possible to update them fast) .

Session tracking (aka “online users”)

I made a small listener that listens to all the sessions opened on ParaDe. If you go on the front page, you can now see who is online (i.e. who has a session). This may not be entirely accurate, since they might be issues with session expiration. I know it doesn’t work super-well yet, but will improve it, since it’s cool.

By the way - if someone knows something about ajax-based chats or such, and has some time to give me a hand, we could put a small Gmail-like chat in ParaDe. I don’t have so much time for developing this kind of thing now, but I guess that it would be a useful feature).

Improved error pages

Well, maybe you saw them already if you try to access ParaDe while it restarts. Basically, instead of having an ugly error page from tomcat, you get something that looks slightly nicer.

CVS hook

This ain’t in place yet, but when it will be, ParaDe will be notified of every commit that is done. Besides of informing people that something changed, it could for instance be used in order to generate statistics (since ParaDe knows that if Bob commits something in Alice’s repository, it was still Bob’s commit).

More documentation

February 27th, 2008

Some ideas of other things to document:

  • the ParaDe cache model
  • various Hibernate tweaks, and ParaDe’s optimised cache configuration
  • the templating mechanism
  • ParaDe servlets and ViewManagers
  • Parade model managers: CVSManager, FileManager, RowManager, AntManager, MakumbaManager
  • ParaDe access control
  • the complex logging mechanism and ideas on how to improve it
  • the TriggerFilter mechanism

ParaDe and JNotify

February 27th, 2008

In this post I’ll try to provide some insights on how JNotify is used by ParaDe to keep its cache up-to-date.

Some words about JNotify

JNotify is a Java library that allows Java applications to receive file system events on windows and linux. It basically is a Java wrapper around events issued by the OS, making it possible to watch given files.

In order to do so, one needs to add watches (represented by a JNotify listener) on a given path, with several other options (whether to watch files in sub-directories, which kind of actions to watch, …).

The possible actions to watch are:

  • creation of a file
  • deletion of a file
  • modification of a file
  • name change of a file

On registration, JNotify will inform the OS that it wants to add a watch on the specific directory or file of the directory. Linux needs to have one watch per-file (directory watches don’t inform of changes in the files of a directory), hence it is useful to increase the max_user_watches limit in /proc/sys/fs/inotify.

JNotify listeners in ParaDe

ParaDe handles the notification of watches on startup, through the org.makumba.parade.model.Parade#addJNotifyListeners(). ParaDe only registers user rows, but not the root row (i.e. the ParaDe webapp itself).

Depending on the event, it then performs specific operations on the concerned file, or rather on its cache. Those actions are carried out in a single Hibernate transaction, and before any cache update is carried out, two checks are performed:

File name filtering

The org.makumba.parade.tools.SimpleFileFilter specifies which files are relevant to the ParaDe cache. Following files are excluded:

  • temporary files
  • .class files
  • CVS-related files (directories, temporary files due to merge/update)
  • tomcat-specific directories (work, logs)
  • internal ParaDe files (starting with “_new_”)
  • temporary Unison files

Lock verification

It can happen that some actions lead to a massive modification or change in the file system. In this case, performing a per-file cache update through JNotify is very resource-consuming, since the JNotify listeners don’t have a context and need to query for the cache object on every access, triggering one transaction per file.

For this reason, ParaDe has per-file and per-directory locking mechanisms: when a CVS update is triggered from the ParaDe UI, it will automatically create a lock: a file that ends with “.parade-lock~” and is named after the file or directory to lock is created and removed before and after the operation is performed. That way the JNotify listener can detect a lock and ignore events occurring at this specific path.

Locks are created and removed through the org.makumba.parade.model.Parade#createFileLock(), org.makumba.parade.model.Parade#removeFileLock(), org.makumba.parade.model.Parade#createDirectoryLock() and org.makumba.parade.model.Parade.removeDirectoryLock() methods. These are called on:

  • lcaol and recursive CVS update
  • CVS add, addBinary, delete, update
  • file creation/edition/deletion through the ParaDe UI

Limitations of the current mechanism

Even though it is heavily optimised for performance, the current mechanism has some drawbacks and limitations. Those are:

  • the lack of context on event notification. Whenever an event is fired, the JNotify listener needs to re-build its context, which leads to an expensive Hibernate transaction. Eventually, this brings a high load on the server, especially in case of a massive update (e.g. in the case of a Unison synchronisation that brings in many events)
  • the lack of control on which files to watch. It would be much more convenient to be able to filter out which files to watch by a set of names / patterns, instead of watching each single file and then filtering it out within ParaDe (through the SimpleFileFilter). Furthermore, this kind of “global watching” can lead to inconsistent error messages, e.g. when a temporary Unison file is created and remove right afterwards (the error message then looks like “registerToSubTree : warning, failed to register /path/to/file: Permission denied“).
  • the high dependency on the OS: JNotify is a workaround for Java’s lack of support of such OS file events. It has different ways of working in windows and linux and doesn’t work at all on Mac OS, hence breaking ParaDe’s platform-independency.

A solution to those shortcomings may eventually be addressed by the implementation of JSR 203 as part of Java 7.

ParaDe start-up

February 24th, 2008

This is a short explanation of what happens on ParaDe startup.

Initialisation

org.makumba.parade.init contains the classes involved with initialisation. Amongst those, the InitServlet plays an important role. It is configured in the global web.xml to run at startup and tries to load the Parade object from the Hibernate cache. If it finds such an object, it will load it, otherwise, it will create a new one, perform a global cache refresh on ParaDe and add the JNotify listeners to the rows.

Note that if the database is empty at this point, Hibernate will automatically create all the tables. At startup, Hibernate also performs schema update, i.e. it checks if the schema of the database corresponds to the one of the model and performs necessary steps if it isn’t. InitServlet is also the starting point for Hibernate: it provides a configuration and declares which Hibernate model files should be loaded.

Additionally, the Freemarker templating engine is also configured by the InitServlet.

The initialisation is being ran in a thread, which means that tomcat will probably finish loading way earlier than ParaDe. Wait for the “ParaDe initialisation finished at …” message before accessing ParaDe.

ParaDe refresh

Global ParaDe refresh is triggered in two ways: either on ParaDe startup, if no ParaDe object can be found, or manually by calling the RowsAction (http://hostname:port/Rows.do?op=parade).

The refresh() method of the org.makumba.parade.model.Parade class does following:

  1. read the row definitions file (rows.properties)
  2. updates the cache accordingly to these definitions
  3. calls the refresh() methods of all RowManagers

The RowManagers perform row-wide refresh operations, which differ for each manager. Some of them call a directory-refresh method, which itself calls a file-refresh method etc.

It is during the row refresh() that all the files of a row are being cached and their other meta-information (CVS, etc…) is being cached.

Hence the rows.properties file is read only once: when there’s no previous ParaDe object available. If new rows are being added to the file, a ParaDe-wide refresh needs hence to be performed. This is of course not optimal, and a more direct way than this one should exist, but is not yet implemented.

JNotify initialisation

At startup, all the rows are getting a JNotify watch: in the case of linux, this will trigger the creation of a inotify node for all the files in a row, hence the /proc/sys/fs/inotify/max_user_watches file needs to contain a rather high value if many rows are being watched. If you get a lot of messages from JNotify at startup saying it can’t add watches, this probably means that this file is not properly configured.

First-time installation

February 24th, 2008

In order to run ParaDe, you’ll need:

  • a tomcat server, preferrably version 5.5 (we didn’t test it on 6 yet)
  • a sql database, like MySQL or else
  • an OS that supports JNotify. We recommend using a unix-based OS. Linux supports inotify natively since kernel 2.6.13.
  • the Java Virtual Machine, version 5 or higher
  • the Apache Ant tools

Step 1: Checking out ParaDe

For the moment, the only way of downloading ParaDe is by checking it out from CVS. You can read more about it on the SourceForge project pages.

The easiest method for checking out ParaDe is the anonymous checkout. In a new folder of your choice, type:

cvs -d:pserver:anonymous@parade.cvs.sourceforge.net:/cvsroot/parade login

cvs -z3 -d:pserver:anonymous@parade.cvs.sourceforge.net:/cvsroot/parade co -P parade2

This will check out ParaDe. For further updates from the repository you can type:

cvs -z3 -d:pserver:anonymous@parade.cvs.sourceforge.net:/cvsroot/parade update

Step 2: The configuration files

Let’s say you have now a fresh checkout in the folder parade. You’ll now have to configure ParaDe and its interaction with tomcat and your database server.

The first thing you need to do is to configure the parade.properties file in parade/webapp/WEB-INF/classes. There’s an example file at parade/webapp/WEB-INF/classes/parade.properties.example which you can directly copy.

The next file that matters is tomcat.properties in the folder parade. There’s also a tomcat.properties.example file provided. Here you’ll only need to configure:

  • the path to your tomcat folder (e.g. if you have extracted a tomcat distribution in the same folder as parade, the path will likely be something like ../apache-tomcat-5.5.x)
  • the JVM’s Xms (minimum memory size) and Xmx (maximum memory size) values (by default those are suited to a production environment).

The other parameters are by default configured to fit the tomcat configuration in ParaDe’s “tomcat”directory, and youdon’t need to change them unless you want to run ParaDe on a different port (or have a safer manager authentication). In this case, don’t forget to do the necessary changes in ParaDe’s tomcat/server.xml and tomcat/tomcat-users.xml.

Once this is done, all you need is configuring the database. First, you need to create a database for parade. with MySQL, issuing a

CREATE DATABASE parade;

statement will do it. Then, you also need to configure the connection to the database in parade/webapp/WEB-INF/classes/hibernate.cfg.xml. ParaDe uses Hibernate to cache the state of the files and rows it handles, see the Hibernate website for more information about supported database systems and configuration options.

Step 3: Running ParaDe for the very first time

So, at this point you should be ready to run parade. From your shell or command-line, type

ant tomcat

You should now see tomcat starting up, and then hibernate creating a lot of tables. If everything goes fine, accessing your tomcat (e.g. http://localhost:5050) should show you the ParaDe index page, with only one row (the root row).

Step 4: Configuring the rows

Now you can configure the rows. First, stop the running ParaDe (at this point, hitting Ctrl+C should just stop tomcat). In parade/webapp/WEB-INF/classes/rows.properties you can set-up which rows you’d like to deploy. The syntax goes as follows:

rowName=/path/to/webapp/
rootrowdata.rowName.obs=Description of the row
rowdata.rowName.webapp=relative path to the context, leave blank if none

Step 5: Starting ParaDe properly

You can then run ParaDe again in a more robust way by issuing

nohup ant tomcat &

The logs will be in parade/tomcat/logs/catalina.out or accessible inside of ParaDe by looking on the root row.

If you plan to have ParaDe running regullarily, there is also a “parade” script which you can either copy over to e.g. /etc/init.d (for Debian-like distributions) or link from there. Don’t forget to “chmod +x” the script file!

That’s it! Enjoy using your new parade configuration!

ParaDe documentation blog

February 24th, 2008

Yes, there’s now a ParaDe blog! Here you will find various information on ParaDe: how it works, new features, how to configure it…

The main purpose of this blog (for the moment) is to write down documentation on ParaDe in an easy way. Then maybe one day we can turn all those posts into something more stable.