RACE - Modifications and Extensions to ACE from Ravensburg

Main Ideas of RACE - Logging

Background and Requirements

We have Entities, Objects (Valuetypes in the CORBA-sense) which must be collected in a Logfile.
Since they may come in a burst or high frequence, the receiving of them mustn't block.
The writing on the disk must be as fast as possible.

If this Entities are (key,data)-pairs, and two Entities are logical equal, it the keys are equal, snapshots of a collection can be made.

If this Entities have a type, a two level snapshot is thinkable: For each type can be a snapshot, and all this snapshots are linked in an whole snapshot.

We abstract extensionally, we are not interessted the content (the intension). Thus, this Entities could be:

Text-Messages for a syslog-Implementation
Events in the sence of the CORBA-Event- or Notification_service
diffs of changed memory-Pages for a RAMS-Metaobjekt -Implentation of a Transaction Prosessing System
Network-Packets - one can build an sniffer
Pictures in a Video-Sequence for an multimedia application

The Entities can be represented binary or in an text format like XML.

The Linking with RACE_LogFileOffsets

The linking in those logfiles in the snapshot are like the relative Pointers of RAMS. Since we expect very long logfiles, a logfile consists of multiple parts. Thus, the RACE_LogfileOffset consists of a (part,offset)-pair.

The Role of the RACE_Dumpables

The Entities and different Snapshots which should be written in the logfile are put into subclasses of RACE_Dumpables.
They encapsule pieces of memory to be dumped in a Logfile.
A RACE_Dumpable enhances a RACE_Message_Block with an extra state, and the possibility to enhance the header is a main invention of RACE vers. ACE.

Like in the Persistence-Concept related with RAMS-Relativ Pointers we destinguish between

the persistent State on the Disk: this is the data in the RACE_Data_Blocks
Transistent Accessor Operations at Runtime in the mainstorage: The subclass of a RACE_Dumpable can implement comfortable access operations

The workings of the logger

The RACE_Logger is an active object with a thread for writing in parallel with the other work of the application.

It is most efficient to write 4096 Byte blocks on disk. Thus, the RACE_Logger has a RACE_Accumulator which collects RACE_Dumpables until it can write a block. It collects the IO-Vectors for the gather-scatter-Pattern.

For snapshots, we have to know where the RACE_Dumpable is written. Thus, the InformOffset callback method of a RACE_Dumpable is called after the offset is fixed.

Subclasses of RACE_Dumpables have a state, and overwrite/implement the InformOffset-Method.

The RACE_Accumulator does not copy the data. Rather, it uses the writev(2) operation which gathers the block together from multiple RACE_Message_Blocks without copying.
The reference counting of RACE_Data_Blocks in the RACE_Message_Block avoids copying whenever possible.

Big dumpables (for snapshots) may have mutiple parts; in that case we call them fragmented.

The Queue with a strategy

If there is too heavy load, some RACE_Dumpables which are marked as discardable (a virtual function which can be overwritten) can be discarded. If the load still is too heavy, the enqueueing is inhibited. The blocking has a timeout and the Logger could react at this situation - for instance terminate the Logging.

The policy for when a queue is regarded as full, as well as the timeout times, are controlled by a QueueLevelStrategy which may be overwritten.

The overwriting is very useful: The InformOffsets are executed from the LoggerThread.
For example, consider the case that a snapshot is triggered and should be enqueued while the queue is full. This is done by the LoggerThread, but the LoggerThread is the only one who consumes the queue and can solve the situation - thus we have a deadlock.
The solution is a special QueueLevelStategy to make the queue appear non-full for the LoggerThread.

The Linking in the Logfile

There may be Snapshots of keyed Entities. The snapshot may be a collection of (key,RACE_LogFileOffset)-Pairs, where the Offset points the last instance of the entity in the logfile.

When the RACE_Accumulator has determined the offset in the Logfile, it calls the Inform_Offset-Callback.

In the management part of the concrete Dumpable derivation should be a association with the Snapshot Collector. This snapshot collector implements the access into the Shapshot dumpable.
The concrete Inform_Offset-Implementaion triggers the insertion the (key,Offset)-pair into the Snapshot.

The Snapshot must be written, when the last outstanding Dumpable is written. This is controlled with the strategie RACE_Snapshot_Trigger.

The RACE_Snapshottrigger

When the time comes to pull a snapshot, there may be RACE_Dumpables in the RACE_Message_Queue or in the RACE_Accumulator (which also behaves as a queue).
Thus we have to wait for committing the Dumpables in the queue and the accumulator and trigger the snapshots.

We have discovered two kinds of Snapshots:

A Snapshot knows how many Dumpables are pending in the whole
. When they arrive, the snapshot can be triggered. There is no extra copy needed.
The Snapshot knows how many Dumpables are needed for a the snapshot. We have to be aware of out-of-sequence snapshots. I.e., one InformOffset is triggered while the previous InformOffset is not triggered yet. (see next subchapter)

Why Snapshots may occur out-of-sequence

If we have a 2 level snapshot with different kinds of snapshots at the first level and a whole snapshot at the 2nd level which has links to the snapshot at the first level, the snapshot may occur out of sequence.

We think at the following:

On kind of Snapshots at the first level can consists of (key,offset)-Pairs.
Those can only then written, if the last Dumpable has called back.
An other kind of snapshots may be a Collection of Entities logged periodically.

When it is time for triggering the 2nd level snapshot,the last Offset of a periodic snapshot can simply inserted into the whole snapshot.
On low traffic and small Entities, it could be, that the last Snapshot isn't ready yet. Thus after beginning the new snapshot, the triggering of a (key,offest)-Snapshot is outstanding.

The Logfile

The logfiles have parts (_number.log). The RACE_LogfileOffset consists of a pair (part number, offset in part).

Note, the layouts of data structures in the dumpables may the natural binary representations chosen by the compiler. This must be considered when porting an RACE to a different architecture.
To be portable, the derivations of the RACE_Dumpable should use marshalling/demarshalling.
A better idea is to convert the logfile to ASCII (logfile2ascii) or to build a special converter.

The main idea is that the logfile consists of variable length entries.

The header of this entry must contain the length and a type identifier. After that, the special data is dumped.

The RACE_LogSwitchingStrategy

The switching of the Logfile is controlled by the RACE_LogSwitchingStrategy. The Logfile can be switched

on the hard limit - it must be switched
on a soft limit - we switch after the next snapshot
...

The RACE_Logger informs the RACE_LogSwitchingStrategy after putting the RACE_Dumpable into the RACE_Accumulator.

Problems to solve:

The Logger should only know the class RACE_Dumpable (and no more application depended class).: The switching of the Logfile must be done in an Inform_Offset-Callback of a snapshot
The Snapshot should be the last Entry in a logfile-Part: After queuing the Snapshot, a Flush-Command-Dumpable is queued. The InformOffset-Callback of the Snapshot triggers the SnapshotWritten-Callback-in the RACE_LogSwitchingStrategy, which triggers the switching.

The Index

The reader of the logfile may needs to position itself at a specific time in the logfile.

Thus, the index file contains a sequence of the pair (timestamp, RACE_LogFileOffset). See RACE_LogIndexLayout.h
There is only one index file for all parts in the logfile.

Management of the index is implemented in the RACE class framework since it is independent/reusable
The writing is implemented by RACE_LogIndexWriter, the reading and positioning is handled by RACE_LogIndexPositioner.

Reference Counter Pattern

Two threads, the main thread and the logger thread, have access to an object.
An object may be deleted only when its no longer used. For example, think of a Snapshot which waits on InformOffset callbacks triggered by the dumpables. There may be a method CheckRightToExist which checks whether anyone needs an object.

Reading of the Logfile

The interpretion of a logfile depends on the application. Here a few stripped remarks from an industrial project using RACE:

LogfileReader and LogSemantics

The LogfileReader reads the Logfiles and interprets them with a LogfileSemantic.

The logfile reader has a special buffer which grows as needed for a LogfileEntry. Another possibility could be working the RACE_Message_Blocks.

logfile2ascii has a special semantic to convert the binary logfile data to human readable text strings.

Using the RACE_LogIndexPositioner we can position ourselves within the Logfile.

The RecoverState is a LogFileSemantic which reads a snapshot. It has an iteration semantic (cf. logfile2ascii). With this combination logfile2ascii can list a snapshot and then the logfile.

Computerscience and Networkassocation Ravensburg e.V Rudolf Weber and friends