Gaudi and Calibration Infrastructure

What it does
Initialization
The players
TCDS data
What's done
What's left
Policies and tools
Making a new calibration class

What it does

To start (and perhaps forever), will just be providing integration of read services with standard Gaudi facilities. It will be possible for a Gaudi application to create persistent calibration data by invoking services of the calibUtil package, but not by building an object in the TDS (more specifically TCDS, Transient Calibration Data Store), then invoking an (output) converter.

The persistent representation of calibration data will always consist of at least two pieces: the metadata, stored in the MySQL database, and the bulk data. This is not quite in line with the Gaudi behavior of a typical (i.e., event data) TDS object.

Calibration data will be made available for read access according to usual Gaudi practice for TDS data, except that, invisible to the client, any conversions which occur will take two stages, one for the metadata and a second for the bulk data:

An algorithm will request a particular calibration TCDS object by its path in the data store, e.g. /Calib/tkr_HotChan/vanilla.
The Data Service Provider (CalibDataSvc) will check to see if the item is present. [You may also ask it to update a dataset already present in the TCDS; that is, check the timestamp for the current event is within the validity interval of the data and if not, attempt to fetch a dataset whose validity interval does cover current event time. The steps involved in an update are similar to those enumerated here for initial creation. To get a somewhat graphical view of the internals see these slides.
If so, data may be returned to the requestor and we're done. Otherwise..
Data Service Provider gets the registry entry associated with the object (this is set up at initialization) and delegates to the Persistency Service (DetPersistencySvc), which in turn discovers the conversion service associated with this object. This conversion service will satisfy the ICalibMetaCnv interface. (CalibMySQLCnv is a concrete service satisfying this interface) and passes it the request, which by now takes the form of invoking IConversionSvc::createObj(IOpaqueAddress* address, DataObject*& obj)
The conversion service has no converters associated with it. Instead, it retrieves relevant metadata fields (start and stop of valid interval, serial number, bulk data type, bulk data identifier,...) for the calibration data set which is the best match to the request.
It uses this information to form another opaque address, this time one suitable for the conversion service which can handle the bulk data type, then invokes the Persistency service createObj method with this new address.
This time the Persistency service will invoke the createObj method of the conversion service appropriate for the bulk data type (e.g. CalibXMLCnvSvc or CalibROOTCnvSvc).
The bulk data conversion service does have converters, perhaps as many as one per CLID (mostly CLIDs are in 1-to-1 correspondence with calibration type). It finds the correct converter by CLID. It might already have one. If not, it looks up the factory associated with the CLID. (This association is managed by the ConverterFactory of ApplicationMgr. ConverterFactory implements the ICnvManager interface, which is what the conversion service will query.)
The converter now can actually fetch and store the bulk data in the TCDS.

For more internals of this process

Q: How does information get back to the data service that the data is now available? A: It's all part of one long call chain, so data will be there, if it exists to be converted, when the original request to the Provider completes.

Initialization

Event time has to be available to CalibDataSvc, the Data Service Provider, so it can check whether data in the TCDS has a validity interval including current event time. There is a timestamp in the event header, but most likely it will not be comparable in its raw state to timestamps assigned to calibrations. Will need some service to convert this time to something compatible with CalibTime. If CalibDataSvc is woken up at event start time, will this event time be available to it then?
Need to keep track of instrument as well. Ideally this should be part of the input event data, but it isn't yet. Also will need to agree on a uniform way to refer to the different instruments. (Public enumerated type? idents package would be a logical place for it.)
TCDS objects need to be registered with CalibDataSvc, and in such a way that CalibDataSvc can discover associated conversion service.
The conversion service must be able to find individual converters. Each converter has a factory which must be made known to the ConverterFactory belonging to the ApplicationMgr (this is done in the _load file for the component containing the converters).
At init time CalibMySQLCnv will construct a calibUtil::Metadata object and use it to open a read connection to the MySQL database. Have a method to make it available to all its converters. (Might also consider retrying whenever a conversion is requested if there is not currently a working connection; e.g., if the connection to the MySQL database failed at init time.)

Job options

Probably will need to add a line to job options analagous to the one now in basicOptions.txt concerning event persistence:

    EventPersistencySvc.CnvServices = {"EventCnvSvc"};

e.g.

    DetectorPersistencySvc.CnvServices = {"CalibCnvSvc"};

See also FluxSvc defaultOptions.txt for a model of job options stuff that may be required.

The Players

Item	Notes, description	Status	Relevant Gaudi code
Data Service Provider (CalibDataSvc)	Satisfies IDataProviderSvc, IDetDataSvc and IInstrumentName (at least). Inherits from DataSvc. Value added includes registering addresses in TCDS at initialize() time.	Draft debugged.	IDataProviderSvc.h IDetDataSvc DetDataSvc.h DetDataSvc.cpp DataSvc.h DataSvc.cpp
IInstrumentName	Simple interface, similar to IDetDataSvc except the latter keeps track of event time and this class keeps track of (event) instrument, making it easily accessible to other cooperating classes.	Debugged.
Persistency Service (DetPersistencySvc)	One per data provider	Provided by Gaudi	IPersistencySvc.h PersistencySvc.h PersistencySvc.cpp DetPersistencySvc.h DetPersistencySvc.cpp IAddressCreator.h
Front-line Conversion Service for calibration data (CalibMySQLSvc)	Derived from ConversionSvc, so satisfies IConversionSvc. Also satisfies ICalibMetaCnvSvc interface. This is the conversion service called initially when calibration data must be fetched from its persistent form. It searches the metadata database for the best-match calibration dataset, then uses information in the metadata belonging to the dataset to form a request to another conversion service to read that dataset and use it to create or update the requested TCDS object. with persistent format of either XML or ROOT because in both cases the interaction starts with the MySQL metadata database. Acts as a manager which matches up calibration data types with the appropriate converter.	Draft debugged	IConversionSvc.h ConversionSvc.h ConversionSvc.cpp
Metadata-specific Interface (ICalibMetaCnvSvc)	Abstract analog of additional utilities needed by a metadata conversion service.	Debugged
XML Conversion Service CalibXmlCnvSvc	Conversion service for calibrations whose bulk data format is XML. Also supplies certain services commonly needed by its converters, such as access to XML parser via ICalibXMLSvc interface.	First draft debugged	Same as for CalibMySQLCnvSvc above.
XML-specific Interface (ICalibXmlSvc)	Abstract analog of additional utilities needed by an XML conversion service.	Draft debugged
ROOT Conversion Service CalibROOTCnvSvc	Conversion service for calibrations whose bulk data format is ROOT. May also supply certain services commonly needed by its converters.	To be written	Same as for CalibMySQLCnvSvc above.
Converter base class(es) (XmlBaseCnv; may have analogous base class for ROOT converters)	Inherit from Converter. Handle physical format specific parts of conversion.	Draft of XmlBaseCnv debugged	Converter.h Converter.cpp
Converters	Need one for each calibration type to be stored in TCDS. Must satisfy IConverter.	Toy XmlTest1Cnv and non-toy XmlBadStripsCnv debugged; no ROOT examples yet.	IConverter.h Converter.h Converter.cpp
TCDS base class (CalibBase)	Implements IValidity and additional methods facilitating updates.	Debugged	See links below for TCDS classes
TCDS classes CalibBadStrips CalibLightAtten etc...	Derived from CalibBase, which is in turn derived from DataObject. There will be one TCDS class per calibration type as viewed by recon and analysis applications. For the time being (perhaps forever), there will be one such class for each calibration procedure, so objects in the TCDS will correspond directly to a single calibration data set.	Toy CalibTest1 and approximately realistic BadStrips TCDS classes written and debugged. Plenty more to go.	IValidity.h DataObject.h DataObject.cpp
Opaque Address	Satisfies IOpaqueAddress. Need just one such class; let's hope the GaudiKernel-supplied class GenericAddress (which isn't really all that generic) will do.	Using GenericAddress, supplied by Gaudi.	IOpaqueAddress.h GenericAddress.h
CalibTime	Have to have something which implements ITime. This is it. Practically all the functionality needed already exists in facilities::Timestamp, but this class can't formally implement ITime because the facilities package can't use any Gaudi packages. CalibTime is derived from both facilities::Timestamp and ITime.	Draft debugged.	ITime.h

TCDS data

Directory structure

As in the Event TDS, TCDS data is organized in a hierarchical structure. There are only three levels in the hierarchy. The path for the root node is "/Calib". The next level adds a string which identifies the calibration type, such as "TKR_HotChan", and the last level is flavor ("vanilla" for all standard calibrations). All the information will be in the leaves and there will be no references to be resolved among DataObjects. Proposed organization leads to paths of the form /Calib/subsystem_calibrationType/flavor

Another possibility is to introduce another level for instrument type, but even in the unlikely case that a single job processed data from more than one instrument, the processing would be sequential; it should never be necessary to keep constants for multiple instruments in the TCDS simultaneously.

A "middle" node, like /Calib/TKR_HotChan, has a CalibCLIDNode associated with it, a stunted little thing derived from DataObject which keeps track of the CLID of all its leaf nodes (they all use the same one; flavor does not affect CLID). Not yet clear whether this serves any purpose.

TDS classes

All classes will be derived from CalibBase, itself derived from DataObject. CalibBase implements IValidity, as all calibration objects must. It also implements an additional method, getSerNo, so that clients can determine whether or not the object has been updated.

ROOT will be the format for bulk calibration data consisting largely of floating point numbers. XML will be used for tracker strip status bits, possibly for Cal and ACD status information as well.

Bad strips

The BadStrips class, used for both hot and dead strips, is a relatively complicated beast. It is not simply be a list of strips; this would be very verbose since each strip would have to have additional addressing information (tower, tray, Si plane within tray). Instead it embodies the physical hierarchy to some degree: BadStrips has nested classes defining a bad tower and a bad uniplane. A bad tower is either marked as entirely bad, or contains a collection of bad uniplanes. Each such uniplane is identified by tray, top-or-bottom and degree of badness (e.g., a the calibration procedure might distinguish between barely-hot strips and very hot strips). It may be marked as entirely bad, or may have a collection of bad strips marked as bad. At each level objects are only created if necessary; that is, there is no object to contain bad strips in tower 5 if tower 5 has no bad strips. This is efficient in terms of space and makes it relatively easy to fetch all bad strips within certain physical or logical units, but may make it difficult to use the information directly. Applications can, if they wish, copy it to a local cache in a form convenient for their use.

Currently BadStrips has just a couple query functions. The main, perhaps only, client, the Tracker Service, will use the BadStrips visitor interface to traverse the entire data structure. See page 4 of this set of slides for a diagram of all the collaborating entities.

Calorimeter calibration data should be much more regular (fixed amount of data per crystal, for example) so most the TDS class(es) will be in a form immediately useful to clients.

Package organization

The TBW classes described above will live in two packages: CalibSvc and CalibData. CalibSvc contains the data service CalibDataSvc, the conversion services CalibMySQLCnv, CalibXMLCnv and CalibROOTCnv as well as related abstract interfaces such as ICalibMetaCnvSvc and the individual converters. CalibData will contain at least some, possibly all, TCDS classes and the model of the TCDS organization.

Done so far

The first goal is to write, assemble and debug all the remaining pieces needed to exercise the full protocol, approximately

XML conversion service
make a very simple fake kind of calibration data, e.g. simple XML file
write a corresponding TCDS class
write base class for XML converters.
write a corresponding converter for fake data type.
write some sample data sets of this kind
enter metadata for them in the MySQL database
do something (probably something fake to begin with) to get an absolute time out of the event. Similarly, do something for instrument name.
write workable, if minimalist, job options file
make a little test algorithm which attempts to access the sample data.
Debug

As of 16 Jan. this has all been checked out on both Linux and Windows (see some output), at least for the straightforward case involving no heavy-duty error handling.

A few more items have been checked off since:

Writing the real TDCS class and converters for XML bulk data.
Reasonable facsimile of hot and dead strips operational as of 30 Jan. See output of the test program simpletest, which includes output from algorithm UseBadStrips
Make the MySQL client library one of the standard external packages; put MYSQLEXT in IExternal
Done as of 6 Feburary.
Add new packages (calibUtil, CalibData, CalibSvc) to GlastRelease.
Done as of 10 Feburary.
Interim XML persistent form and converters for at least some CAL classes.
Gains and pedestals implemented as of 27 Feburary.

To do

Additional work for a real functioning system includes

Review package organization to make sure it supports development of subsystem calibration classes.
More thorough exercise of nominal functionality; e.g., see if multiple flavors are handled properly
Writing the ROOT conversion service
Writing the real TDCS class and converters for ROOT bulk data
Implementing permanent scheme giving access to event time, instrument name
Figure out how and where to store bulk data
Flesh out scheme for portable calibration data, probably involving alternate to CalibMySQLCnvSvc
Handling of many more possible error conditions
Tracker ToT, probably initially with XML persistent form.

and so forth.

Policies and Tools

There are several calibration infrastructure policy decisions to be made. I have preferences in some cases, but these are all still open questions.

Where should the bulk data be kept? Bulk calibration data should be handled more or less in the same manner as event data. Probably this means finding some nfs space for it at SLAC.
Is it mirrored? Yes.
Is the MySQL database mirrored? Yes.
Who may add calibration datasets to the standard pool? Some limited collection of people taken from Core software, subsystem groups and perhaps I & T, but to begin with just Joanne and perhaps a couple others.
Who can register new datasets (that is, write to the metadata database)? A similarly restricted group.
As currently designed, calibration data is geometrically organized so that it will be convenient to associate with event data. Who does the translation from electronics ordering? This will be the responsibility of people associated with each subsystem who know the details of this mapping. Fully supported data types in the calibration database will always be in geometry order.
Should the system be dedicated to calibrations of interest to Gaudi event analysis, or should it also be used as a repository for other, time-dependent detector information? No preference at the moment.
In the latter case, should metadata for this information be kept in a separate table? Yes
Current draft of the XML persistent form of bad strips data includes several fields intended diagnostic or historical purposes, such as locale where the calibration was done, cuts used, etc. Is this useful? If so, is the current set of such fields adequate? Should they be required or just optional?
Each calibration dataset must have associated with it a validity interval (validity start and end), stored in the metadata database row for that dataset. How is this interval, especially the validity end, chosen?
Does it get updated when a new calibration of the same type is done? Probably.
Should a set of recognized flavors be defined for all calibration types, or do we expect large numbers of flavors which will apply only to one or two calibration types? If there is no significant penalty in defining extra unused nodes in the TCDS, or if flavors are not that heavily used altogether, just define all flavors for all calibration types since this is easier to express in, for example, a job options file.

Depending on decisions made above, will need some or all of the following:

Help for calibrators in generating output of the proper form. Almost certainly necessary for XML datasets, maybe also for ROOT.
Tool for writing a row to metadata dbs for a new dataset to guarantee that entries are sensible and complete.
Tool to check whether at least one acceptable calibration of a given type exists for entire time period of interest.
Tool allowing "safe" modifications of metadata dbs, e.g. change validity end-time of one calibration when a new one covering the next time period is available; ability to mark a calibration as superseded, etc.
Whatever machinery is needed for mirroring

GLAST Software Home

J. Bogart

Last modified: