MySQL Prototype for Constants Handling

Introduction

GLAST raw data must undergo many transformations before they are useful for science. Among these are appropriate handling of dead and noisy channels and the application of functions to convert from electronics units to physical units. The transformations are parametrized by time-dependent information, the output of several kinds of calibration. A facility is needed to store multiple versions of this information. The contents must be accessible by time period of validity. It also may be the case that several calibrations of a given type will have been made covering the same time period as calibration procedures are developed and tested. The facility must provide a field in which one can mark a given calibration as "production" (or not) and support queries based on this field. Other forms of query and sorting are also desirable. These access requirements suggest the use of a relational database.

MySQL

MySQL is a widely-used open-source relational database implementation. It was selected for study because it is well supported for all platforms of interest and its features and performance characteristics appeared to be a good match to our needs. In particular, it is built for speed and efficiency. It does not support transactions and rollbacks, but makes lower-level locking mechanisms available to applications wishing to use them. Several HEP experiments are using or investigating using MySQL for similar purposes.

System Experience

Documentation for installation and system administration of MySQL is generally complete and clear. The installation of the MySQL server on Linux was uneventful.

Several client programs come with the MySQL distribution, including one for administrators to set up databases and accounts, and another for interactive users to query and update their databases using the usual SQL commands. Though not always straightforward to use, involving as it does combinations of users, databases, and machines to be manipulated via a primitive interface, the model for access control was adequate for our needs. The interactive SQL client was run successfully on all three target platforms: Linux, Solaris and Windows.

Applications Programming Interface

The MySQL distribution comes with a library of c routines supporting programmatic access to MySQL databases. It provides all functions we might conceivably need. Like other aspects of the product, the routines are well-documented.

The Prototype

The prototype was built to handle two forms of calibration output: tracker noisy channel lists and tracker dead channel lists. They were chosen because actual calibration data (for the testbeam instrument) were available, and because they are relatively simple in structure.

Database structure

The prototype defined four MySQL tables in all, two for each calibration type: a summary table and a data table. Other schemes, such as maintaining a separate data table for each calibration run, or combining both summary tables into one, were briefly investigated and dropped.

Each row in a summary table represented a single calibration run. Columns included a unique index, start and end times delineating the interval over which the calibration data were valid, a description of the input data used, and so forth. For the two forms of calibration in the prototype the summary tables were identical in structure. For a larger collection of calibration types this would not be the case, but a core set of columns would belong to any summary table.

Each row in a data table identified a single bad channel for a particular calibration run. Columns included fields sufficient to identify the channel (e.g., layer and strip number) and an index pointing back to the relevant row in the summary table. The dead channel table also had a column fully dead which could take on values 'Y' or 'N' while the noisy channel table had a column percent (i.e., percent of the time the channel was noisy) which took a floating point value. This permitted different client applications to apply their own thresholds in determining which channels were too noisy. Combining information for all calibrations of a given type into a single data table maked tracing the history of a particular channel or set of channels a simple operation.

Application database access

Utilities were written to add data and summary row (in an interlocked fashion) for a new calibration of each type to the tables. Other utilities could fetch the list of bad (noisy or dead) channels discovered by a particular calibration run. The utilities were straightforward to write and behaved properly when run either locally or remotely, on a machine other than the one hosting the database.

The interactive SQL client may be used fully interactively by typing individual SQL commands, or one may invoke a script. Scripts were used to create a destroy the database a whole, relatively common operations during prototyping.

Conclusions

In spite of its limited scope, the prototype demonstrated that the relational database manager MySQL would be a good choice for maintaining directory information on calibrations. No similar conclusion can as yet be drawn for the calibration data (as opposed to meta-data); the prototype model was too specific to the calibration types involved and too simplistic. More work needs to be done on the detailed structure of the summary databases; and on investigating whether MySQL should also be used as a repository for at least some forms of calibration output.


Written 2001-07-27
J. Bogart