Identifier Conversion : July 24, 2001

Identifier Conversion

Definitions

An identifier is a sequence of ordered pairs (fieldname, value). A coherent set of identifiers is one which may be specified by an identifier dictionary (ref?), hence may be viewed as a tree. Each node has an identifier field name, (optional) constraints on values it may take on, (optional) constraints on values of its parent node, and (optional) child nodes.

An identifier conversion acts on a subtree. The result of converting an identifier is another identifier, but need not belong to the same identifier dictionary. An identifier converter consists of a collection of identifier conversions which act on disjoint subtrees of the full set. Of course this can only be verified when the converter is applied to identifiers from a known dictionary, but a path to the root of the subtree may be written down, even when the dictionary description is not available, as a sequence of fieldnames and (optional) value constraints.

Conversions -- Details

A conversion specification has three parts:

A path. In terms of the tree representing an id dictionary, the path is really just a path from the root node to some other node. The subtree hanging off this (destination) node is the domain of the converison.
A condition. An extra condition on an identifier belonging to the domain, such as "must have field with fieldname X". (May be the case that this is the only form of condition we'll ever need.) Identifiers in the domain but not fulfilling the condition are left alone by the conversion.
An operation to be performed on identifiers in the domain which satisfy the condition.

Four conversion operations are defined. All may be thought of as collapsing the collection of identifiers.

single-value: For all identifiers in the domain possessing a particular field, specified in the condition, generate a new identifier identical to the old except that the value for the particular field is set to a specified single value. [Not sure we really need this one.]
truncate: For all identifiers in the domain satisfying condition, truncate all fields following a particular specified field.
disappear: All identifiers in the domain possessing a particular field are dropped. They have no counterpart in the image.
compress: For all identifiers in the domain possessing a particular field, generate new identifiers by eliminating a specified segment of id fields

Some pictures may make this clearer.

Motivation

Why would anyone care about this and what does it have to do with the LAT, anyway?

The precise way the volumes comprising an instrument description are defined and nested in the XML geometry description is conditioned by the capabilities of the simulator(s). For example, a component like the grid, logically and perhaps even structurally a single object, will be broken into several volumes. Therefore a typical detModel identifier (that is, one coming directly from the XML description of the instrument) may contain many fields which are of no interest to applications other than similution. Clients of digitized CAL data, for example, will need to know tower number, layer number, and log within layer ("column"?). Other volume fields, such as those denoting LAT (rather than, e.g., spacecraft) and CAL, are implicitly known; still others may just be uninteresting. Clients of the energy-accounting hit data might wish to sum up all energy from the grid, but the output from the simulator will necessarily be divided among several different volumes. Properly defined identifier converters will convert detModel identifiers into a form much closer to that needed by these clients.

Implementation (XML)

XML elements are defined corresponding very closely to the conceptual components described above. There is an XML element for each operation. There is a <path> element and a <hasField> element which defines the condition on an identifier of having a particular specified field. (Ultimately a larger repertoire of conditions may be needed.) The <idConv> element stands for a single conversion. Its content is just a <path> followed by a <hasField> followed by one of the operations. Finally there is an <idConverter> element which contains a list of <idConv>s and optionally other things, such as constants, needed for context.

Implementation (C++)

Classes (denoted, e.g., MyClass) corresponding closely to the XML elements will be used to represent an Id Converter. These classes and the primary services to be provided are as follows.

Path

A path is simply a sequence of field names. It is implemented as a vector of strings.

Condition

For now the only supported condition is presence of a specified field name (using <hasField>) The Condition class is simply a typedef of a string, to be filled with the field name. Could change later to a base class with derived classes for each condition type.

Operation

IdOperation is a base class; derived classes are defined to correspond to each of the supported operations.

Id Conversion

The IdConversion corresponds to <idConv>. It provides services by delegating appropriate pieces to its componenets. For example

Check if identifier is in its domain (delegated to its path component).
Convert identifier. (First check that it's in domain. If so, delegate check of condition to hasField or other condition-type component. Then either clone, if condition is not satisfied, or delegate conversion to its IdOperator component if it is.)

Id Converter

The IdConverter class corresponds to <idConverter>. Provides the primary, in fact usually only, interface to unprivileged clients. Services are

conversion of an identifier to another
conversion of an entire dictionary to another
consistency check. Conversions belonging to the converter should have non-intersecting domains. (probably invoked at build time)