Requirements for the D1 (Event) database

D1 (Event) Database Requirements

Overview

The D1 event database contains the data, e.g. photon events, needed by beginning stages of any scientific analysis. Access by all scientific analysis clients will be via the layer known as the U1 data extractor. U1 verifies that queries are valid, then sends them to the the D1 tool, the internal entity which has direct access to the data. U1 receives the data from the D1 tool, then (may? or always?) reformat it into FITS files before returning it, along with status information, to the client.

What about something about adding or replacing data? Should there be an overview paragraph about that?

This document describes some of the fundamental concepts and procedures involved in database design.

Definitions

database

Tends to mean one of two things: either just a collection of data on some physical medium, lying there passively, or

the data as above (e.g., disk files comprising a MySQL database)
a protocol for accessing it (e.g., SQL)
some active agent, like a piece of software, which implements the protocol (MySQL server + mysql client program, which accepts SQL queries as input)

Usually we use it in the second sense, which still can be ambiguous. There may be multiple layers of protocols and associated agents for a single collection of data. For example, U1 is an additional layer for the D1 data.

metadata

Information about other data such as size, when created, where to find it, etc.

bulk data

The actual stuff users want access to; complement of metadata.

Determining requirements

As with requirements on practically anything, database requirements come in two flavors. Functional requirements are what the clients want the thing to do. Each protocol layer has its own set of functional requirements. Performance requirements concern constraints on resource usage.

Functional requirements

For any database, at the most abstract level there are just a few things you can ask it to do:

Provide read access to data
Add new data
Modify existing data
Monitor itself (keep statistics on usage, log errors, etc.)

Depending on the particulars of the data and the design chosen, it may make sense to subdivide some of these functions, to cover cases when the data in question is bulk data, metadata or both. Then there may be additional requirements on the protocol, especially the client-side (aka user interface), such as

support for interactive/human access
support for batch/program access
scripting
"learn" mode where interactive input gets saved to a script
security requirements
verification of input
error reporting
support (or not) for remote access
- via net
- via mirror

For an actual database, some of these classes of functions will be important (or, more likely, subsets of some classes will be important) while others may be uninteresting. Use cases are a good tool to determine which are which.

Finally, independent of use cases, there usually are requirements on maintaining integrity and on backup.

Performance and resource requirements

These include such things as

Limits on resources like disk space, network bandwidth,..
Required throughput for a single transaction
Required throughput for series of transactions (limit on per-transaction overhead)
Degree of support for concurrency

GLAST Software Home

J. Bogart

Last modified: