D1 (Event) Database Requirements
Overview
The D1 event database contains the data,
e.g. photon events, needed
by beginning stages of any scientific analysis. Access by all
scientific analysis clients will be via the layer known as the
U1 data extractor. U1 verifies that queries
are valid, then sends them to the
the D1 tool, the internal entity which has
direct access to the data. U1 receives the data from the D1 tool,
then (may? or always?) reformat it into FITS files before returning
it, along with status information, to the client.
What about something about adding or replacing data? Should
there be an overview paragraph about that?
This document describes some of the fundamental concepts and procedures
involved in database design.
Definitions
- database
- Tends to mean one of two things: either just a collection of
data on some physical medium, lying there passively, or
- the data as above (e.g., disk files comprising a MySQL database)
- a protocol for accessing it (e.g., SQL)
- some active agent, like a piece of software, which implements
the protocol (MySQL server + mysql client program, which accepts
SQL queries as input)
Usually we use it in the second sense, which still can be ambiguous.
There may be multiple layers of protocols and associated agents
for a single collection of data. For example, U1 is an additional layer
for the D1 data.
- metadata
- Information about other data such as size, when created,
where to find it, etc.
- bulk data
- The actual stuff users want access to; complement of metadata.
Determining requirements
As with requirements on practically anything, database requirements
come in two flavors. Functional requirements
are what the
clients want the thing to do. Each protocol layer has its own set
of functional requirements.
Performance requirements concern
constraints on resource usage.
Functional requirements
For any database, at the most abstract level there are just a few things
you can ask it to do:
- Provide read access to data
- Add new data
- Modify existing data
- Monitor itself (keep statistics on usage, log errors, etc.)
Depending on the particulars of the data and the design chosen, it
may make sense to subdivide some of these functions, to cover cases
when
the data in question is bulk data, metadata or both.
Then there may be additional requirements on the protocol,
especially the client-side (aka user interface), such as
- support for interactive/human access
- support for batch/program access
- scripting
- "learn" mode where interactive input gets saved to a script
- security requirements
- verification of input
- error reporting
- support (or not) for remote access
For an actual database,
some of these classes of functions will be important (or,
more likely, subsets of some classes will be important) while others
may be uninteresting. Use cases are a good tool to determine which
are which.
Finally, independent of use cases, there usually are requirements
on maintaining integrity and on backup.
Performance and resource requirements
These include such things as
- Limits on resources like disk space, network bandwidth,..
- Required throughput for a single transaction
- Required throughput for series of transactions (limit on
per-transaction overhead)
- Degree of support for concurrency
GLAST Software Home
J. Bogart
Last modified: