About RUcore: The Data Model
Table of Contents
The data model is the heart of the RU
core architecture. The data model provides for the basic description of information so that it can be effectively discovered and used. A good data model should be:
- Understandable by the information manager and the user. A repository can hold terabytes or petabytes of data. The data model shows everyone how the information is organized and interrelated for use.
- Context independent. A good data model will support all contexts, those that information creators bring to the repository, which can be layered on top of the data model, and the information contexts of future generations of scholars, which can't even be envisioned. The rapid proliferation of digital information is already revolutionizing research. A good data model should be flexible enough to accommodate major transformations to the research and publication process.
- A Representation of "living data." Digital data lives in relationship to other data and may be repurposed many times over its lifecycle. For example, a data set may be created in a physical experiment, published as the appendix to a journal article, reused in a computational simulation, all within the space of a few months! A good data model will track the events and interrelationships of data, to expose not only the data for reuse but also the lifecycle and ecology of the data.
- Support the management and preservation of data. A data model will not only support the discovery and reuse of information but also its preservation for long-term access.
The RUcore Data Model
RUcore tracks lifecycle and ecology of the data.
The life cycle of information is the interaction of the information object with place (place of creation, place of storage, etc.) and agent (the person(s) or organization(s) responsible for creating,describing, managing or using the information). When the object interacts with place or agent at a specific point of time, an event in the lifecycle of the information is said to occur. Events provide context for information use. For example, a piece of sculpture may be created by a sculptor ("agent") during her impressionist period in Paris ("place") in 1963. This is one event in the life of the sculpture. That same piece of sculpture may be purchased by the head curator ("agent") for permanent display in the garden of a New York museum ("place") in 1978. This is another event in the lifecycle of the data.
A data model should be designed to meet the user's core information needs, as identified by the International Federation of Library Associations and Institutions (IFLA) [1] :
- Find. Can the user discover the information object?
- Identify. Does the description provide enough information so that the user knows what he has found?
- Select. Can the user select among two or more competing resources that were retrieved with a search?
- Obtain. Can the user obtain the resource?
An event-based data model can meet these core information needs without overloading the object with context specific to one type of scholar or one field of study. A researcher looking for works that demonstrate the influence of impressionism on the artist would discover this work, as would the high school art teacher planning a field trip to view the artist's work in situ. Each user finds the information she needs by discovering a different "event" in the lifecycle of the data.