How does RUcore work?
How does RUcore work?
RUcore includes a sophisticated and flexible information architecture that can support archiving and long-term access to any type of digital information.

RUcore is based on two complementary architectures, the data model, which enables users to understand and find digital information, and the Fedora repository architecture, which supports the preservation, management and use of digital information.

RUcore is a modular repository architecture that can integrate with tools that support the resource creator and the resource user in their collaboration to share information for the public good. A critical tool is the Workflow Management System, a web-based utility that enables Rutgers faculty and collection collaborators to upload digital objects and create metadata for them.

The components of RUcore integrate to form a seamless workflow that takes data from creation to preservation to long-term access for a wide range of potential users.

RUcore - How it Works

The RUcore Data Model

The data model is the heart of the RUcore architecture. The data model provides for the basic description of information so that it can be effectively discovered and used. A good data model should be:
  • Understandable by the information manager and the user. A repository can hold terabytes or petabytes of data. The data model shows everyone how the information is organized and interrelated for use.
  • Context independent. A good data model will support all contexts, those that information creators bring to the repository, which can be layered on top of the data model, and the information contexts of future generations of scholars, which can't even be envisioned. The rapid proliferation of digital information is already revolutionizing research. A good data model should be flexible enough to accommodate major transformations to the research and publication process.
  • A Representation of "living data." Digital data lives in relationship to other data and may be repurposed many times over its lifecycle. For example, a data set may be created in a physical experiment, published as the appendix to a journal article, reused in a computational simulation, all within the space of a few months! A good data model will track the events and interrelationships of data, to expose not only the data for reuse but also the lifecycle and ecology of the data.
  • Support the management and preservation of data. A data model will not only support the discovery and reuse of information but also its preservation for long-term access.

RUcore Data Model

RUcore tracks lifecycle and ecology of the data.

A data model should be designed to meet the user's core information needs, as identified by the International Federation of Library Associations and Institutions (IFLA) [1]:
  • Find. Can the user discover the information object?
  • Identify. Does the description provide enough information so that the user knows what he has found?
  • Select. Can the user select among two or more competing resources that were retrieved with a search?
  • Obtain. Can the user obtain the resource?
An event-based data model, such as the underlying data model behind the metadata standard you select must enable users to meet their core information needs. In addition, it is helpful to think about users according to the roles they play-creator (the creators of information and metadata), viewers (those that view the information), evaluators (those that evaluate or critique the information) and repurposers (those who will modify the information to create a derivative or revised information object).

Fedora Repository Architecture

The repository architecture is Fedora, [2] an open-source repository application developed by Cornell University and the University of Virginia through a Mellon grant. Fedora-Flexible, Extensible Digital Object Repository Architecture-is supported by a community of developers, including the Rutgers University Libraries, Tufts University, OhioLink, Cornell and the University of Virginia.

At its core, Fedora is a modular, extensible repository architecture that can store, manage, and deliver any type of digital information. Fedora makes no assumptions about the type of digital information it contains. Fedora can support pre-prints, books, journals, data sets, multimedia-anything that a participant considers useful information.

Fedora consists of two core modular service suites, written in web services description language (WSDL):
  • Fedora Management Service (API-M): open interface for managing the repository, including creating, modifying, and deleting digital objects or components within digital objects.
  • Fedora Access Service (API-A): open interface for accessing digital objects and the behaviors (services) associated with them.
Fedora is designed to comply with the emerging information model for digital repositories, the Open Archives Information System Functional Model (OAIS) [3]. OAIS describes the core requirements of a repository that supports access to information for users, as well as the long-term preservation of information:


The OAIS Functional Model
http://nssdc.gsfc.nasa.gov/nost/isoas/
The OAIS Functional Model

Fedora provides a suite of services that support the preservation of information, particularly the concatenation of all the metadata needed to manage, preserve, safeguard and make accessible.

Metadata is "data about data" that is structured for consistency and clarity. Metadata only has meaning and value in relation to the information it describes.

Metadata is used to:
  • Describe information so that a user can find, identify, select and obtain the appropriate resource (Descriptive metadata)
  • Describe the information source, which is the first generation of information. The sculpture itself, in the example above, is the source, while the high-resolution digital image in the repository is the digital information object. (Source Metadata)
  • Describe the technical characteristics of the digital information object so that the user can display it and so that the image can be migrated to a new digital format, as better technologies emerge. (Technical metadata)
  • Identify the rights holder, the permissions given to the user to display or reproduce the object and any restrictions or pre-requisites for use of the object, such as enrollment in a course. (Rights metadata)
  • Identify events in the digital lifecycle of information, such as modifications to information, the creation of a new digital copy, etc. (Digital Provenance)
Each of these types of metadata provides a wealth of information that must be consistent across all information objects and durably linked to the object itself.

The libraries utilize an open standard maintained by the Library of Congress-METS (Metadata Encoding and Transmission Standard) to maintain all the metadata necessary to manage, preserve and make available information. [4] METS concatenates all types of metadata with the information object in a standardized XML envelope for easy management and transport. The Rutgers University Libraries provide a web application for any Rutgers information creator to submit digital objects into the repository and to create metadata for their information objects. This tool-the Workflow Management System-is currently in its testing stage and will be available for use by the Rutgers scholarly community in early 2006.

The underlying data model and metadata architecture allow faculty to customize metadata to support their context of use, while at the same time conforming to the context-independent data model that will support other researchers who may discover and use the information in a very different area of study.

Fedora supports sophisticated access to information through the creation of "intelligent objects." The information object is linked to behaviors (disseminators), which launch access procedures whenever the digital object is retrieved. Examples of disseminators include:
  • Automatically displaying front, back and side images for three-dimensional objects.
  • Adding a watermark whenever a digital image is printed or downloaded to a hard drive.
  • Synchronizing a video lecture and its written transcript to display, side by side.

In a future version of the Fedora architecture, the digital collection owner (a faculty member or publisher at Rutgers) will be able to select and assign disseminators to different types of objects using a web form, test the resulting displays and make changes to disseminators. Every aspect of information management through Fedora will ultimately be personalized, using simple web input forms and menus.

The Rutgers University Libraries and the RUcore repository platform offer a number of tools and standards to support digital publication and preservation of information, including:
  • Workflow Management System (WMS). The WMS is a web tool for adding digital objects and metadata to the repository. The WMS supports multiple strategies for adding objects and metadata, including adding an entire digital object collection and creating metadata for each object, or migrating metadata from the collection owner's database into the WMS. The WMS supports a full METS implementation, with metadata to describe and preserve the digital object. The WMS supports customized templates to create metadata specific to the collection owner's context and export of metadata in Dublin Core for sharing with other repositories, such as arXive.
  • Open Journal System. An e-journal platform that supports peer reviewing, publication and the archiving of each article in RUcore is currently being tested with several locally-published scholarly journals.
  • OAI-PMH Gateway. Open Archives Initiative-Protocol for Metadata Harvesting, is a protocol for sharing metadata and links to digital objects across repositories. OAI-PMH is a widely adopted standard that utilizes http commands and Dublin Core metadata to share information about resources across repositories. Support for OAI-PMH is often mandated by federal grants for sharing research results. RUcore includes an OAI-PMH gateway, as well as the ability to automatically limit record transfer to records belonging to a single owning organization or single collection, so that RUcore participants can share their data with selected collaborators automatically, on demand or according to a set schedule.
  • Search and Retrieval Utility. RUcore offers automatic OCR of documents, for full-text searching, as well as both XML and fielded record searches, using the open source search engine, Amberfish. Participants make their collections available via the repository interface, to provide a "one stop shop" for digital primary source material at Rutgers. However, participants will also be able to add a script to their own website to support search and retrieval of their own collections from the repository. The fact that RUcore is providing the search utility and digital resources will be transparent to visitors at the participant's web site.

Footnotes

Version 7.3.1
Rutgers University Libraries - Copyright ©2014