A repository must meet four critical tests, in order to live up to the trust placed in it by the faculty who deposit their work, as well as researchers who use the collections:
A digital resource may be "born digital," such as a database created in Microsoft Access. Unfortunately, the required application may change in future versions, so that the database over time can no longer be accessed and read. One or more "application-independent" or canonical representations must be created to insure that the born digital file is still accessible as technologies change. In the case of data sets, the Rutgers University Libraries are developing canonical XML representations that can be read, searched, interpreted and used without requiring the database application. In addition, CSV (comma separated values) and plain text representations will also be created to support the seamless migration of the data from one database management system to another.
Often, modifications to digital information are useful and important, an example being the correction of a significant error in an e-journal article or the editing of raw data into a useful dataset that can be manipulated by an application such as MATLAB. A trusted repository will maintain the canonical or "digitally true" representation of the information in perpetuity, as well as an audit trail that documents any changes to the data. The display copy provided to users may be the latest version, with all authorized changes, but the scholar will have the opportunity to view entire lifecycle or "digital audit trail" of the resource-- the original resource and all iterations and modifications to the resource. Since scholars often build forward on the work of others, particularly through citations, it is critical to maintain the integrity of the referential chain, even while providing the most accurate and complete version for active use.