Electronic records management is just as critical, if not more critical, than traditional paper-based records management. While general records management issues are relevant regardless of the form of the record, electronic record keeping has some special considerations, especially for long-term records. The life expectancy of an electronic file is relatively short, compared with the potential life span of the record itself. The media on which the record is kept and even the format it is kept in may be good for only a few years at best, whereas the lifetime of the record may be forever (its retention may be permanent). This paper concentrates on long-term records, defined as those records that have retentions from 10 years to permanent.
Whether it is a personal computer (PC) or an information management system, the system where the electronic record is maintained should be able to manage and account for that record from the time it is created. In most cases, this means taking the file into accountability from the time it becomes a record. In most cases, this means when a file is declared to be a record and when the information is final. In addition, interim longterms or versions may be records if they document company activities or contain information of value, such as historical information or policy decisions. Thus, an electronic record could be the final version that is produced electronically or the electronic file of a document, even if it is published in hard copy.
Taking the file into accountability means identifying the record file, preserving the file for later use (storage and retrieval), maintaining the integrity of the file, ensuring that the file remains accessible, and considering how long the file must be retained. In short, the use of the file during its full life cycle must be accounted and planned for, taking into consideration the following issues:
In addition, a number of other issues must be addressed when the record file is created. First, the file should be retrievable without loss or change. (See discussion on Storage and Retrieval.) Second, access to the file must be controlled such that it cannot be revised (written) or deleted without appropriate approvals. (See Security.) Third, the file must be maintained intact in a form that can be reproduced later, and decisions must be made regarding the media it will be preserved on, the format of the file itself, and any hardware or software applications required for its use. (See discussion on Migration.)
Sources of information for best business practices include:
For further help regarding the creation of electronic records, contact Records Management and Document Control at LMES or Records Management at LMER.
The system should be able to locate all electronic and related hard-copy records (e.g., on paper or microfilm) associated with that system. The best way to do this is to create an index for the record files (electronic and hard-copy) using a PC, client-server, or mainframe software application. The index should contain the metadata for all of the documents and the relevant fields defined by users: for example, title, document number, process number, program, revision number, page number, date, responsible person, document type, classification or group, and keywords.
Full-text indexes provide the most complete form of indexing at this time and the greatest amount of control over what goes into the index. These indexes can built to control the content and the terminology of the index. Control over content is especially important, since it allows the user to input or receive records external to the system where the index resides (e.g., information from the hard-copy records or records from another electronic system). The user can also specify the data elements to be indexed and can implement controls, such as use of a thesaurus or data dictionary, for data elements to ensure that file classification and indexing are consistent. Full-text indexes employ robust databases such as those used by BASIS. [For example, see the Central Publications and Presentations Registry, which was developed using BASISPlus.]
Automatic indexes use software applications to index electronic files on a single system (including scanned, imported, or external files). These applications work in a given structure (e.g., a hard drive on a PC or a directory on a server) by scanning the files and creating an index. If there are hard-copy records associated with the electronic files, automatic indexes may be inappropriate. These indexes are usually dependent on terms or keywords in their dictionaries, although they can also contain some metadata elements derived from the files themselves (such as the date of creation). The dictionaries can be edited by the user to delete common terms like "and" or to add relevant words. The indexes can be updated automatically when new documents are added. These applications provide search capabilities based on the keywords, and some support full-text retrieval. Search tools can often provide a ranked list of documents retrieved through the search facility. (Applications that use automatic indexing include PageKeeper for the PC and ZyIndex for the PC or server.)
Full-text search tools can also be used to locate unindexed electronic records. Like automatic indexes, it is not recommended that these tools be used alone for searching records without an index. They may return unneeded files or fail to find all of the relevant records, if the search terms are not defined well enough. Also, there is no linking of related documents through grouping or classification. Full-text searching can be useful if the user is unfamiliar with the indexing system, since these tools employ user-defined search terms, Boolean operators, or "fuzzy" logic searches. "Fuzzy" logic searches will actually look for words that match or are similar to the word being searched (e.g., with some piece missing or misspelled words). An example full-text search tool is the OpenText Index. An example full-text search tool is the OpenText Index.
Help for using some type of defined indexing terminology can be found using the following resources:
Contact IMS Project Management for more information regarding indexing systems for electronic records.
Administrative guidance on protecting sensitive information may be found in "Definition, Identification, and Protection of Unclassified Sensitive Information in the Computer Applications Arena." At Lockheed Martin, a certification program for all computing resources in the classified and sensitive-unclassified domains is described in CP-214, Application Software Security Certification. The Technical Architecture Specification (TAS) provides an organized collection of information technology guidelines, standards, and preferred deployment strategies.
Information management groups at LMER and LMES that may be approached with questions concerning system security include:
Preservation of content is at the core to maintaining the integrity of the record since the record is only as good as what it contains. Maintaining content means preserving all of the information contained in the file. For example, if there are links to information in other files, those links or files must also be preserved. Preserving the content may also mean preserving the format, layout, or structure of the file. For example, if there are legal signatures in the file, these must be preserved intact as they appear. If these are legal documents with page references, the exact page layout may have to be preserved. (See discussion on Document Formats.)
In some cases, maintaining the integrity of the record file also means preserving its attributes (i.e., origins, chain-of-custody, migration activity, and format). Electronic records taken out of their context may be less reliable than those with their attributes intact. For example, the uniqueness of a particular record may make it necessary to preserve that file's place in time (such as the logbook that recorded the actions at a certain time and place in a laboratory). Or, the history of the record may need to be preserved for tracking purposes, such as the chain of custody (and approval) of a procedure.
Preserving the integrity of the file also means ensuring that the content of the file can be retrieved and read in the future. For short-term retention, this may be as simple as maintaining the information file on the original source. For long-term storage, however, this may mean preserving the hardware and software application that created the file; migrating the file (and its data) to another format; or saving the file in a standards-based format. (See discussion on Migration.)
Administrative guidance on protecting and preserving information may be found in
Preserving the integrity of a record is fairly complex and may be expensive to ensure. The value of the information must be considered as well as the risk of loss. A decision as to what is necessary to protect to ensure integrity should be made during systems design. For assistance with identifying records and their retention schedules, see Records Management and Document Control at LMES and Information Management and/or the Business Applications & Information Integration Sections of the Computing, Information, & Networking Division at LMER. For help with preserving or migrating files, contact IMS Project Management.
Personal computer backups for record copy are the responsibility of the owner. Backups can be made to a number of different media, including diskettes, magnetic tape, or CD ROM. Server and mainframe backups can be performed automatically or by the system manager. It is recommended that these files be backed up automatically by the system manager or manager as part of the maintenance of the system.
For Energy Systems and Energy Research assistance, see Records Management and Document Control at LMES and Information Management and/or the Business Applications & Information Integration Sections of the Computing, Information, & Networking Division at LMER.
With electronic records, it will not only be useful, but necessary, to migrate the files from one system or format to another over time to maintain the files and ensure they will remain accessible over their lifetimes. During the lifetime of an electronic record, the actual media on which the record is kept may deteriorate; the hardware platform on which the record was created may become outdated, replaced, or upgraded; or the software application may need to be upgraded or replaced. The individual owner who creates the record is responsible for ensuring that it is retrievable and accessible during its entire retention period. Most importantly, the owner should ensure that the information contained in the record remains intact during its lifetime (see Integrity) by focusing on the record's retention period, access and retrieval, storage, and costs.
The life expectancies of storage media, hardware, and software are relatively short compared to the lifetimes of records with long- term retention periods (up to permanent). For very long-term retention periods, media may deteriorate. Even for retention periods of 10 years or more, during which the media may remain intact, supporting hardware and software may have to be upgraded well before the retention period has expired. Hardware may be upgraded every 2-5 years and software from 6 months to 2 years; for long-term electronic records, that means it will become necessary to migrate the record files to new media or to new platforms or applications at those same (or similar) intervals.
The frequency of access should be taken into account when planning for the migration of the record files. A flat file stored on magnetic tape is appropriate for data with a low retrieval rate. However, other media forms may have to be considered for information with a higher retrieval rate. For example, if the file is accessed daily, it may simply be preserved on magnetic disk or a CD server that allows simultaneous access to multiple users.
Physical storage of the record can be accomplished in a number of ways. The record can be stored logically (in order or in segments in one system) or distributed across several systems to be recreated by the system into the record. Although both records would look similar to the viewer, the storage mechanism is much different the method would have to be accounted for in migration plan.
It is also important to determine what is the record: the data itself or the data and the format of the record. For example, the data associated with a monthly budget sheet tends to be more important than the format or appearance of the budget sheet itself. On the other hand, a piece of correspondence mandating a policy change depends on its format or appearance; it loses much of its authority without the signature block and/or signature. In either case, it may be beneficial to archive the file in one format and create a separate "information" version for retrieval. For example, if a record is "active," it may be desirable to use an easily accessible format, such as hypertext markup language (HTML) for the world-wide web (WWW). However, the record copy may need to be preserved in another format (such as Adobe's portable document format or PDF) to preserve the exact appearance of the original.
No single strategy for migration applies to all forms of electronic records. Migration strategies and their associated costs vary for different media, applications, formats, and degrees of computation, display, and retrieval. The two most common migration strategies are to change the media and to change the format:
Change Media - this strategy involves transferring record files from less stable to more stable media. The most common option has been to print electronic files onto paper or microfilm. Computer output to microfilm (COM) and Computer Output to Laser Disk (COLD) technologies enable data to be written directly from the computer to microfilm or laser disk for storage or distribution, eliminating the need for paper. Migrating electronic files by copying them onto other storage media is another option where the information is unencoded and independent of particular hardware and software. This option is most cost-effective in cases where retaining the content is paramount, but display, indexing, and computational characteristics are not critical. Until the time that more robust and cost-effective migration options are available, printing to paper (or film) or preserving flat files on a more permanent media may remain the preferred method of storage by many.
However, changing media may not be feasible for complex systems. Data relationships embedded in databases as well as computational capabilities may be lost. Further, complex graphic displays and indexing are usually lost in paper or flat files. Where it is important to preserve these capabilities, changing format may be more appropriate.
Change Format - this strategy reduces a great number of formats to a smaller number of standard formats that can still encode the complexity of structure and form of the original. That is, it involves changing the original proprietary format of the record to a standards-based format, such as the standard generalized markup language (SGML) text-based or TIFF graphics format. (See Appendix A for a discussion of document formats.) Changing format has the advantage of preserving the display, dissemination, and computational characteristics of the original in a format that does not have to be repeatedly migrated to a new one. For databases, as with the change media strategy, a reduction of some data relationships and computational capabilities may be lost using the change format strategy.
Currently, at LMES and LMER, there is no single, company-supported approach for migration. Migration must be approached on a case-by-case basis and it must be addressed in the early stages of system development. When migrating files, the following steps should be addressed:
Cost comparisons for migration also need to be conducted by the owner of the information in conjunction with appropriate technological personnel. Routine factors to be considered include storage costs (device procurement, maintenance, and operation), media costs, and actual costs to move the file as well as access costs (document server procurement, maintenance, and operation), software, printing, and delivery. Long retention periods could require multiple migrations, each with their own associated costs. The owner of the record information most likely will be expected to meet the cost of the migrations.
For help with retention periods, the owner can contact Records Management at LMER and Records Management and Document Control at LMES. For help with access and retrieval, storage, and costs related to migration strategies, the owner can consult with IMS Project Management.
A records retention and disposition schedule is the tool for conducting records management in a systematic, rational manner. It is this documented systematic and execution that provides the legal basis for credibility for destroying records in the general course of doing business. A retention and disposition schedule identifies the following:
DOE and its contractors (including LMER and LMES) maintain records that are specific to the agency and must propose schedules for those records to be approved by NARA . Because LMES operates under a contract with DOE that carefully defines ownership of records, most, if not all, records generated at LMES are the property of DOE. NARA, by law, has the role in the United States to regulate records generated by government agencies, including DOE. NARA includes in its definition of "disposition" the transfer of eligible records to Federal records centers, transfer of permanent records to the National Archives, and disposal of temporary records.
The records schedule provides the authority for the disposition of agency records and identifies records as either temporary or permanent. Temporary records are those approved for disposal, either immediately or after a specified time or event. Permanent records are those having sufficient value to warrant continued preservation by the government as part of the National Archives of the United States.
Through the General Records Schedules (GRS), the Archivist grants approval for disposal of temporary records that are common to all agencies. The GRS describes temporary records that are common among Federal agencies and authorizes their disposal without further approval from NARA. GRS 20, "Electronic Records," authorizes the disposal of some routinely created records such as processing files; most files relating to administrative functions; and the documentation relating to a disposable master file or database. GRS 23, "Records Common to Most Offices," covers electronic records produced as a result of office automation applications. It authorizes the disposal of word-processing documents for which the paper is the record medium; tracking and control files for disposable records; and databases developed on personal computers to support administrative functions.
The DOE Records Schedule (DOERS) establishes retentions for DOE's programmatic records and is approved by NARA; however, some overlap with GRS occurs. The Approved Comprehensive Records Schedule consists of a subset of the DOERS and GRS schedules currently in use at Energy Systems.
Best practices for the establishment of retention and disposal schedules for electronic records can be outlined as noted below:
Sources of information for best business practices include:
For help with retention and disposition schedules, the owner can contact Records Management at LMER and Records Management and Document Control at LMES.
At a minimum, it is suggested that system documentation should include the following items:
Additional useful information which could be part of the documentation includes: screen layouts, report layouts, code books, and thesauri employed.
Although there is no definitive format utilized by external companies for system documentation, in the local LMES environment, there are guidelines for documentation in the Software Management Framework strategy. Also, there is a very structured approach in the Y-12 Series 80, which includes documentation of system requirements, functional system design, computer system design, user acceptance testing , and software release and configuration control. Creation of system documentation should be commensurate with the complexity of the system; for example, a company-wide system will require a more in-depth documentation than a standalone PC application. The minimum steps noted above should be addressed for each system as appropriate.
Electronic records must be accessible for as long as they are needed (i.e., for their retention schedule). The user should be able to be view and print a file either by using its native format and application or by using another format that can duplicate the original file. In effect, this means preserving the file in its original format (e.g., word processing format) or in some format that duplicates the file. If the file is preserved in its native format, this means that the native application (and, in some cases, the operating system) must be preserved with it, or the file must be migrated to the appropriate format. For very short retention schedules, this may the most effective means. However, for most records with long retention schedules, it is not.
The other alternative is to preserve the record in a format that reproduces the original document without losing the integrity of the document. In this case, it means using a standard file format. Until recently, ASCII has been the most recognized and accepted format for electronic records. However, ASCII is limited to text and data; it does not provide for image files or files containing images (which cannot be converted to ASCII). Because of these limitations, other standards are being considered for use, including SGML, HTML, and PDF document files and TIFF image files.
Standard general markup language (SGML) is a standards-based markup language that preserves the format and structure of a text-based document (with appropriate links to image files or tables). SGML-tagged documents are becoming more widely accepted as record copy because they preserve the integrity of the document intact and are based on an open-systems, nonproprietary format. They are viewable across a variety of platforms, although they do require an SGML-compatible viewer or application.. Hypertext markup language (HTML) is a standards-based format (an SGML document type) that is the de facto standard for world-wide web (WWW) browsers. Although not necessarily appropriate for records storage, WWW servers and internet and intranet technologies do provide a widely used method for accessing documents. HTML should, perhaps, be considered in conjunction with other formats for providing access to records.
Portable document format (PDF) files contain all of the formatting language needed to reproduce the document in its entirety with the text, formatting, graphics, and even color intact. This is the strength of the PDF format---the ability to reproduce the document in almost exactly its paper form in a searchable format. Once retrieved, this file is viewable across a variety of platforms. (Although PDF is a proprietary format of Adobe, Adobe provides a free reader that can be used by anyone who has access to the file, regardless of whether that person has access to Acrobat.) Also, PDF is fairly secure as a format; it is not easily edited, and revising a document requires that a new PDF file be generated. Optical Character Recognition (OCR) can be used to convert hard-copy documents to ASCII text or PDF files (with embedded images). Documents saved as TIFF or PDF files will preserve the images associated with scanned pages converted to ASCII files.
Nontextual material (e.g., 3-D models, drawings, videos) must also be considered in an electronic record-keeping system. As with textual files, the file may be preserved in its original format or in some format that duplicates that format. As previously noted, if the file is preserved in its native format, the native application (and, in some cases, the operating system) must be preserved with it. The alternative to maintaining the native format and operating system is to provide a duplicate format. Some of the text-based formats discussed earlier will also work with nontextual material, such as PDF and TIFF.
IGES (International Graphics Exchange System) is the current standard format for 2-D drawings and works well for some 3-D information (e.g., surfaces and wire frame drawings). VRML is the standard for Web-based 3-D models. However, for 3-D CAD drawings, users should refer to ISO 10303, which is currently being revised by PDES Development Corporation. ISO 10303 is more commonly known as STEP (STandard for the Exchange of Product Model Data). STEP is the international data description standard which will provide a complete unambiguous, computer-interpretable definition of the physical and functional characteristics of a product throughout its life cycle.