Digital Preservation Policy
* See glossary right *
1.1 Digital preservation issues are critical to many aspects of the work of Hampshire Archives and Local Studies (HALS) at Hampshire Record Office (HRO) and Wessex Film and Sound Archive (WFSA). HALS exists to preserve and provide access to records relating to the history of the county and its inhabitants. Once records are no longer required for legal, administrative, financial or personal purposes, they may be preserved for posterity, to show how people lived and died, what decisions they made and activities they pursued. Part of the Record Office’s holdings is the archive of Hampshire County Council, which shows how the organisation has operated during its 100 year plus life.
1.2 The bulk of the Record Office’s holdings have so far been received in traditional formats, such as paper or parchment. However, an increasing quantity is now being received in electronic format, and access is being provided to the holdings of WFSA in digital format. Digital copies of hard copy originals are also being created in-house for preservation or access reasons, or deposited with the Record Office in place of the original.
1.3 This digital heritage is at risk of being lost to posterity. Contributing factors include the rapid obsolescence of hardware, software and storage media, uncertainties about resources, responsibility and methods for maintenance and preservation, and the lack of supportive legislation.
1.4 Best practice is developing in the area of digital preservation, so this policy can only provide recommendations based on current thinking, and will be reviewed regularly in the light of new research.
2,1 To address the risk of losing digital materials, HALS has developed a digital preservation policy and strategy. This policy outlines the Record Office’s approach to digital preservation, whilst the aim of the strategy is to describe this approach in more detail, including technical specifications where appropriate.
2.2 This will ensure preservation of digital material held at HRO, whether received from the Records Management Service as part of Hampshire County Council’s electronic corporate memory, deposited or donated by businesses, organisations or individuals, or created as digital surrogates, and to ensure that these can be made available to colleagues and customers, both internal and external, now and in the future.
3. Sources and examples of digitised material
3.1 Received internally as Hampshire County Council e-archive, for example databases, CAD (computer-aided design) files, the outputs from Hantsfile, the corporate electronic document and records management system
3.2 Received externally, as e-accessions, from many and varied sources, including other local authorities, official organisations, groups and individuals, e.g. digital photographs of listed buildings, parish plans, digital surrogates.
3.3 Created in-house as accessible surrogates for use in and beyond HRO, for example for Discovery Centres, exhibitions (latter may be short-term initially) and specific digitisation projects
3.4 Audio-Visual material held in WFSA and also received from sources as above, for example electronic social care records which might include video or audio recordings of a child’s natural parents, or VR (virtual reality) footage from public enquiries
4.1 Increasing storage capacity and decreasing costs does not mean that we should keep everything. This is because storage costs, though reducing, are not insignificant, and it is also important to be able to retrieve appropriate information in a timely manner from a lean and efficient archive. Information may be required to answer a Freedom of Information enquiry or an historical enquiry, and an e-archive should be able to retrieve the information required to meet its legal requirements and as part of its customer service.
4.2 Long-term storage of electronic records covers a variety of methods and media, including online*, near line* and off line* for both magnetic and optical media. The ideal digital preservation programme should ensure that three copies of a born-digital item, and two copies of a digital surrogate are made available on different storage media in different locations.
4.3 The original born-digital item, and its bit-stream*, should be stored on a separate server with two hard drives. There should be an air gap* between this server and the rest of a network. Only one hard drive should ever be connected to another, separate hard drive on a server which can be isolated from the network during the time this connection is live.
4.4 Second copies of born-digital items and digital surrogates should be stored on Digital Linear Tape (DLT). Access copies of born-digital and digital surrogates should be available on the network, or CD-R as appropriate.
4.5 The CD-Rs should be stored, along with DLTs, in the HRO strongroom.
5. Preservation and Migration
5.1 CD-R and DVD-R/DVD+R have life expectancies of 5 to 10 years. They will need checking regularly for outward signs of deterioration, and data will also need checking regularly using checksums* to detect signs of corruption and deterioration The checksum will also cover security, to ensure, and demonstrate, that data hasn’t been tampered with.
5.2 Born-digital and digital surrogates will need to be migrated to new storage media and accessible versions of software, typically at five year intervals.
5.3 costs of migration can be minimized by adhering to standards that promote open systems and interoperability of data, as well as careful selection of the most useful records to preserve, but especially by being involved as early as possible in advising projects which will result in output in electronic format.
5.4 The migration process is straightforward, and could be automated in the near future. However, at present, it should be done manually, as HALS does not have the technology necessary to automate the procedure.
5.5 The secure server approach described above will help to preserve the original born-digital items and bit-streams. However, a strategy for checking and migrating these files will need to be developed, in discussion with The National Archives and other organisations developing practice in digital preservation.
6. Access and Use
6.1 For images, HALS’s present policy recommends TIFFs (original version for master copy), high resolution JPEGs (second copy for high quality reproduction) and low resolution JPEGs (for access copies).
6.2 For other documents, Microsoft formats are suggested at present, as this is Hampshire County Council’s preferred supplier, and therefore any upgrade or change could be managed as part of work on other County Council files.
6.3 Adobe Acrobat files can be read at present, too, and this format may be useful for preserving e-publications.
6.4 Examples of formats which are unsuitable for long-term preservation are proprietary software, e.g. family history programs. This is because the information value doesn’t warrant the expenditure required to monitor and migrate a large number of very specific programs for which only a limited number of examples may be held at HRO. Family history programs also provide the means to reproduce the results of research in original sources, rather than the original sources themselves. The results could be rendered in an alternative format, for example Adobe, if necessary.
6.5 At present it is possible to provide online access via CALM for images of documents or born-digital image accessions where a catalogue record exists on CALM. However, on the advice of its suppliers, Axiell Ltd, CALM may not be able to provide access to non-image based documents, e.g. PowerPoint, Excel, etc at present. These could be accessed on CD ROM in the short-term, but an access system should be developed or purchased in the near future.
7. Emergency plan
7.1 We will maintain an emergency plan which will be regularly updated . All appropriate staff will be trained in the appropriate actions to be taken in the event of a major disaster. We will take all reasonable measures to ensure that no such disaster occurs.
7.2 Back-ups will be needed for e-archives, wherever stored, and a disaster recovery plan agreed with IT Services, implemented and tested regularly.
7.3 Additional security is also provided by having more than one copy in more than one format, therefore if higher level back-ups fail or become corrupted, we can fall back on other copies. These will also, in turn, need to be checked, maintained and migrated as appropriate.
7.4 Separate copies should be stored on separate sites, and HALS copies and IT Service back-ups will cover this.
8. Preservation of Hampshire County Council records
8.1 We will ensure that our own key records are stored and maintained in a manner suitable for long-term preservation. We will also advise other departments of the County Council on appropriate methods for the creation and preservation of their records in non-traditional formats including Internet, intranet, digital media and audio and video tapes. We will also advise on the legal admissibility of scanning and storage of data in digital format.
9.Preservation of records held elsewhere in the county/region
9.1 We will encourage good practice and provide advice to owners of archives on the care of their digital documents, whether or not they form part of the holdings of HALS or WFSA.
9.2 The promotion of effective preservation policies forms a major part of the work of Hampshire Archives Trust and it has provided advice to external bodies for many years on how to keep their archives. This will now be extended to electronic sources though digital preservation.
Agreed by HRO Management Team, August 2005
Revised August 2010
Air gap: space between preservation server and network for increased security
Bit-stream: series of 0s and 1s, usually refers to original file format of a digital item
Checksum: logarithm used to check that digital item hasn’t become corrupted
Near line: data which is stored near line is held in second level storage which is cheaper. Files take slightly longer to retrieve. For example, tape or disk libraries which are connected to a PC would count as near line.
Online: data which is stored online is immediately accessible. It is usually stored on a network or the hard drive of a PC.
Off line: data which is stored off line is not immediately accessible and make take some time to load, maybe from external media.