projects

You are here

Full name:SZTAKIMemory - Non Deletable Digital Archive System as Institutional Memory
Start date:2017.01.01
End date:2017.12.31
Participants:SZTAKI – Institute for Computer Science and Control, Hungary
Coordinator:SZTAKI / DSD

The aim of the SZTAKIMemory project is to create the infrastructure for the long-term digital preservation of all research-and-development results and the knowledge that have been gathered throughout several decades in the Institute. By applying SZTAKIMemory, each MTA SZTAKI researcher will be able to store their own research results, and the publications, notes, necessary data, programs, virtual machines, etc. that supported their research; in a way that all these items could be searched, restored, read and run even after years or decades.

Though the actual system can store and restore only internal data at the moment, SZTAKIMemory could be a strong base to put Science 2.0 principles into practise. As the required research topic and result are stored in the very same place, they could be later shared with the scientific community on a wider scale. As not only the results of a given research but also their corresponding data will be available, the results could be reproduced by any other researcher.

 

SZTAKIMemory is capable of storing data, such as:

 

Entities of the collection scope

External objects

 

Patents

Electronic versions of own and external patent descriptions

Works of art

Works of digital art created in association with MTA SZTAKI

Media publications under the name of MTA SZTAKI (referring to MTA SZTAKI researchers, employees, projects, etc.)

Digital multimedia documents in full-text version, full audio-video recordings

Internal (in-house) objects

 

Professional profiles of MTA SZTAKI researchers

CVs and other certificates of professional skills

Personal research documents, collections of media publications

Digital copies of books, articles and other information - from the personal collections of present and earlier MTA SZTAKI researchers

PR objects

MTA SZTAKI PR (multimedia) documents and their main versions

Datasets

Datasets created in MTA SZTAKI, in database format, if it is storable regarding its size

Software

Software systems and their main components produced in MTA SZTAKI, the source code and the runnable (?) versions of their major and final versions (eg. as virtual machines)

Hardware specifications

Technical specifications and the complete production documentation of hardware tools produced in MTA SZTAKI

Project documentations

End result of all projects of the Institute including the final documentations, software source codes, multimedia files, etc.

Relations and annotations between digital objects, defined by the user

Relations and/or annotations between stored digital objects defined by a given user are stored in a personalized way. The form of annotation can be textual or pictorial, typed or handwritten.

Information and datasets regarding the internal operations of SZTAKI

Periodic dumps of the actual full-text intraweb and the major internal databases (eg. inventory) to ensure long-term preservation.

Infrastructure specifications and traces

Specifications and traces of MTA SZTAKI infrastructure (electricity, water, gas, sewage, IT network?, security, fire and entry-exit system, etc.)

Building, equipment, furniture, descriptions

Technical information including architectural and engineering plans and drawings


Long-term preservation is ensured by several strategic and technological solutions:

  1. Standards based metadata-handling. In the course of implementing SZTAKIMemory, we applied semantic web technologies and created a solution that is compliant with the internation standard OAIS Reference Model aimed at the long-term preservation of digital information (Reference Model for an Open Archival Information System (OAIS), ISO 14721:2012 ). It enables us to transfer SZTAKIMemory data to systems that support the ISO 14721:2012 standard, therefore our solution is not limited to one specific implementation.

  2. File-system based storage. Data and their metadata are all stored in  the file-system. Metadata is stored in textual form (XML/JSON), and the index needed to their online access and search can be (re)generated from the original files any time necessar. So SZTAKIMemory does not rely on a traditional database-management system, the file format of which can change from time to time, and the software that supports the system also needs to be maintained. Instead, all descriptive data are both machine- and human-readable, and therefore makes it possible to preserve metadata even in printed format, or restore the original metadata using the printed version.

  3. Redundant data storage. Data that are accessible online need to be simultaneously stored on more servers, which are geographically distinct from one another (eg. SZTAKI premises or external cloud-providers).

  4. Archiving. It is also necessary to save data occasionally on long lasting media that are potentially slowly readable (blueray, tape, or even printed out on paper). Their readability and restorability need to be checked at regular intervals, and the archives should be copied on newer type media as technology develops.

  5. Device-independent access. SZTAKIMemory is accessible by SZTAKI employees using their SZTAKI credentials via web, mobile or tablet. SZTAKI employees have access to their personal and department data, as well as to the global figures of the Institution (SZTAKI) as default, due to the finely-adjustable authorisation system (jogosultság szó használata szerint!), however, such data can also be shared with other departments, groups or the Insitution as a whole.