13th European Conference on eGovernment – ECEG 2013 1 | Page 651

Flavio Costa, Jean‐Yves Le Meur and Tim Smith
2.1 Standard document repository features useful for e‐Government
Invenio was designed for document repositories. A lot of the needs of document repositories are shared by e‐ Government applications and e‐services, including long term preservation, ergonomic and quick search, flexible and powerful classification and taxonomy of the documents, social and cooperative tools, interoperability, scalability, high performances with high numbers of documents and of users.
The search and performance abilities of Invenio apply to the full content stored in the system, meaning both the metadata and text of up to millions of documents, through full text indexing of all items. Searching can then be performed, at the choice of the user, in some metadata( title, abstract, date, etc.), in the text contained in the documents or in both. And Invenio will always provide the answer instantaneously.
Invenio provides functions to automatically convert and standardise file formats at submission( upload) time. These and other features are already available in Invenio as a very advanced document repository software( Andro, Assalin, Maisonneuve 2012).
2.2 Long term preservation
Long term preservation is assured by Invenio based on the most advanced and proven concepts and practices. The targeted format of documents for archiving is PDF / A‐1( Portable Document Format, PDF 1.4), which is indeed the version of the PDF format conceived for long term preservation archiving, in its most constraining but also most conservative sub‐version.
Although relying on a very simple and direct way of storing documents in a file system, a specific module( BibDocFile) is dedicated to provide complex functions associated to fulltext files. In this way Invenio can group different documents associated to the same record( including the different versions which may be needed during the lifecycle of the document) as well as different formats, and can associate to documents various information like the type of document, description, comments, etc.
2.2.1 Standards for long term preservation
The standardization of the metadata is probably more important than the file format when aiming at long term preservation. The use of a standard which is widely as possible recognized as such, provides a practical advantage in long term preservation. The fact that the standard is widely used favours a longer life cycle for that standard, hence reduces the risk that a system becomes obsolete because of the standard it is based on reaching end of life. In addition, the standard being widely used favours interoperability( which in turn promotes longer life of the standard and of the systems based on it). Invenio uses MARC 21( MARC 21 2012) as the standard for metadata. Although many other standards exist which have been developed more recently, MARC is widely used and rates better in several aspects than the best alternative standards. These aspects typically include granularity and consistency of metadata and extensibility, all aspects which are relevant for data preservation. Hence, MARC 21 is at the same time a conservative choice and well suited to long term preservation. Also for future proofing, Invenio is in the process of becoming independent from any particular standard for metadata, enabling it to adopt any standard.
2.3 Invenio: Open solution
Invenio deals with any type of document and can be considered compatible even with future document types and formats. In fact the documents referenced by the metadata are simply and directly stored in the file system and linked via pointers. Hence any document that can be stored in the file system is potentially dealt with by Invenio. With Invenio’ s modular structure it is also easy to integrate modules performing tasks specific to each document type.
Invenio uses wide and recognized standards in document management to favour openness to other systems and documents, such as:
• the US Library of Congress standards for bibliographic information description( MARC 21) to store metadata.( An option under development will allow to store metadata with any type of other standard)
• OAI‐PMH( Open Archives Initiative Protocol for Metadata Harvesting) as protocol to exchange records with other systems, which allows Invenio to interoperate with other institutional repositories
629