picture of
Daniel Gomes
Service Manager
Arquivo.pt celebrated 15 years on November 8, 2022. The Manager of Arquivo.pt, Daniel Gomes, tells you all about the path of this FCCN Unit's service, anticipating some of the novelties expected for the future.

The official Arquivo.pt project started in January 2007 at FCCN. However, the initial idea came in 2001 with the so-called tumba! - we have an alternative search engine! project, developed at the University of Lisbon and supported by FCCN. 

In the wake of tumba! still in an academic environment, Tomba appeared, a name derived from an analogy with the Torre do Tombo, which was the first prototype of a Portuguese web archive. Arquivo.pt (or, at the time, Arquivo da Web Portuguesa) would emerge a few years later. These are some of the main milestones of its 15 years of existence. 

Archive.pt Chronology 

The Challenges of the Present

Today, lack of awareness about the value of information published online is widespread. Most of the information used to organize our digital societies is published exclusively online but disappears quickly and most people are not aware that online information, although widely available, is extremely ephemeral. 

In this context, one of the main challenges is to find ways to capture the attention of ordinary citizens and decision-makers to the problems that arise from the non-preservation of online information. 

Lack of awareness about the value of preserving information published online

In 2022, Arquivo.pt preserves 28 million websites published since the 1990s. It is estimated that the investment made to build these websites was about 540 billion euros. 240 billion in 2021. 

540 billion is a rough estimate of the investments made to produce the web data preserved by Arquivo.pt that would otherwise have been wasted forever. 

The real economic value of preserved web information is complex to estimate. Some information loses value as it becomes obsolete. However, other information becomes more valuable over time by providing unique historical perspectives that allow trends to be derived. However, it is undeniable that digital societies invest heavily in online information production. 

The important question that must be constantly raised is: Can societies afford to continue wasting their information online? 

Hire, train, and retain Web archivists

A web archive is an information system that collects information online, stores it, and provides access typically through web user interfaces. For this reason, it requires the hiring of web technology specialists who are not abundant in the job market. 

On the other hand, web archiving is a Big Data activity and web archives have to compete with the Internet giants (e.g. Google, Facebook, Amazon) to hire some of these professionals. 

Since Web technologies are constantly evolving, Web archivists must be knowledgeable about both current and obsolete Web technologies. 

Archiving the web requires experts in cutting-edge technologies with a historian's mindset, and this mindset is not taught in any course. Hiring qualified personnel to develop and operate a web archive is a major challenge. Retaining these specialists for a long period of time is even more difficult. 

The Future: Arquivo.pt as a society's infrastructure 

Web archives have the noble mission of preserving historical web documents for future access. 

Scientific researchers have been using web archives since their inception. But as the Internet penetrates all levels of daily life, web archives must assume their role as useful memory infrastructures for all citizens and organizations in order to serve digital societies. 

To this end, it is crucial that web archives offer services of generic utility that make it easy to take advantage of the historical information preserved. In this sense, Arquivo.pt has launched innovative services such as SavePageNow, Complete the page, Memorial or Arquivo404. Publicize them so that they can be useful to more citizens!

Other related articles