

Arquivo.pt celebrated 15 years on November 8, 2022. Arquivo.pt Manager, Daniel Gomes, tells you all about the journey of this FCCN Unit service, previewing some of the new features expected for the future.
The official Arquivo.pt project began in January 2007 at FCCN. However, the initial idea came about in 2001 with the call tomb project! – we have an alternative search engine!, developed at the University of Lisbon and supported by FCCN.
Following the tomb! it also appeared in an academic environment the Tomb, a name derived from an analogy with the Torre do Tombo, which was the first prototype of a Portuguese web archive. Arquivo.pt (or, at the time, Arquivo da Web Portuguesa) would emerge a few years later. These are some of the main milestones of its 15 years of existence.
Topics in this article:
Arquivo.pt Timeline
- 2007: Launch of the Portuguese Web Archive (AWP) project at FCCN
- 2008: First collection carried out by AWP
- 2009: Publication of recommendations for publishing preservable web content
- 2010: Launch of first version of the experimental prototype of the search and access service
- 2011: Open source availability
- 2012: Launch of API for accessing Arquivo.pt
- 2013: Decree-Law which mandates the preservation of content available on the national Internet by the FCT-FCCN
- 2017: Start of training program
- 2018: First edition of Arquivo.pt Award
- 2019: Launch of Arquivo.pt Memorial
- 2020: Arquivo.pt officially preserves the websites of national scientific projects
- 2021: Launch of the web's largest archived image search service
- 2022: Availability of derived open data, launch of the SavePageNow and File404.
The Challenges of the Present
Currently, there is a widespread lack of awareness of the value of information published online. Most of the information used to organize our digital societies is published exclusively online but disappears quickly, and most people are unaware that online information, while widely available, is extremely ephemeral.
In this context, one of the main challenges is finding ways to capture the attention of ordinary citizens and decision-makers to the problems arising from the failure to preserve information online.
Lack of awareness about the value of preserving information published online
In 2022, the Arquivo.pt preserves 28 million websites published since the 1990s. It is estimated that the investment made to build these websites was around 540 billion euros. Portugal's Gross Domestic Product in 2021 was 240 billion euros.
On the other hand, 540 billion euros is a rough estimate of the investments made to produce the web data preserved by Arquivo.pt and which, otherwise, would have been wasted forever.
The true economic value of information preserved on the web is difficult to estimate. Some information loses value as it becomes obsolete. However, other information becomes more valuable over time, providing unique historical perspectives that allow for the identification of trends. However, it is undeniable that digital societies invest heavily in the production of online information.
The important question that must be constantly asked is: Can societies afford to continue wasting their information online?
Hire, train, and retain web archivists
A web archive is an information system that collects online information, stores it, and provides access, typically through web user interfaces. Therefore, it requires the hiring of web technology specialists, who are not readily available in the job market.
On the other hand, archiving the web is a Big Data and web archives have to compete with Internet giants (e.g. Google, Facebook, Amazon) to hire some of these professionals.
Since web technologies are constantly evolving, web archivists must be knowledgeable about both current and obsolete web technologies.
Archiving the web requires cutting-edge technology experts with a historian's mindset, and this mindset isn't taught in any course. Hiring qualified staff to develop and operate a web archive is a major challenge. Retaining these specialists for a long period of time is even more difficult.
The Future: Arquivo.pt as a societal infrastructure
Web archives have the noble mission of preserving historical web documents for future access.
Scientific researchers have been using web archives since their inception. However, as the internet penetrates every aspect of daily life, web archives must assume their role as useful memory infrastructures for all citizens and organizations so that they can serve digital societies.
To this end, it is crucial that web archives offer general-purpose services that allow users to easily leverage preserved historical information. To this end, Arquivo.pt has launched innovative services such as SavePageNow, Complete the page, Memorial or the File404. Share them so they can be useful to more citizens!