Basic information and data

  • The web archiving project has started at the beginning of 2017 in the National Széchényi Library, the test period lasted until 2019.
  • The aim is to preserve and make searchable documents and information sources produced and distributed in digital form.
  • The primary scope of collection is scientific, cultural, educational, and public sphere web content.
  • Types of archiving:
    – periodic harvests of selected Hungarian websites (by theme, genre, institution);
    – events related harvests (news portal sections, relevant websites and blogs);
    – snapshots of the Hungarian web space (web servers under the .hu domain and other Hungarian related web content).

    Registration and harvest data
    of e-periodicals
  • The web archive uses open source, free software.
  • Only a small part of the collection is public, for legal reasons.
  • Statistics at the end of 2019:

Closed archive:
12 thematic sub-collections (e.g. literature, art, culture, religion, higher education, research, government, public collections)
1 sub-collection by genre (e-periodicals – over 4,600 websites)
5 event-based sub-collections (eg elections, Olympics)
approx. 25,000 selected websites saved 1-5 times and front page screenshots
approx. 250,000 automatically collected sites saved 2 times and front page screenshots
more than 600 million files / URLs saved
35 terabytes total size

Public archives:
186 selected and licensed sites saved 1-3 times with front page screenshots and metadata
44 NSZK websites saved 1-2 times with front page screenshots
1 event-based sub-collection (Rákóczi Memorial Year)
0.5 terabyte total size