This small collection has established in the pilot web-archiving project of National Széchényi Library. It contains those archived websites that are being permitted for public service by the copyright owners. (We always welcome further suggestions on this template. For public service, the copyright owner must sign this contract). Our aim is to demonstrate the capabilities and limits of the current web archiving technology. Although the selected websites in this archive can be well harvested in an automated way, some errors, mistakes, lack of some content elements can appear on the archived websites. A part of these errors and mistakes can be corrected by applying the recommendations to create crawler friendly and archive friendly websites.
Archived items marked by red arrows, had made by Heritrix, Brozzler, Webrecorder or HTTrack software, usually in a limited depth of the original website. Display of the archived items are being made by OpenWayback and PyWb software and/or by Conifer, the online version of Webrecorder. The archived items made by HTTrack in a file system structure can be seen through the webserver. (In each case, you have to click to the date of saving of the item you want to see). The capabilities of each harvesting and displaying software is different. Where there are many red arrows are appearing it is worth to try each of them.
The other arrows are directing to the screenshots of the original homepages, to archived copies made by Internet Archive, and to the original site (if this arrow is gray, the website is already dead.). By clicking on the arrows, the websites are appearing on a new browser page. In this way, the archived versions can be easily compared to the original versions. The yellow button is referring to a graph interface that displays the outgoing and ingoing links from/to the archived items. (As this demo archive is small, you can find rather few ingoing links.) The brown arrows in the last column are referring to the corresponding metadata record of an archived website (by pressing CTRL/U in the browser the original XML source code can be seen.). In case of the latest items, metadata records will be made later, so the brown arrow is missing. A metadata record describing the sub-collection itself is available here.
By the SolrWayback software, full text search function of the archived websites is available. Sorting by domain names, file types and year of archiving can customize lists of hits further. In these lists, by clicking on the capitalized name of a website or file, the archived version is appearing. The address on the Url: line is referring to the original website or file. By clicking on See raw data, the details of a selected archived item are shown. Selecting an item in the list of hits further information is appearing from the corresponding website or domain by clicking on Toolbar on the upper left side.