With every passing day a vast amount of information is uploaded to the internet through content published in websites worldwide. This information is readily available to researchers and can prove valuable considering the huge data samples it can provide. Data can be gathered from a website’s content but also from information about its origin, language or even its aesthetics. Color, typography, layout and many other parameters contribute to a website’s look and feel. Moreover, online internet archives maintain instances of websites from the past providing the opportunity to the researcher to study trends and patterns during specific timelines.
In the present study a very practical methodology for gathering data from internet archives using open source digital tools is provided. Using web browser plugins, a database of websites and PHP coding a researcher can parse an online archive API and retrieve data. The process is described in detail and is divided in distinct sections: a) The structure and implementation as well as the data design schema of the database. b) The application responsible for retrieving data from the database and creating the corresponding code to be injected into the browser plugin. c) The process through which the plugin sends query requests to the archive API, retrieves information from the responses and organizes the data collected into a digital filing system. The data gathered consists of the websites source code and a homepage screenshot and, as presented to this study, contains a surprisingly large amount of information about the website regarding the parameters that contribute to its aesthetics.
Back
SPONSORS
Agora Restaurant Grill House
Lupin Aperitivo Espresso Bar






























