Digital Humanities

Digital Humanities (DH) is an interdisciplinary field, involving work by people from various fields and, dealing with the application of technology, computing methods and tools, in the arts, humanities and social sciences. It is a relatively new field of research and teaching, while it started from the linguistic and literary domains and until a few years ago was known as humanities computing. Thus, one could find many definitions depending on the domain and focus of the digital humanists on the use and production of digital resources and services. In general terms, digital humanities could be considered as the field where humanists meet with and use digital resources, methods, tools and technologies.

 

Digital humanists use digital technologies either to present and conduct their research in practical way, or to be able to transform knowledge and answer research questions, as derived and related to the source material. All scholars usually start with the formulation of research questions and examination of the source material, either in digital or physical form. Fortunately, the expensive task of digitisation is not necessary for all scholars, or at least does not play a crucial role for their research, like for those in economic history or demographic studies. Moreover, they collect, analyse, identify, relate and document information related to persons and other agents, events, places, time periods, context, subject and other qualitative and quantitative attributes. The information landscape of humanities scholars becomes more complicated as, in order to give meaningful interpretations and so transform knowledge, they need to deal with vague concepts, abstract terms, complex relationships and structures, ambiguous identities, uncertainty, unknown places and names, obsolete languages, exceptions, irregularities, contradictions, changes and movement in time and space, unique attributes of the source material and subject domain, and even with various manifestations and different editions, commentaries and translations of the same work. In general, humanities data is full with complexities, inconsistencies, irregularities, errors and messy information, missing values, misspellings, but also with abbreviations, synonyms, polysemy, homophony and with various elements that machines cannot understand and interpret their meaning as used, inherently, in human languages.

 

Metadata and controlled vocabularies help resolving issues of disambiguation and are very important for explicitly, precisely and meaningfully documenting, exchanging and discovering humanities data. They are very important, as very often scholars deal with ambiguity and generally with terms and concepts whose meaning is not clear and therefore it is very common to use different words, even in the same language, in order to describe the same person or event. Knowledge organisation and data modeling in humanities play crucial role for the structure, interrelation, representation, contextualisation analysis, visualisation and interpretation of complex information but also for enabling data interoperability, discovery and sharing. Moreover, text and data mining technologies play also a decisive role in analysing vast information or ‘big data’ automatically and, therefore, in recognising and interpreting patterns and, consequently, transforming knowledge, in a way that would be impossible with traditional scholarship.

 

Digital scholarship does not rely only on the use of computational methods and tools for documenting, analysing and interpreting data, but also on accessing full-text, high resolution images and other quality digital material from which new types of questions, methods and results could arise. Humanities researchers dive into texts and read, even if text is not the primary source and, certainly, they use extensively secondary literature. Moreover, they need to find, access, reuse and cite easily information resources for assisting their research, but also they need their research outputs to be peer-reviewed and published, then to be discovered, if so accessed, reused, disseminated and cited properly in an easy and valid way manner. However, there is a great difficulty in integrating, discovering, accessing and reusing digital resources, due to the chaotic landscape of the World Wide Web but also because of the nature of humanities data. Specifically, data in arts, humanities and cultural heritage are scattered, representing different types of content, with semantically and syntactically heterogeneous languages, from various individual digital libraries and maintained by different applications and systems in various formats. As a consequence, there is a need for providers with high-quality digital resources and services that facilitate digital scholarship, not only with the discovery of open access primary and secondary research material, but also with integrating, representing and disseminating content efficiently and meaningfully.

 

The benefits that information portals and repositories offer to scholars with single points of access to heterogeneous content from various sources are enormous. However, for exchanging and publishing all these scattered, interdisciplinary, multilingual and rich information resources, of different type of content and format, with various manifestations and from different operating systems and computer programs, there is an absolute need for quality data and interoperability between systems. Computers need to interpret data meaningfully and efficiently that can be further automatically processed and combined with other data, despite the differences in the systems and the technologies used. Thus, semantic interoperability plays fundamental role in maintaining the meaning of humanities data and all of its unique intellectual characteristics that are important for research inquiry.