SwePub
Sök i LIBRIS databas

  Extended search

onr:"swepub:oai:gup.ub.gu.se/309329"
 

Search: onr:"swepub:oai:gup.ub.gu.se/309329" > A data-driven appro...

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

A data-driven approach to studying changing vocabularies in historical newspaper collections

Hengchen, Simon, 1988 (author)
Gothenburg University,Göteborgs universitet,Institutionen för svenska språket,Department of Swedish
Ros, Ruben (author)
Marjanen, Jani (author)
show more...
Tolonen, Mikko (author)
show less...
 (creator_code:org_t)
2021-11-05
2021
English.
In: Digital Scholarship in the Humanities. - : Oxford University Press (OUP). - 2055-7671 .- 2055-768X. ; 36:Supplement 2, s. 109-126
  • Journal article (peer-reviewed)
Abstract Subject headings
Close  
  • Nation and nationhood are among the most frequently studied concepts in the field of intellectual history. At the same time, the word ‘nation’ and its historical usage are very vague. The aim in this article was to develop a data-driven method using dependency parsing and neural word embeddings to clarify some of the vagueness in the evolution of this concept. To this end, we propose the following two-step method. First, using linguistic processing, we create a large set of words pertaining to the topic of nation. Second, we train diachronic word embeddings and use them to quantify the strength of the semantic similarity between these words and thereby create meaningful clusters, which are then aligned diachronically. To illustrate the robustness of the study across languages, time spans, as well as large datasets, we apply it to the entirety of five historical newspaper archives in Dutch, Swedish, Finnish, and English. To our knowledge, thus far there have been no large-scale comparative studies of this kind that purport to grasp long-term developments in as many as four different languages in a data-driven way. A particular strength of the method we describe in this article is that, by design, it is not limited to the study of nationhood, but rather expands beyond it to other research questions and is reusable in different contexts.

Subject headings

HUMANIORA  -- Historia och arkeologi -- Historia (hsv//swe)
HUMANITIES  -- History and Archaeology -- History (hsv//eng)
HUMANIORA  -- Annan humaniora (hsv//swe)
HUMANITIES  -- Other Humanities (hsv//eng)
NATURVETENSKAP  -- Data- och informationsvetenskap -- Språkteknologi (hsv//swe)
NATURAL SCIENCES  -- Computer and Information Sciences -- Language Technology (hsv//eng)

Publication and Content Type

ref (subject category)
art (subject category)

Find in a library

To the university's database

  • 1 of 1
  • Previous record
  • Next record
  •    To hitlist

Find more in SwePub

By the author/editor
Hengchen, Simon, ...
Ros, Ruben
Marjanen, Jani
Tolonen, Mikko
About the subject
HUMANITIES
HUMANITIES
and History and Arch ...
and History
HUMANITIES
HUMANITIES
and Other Humanities
NATURAL SCIENCES
NATURAL SCIENCES
and Computer and Inf ...
and Language Technol ...
Articles in the publication
Digital Scholars ...
By the university
University of Gothenburg

Search outside SwePub

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

 
pil uppåt Close

Copy and save the link in order to return to this view