What's New
corpus

Description:
LITUND contains two comparable corpora:
1. Unreliable news texts. 147 full-text articles (100,678 words) identified as misleading by professional fact-checkers. The corpus includes metadata file with the following ...
This item contains 3 files (3.34
MB).
Academic Use


corpus

Description:
This corpus consists of (1) examples of hate speech based on ethnicity, nationality, or race, and (2) a collection of neutral comments, including both general comments and comments mentioning nationality in a positive or ...
This item contains 4 files (803.65
KB).
Publicly Available
lexicalConceptualResource

Description:
The dataset was extracted from publicly available online sources, primarily Lithuanian news portal publications from the period 2014–2020 (~500M words). It includes patterns using the following Perl-style regular expression:
...
This item contains 1 file (749.56
KB).
Publicly Available
Most Viewed Items
Top Last Week
lexicalConceptualResource

Description:
Dabartinės lietuvių kalbos tekstyno žodžių formų dažniniai sąrašai
Worlists of Wordforms of the Contemporary Corpus of Lithuanian language
Tekstyno struktūra/Corpus Structure
Patekstynis/Subcorpus Words,m Proporti ...
This item contains 2 files (33.16
MB).
Publicly Available
toolService

Description:
Speech to text automatic transcriber for Lithuanian is a containerized application implemented into 17 containers. It covers four areas: administrative, legal, medical and general spoken language. For the installation of ...
This item contains no files.
lexicalConceptualResource

Description:
Lithuanian Hunspell dictionary consists of two files, namely an affix file (.aff) and a dictionary file (.dic). The data can be used for spell checking, morphological analysis, or synthesis of a Lithuanian word (e.g., ...
This item contains 4 files (1.63
MB).
Publicly Available