Resources Developed

Test

LINGUISTIC RESOURCES DEVELOPED

1.  Digital E-book of Sabar Shabdamala (শবর শব্দমালা

2.  Complete database of Bengali Spatio-Temporal Expressions

3.  20K English-Bengali Scientific and Technical Terms and Multiword Expressions

4.  20K English-Bengali transliterated lexical database

5.  25K English-Bengali bilingual-bidirectional translational equivalents

6.  BIS POS-Tagged 60K multidisciplinary and benchmarked Bengali sentences

7.  1.5 Lakh multidisciplinary and benchmarked normalized Bengali sentences.

8.  A large list of Bengali ‘words of multitude’ with examples

9.  A large database of Scientific and Technical Terms in Bengali

10.  Full list of consonant Grapheme Clusters used in Bengali text

11.  Exhaustive list of Bengali Basic Vocabulary

12.  Exhaustive list of Bengali place names

13.  Full list of Bengali Postpositions with examples

14.  Full list of Bengali verbal suffixes and conjugation markers

15.  Full list of Bengali nominal suffixes and case markers

16.  Hindi News Text Corpus of 2 million words

17.  POS tagged News Text Corpus of Indian English of 10 million words

18.  News Text Corpus of Indian English of 10 million words

19.  English-Bengali bilingual database of 1000 idiomatic expressions

20.  Complete list of annotated Bengali pronouns

21.  Full list of Bengali adjectival suffixes with examples

22.  Lexical database of 5000 Bengali prefixed words

23.  Full list of Bengali Prefixes with examples

24.  Chunked Bengali monolingual corpus of 30K sentences

25.  POS tagged Bengali monolingual corpus of 30K sentences

26.  Lexical Dataset of 5000 words from medical domain

27.  News Text Bengali Corpus of 2 million words

28.  Hindi-Bengali chunked corpus of 1 lakh sentences

29.  Hindi-Bengali POS tagged corpus of 1 lakh sentences

30.  Hindi-Bengali Parallel Translation Corpus of 70 thousand sentences

31.  Frequency-based Bengali lexical database of 2 lakh words

32.  Tokenized Bengali lexical database of 2 lakh words

33.  A unicode compatible normalized version of the TDIL corpus

34.  Modern Bengali text corpus of 5 million words from 65 disciplines