AILLA is a digital archive of recordings and texts in and about the indigenous languages of Latin America. Access to archive resources is free of charge. Most of the resources in the AILLA database are available to the public, but some have special access restrictions.
The heart of the collection is recordings of naturally-occurring discourse in a wide range of genres, including narratives, ceremonies, oratory, conversations, and songs. Many of these recordings are accompanied by transcriptions and translations in either Spanish, English, or Portuguese. These works contain a wealth of information about Latin American indigenous cultures as well as knowledge about the natural environments that the people live in.
The archive also collects materials about these languages, such as grammars, dictionaries, ethnographies, and research notes. The collection includes teaching materials for bilingual education and language revitalization programs in indigenous communities, such as primers, readers, and textbooks on a variety of subjects, written in indigenous languages
The Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) holds computer-based (digital) materials about Australian Indigenous languages in the Aboriginal Studies Electronic Data Archive (ASEDA). ASEDA has materials including dictionaries, grammars, teaching materials, and represents about 300 languages. ASEDA offers a free service of secure storage, maintenance, and distribution of electronic texts relating to these languages.
This collection contains linguistic data, stories, songs/chants and other material in about 90 languages; mostly endangered or rare languages. The largest group of languages represented is the result of field work sponsored by the Survey of California and Other Indian Languages (UCB, Department of Linguistics).
Cascadilla Proceedings Project is an imprint of Cascadilla Press. We created CPP as a new model for proceedings of linguistics conferences and workshops. All proceedings published by CPP are available both in print and on the web. Web access is free and unrestricted, and the copy available on the web is the same as the book version in content, formatting, and pagination. The print edition is a hardback which meets library binding standards. This combination allows for the best of both worlds: free and quick access for researchers looking for a proceedings paper, with all the advantages of being published in book form.
Type in a word or phrase in one of seven languages (English, French, German, Spanish, Hebrew, Russian, Chinese) and see how its usage frequency has been changing throughout the past few centuries. Addictive.
The Centre for East European Language-Based Area Studies (CEELBAS) Language Repository is an open-access digital resource for students of the languages of Central and Eastern Europe and Russia. It currently covers 13 languages and houses almost 40 different sets of specially-designed language training materials, including materials for several languages where online resources are scarce. Currently the following languages are covered: Bulgarian; Croatian/Serbian; Czech; Estonian; Finnish; Georgian; Hungarian; Polish; Romanian; Russian; Slovak; Ukrainian.
ELSNET is a Europe-based forum dedicated to human language technologies. It operates in an international context, and will consider, across discipline boundaries, all human communication research areas related to language and speech.
Its main objective is to advance human language technologies in a broad sense by bringing together Europe's key players in research, development, integration or deployment in the field of language and speech technology and neighbouring areas.
This page describes archives which house materials that are intended to document and describe human languages, such as wordlists, lexicons, annotated signals, interlinear texts, paradigms, field notes, and linguistic descriptions.
The LINGUIST List is dedicated to providing information on language and language analysis, and to providing the discipline of linguistics with the infrastructure necessary to function in the digital world. LINGUIST maintains a web-site with over 2000 pages and runs a mailing list with over 25,000 subscribers worldwide. LINGUIST also hosts searchable archives of over 100 other linguistic mailing lists and runs research projects which develop tools for the field, e.g., a peer-reviewed database of language and language-family information, and recommendations of best practice for digitizing endangered languages data.
LINGUIST is a free resource, run by linguistics professors and graduate students, and supported entirely by your donations.
The National Gallery of the Spoken Word (NGSW) is an ongoing five year research project funded under the Digital Library Initiative II
spearheaded by the National Science Foundation. The NGSW is creating an online fully-searchable digital library of spoken word collections spanning the 20th century at HistoricalVoice.org. NGSW provides storage for these digital holdings and public exhibit "space" for the most evocative collections. From Thomas Edison's first cylinder recordings and the voices of Babe Ruth and Florence Nightingale to Studs Terkel's timeless interviews and the oral arguments of the US Supreme Court, the collections of the NGSW digital library cover a variety of interests and topics.
Omniglot is a guide to the writing systems and languages of the world.
It also contains tips on learning languages, language-related articles, quite a large collection of useful phrases in many languages, multilingual texts, a multilingual book store and an ever-growing collection of links to language-related resources.
This catalog, developed by the Open Language Archives Community (OLAC), provides access to a wealth of information about thousands of languages, including details of text collections, audio recordings, dictionaries, and software, sourced from dozens of digital and traditional archives.
The speech accent archive uniformly presents a large set of speech samples from a variety of language backgrounds. Native and non-native speakers of English read the same paragraph and are carefully transcribed. The archive is used by people who wish to compare and analyze the accents of different English speakers.
WALS is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of more than 40 authors (many of them the leading authorities on the subject).
WALS consists of 141 maps with accompanying texts on diverse features (such as vowel inventory size, noun-genitive order, passive constructions, and "hand"/"arm" polysemy), each of which is the responsibility of a single author (or team of authors). Each map shows between 120 and 1370 languages, each language being represented by a symbol, and different symbols showing different values of the feature. Altogether 2,650 languages are shown on the maps, and more than 58,000 datapoints give information on features in particular languages.
It provides vocabularies (mini-dictionaries of about 1000-2000 entries) of 41 languages from around the world, with comprehensive information about the loanword status of each word. It allows users to find loanwords, source words and donor languages in each of the 41 languages, but also makes it easy to compare loanwords across languages.
ELRA is the driving force to make available the language resources for language engineering and to evaluate language engineering technologies. In order to achieve this goal, ELRA is active in identification, distribution, collection, validation, standardisation, improvement, in promoting the production of language resources, in supporting the infrastructure to perform evaluation campaigns and in developing a scientific field of language resources and evaluation.
Haskins Laboratories is an independent, international, multidisciplinary community of researchers conducting basic research on spoken and written language. Exchanging ideas, fostering collaborations, and forging partnerships across the sciences, it produces groundbreaking research that enhances our understanding of -- and reveals ways to improve or remediate —speech perception and production, reading and reading disabilities, and human communication.
Conlang.org is a site for conlangers, would-be conlangers, those interested in or curious about conlangs, and anything else to do with conlanging.
A "conlanger" is someone who creates or constructs languages or "conlangs." Conlangs come in a wide variety although these can be divided primarily into three general areas: auxlangs or international auxiliary languages like Esperanto, engelangs or engineered languages like Ithkuil and Lojban, and artlangs or artistic languages like Sindarin or Klingon.
The Universal Esperanto Association was founded in 1908 as an organization of individual Esperantists. Currently UEA is the largest international organization for Esperanto speakers and has members in 118 countries. UEA works not only to promote Esperanto, but to stimulate discussion of the world language problem and to call attention to the necessity of equality among languages.
ANVILL (A National Virtual Language Lab) is a speech-based toolbox for language teachers. Like the language lab console of old, it's focused on the practice of oral/aural language, but at its core are very modern web-based audio and video tools from duber dot com and the University of Oregon: Voiceboards, LiveChat, and Quizzes and Surveys. Our newest tool, TCast, allows teachers to record and place audio or video files anywhere in a lesson--in 3 easy steps. Each of these tools really opens up the scope and sequence of lessons centered around spoken language tasks.
iLoveLanguages is a comprehensive catalog of language-related Internet resources. The more than 2400 links at iLoveLanguages have been hand-reviewed to bring you the best language links the Web has to offer.
The goal of TalkBank is to foster fundamental research in the study of human and animal communication. It will construct sample databases within each of the subfields studying communication. It will use these databases to advance the development of standards and tools for creating, sharing, searching, and commenting upon primary materials via networked computers.
Wordnik is the world's biggest online English dictionary, by number of words.
Wordnik.com is an online English dictionary and language resource that provides dictionary and thesaurus content, some of it based on print dictionaries such as the Century Dictionary, the American Heritage Dictionary, WordNet, and GCIDE. Wordnik has collected a corpus of billions of words which it uses to display example sentences, allowing it to provide information on a much larger set of words than a typical dictionary Wordnik shows definitions from multiple sources, so you can see as many different takes on a word's meaning as possible