Skip to Main Content

Text and Data Mining

Information on text and data mining resources available through the Library

Gale (Including the Economist Historical Archive)

Researchers may request text mining access to content from most Gale Primary Sources. These are delivered to the Library as XML and PDF files. The Library has acquired the collections listed below. Contact us about others. 
Note: the Financial Times will not license their historical archive for text mining at this time

Adam Matthew

The Library has secured text mining rights from the publisher, Adam Matthew The Library has purchased multiple archives from Adam Matthew, including the collections below.

Although rights are secured, an additional project description will be required from the researcher before Adam Matthew will release the data. Additional costs may be involved. Adam Matthew prohibits any automated searching and downloading of content from the website. Please contact the Library for assistance if you have a text and data mining project where you would like to use content from Adam Matthew.
  • URL

Readex

Readex has introduced a new tool called Readex Text Explorer, which allows online text analysis using  Voyant within selected Readex databases. This is currently only available in our Foreign Broadcast Information Service database (FBIS). Connect to FBIS and select Text Explorer in the menu bar to use this tool.

Text Creation Partnership

The Text Creation Partnership is a joint effort to transcribe historical texts from three major databases

  • Early English Books Online (published by ProQuest)
  • Eighteenth Century Collections Online (published by Gale Cengage)
  • Evans Early American Imprints (published by the Readex division of Newsbank

The Library subscribes to the full versions of these databases which provide page images but not easy access to machine readable text. The TCP has made that available for researchers to download for selected titles from two of these databases.