Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Text and Data Mining

Information on text and data mining resources available through the Library

Gale (Including the Economist Historical Archive)

The Library subscribes to the Gale Digital Scholar Lab, an interactive interface that supports text mining for the majority of the Gale Primary Sources acquired by the library, but  limits any dataset to 10,000 documents.

Researchers may request text mining access to content from most Gale Primary Sources. These are delivered to the Library as XML and PDF files. The Library has acquired the collections listed below. Contact us about others. 
Note: the Financial Times will not license their historical archive for text mining at this time

Adam Matthew

The Library has secured text mining rights from the publisher, Adam Matthew The Library has purchased multiple archives from Adam Matthew, including the collections below.

Although rights are secured, an additional project description will be required from the researcher before Adam Matthew will release the data. Additional costs may be involved. Adam Matthew prohibits any automated searching and downloading of content from the website. Please contact the Library for assistance if you have a text and data mining project where you would like to use content from Adam Matthew.
  • URL

Text Creation Partnership

The Text Creation Partnership is a joint effort to transcribe historical texts from three major databases

  • Early English Books Online (published by ProQuest)
  • Eighteenth Century Collections Online (published by Gale Cengage)
  • Evans Early American Imprints (published by the Readex division of Newsbank

The Library subscribes to the full versions of these databases which provide page images but not easy access to machine readable text. The TCP has made that available for researchers to download for selected titles from two of these databases.