Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Text and Data Mining

Information on text and data mining resources available through the Library


The Collaborative Archive & Data Research Environment, or CADRE, provides access to standardized versions of the following datasets:

  • Web of Science: a commercial dataset that includes 73 million papers and 1.7 billion citations.
  • Microsoft Academic Graph: an open bibliometric dataset that holds 250 million documents and 2.4 billion citations.
  • U.S. Patent and Trademark Office: an open government dataset that includes 9 million patent application documents.

More information on CADRE can be found at

To access CADRE, click on the link below and then click on "Log In to CADRE," and then "cilogon." Click on the ORCID dropdown box and search for Chicago. Select University of Chicago and click "Log On" to authenticate.

CADRE access


Citation metadata, including citation counts can be accessed through the Scopus/ScienceDirect API

Web of Science

Citation data to conduct research performance analysis. Additional information for the XML files can be found in the Web of Science XML Users Guide. Web of Science datasets can also be accessed in CADRE.

EBSCO Bulk Download

EBSCO databases allow researchers to download up to 25,000 citations at a time. This can only be done from a results screen and does not allow you to select individual articles. Note that this is only the citation information and not the full text of the articles. It does include abstracts, in most cases.
​Follow these steps

  1. Conduct your search
  2. Click on Share
    EBSCO results sharing link
  3. Select Email a link to download exported results and select your desired format
    Export options in EBSCO
  4. You'll receive an email when the file is ready. This usually takes just a few minutes.

PubMed API & Download

The entire PubMed bibliographic database can be downloaded for local use or accessed via API. This is citation and abstract information only, not the text of the cited articles.
Learn more at the National Library of Medicine site.