This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Please attribute this work to the NYU Libraries Scholarly Communications and Information Policy Department.
Data mining is a research technique using computational analysis to uncover patterns in large data sets. Data mining techniques range from machine learning applications, to GIS and mapping, to business intelligence. The range of data types makes data mining techniques harder to pin down.
Text mining is the process of deriving information from textual data. Text mining techniques might include sentiment analysis, network analysis, word frequency distributions, pattern recognition, tagging/annotation, information extraction, and the production of granular taxonomies or ontologies.
This kind of analytic tool is useful in numerous scholarly fields, from the humanities to the sciences, where useful data can be "mined" from large non-text datasets and from text databases of the published literature (Source: UMass Amherst Libraries).
Questions? Contact us by emailing data.services@nyu.edu, or fill out our consultation request form and we'll get back to you.
Before grabbing all the data you can, you need to check the copyright and policies of the database, website, or social media platform you plan on mining. Many platforms, including library systems, do NOT allow users to mine their materials. For more information, check out the Libraries guide on Applying Fair Use.