Skip to Main Content

Evaluating Generative AI Tools for Academic Research

Support for critical approaches to GenAI tools for academic research.

AI Tools for Transcription

AI transcription tools are quickly becoming the norm in academic settings. Sometimes referred to as speech-to-text (STT) or automatic speech recognition (ASR) tools, they may be used in qualitative research to help a researcher capture, reflect on, and analyze spoken aspects of audio data, and/or to support accessibility in class, helping students transcribe, take notes, and/or summarize a lecture. No matter the purpose, you should think carefully about the ethical and legal implications of using these types of tools and get consent to ahead of time to record, transcribe, and store comments from all parties. Although New York State is a one-party consent state, researchers need to follow guidelines from the NYU Institutional Review Board when using these tools for their project(s). 

Please note that AI transcription tools have been known to add text that was not part of the source audio, mis-attribute text to speaker, and/or incorrectly transcribe phrases or full sentences. In some cases, these errors can be harmful and offensive (see: Careless Whisper: Speech-to-Text Hallucination Harms by Allison Koenecke et al, 2024), or can create confusion and misrepresent speakers. It is your responsibility to review outputs to ensure accuracy.

Evaluating AI Tools for Transcription

Many researchers select a tool based on what is commonly used by peers in a department or field. As a researcher or student, it is your responsibility to investigate tools independently. In addition to identifying your own research or accessibility requirements, there are a number of other things to consider.

The table below provide a rubric for evaluating AI tools for transcription.
Criteria Considerations
Privacy and Security

Whether a recording or transcription contain personally identifying information (PII) or the intellectual property of someone else, the privacy and security of your data should be carefully considered.

  • Do you know how the platform will store and secure your data?
  • What is the company’s data retention and privacy policy? (see Did You Read the Terms of Service below)
  • Are they transparent about any use or sale of your data?
  • Do they re-use your data to help train their tool?
Data Storage

Data privacy and security issues may or may not relate to privacy and security issues, but where it is stored has further implications. For instance, do you want to store data on the cloud, or locally on your device?

  • If local, does your computer have the storage space and processing power to use an automated transcription tool? Note that high quality audio will lead to a better transcription, but will require more processing power. 
  • If on the cloud, can you find information about where the company’s servers are located? Does the service have limits on the size and length of audio files?
Accuracy

Accuracy may depend on some of the features including language or sound quality, but an understanding of how accurate a transcription is can help you assess how much work you may need to do to correct it and how easily that can be done.

  • Can you find information about the “word error rate” (WER), or the number of errors compared to a human transcription?
  • Does the tool provide an editor you can use to correct the transcript?

As stated in the introduction above, AI transcription tools not only incorrectly transcribe, but can also append text that was not part of the source audio. In some cases, these errors can be harmful and offensive (see: Careless Whisper: Speech-to-Text Hallucination Harms by Allison Koenecke et al, 2024). It is your responsibility to review outputs to ensure accuracy.

Features

Does the tool provide features useful to your research such as:

  • Languages - transcription quality may vary dramatically across languages
  • Commenting or highlighting features
  • Time stamping or indexing
  • Identifying speakers
  • Summarizing
  • Sharing transcripts or transcription credits with collaborators
User Experience
  • Do you find the tool’s interface easy to navigate and understand?
  • If you have a problem, can you access tech support? 
  • How long does the tool take to transcribe your audio? 
  • How does the tool export text, and can you customize the format of the transcript?
Cost
  • Is the tool free and/or open source? Note that free tools might come with trade-offs (e.g., a lack of transparency, privacy, and/or usability). If it is free and proprietary, be sure to read the terms of service especially closely. 
  • If there is a cost, how is the cost calculated and how much is it? What happens to your account and data if you stop paying? Does the service have a limited free plan you can use?
Access NYU provides access to some platforms and tools that help with transcription including Zoom, NYU Stream, or the online version of Microsoft Word. Details on these platforms and more can be found in our Qualitative Data Analysis Research Guide.

Did You Read the Terms of Service?

Cloud-based services often have lengthy and dense terms of service. When evaluating transcription tools, you may be interested in finding information about the service’s data security, retention, and privacy practices. You may also want to know if the service can use your data to improve their AI, and if you can opt out from this practice. 

Third-Party AI Assistants

Third-party AI assistants like OtterPilot (from otter.ai), Read.ai, or Fireflies.ai may automatically record and transcribe your online meetings. Acting as a user, these “assistants” will automatically join an online meeting and record the conversation. While these tools may be used for legitimate purposes, it’s good practice to ask for explicit consent from meeting attendees to record and transcribe the meeting. We also encourage users to review the Terms of Service carefully prior to signing up for these tools which are often free, but lack transparency in their privacy policy and data use.

Additional Resources