Skip to Main Content

Evaluating Generative AI Tools for Academic Research

Support for critical approaches to GenAI tools for academic research.

AI Tools for Translation

Translation, which is distinct from transliteration (writing words of one language using the alphabet of another), sits at the intersection of several fields: “sociology, psychology, computer science, information science, and linguistics, its birthplace” (Erton, 2020, p. 1910). Although it’s easy to think of translation as merely working out the “same meaning” between two texts, meaning can be idiomatic or stored in the context - temporal, political, social, et al.. The intersection of all of these issues makes translation difficult, but the standard by which translation is judged is ultimately set by humans. As a result, translation can liken itself to more of an interpretive rather than definitive practice, making the development and the evaluation of machine translation tools particularly challenging. 

Machine Translation (MT) has been around for some time. As an umbrella term, it refers to the use of computers to translate words and phrases from one language to another. There are a number of ways this relationship can occur:

  • “one-to-one, i.e. from one source language (SL) to one target language (TL), known as bi-lingual translation
  • one-to-many, i.e. from one SL into many TLs
  • many-to-many translation, i.e. from many SLs to many TLs known as Multilingual Machine Translation (MMT)” (Sitender et al., 2023, p. 3441).

Translation can be unidirectional (e.g. from Spanish to Korean only) or bidirectional (e.g. from Spanish to Korean AND from Korean to Spanish). 

The source or input format for MT is typically audio or text; this can also include the audio or transcription of the audio in a video, or images where optical character recognition (OCR) can be used to turn text that appears in images into machine-readable characters.

Please note that GenAI translation tools can generate inaccurate outputs including errors in translating individual words and context. It is your responsibility to review outputs to ensure accuracy. Researchers using these tools for their project(s) also need to follow guidelines from the NYU Institutional Review Board.

Evaluating AI Tools for Transcription

Many researchers select a tool based on what is commonly used by peers in a department or field. As a researcher or student, it is your responsibility to investigate tools independently. In addition to identifying your own research or accessibility requirements, there are a number of other things to consider.

The table below provide a rubric for evaluating AI tools for translation.
Criteria Considerations
Privacy and Security

Whether a recording or transcription contains personally identifying information (PII) or the intellectual property of someone else, the privacy and security of your data should be carefully considered.

  • Do you know how the platform will store and secure your data?
  • What is the company’s data retention and privacy policy? (see Did You Read the Terms of Service below)
  • Are they transparent about any use or sale of your data?
  • Do they re-use your data to help train their tool?
Data Storage

Data privacy and security issues may or may not relate to privacy and security issues, but where it is stored has further implications. For instance, do you want to store data on the cloud, or locally on your device?

  • If local, does your computer have the storage space and processing power to use an automated translation tool? Note that high quality audio or video will require more processing power. 
  • If on the cloud, can you find information about where the company’s servers are located? Does the service have limits on the size and length of audio files?
Accuracy

Accuracy may depend on the source material or languages available, but an understanding of how accurate a translation is can help you assess how much work you may need to do to correct outputs and how easily that can be done.

  • Can you find information about the “word error rate” (WER), or the number of errors compared to a human translator?
  • Does the tool provide an editor you can use to correct the output?

The accuracy of any AI translation tool can sometimes be debated because of the interpretative nature of the task as described above. It is your responsibility to review outputs to ensure accuracy.

Features

Does the tool provide features useful to your research such as:

  • Languages - quality may vary dramatically across languages
  • Commenting or highlighting features
  • Time stamping or indexing
  • Identifying speakers
  • Summarizing
  • Sharing transcripts or translation credits with collaborators
User Experience
  • Do you find the tool’s interface easy to navigate and understand?
  • If you have a problem, can you access tech support? 
  • How long does the tool take to translate your source material? 
  • How does the tool export outputs, and can you customize the format?
Cost
  • Is the tool free and/or open source? Note that free tools might come with trade-offs (e.g., a lack of transparency, privacy, and/or usability). If it is free and proprietary, be sure to read the terms of service especially closely. 
  • If there is a cost, how is the cost calculated and how much is it? What happens to your account and data if you stop paying? Does the service have a limited free plan you can use?

Did You Read the Terms of Service?

Cloud-based services often have lengthy and dense terms of service. When evaluating translation tools, you may be interested in finding information about the service’s data security, retention, and privacy practices. You may also want to know if the service can use your data to improve their AI, and if you can opt out from this practice. 

  • Some companies have dedicated policies separate from the Terms of Service. It may be under a Data Retention or Privacy Policy.
  • You can often request more information about a company’s security and privacy practices.
  • In some cases you can ensure your data is not used to train a platform’s AI model.

Additional Resources