Research Guides: Machines and Society: Setting Up Local Generative AI Tools

Introduction

While most of the popular AI tools are available online, they come with certain limitations for users. These limitations include privacy concerns, as all content submitted to online platforms is visible to the platform owners, which may not be desirable for some use cases. Additionally, these platforms may go offline when you need to use them or may be unavailable in certain countries due to geopolitical restrictions.

A rising solution to these problems is setting up your own local generative AI tools. In this section, we will outline some tools that enable offline image and text generation and provide links to their quickstart guides.

Image Generation

For local image generation, Stable Diffusion is currently the most viable option. There are several popular tools that provide access to Stable Diffusion:

Stable Diffusion WebUI: This is the most popular option for using Stable Diffusion locally. It offers a wide variety of extensions and features. However, the interface may be challenging for some users to navigate.
InvokeAI: This option provides a familiar interface for professionals and enthusiasts to create images using Stable Diffusion. It offers an intuitive interface for features like outpainting, inpainting, color sketches, and prompt matrices.
ComfyUI: ComfyUI provides a node/graph-based interface that allows users to construct complex image generation workflows without the need for coding.

In addition to using these tools directly, you can also utilize them via Docker using this repository: https://github.com/AbdBarho/stable-diffusion-webui-docker.

Text Generation

Text Generation is still improving and may not be as stable and coherent as the platform alternatives. However, it can be a good alternative for certain use cases. Here are a few options for running your own local ChatGPT:

GPT4All: It is a platform that provides pre-trained language models in various sizes, ranging from 3GB to 8GB. These models can be downloaded and used with their open-source software. They have been trained on a large amount of data to generate responses that are similar to human conversation. GPT4All is useful for tasks like writing assistance, answering questions, and creating conversational AI. The platform also offers additional software, such as a desktop chat client, Python bindings, and a command-line interface, to make it easier to interact with the language models.
Text Generation WebUI: It is an open-source project that provides a web-based user interface for running various large language models like GPT-J, LLaMA, and GALACTICA. It offers three interface modes: default, notebook, and chat. The project aims to provide users with an easy-to-use, feature-rich, and extensible text generation tool.

To use these tools, you will need to retrieve a language model. You can find language models on websites like HuggingFace and ModelScope.

Contact

Utku Ege Tuluk
Senior Associate of Emerging Technologies
uet200@nyu.edu