Generative AI is rapidly transforming the landscape of visual media, design, and art. With AI tools and capabilities evolving at a fast pace, this guide provides a snapshot of common tools and concepts in this constantly updating field. This guide aims to introduce key concepts, categories of tools, common applications, various access methods, and important considerations when using generative AI for visual creation.
Page last updated: June 2025
Generative AI is becoming an increasingly integral part of the creative workflows of artists and designers in their daily practices. While the possibilities are constantly expanding, this section highlights some of the most common and practical applications across various fields—from art and storytelling to branding, architecture, and beyond. These examples illustrate new ways to ideate, prototype, and communicate visually. Keep in mind, these are just a few of the many creative use cases emerging in this rapidly evolving space.
Note: This field is rapidly changing. The tools listed here serve as examples and are not exhaustive. Users should always check the terms of service for any tool they use.
A. Pixel-Based (Raster) Image Generation
Overview: Creating or modifying images composed of pixels, typically from text prompts (text-to-image) or based on existing images (image-to-image).
Key Features and Techniques:
Example Platforms and Models:
B. Multimodal AI Platforms with Visual Capabilities
Overview: AI systems that understand and generate information across multiple modalities (text, images, audio, code). They often provide conversational interfaces for performing visual tasks.
Example Platforms:
C. Vector Graphics Generation
Overview: Creating scalable vector graphics (SVGs, etc.) that are resolution-independent, suitable for logos, icons, and illustrations.
Example Platforms:
D. Video Generation
Overview: Generating video clips from text prompts, images, or existing video footage.
Example Platforms and Models:
E. 3D Model Generation
Overview: Creating three-dimensional models from text prompts or 2D images.
Example Platforms and Models:
F. Code Generation for Visuals
Overview: Using AI to generate code for web design (HTML, CSS, JavaScript), data visualizations (e.g., D3.js, Python libraries), generative art (e.g., Processing, p5.js), shaders, and other visual applications.
Example Platforms (General LLMs capable of this):
Applications that are able to utilize LLMs through code:
G. Presentation and Document Design Assistance
Overview: AI tools that assist in generating, structuring, and designing presentations, reports, and other visual documents.
Example Platforms:
H. AI-Powered Design Platforms and "Agents"
Overview: Integrated design platforms that leverage AI across various workflows, sometimes acting as "agents" to automate or assist in complex visual tasks from start to finish.
Example Platforms:
Generative AI tools for visual creation can be accessed in several ways, each with its own advantages, cost implications, and technical requirements. Understanding these options can help you choose the best approach for your specific needs, whether you're a beginner looking for ease of use or an expert needing deep customization. Key factors to consider when selecting a tool include its primary function (e.g., image, video, 3D), ease of use, cost, available features, desired output quality, and the level of control you require over the generation process.
A. Online Platforms and Web-Based Services
Description: Ready-to-use tools accessible via a web browser, often subscription-based (SaaS) or offering freemium tiers. This is the most common and user-friendly entry point.
Examples: Midjourney (via Discord), ChatGPT (with DALL·E 3), Canva, Adobe Firefly website, Recraft, Runway.
B. Local Deployment
Description: Running AI models directly on your own computer hardware. This offers maximum control, customization, and privacy, but also requires more technical expertise and powerful hardware.
Examples: Stable Diffusion (using interfaces like Automatic1111 WebUI, ComfyUI, Fooocus), open-source models from Hugging Face.
Stable Diffusion DIY:
Configuring Stable Diffusion on your own device is a fairly technical undertaking. At the very least, you need to be able to use the command line tools.
Additionally, you need a computer with a dedicated Nvidia GPU to be able to setup Stable Diffusion. Currently support for Apple M chip devices and AMD GPUs is not widespread. However, you can use these resources as a starting point to find tools that do support your hardware.
You need to install Docker before using these resources. You can get Docker for free here.
The Stable Diffusion (webUI) can be installed using this GitHub repository. Simply follow this Setup guide and you are set up.
The commands below are for your reference.
git clone https://github.com/AbdBarho/stable-diffusion-webui-docker
docker compose --profile download up --build
docker compose --profile auto up --build
auto
is the most feature-rich option and is the suggested one.
C. APIs (Application Programming Interfaces)
Description: For developers and businesses to integrate generative AI capabilities into their own custom applications, websites, or workflows.
Examples: OpenAI API (DALL·E, GPT-4V), Stability AI API, Google Cloud Vertex AI (Gemini, Imagen), Anthropic API (Claude).
D. Plugins and Software Integrations
Description: Generative AI features embedded within existing software applications (e.g., design tools, productivity suites, browsers), extending their functionality.
Examples: Adobe Photoshop (Generative Fill), Microsoft 365 Copilot, Figma plugins, VS Code, etc.
A. AI Art and Copyright Infringement
Awareness of copyrights: You are likely not the exclusive owner of the images created with AI generators under certain terms, and there could be legal risks if the AI was trained on copyrighted material without permission.
Varying terms: Different AI generators have different rules for commercial use, ownership, and licensing. AI art generation is an emerging field; it’s crucial to read the terms and conditions before using or distributing any artwork created through AI.
Example with Midjourney: Under Midjourney's copyright terms (as of past review, always check current terms), non-paying users are often granted an asset license under the Creative Commons Noncommercial 4.0 Attribution license, while paid subscribers may have different, often more permissive, commercial rights.
Responsible AI and content verification: Tools like SynthID support ethical AI use by embedding invisible watermarks in generated images, making it easier to verify and track AI-created content without altering its appearance.
B. Concerns Regarding AI-Generated Images and Content
Inappropriate content: Similar to human-created drawings, AI-generated content can also be inappropriate or harmful. AI can amplify this due to its high-speed productivity and ability to mimic specific characteristics.
Examples of inappropriate content (based on common guidelines like Midjourney's):
Artist protests and originality: Concerns from artists about AI diminishing the value of traditional art forms, originality, and human creativity. Debates around AI models being trained on artists' work without consent or compensation.
Bias perpetuation: AI models can learn and perpetuate harmful stereotypes or biases present in their training data.
C. How to Protect Your Data
Data privacy: Be cautious when uploading personal photos, proprietary artwork, or other sensitive visual data to online AI tools. Understand how the service might use your uploaded data.
Limiting exposure: The most secure way to protect data is not to upload it to public or untrusted services.
Protective technologies: Researchers are working on methods to protect visual data from unauthorized AI training. For instance, tools like Fawkes or Glaze make tiny, often human-imperceptible pixel-level changes to images to "cloak" them, confusing AI models trying to learn artistic styles or identify individuals.
A project would usually follow the basic steps below:
The rise of AI is reshaping how visual creation works. In a traditional schema, a one-to-one correspondence prevailed between a given creative task and its supporting tool—for example, generating a three-dimensional (3D) rendering exclusively with 3D modeling software. AI disrupts this clear mapping of tasks to tools and predetermined workflows. Besides, the velocity at which new AI functionalities emerge has shortened the iteration cycle for workflow innovation from months or years to days, thereby demanding continuous recalibration of methodological frameworks.
Ways to Integrate AI
Selective augmentation of conventional workflows
Under this approach, the conventional visual creation pipeline remains intact, and AI tools are introduced opportunistically to expedite or enhance discrete stages of the process. For instance, if an AI-based image-generation model demonstrably reduces the time required for preliminary concept sketches, designers may opt to incorporate it solely for that segment of the workflow (e.g., brainstorming or prototyping).
AI-centric workflow design
In contrast, a more radical approach involves reconceiving the entire project workflow around AI capabilities from the outset. This may entail identifying tasks that AI currently performs with superior efficiency—such as rapid style exploration or automated layout generation—and structuring the sequence of operations so that these AI-driven tasks become foundational. Conversely, components that rely on uniquely human faculties (e.g., aesthetic judgment, cultural contextualization, or highly nuanced brand messaging) are delineated explicitly as manual interventions.
End-to-end AI-driven workflows
A third extreme envisions relegating the entire production pipeline to AI—potentially requiring minimal human oversight beyond specifying high-level objectives. In this mode, prompts, iterative refinements, and final quality checks are all managed through AI agents capable of autonomously executing discrete subtasks (e.g., generating iteration variants, evaluating visual coherence, or adjusting technical specifications).
Aside from the starting point, AI can potentially handle most stages of creation. The challenge is deciding which approach fits each project.
Choosing an Approach: Quality, Time, Resources
When selecting an approach, consider three factors:
Quality
Time
Resources
Often, AI saves time on routine tasks but may still require manual edits. If AI cannot meet a project’s specific needs, human-led work may be more efficient and cost-effective.
Balancing Trade-offs and Looking Ahead
From a business standpoint, the ideal workflow maximizes quality for the lowest cost. Currently, many specialized demands still favor human effort—especially when originality matters most. In contrast, tasks like initial mockups or basic background generation are well suited to AI.
In practice, most teams adopt a hybrid approach, letting AI handle repetitive or volume-driven tasks while humans focus on creative judgment and final refinements. Over time, these hybrid workflows may become standard, particularly in areas where AI matches or exceeds human speed and consistency.
In the long term, AI is expected to permeate every phase of the visual creation process. As this integration deepens, new demands—such as authentic creativity, handcrafted techniques, and genuine originality—are likely to emerge. Time will tell, and we will see.
Xinyi Zhu
Motion Graphic Designer/Animator
xz3366@nyu.edu