Basic concepts on AIGC
  • About the course materials
  • General Course Format and Strategies
  • Introduction
  • Foundations for AIGC
    • Computers and content generation
    • A brief introduction to AI
      • What AI is?
      • What ML is?
      • What DL is?
      • Discriminative AI vs. Generative AI
  • Generative AI
    • Introduction to Generative AI
      • Going deeper into Generative AI models
  • Deep Neural Networks and content generation
    • Image classification
    • Autoencoders
    • GAN: Generative Adversarial networks
    • Transformers
    • Diffusion models
      • Basic foundations of SD
  • Current image generation techniques
    • GANs
  • Current text generation techniques
    • Basic concepts in NLP in Large Language Models (LLMs)
    • How chatGPT works
  • Prompt engineering
    • Prompts for LLM
    • Prompts for image generators
  • Current AI generative tools
    • Image generation tools
      • DALL-E 2
      • Midjourney
        • More experiments with Midjourney
        • Composition and previous pictures
        • Remixing
      • Stable diffusion
        • Dreambooth
        • Fine-tuning stable diffusion
      • Other solutions
      • Good prompts, img2img, inpainting, outpainting, composition
      • A complete range on new possibilities
    • Text generation tools
      • OpenAI GPT
        • GPT is something really wide
      • ChatGPT
        • Getting the most from chatGPT
      • Other transformers: HuggingFace
      • Other solutions
      • Making the most of LLM
        • Basic possibilities
        • Emergent abilities of LLM
    • Video, 3D, sound, and more
    • Current landscape of cutting-edge AI generative tools
  • Use cases
    • Generating code
    • How to create good prompts for image generation
    • How to generate text of quality
      • Summarizing, rephrasing, thesaurus, translating, correcting, learning languages, etc.
      • Creating/solving exams and tests
  • Final topics
    • AI art?
    • Is it possible to detect AI generated content?
    • Plagiarism and copyright
    • Ethics and bias
    • AI generative tools and education
    • The potential impact of AI generative tools on the job market
  • Glossary
    • Glossary of terms
  • References
    • Main references
    • Additional material
Powered by GitBook
On this page
  • The inflection points
  • The current Generative AI landscape
  • The always difficult task of predicting the future
  • A general view of current Generative AI possibilities and main actors
  • Assisting tools
  1. Generative AI

Introduction to Generative AI

A new step towards creativity

PreviousDiscriminative AI vs. Generative AINextGoing deeper into Generative AI models

Last updated 2 years ago

A powerful new class of foundation AI models is making it possible for machines to write, code, draw, animate, compose, and create with credible and sometimes superhuman results.

AI has been lately, and still is, very good at analyzing things, even better than humans. AI models are unbeatable analyzing huge sets of data and find patterns in them for a multitude of use cases:

  • Classifying or predicting values from input data of all kind

    • Structured/tabular data

    • Non-structure data (text, signals etc.)

    • Images, video, sound, perceptual signals

This could be classified as Analytical AI. But humans are also good at creating, from literature, to art, music, etc.

Up until recently, machines had no chance of competing with humans at creative work—they were relegated to analysis and rote cognitive labor. But machines are just starting to get good at creating sensical and beautiful things. This new category is called “Generative AI,” meaning the machine is generating something new rather than analyzing something that already exists.

Every industry that requires humans to create original work—from social media to gaming, advertising to architecture, coding to graphic design, product design to law, marketing to sales—is up for reinvention.

Certain functions may be completely replaced by generative AI, while others are more likely to thrive from a tight iterative creative cycle between human and machine—but generative AI should unlock better, faster and cheaper creation across a wide range of end markets.

The inflection points

Neural networks are not new at all. Its foundations date back to the late 40s, and many of its challenges (training) were gradually resolved.

But we can date a first turning point in 2012, when the AlexNet network, trained using GPUs, managed to far outperform other approaches in the complicated problem of classifying images.

The use of these networks to move from analytical use in classification tasks to creative tasks began early with architectures such as autoencoders, and the possibility of generating new images and content by injecting random numbers into their latent space.

Another turning point is the GANs, by Ian Goodfellow, in which two networks, after a process of competition between the two, one creating new content from random numbers, the other determining the adequacy of the generated content and punishing (retraining) the first, make possible to generate new images (styles, and other types of content) based on totally random information or assisted by a creator.

The appearance of the transformer architecture by Google in 2017, made the generation of text and its creativity reach new heights. This technique was soon applied to other frames beyond the text.

The combination of techniques, such as the transformer and broadcast models, have led to a real game changer in 2022, with the development of powerful foundational models of content generation through AI in almost any field: image, video, animation, text, music. , sound etc

The current Generative AI landscape

The always difficult task of predicting the future

A general view of current Generative AI possibilities and main actors

Other tech giants like Google and Meta have their own generative AI models:

Meta has ‘Make-a-scene’, that not only takes text prompts but also sketches to create high-definition visual masterpieces on a digital canvas.

I am excited about the future of content that can be personalized and contextualized. I believe that this technology will not replace but rather enhance human creativity and am excited to see many different and unique use cases of generative AI.

Here are a couple of projects that I’ve come across that are using this technology:

Assisting tools

References

created and introduced GPT-3 (Generative Pre-trained Transformer 3) in 2020. This LLM leverages deep learning to generate text, code, images, etc.

, an AI image generator by Open AI is a “neural network that creates images from text captions for a wide range of concepts expressible in natural language. DALL·E is a 12-billion parameter version of GPT-3 trained to generate images from text descriptions, using a dataset of text–image pairs.”

invested $1B in Open AI and their new uses AI-generated images by DALL-E 2.

Copilot, essentially autocomplete for coders is another application of this technology. uses the “OpenAI Codex to suggest code and entire functions in real-time, right from your editor.

is a “latent text-to-image diffusion model capable of generating photo-realistic images given any text input, cultivates autonomous freedom to produce incredible imagery, empowers billions of people to create stunning art within seconds.”

just raised $101M funding round and has over 5,000 A100 graphic processors, making it one of the biggest supercomputers in the world, trained with 2B images. is open source and is made up of a developer community “with over 200,00 members who are building AI for the future.”

by Stability AI is a new AI system powered by Stable Diffusion that can create realistic images, art and animation from a description in natural language.”

has (Pathways Autoregressive Text-to-Image model), “an autoregressive text-to-image generation model that achieves high-fidelity photorealistic image generation and supports content-rich synthesis involving complex compositions and world knowledge.”

, a word processor with artificial intelligence baked in, so you can write faster.

, an AI assistant that creates graphics for presentations.

, a search engine based on generative AI, the same sorts of techniques behind DALL-E 2 and GPT-3.

, next-generation content creation with artificial intelligence.

For more, has a collection of 200+ AI tools that you can explore!

OpenAI
DALL-E
Microsoft
Microsoft Designer
GitHub
GitHub Copilot
Stable Diffusion
Stability AI
Stability AI
“DreamStudio
Google
Parti
Lex
Chula
Metaphor
Runway
MagicTools
https://www.sequoiacap.com/article/generative-ai-a-creative-new-world/
Generative AI is knocking many doors
Search for 'Generative AI' on Google