Basic concepts on AIGC
  • About the course materials
  • General Course Format and Strategies
  • Introduction
  • Foundations for AIGC
    • Computers and content generation
    • A brief introduction to AI
      • What AI is?
      • What ML is?
      • What DL is?
      • Discriminative AI vs. Generative AI
  • Generative AI
    • Introduction to Generative AI
      • Going deeper into Generative AI models
  • Deep Neural Networks and content generation
    • Image classification
    • Autoencoders
    • GAN: Generative Adversarial networks
    • Transformers
    • Diffusion models
      • Basic foundations of SD
  • Current image generation techniques
    • GANs
  • Current text generation techniques
    • Basic concepts in NLP in Large Language Models (LLMs)
    • How chatGPT works
  • Prompt engineering
    • Prompts for LLM
    • Prompts for image generators
  • Current AI generative tools
    • Image generation tools
      • DALL-E 2
      • Midjourney
        • More experiments with Midjourney
        • Composition and previous pictures
        • Remixing
      • Stable diffusion
        • Dreambooth
        • Fine-tuning stable diffusion
      • Other solutions
      • Good prompts, img2img, inpainting, outpainting, composition
      • A complete range on new possibilities
    • Text generation tools
      • OpenAI GPT
        • GPT is something really wide
      • ChatGPT
        • Getting the most from chatGPT
      • Other transformers: HuggingFace
      • Other solutions
      • Making the most of LLM
        • Basic possibilities
        • Emergent abilities of LLM
    • Video, 3D, sound, and more
    • Current landscape of cutting-edge AI generative tools
  • Use cases
    • Generating code
    • How to create good prompts for image generation
    • How to generate text of quality
      • Summarizing, rephrasing, thesaurus, translating, correcting, learning languages, etc.
      • Creating/solving exams and tests
  • Final topics
    • AI art?
    • Is it possible to detect AI generated content?
    • Plagiarism and copyright
    • Ethics and bias
    • AI generative tools and education
    • The potential impact of AI generative tools on the job market
  • Glossary
    • Glossary of terms
  • References
    • Main references
    • Additional material
Powered by GitBook
On this page
  1. Current AI generative tools
  2. Text generation tools
  3. OpenAI GPT

GPT is something really wide

PreviousOpenAI GPTNextChatGPT

Last updated 2 years ago

In the OpenAI available API (), GPT-3 represents a fleet of different models that distinguish themselves by size, the data used and the training strategy. The core GPT-3 model () is the source of all those derived models. When it comes to size, OpenAI offers different models that balance the quality of Natural Language Generation and inference speed. The models are labeled with the names of scientists or inventors: - Davinci: 175B parameters - Curie: 6.7B parameters - Babbage: 1B parameters - Cushman: 12B parameters I am missing "Ada", which is supposed to be faster (so I guess smaller) but they don't document its size. The models can be fine-tuned in a supervised learning manner with different datasets. The Codex family is specifically designed to generate code by fine-tuning on public Github repos' data (). Most of the text generation models (text-davinci-001, text-davinci-002, text-curie-001, text-babbage-001) are actually GPT-3 models fine-tuned with human labeled data as well as with the distillation of the best completions from all of their models. OpenAI actually described those models to be InstructGPT models (), although the training process is slightly different from the one described in the paper. Text-davinci-002 is specifically described by OpenAI as being fine-tuned with text data from the Codex model code-davinci-002, presumably performing well on both code and text data. Text-davinci-003 is a full InstructGPT model as it is the text-davinci-002 model further refined with a Proximal Policy Optimization algorithm (PPO) (), a Reinforcement Learning algorithm. The "GPT-3.5" label refers to models that have been trained on a blend of text and code from before Q4 2021 as opposed to October 2019 for the other models. OpenAI has been using GPT-3 for many specific applications. For example, they trained text and code alignment models (text-similarity-davinci-001, text-similarity-curie-001) to learn embedding representations of those data () in a similar manner than the CLIP model () powering DALL-E 2 and Stable Diffusion. They developed a model to summarize text with labeled data in a very similar manner than InstructGPT has been developed (). They also provide a way to extract the latent representation provided by GPT-like models (text-embedding-ada-002). And we know that ChatGPT is a sibling model to InstructGPT trained from GPT-3.5 so it is probably using Text-davinci-003 as a seed.

https://lnkd.in/gxbtuBk3
https://lnkd.in/g8MEAuUi
https://lnkd.in/g2HF42gi
https://lnkd.in/gnt9K9pu
https://lnkd.in/gsDTWtga
https://lnkd.in/gCwdbUdd
https://lnkd.in/eHbmBb2t
https://lnkd.in/gdrzdWu3