Generative AI

Information and resources on generative artificial intelligence

Introduction

This guide provides information and resources to support safe, ethical, and effective engagement with generative AI. Read below and use the tabs above to learn about gen AI and how to use it in your work, understand best practices and controversies, and find sources to expand your knowledge.

Gen AI is especially dynamic now. Its technology, issues, and related guidance are changing rapidly. To help our community keep current, we welcome your suggestions for these pages; please contact our librarians

What is Generative Artificial Intelligence (Gen AI)?

"Generative AI can be thought of as a machine-learning model that is trained to create new data, rather than making a prediction about a specific dataset. A generative AI system is one that learns to generate more objects that look like the data it was trained on." The article "Explained: Generative AI" from MIT News offers a breakdown of this popular technology.

Things to Know

REMEMBER: It is YOUR responsibility to be accountable for content generated by AI that you incorporate into your work! Generative AI output may be inaccurate and biased. It is up to you to verify anything AI generates.

REMEMBER: For security reasons, students, faculty, and staff shall not input Controlled Unclassified Information (CUI), personally identifiable information (PII), classified information, or any otherwise restricted information into generative AI tools. 

REMEMBER: Transparency about gen AI use is a DOD priority and is central to emerging academic integrity norms. Ask before using gen AI in coursework, research, writing, and publishing. Make it a habit to disclose use.

Read the NPS Interim Guiding Principles for use of Generative Artificial Intelligence (AI) Tools to understand NPS expectations. 

Related Research Guides

How Does Generative AI Work?

Generative AI is a deep-learning model (also called deep neural networks) that can create new content, rather than making a prediction about specific existing datasets. Gen AI systems are fed a VAST amount of data (web pages, books, organizational information, etc.) and can '“learn” to generate statistically probable outputs when prompted. At a high level, generative models encode a simplified representation of their training data and draw from it to create a new work that’s similar, but not identical, to the original data" (IBM).

An article from the World Economic Forum explains:

Generative AI refers to a category of artificial intelligence (AI) algorithms that generate new outputs based on the data they have been trained on. Unlike traditional AI systems that are designed to recognize patterns and make predictions, generative AI creates new content in the form of images, text, audio, and more.

Generative AI uses a type of deep learning called generative adversarial networks (GANs) to create new content. A GAN consists of two neural networks: a generator that creates new data and a discriminator that evaluates the data. The generator and discriminator work together, with the generator improving its outputs based on the feedback it receives from the discriminator until it generates content that is indistinguishable from real data.

Generative AI Glossary

  • Algorithm: A set of rules or instructions that tell a machine what to do with the data input into the system.

  • Artificial Intelligence (AI): Artificial intelligence (AI) refers to computer systems capable of performing complex tasks that historically only a human could do, such as reasoning, making decisions, or solving problems. 

  • Bayesian Networks: Probabilistic graphical models used in generative AI to represent uncertain relationships between variables and make probabilistic predictions.

  • Conditional Generation: Refers to generating data samples conditioned on specific input or context, allowing for targeted and controlled generation.

  • Deep Belief Networks (DBNs): Generative models of multiple layers of stochastic, latent variables used for unsupervised learning tasks.

  • Encoder-Decoder Architecture: A generative AI model comprising an encoder to convert input data into a latent representation and a decoder to generate output data from the latent representation.

  • Fuzzy Logic: A mathematical framework used in generative AI to handle uncertainty and imprecise reasoning in decision-making processes.

  • Generative AI: Generative artificial intelligence is artificial intelligence capable of generating text, images or other data using generative models, often in response to prompts.

  • Generative Models: Algorithms that learn to generate data similar to the training data, providing valuable insights into the underlying distribution of the data.

  • Hallucination: A situation where an AI system produces fabricated, nonsensical, or inaccurate information. The wrong information is presented with confidence, which can make it difficult for the human user to know whether the answer is reliable.

  • Hierarchical Models: Architectures that represent data in multiple levels of abstraction, capturing complex patterns and dependencies in the data.

  • Inference: Refers to using trained models to make predictions or generate new data based on observed input.

  • Joint Probability Distribution: Represents the probability of multiple random variables occurring together, enabling the modeling of complex relationships between variables.

  • Kullback-Leibler Divergence: Quantifies the difference between two probability distributions commonly used in training generative AI models.

  • Large language Model (LLM): A computer program that has been trained on massive amounts of text data such as books, articles, website content, etc. An LLM is designed to understand and generate human-like text based on the patterns and information it has learned from its training

  • Latent Space: Represents a lower-dimensional space where data is mapped, allowing for efficient representation and generation of complex data.

  • Markov Chain Monte Carlo Methods (MCMC): Used sample from complex probability distributions and estimate properties of the distribution.

  • Natural Language Generation (NGL): An application of generative AI that produces human-like text from structured data or other input forms.

  • Natural Language Processing (NLP): The ability of machines to use algorithms to analyze large quantities of text, allowing the machines to simulate human conversation and to understand and work with human language.

  • Overfitting: Occurs when a model performs well on the training data but fails to generalize to unseen data, resulting in poor performance.

  • Probability Density Function: Used to describe the likelihood of different outcomes occurring.

  • Prompt: Instructions or queries you enter into the artificial intelligence's (AI) interface to get responses; consisting of keywords and phrases.

  • Prompt Chaining: An ability of AI to use information from previous interactions to color future responses. 

  • Quantum Generative Models: Leverage principles from quantum computing to generate data samples.

  • Recurrent Neural Networks (RNNs): A generative AI model with feedback connections, allowing them to process sequential data and generate sequences.

  • Sampling: Generates new data points from a trained generative model to explore the learned distribution.

  • Temperature: Parameters set to control how random a language model's output is. A higher temperature means the model takes more risks. 

  • Token: The building block of text that a chatbot uses to process and generate a response. For example, the sentence "How are you today?" might be separated into the following tokens: ["How," "are," "you," "today," "?"]. Tokenization helps the chatbot understand the structure and meaning of the input.

  • Topic Modeling: Discovering latent topics or themes in a collection of text documents.

  • Unsupervised Learning: Training models on data without explicit labels, enabling them to discover patterns and structures in the data.

  • Variational Autoencoders: Generative models that combine autoencoders and variational inference elements to learn a compact latent representation of data and generate new samples.

  • Wasserstein Generative Adversarial Networks (WGANs): Variant of GANs that use the Wasserstein distance as a metric to improve training stability and generate high-quality samples.

  • Zero-Shot Learning: Involves training models to perform tasks that have not been explicitly trained, allowing them to generalize to new tasks.


Sources:

Carnegie Mellon University, Heinz College: Artificial Intelligence, Explained

Coursera: Generative AI Definitions