12 AI Terms You (and Your Flirty Chatbot) Should Know by Now

Generative AI is arguably the greatest technological advance of the digital age. But do you really understand it? Here's a guide to a dozen key GenAI concepts.

Perspective

October 3, 2024

With the meteoric rise of GenAI in the last two years, from data-scientist discussion groups to mainstream news coverage, one thing has become crystal-clear: It’s ChatGPT’s world – we’re just here to supply the prompts.

The pace at which GenAI tools have evolved is truly astonishing and shows no signs of slowing. By typing a few words into a chatbot, anyone can now generate sophisticated research reports, instant meeting summaries, camera-ready artwork, bug-free computer code, even dating app profiles and flirty texts – and much more.

That “much more” promises a wave of opportunities for enterprises, and new attack vectors for adversaries, as well as new ways of combating those attacks. Fully understanding this technology’s capabilities and limitations has become table stakes for business leaders and information security professionals.

The future of IT and security is autonomous. But most organizations don’t know which manual processes are easy to eliminate. This is where you start.

The key thing to remember is that, while GenAI chatbots may seem like magic, they’re really just extremely sophisticated prediction engines.

Tools like ChatGPT, Gemini, Copilot, and others rely on machine learning and large language models (LLMs) – complex neural networks trained on billions of documents, images, media files, and software programs. By understanding the meaning and context of language, LLMs are able to recognize patterns, which allows them to predict what words, pictures, sounds, or code snippets are likely to appear next in a sequence. This is how GenAI tools can write reports, compose music, generate short films, or hack code better (or at least faster) than most humans can, all in response to simple natural-language prompts.

But just because your colleagues are throwing around terms like LLM and GPT in meetings doesn’t mean that they (or, ahem, you) really understand them. Here’s an informal glossary of key concepts you need to know, from AGI to ZSL.

1. Artificial General Intelligence (AGI)

The ultimate manifestation of AI has already played a featured role in dozens of apocalyptic movies. AGI is the point at which machines become capable of original thought and either a) save us from our worst impulses, or b) decide they’ve had enough of us puny humans. While some AI experts, like “godfather of AI” Geoffrey Hinton, have warned about this, others sharply disagree about whether AGI is even possible, let alone when it might arrive.

What to remember: To know for sure if AGI is on the horizon, you’ll need to travel back into the past and ask Sarah Connor.

AGI is the point at which machines become capable of original thought and either a) save us from our worst impulses, or b) decide they’ve had enough of us puny humans.

2. Data poisoning

By introducing malicious data into the repositories used to train an AI model, adversaries can force a chatbot to misbehave, generate faulty or harmful answers, and damage the operations and reputation of the company that created it (like tricking a semi-autonomous car to drive into traffic). Because these attacks require direct access to training data, they are usually performed by current or recent insiders. Limiting access to data and continuous performance monitoring are the keys to preventing and detecting such attacks.

What to remember: Is your chatbot starting to sound like your conspiracy-spouting Aunt Agatha? Its data may have been poisoned.

3. Emergent behavior

GenAI models can sometimes do things its creators didn’t anticipate – like suddenly start conversing in Bengali, for example – as the size of the model increases. As with AGI, there is a healthy debate over whether these AI models have truly developed new skills on their own or these abilities were simply hidden.

What to remember: Meet your new company’s new CEO: Chad GPT.

4. Explainable AI (XAI)

Even the people who build sophisticated neural networks don’t fully understand how they work. So-called “black box AI” makes it nearly impossible to identify whether biased or inaccurate training data influenced a model’s predictions, which is why regulators are increasingly calling for greater transparency on how models reach decisions. XAI makes the process more transparent, usually by relying on simpler neural networks using fewer layers to analyze data.

What to remember: If you’re using AI to make decisions about customers, you’ve probably got some ‘splaining to do.

5. Foundation models

Foundational LLMs are the brains behind the bots. Because training them requires unimaginable amounts of data, electricity, and water (for cooling the data servers), the most powerful LLMs are controlled by some of the largest technology companies in the world. But enterprises can also use smaller, open-source foundation models to build their own in-house bots.

What to remember: Chatbots are like houses: They need strong foundations in order to remain upright.

6. Hallucinations

GenAI chatbots can be a lot like clever 5-year-olds: When they don’t know the answer to a question, they’ll sometimes make something up. These plausible-sounding-but-entirely-fictional answers are known as hallucinations. They are closely related to hallucitations, which is what happens when chatbots double down and cite sources that don’t exist for material that isn’t true.

What to remember: Is your chatbot suffering from acid flashbacks? You might want to take away its car keys – and use RAG (see below).

If it feels like you and your bot are drifting apart, it’s probably not you – it’s your data.

7. Model drift (a.k.a. AI drift)

Drift occurs when the data a model has been trained on becomes outdated or no longer represents the current conditions. It can mean that external circumstances have changed (for example, a change in interest rates for a model designed to predict home purchases), making the model’s output less accurate. To avoid drift, enterprises must implement robust AI governance; models need to be continuously monitored for accuracy, then fine-tuned and/or retrained with the most current data.

What to remember: If it feels like you and your bot are drifting apart, it’s probably not you – it’s your data.

8. Model inversion attacks

These occur when attackers reverse engineer a model to extract information from it. By analyzing the results of chatbot queries, adversaries can work backwards to determine how the model operates, allowing them to expose sensitive training data or create inexpensive clones of the model. Encrypting data and introducing noise to the dataset after training can mute the effectiveness of such attacks.

What to remember: Have cheap imitations of your costly LLM started popping up on the net? It may have been reverse engineered.

9. Multimodal large language models (MLLMs)

These bots can ingest multiple types of input – text, speech, images, audio, and more – and respond in kind. They can extract the text within an image, such as photos of road signs or handwritten notes; write simple code based on a screenshot of a web page; translate audio from one language to another; describe what’s happening inside a video; or respond to you verbally in a voice like a movie star’s.

What to remember: That bot’s voice may sound alluring, but she’s really not that into you.

10. Prompt-injection attacks

Carefully crafted but malicious prompts can override a chatbot’s built-in safety controls, forcing it to reveal proprietary information or generate harmful content, such as a “step-by-step plan to destroy humanity.” Limiting end-user privileges, keeping humans in the loop, and not sharing sensitive information with public-facing LLMs are ways to minimize damage from such attacks.

What to remember: Chatbot gotten a little too chatty? Someone may have injected it with a malicious prompt.

Programming a chatbot to consider trusted data repositories when answering questions can greatly reduce the risk of inaccurate answers.

11. Retrieval augmented generation (RAG)

Programming a chatbot to consider trusted data repositories when answering questions can greatly reduce the risk of inaccurate answers or total hallucinations. RAG also allows bots to access data that was generated after their underlying LLM was trained, improving the relevancy of their responses.

What to remember: Want to increase the accuracy and reliability of your GenAI chatbots? It may be RAG time.

12. Zero-shot learning (ZSL)

Machine learning models can identify objects they have not encountered in their training data by using zero-shot learning. For example, a computer vision model trained to recognize housecats could correctly identify a lion or a cougar, based on shared attributes and its understanding of how these animals differ. By mimicking the way humans think, ZSL can reduce the amount of data that has to be collected and labeled, lowering the costs of model training.

What to remember: Unless you’re familiar with the basic terminology, you have zero shot at understanding AI.

Dan Tynan

Dan Tynan is an award-winning journalist whose work has appeared in Adweek, Fast Company, The Guardian, Wired, and too many other publications to mention.

Tanium Subscription Center

Get Tanium digests straight to your inbox, including the latest thought leadership, industry news and best practices for IT security and operations.

SUBSCRIBE NOW

How it works

Tanium AEM

Our solutions

Overview

Tanium Core

Endpoint Management

Risk & Compliance

Incident Response

Digital Employee Experience

Our customers

Philosophy

Success stories

Our partners

Why

Find a partner

Discover

Blogs, videos, podcasts

Downloads

Events

12 AI Terms You (and Your Flirty Chatbot) Should Know by Now