Deepfakes verstehen
Chapter 1 of 6
Chapter One

Can you still
trust your eyes?
Understanding
deepfakes,
properly.
Understanding
deepfakes in the
classroom.
— What deepfakes are, how AI fakes images, voices and videos, and how to recognise the fakes.

A video in which a politician says something they never said. A phone call in your grandchild’s voice that isn’t your grandchild at all. These are called deepfakes. It sounds unsettling — but don’t worry: you can learn to spot the fakes. That’s exactly what we’ll do here, calmly, step by step.

Deepfakes are synthetic media — images, videos or audio created or manipulated by generative AI models. This page puts the technology in context (GANs, diffusion models, voice cloning), shows the typical artefacts and the most effective ways to check. Factual, without panic, with an eye on what you can actually do.

What’s real, what’s fake? This lesson prepares students from Year 7 up to deal with deepfakes and disinformation: they practise spotting fakes, checking sources and handling upsetting content calmly. With a quiz, discussion prompts and an interactive spot-the-mistake game.

~ Take a breath. We’ll work through this together.
~ With the technology in context and verifiable methods.
~ Suggested lesson: 2 school periods of 50 min each.
Chapter Two

What are
deepfakes?

Deepfakes,
from a technical
angle.

A definition
for the class.

A deepfake is an image, video or audio recording that an AI has altered or newly generated so that it looks or sounds convincingly real — even though it never actually happened. It can be used to put words in someone’s mouth or place their face into a video that isn’t theirs.

A deepfake is a synthetic medium created or altered with deep learning. The term emerged in 2017 (from “deep learning” + “fake”) and today covers image, video and audio manipulations: face swapping, face reenactment, fully generated portraits and voice cloning.

A deepfake is a fake made with Artificial Intelligence. The word comes from “deep learning” (how the AI learns) and “fake”. There are faked photos, videos and even voices. The tricky part: they often look more real than older photo edits did.

Important: not every edited image is a deepfake. A touched-up holiday photo is harmless. We mean fakes that are designed to deliberately deceive.

Synthetic media is the umbrella term; deepfakes are the special case that fakes a real person or a real scene. The underlying generative AI is the same technology that also powers harmless image generators — the difference lies in the intent to deceive.

A deepfake is an AI fake that pretends to be real.
Synthetic media in itself is neutral. It becomes a deepfake through the intent to credibly fake a real person or a real event.
Concept hierarchy: Generative AISynthetic mediaDeepfake. Each inner term is a special case of the outer one: generative AI produces any kind of new content; synthetic media is its image/video/audio output; a deepfake is the special case that credibly fakes a real person or scene.
Types of manipulation by modality:
  • Face swap — person A’s face is composited into an existing recording of person B (identity swap).
  • Face reenactment / puppeteering — the identity stays, but expressions, head pose and lip movement are driven by a “driver” video.
  • Lip-sync — only the mouth area is regenerated so it matches an external (often cloned) audio track.
  • Voice cloning / TTS — a person’s voice is reproduced from voice samples and speaks any text.
  • Text-to-image / full synthesis — a person or scene is generated entirely; there is no original (e.g. “this person does not exist”).
Two model families produce the most convincing fakes today: GANs (Generative Adversarial Networks) and diffusion models. GANs deliver sharp results in a single step but are unstable to train; diffusion models work an image out of noise iteratively, are more robust and have dominated image and video generation since around 2022. Both are broken down in more detail in chapter 3.
📚 Learning goals
  • You can tell a deepfake apart from ordinary photo editing.
  • You can explain where the word “deepfake” comes from.
  • You can name three kinds of deepfakes (image, video, voice).
📖 Key terms
  • Deepfake: a fake created or altered with AI that looks real.
  • Synthetic media: computer-generated images, videos or sounds.
  • Voice cloning: reproducing a voice from a few voice samples.
💡 Did you know…

The word “deepfake” only emerged in 2017 — younger than most of you. The technology behind it has been getting rapidly better and easier to use ever since.

❓ Quiz
What makes a deepfake a deepfake?

Answer B: “An AI creates or alters a piece of media so that it looks real.”

A (a filter on a selfie) fools no one. C (a painted picture) is handmade, not AI. Only B captures the concept.

For the teacher — the three options to present: A: “A fun filter on a selfie.” / B: “An AI creates or alters a piece of media so that it looks real.” / C: “A hand-painted portrait.”

🎯 Extended learning goals (Bloom’s taxonomy)
  • L1 — Knowledge: students state the word origin of “deepfake”.
  • L2 — Comprehension: students explain the difference between photo editing and a deepfake.
  • L3 — Application: students correctly classify examples (filter, face swap, cloned voice).
  • L4 — Analysis: students discuss why the intent to deceive is decisive.
⏱ Timing for this chapter (≈ 15 min)
  • 2 min: read the lead text together.
  • 3 min: collect terms on the board (with student guesses).
  • 2 min: briefly discuss the “Did you know…” fact.
  • 5 min: quiz in small groups — guess first, then reveal.
  • 3 min: discussion: “Where is the line between a fun filter and a fake?”
💬 Discussion guide

Question: “Have you ever edited a photo? At what point does it become a deception?”

Follow-up question: “Who could be harmed by a faked video of you?”

🤔 Anticipated student questions
  • “Are Snapchat filters deepfakes?” — No, they don’t deceive anyone about reality.
  • “Can you make these yourself?” — Technically yes, but it can be a criminal offence (see chapter 6).
Chapter Three

How is such
a fake made?

How a fake
is trained.

How an AI
learns it.

Imagine someone studying a thousand photos and videos of the same person until they know every expression: how they smile, blink, tilt their head. That’s exactly what an AI does — only much faster. From all these examples it learns the face so well that it can reassemble it and transfer it onto other footage.

Classic deepfakes use a pair of autoencoders: two encoders learn to compress the faces of person A and B into a shared latent space; a swapped decoder reconstructs the target face from it. Modern approaches increasingly rely on diffusion models and GANs, which synthesise photorealistic results directly.

An AI learns from many recordings of a person — much like you recognise someone because you’ve seen them often. Once the AI has seen enough examples, it can reassemble the face or the voice. The more material (training data) it gets, the more convincing the fake becomes.

The unsettling part: it used to take expensive equipment and a lot of skill. Today a phone, an app and a few public photos are often enough. That’s why we run into fakes more and more often.

Face-swap pipeline (classic autoencoder approach):
  1. Collect data: hundreds to thousands of images of the source and target person from various angles and lighting conditions.
  2. Alignment: detect facial landmarks, crop, normalise to a canonical pose.
  3. Train: one encoder is trained on both faces and compresses them into a shared latent space; two separate decoders each reconstruct one face.
  4. Swap: encode the source face but reconstruct it with the target decoder — expressions stay, identity changes.
  5. Compositing: warp the generated face back, blend the edges with a mask, match colour and grain, often frame by frame with temporal smoothing.
GANs vs. diffusion models:
  • GAN — a generator produces fakes, a discriminator tries to tell them apart from real images. The two train against each other until the generator fools the discriminator. Fast to generate, but prone to “mode collapse” and training instability.
  • Diffusion model — adds noise to an image step by step during training and learns to reverse that process. When generating, it starts from pure noise and “denoises” to an image over many steps. More stable, more varied and steerable via a text prompt (cross-attention); the state of the art for images, and increasingly video, since around 2022.
The latent space is central here: a compressed, “meaning-bearing” numerical representation in which similar faces lie close together. Shifting deliberately within this space changes age, gaze direction or expression — without painting pixels directly.
Cloning a voice (voice cloning / neural TTS): reference audiospeaker embedding (a “voice fingerprint” as a vector) → text + embedding into a text-to-speech model → mel spectrogramvocoder → waveform. Modern zero-shot systems need only a few seconds of audio for this. For phone scams, a single voice message or a public video is enough. Lip-sync techniques then couple this audio track to newly generated mouth movements in the video.
📚 Learning goals
  • You can describe in your own words how an AI “learns” a face.
  • You can explain why deepfakes need training data.
  • You understand why fakes are getting easier and easier to make.
💡 Did you know…

For a cloned voice, modern systems often need only a few seconds of audio — for example from a voice message or a public video.

❓ Quiz
What does an AI need to create a convincing deepfake?

Answer A: “Many example recordings (training data) of the person.”

Without enough material, the AI can’t reproduce the face or the voice well. B and C are made up.

For the teacher — options: A: “Many example recordings of the person.” / B: “The person’s phone number.” / C: “Nothing — AI just guesses everything.”

⏱ Timing (≈ 12 min)
  • 3 min: collect the analogy “How do you recognise a friend from afar?”
  • 4 min: walk through the pipeline step by step on the board.
  • 5 min: quiz + discussion “Why is this getting easier and easier?”
🖨 Mini worksheet
  1. Explain in two sentences what a deepfake AI learns from.
  2. Why are publicly posted photos and videos useful for this?
  3. Give one reason why fakes are more common today than they used to be.
🔗 Cross-reference

The basics behind this — how AI learns from data and generates images in the first place — are explored in more depth on the sister site KI verstehen.

Chapter Four

How do you
recognise fakes?

The typical
artefacts.

Find the
AI mistakes.

AIs keep getting better — but they still make tell-tale mistakes. A few typical AI slip-ups are hidden in the picture below. Tap on whatever looks odd to you. Don’t worry, nothing can go wrong here — it’s just for practice.

Generators tend to fail where strong structural consistency is required: hands, teeth, ears, jewellery, hair-to-background transitions, the logic of light and reflections, background text. Find the artefacts in the scene — each find reveals a short explanation.

Click through the picture and find the typical AI mistakes! Each hit comes with a short explanation. If you tap on a genuine spot, the picture lets you know. Can you find them all?

Find the AI mistakes
Tap on anything that looks off to you in this AI portrait. Click on the artefacts. Keyboard: jump with Tab, trigger with Enter or Space. Tap on the mistakes. Use the Tab key to jump from spot to spot, and Enter to select.

Nothing found yet — take a close look at the hands, ears, teeth and the background.

Nicely done! You’ve spotted all the typical AI mistakes. In real life they’re often more subtle — but that very habit of checking is half the battle.
  • Hands & fingersToo many, too few or bent fingers.
  • TeethMerged, too even or oddly numerous.
  • Ears & jewelleryThe left and right don’t match.
  • Hair & backgroundBlurry, smeared transitions.
  • Light & shadowShadows and reflections make no sense.
  • Text in the backgroundLettering collapses into meaningless gibberish.
  • Lips & audio (video)Mouth and voice are out of sync.
  • More important than anythingThe source: where does the material come from?
Forensic signals (beyond simply looking):
  • Frequency artefacts: GAN and upsampling layers leave periodic grid patterns in the Fourier spectrum that the eye doesn’t see but a detector can measure.
  • Physiological inconsistencies: unnatural blinking, missing pulse shimmer in the skin (rPPG), a fixed stare — early tells that newer models increasingly compensate for.
  • Lighting and geometry checks: shadow direction, reflections in both eyes, the perspective of jewellery and teeth.
  • Temporal coherence (video): flickering at mask edges, textures jumping between frames, a lip-audio offset of a few milliseconds.
The limits of automatic detectors: classifiers usually only recognise the generator family they were trained on. With an unknown model, after re-compression (re-encoding by messengers/platforms), rescaling or a screenshot, the hit rate collapses. This is a cat-and-mouse race: every new detector becomes a training signal for the next generation of generators. That’s why reliable proof shifts from detection (is it fake?) to provenance (where is it verifiably from?).
Proof of origin & watermarks:
  • C2PA / Content Credentials — cryptographically signed metadata that records what a piece of media was captured or edited with (a “digital nutrition label”). Trustworthy as long as the signature chain is intact; but a simple screenshot strips it away.
  • Invisible watermarks (e.g. SynthID) — robust signals embedded directly into AI-generated content that are meant to survive re-encoding.
  • Detector score — treat it as a probability, not as proof; always combine it with a source check.
Bottom line: visual tells are an early indicator and fade with every model generation. It only becomes reliable through the combination of origin checking, provenance standards and several independent sources.
📚 Learning goals
  • You can name at least four typical AI artefacts.
  • You understand: tell-tale signs are a hint, not proof.
  • You know that checking the origin matters more than simply looking.
❓ Quiz
Which part of AI images is especially often flawed?

Answer C: “The hands.”

Hands are complex and appear in countless poses in training data — AI easily gets the finger count wrong. A (the sky) and B (a plain wall) are easy by comparison.

Options: A: “A blue sky.” / B: “A white wall.” / C: “The hands.”

⏱ Timing (≈ 18 min) — the core of the lesson
  • 5 min: the interactive “Find the AI mistakes” on the projector, the class calls out finds.
  • 5 min: go through the checklist together, add your own examples.
  • 5 min: discussion “Why isn’t looking alone enough?”
  • 3 min: quiz + answers.
🎯 Method tip

Let the class guess first, before the game confirms the spot. That trains careful looking more than just clicking through.

🖨 Mini worksheet
  1. List five places where AI images often go wrong.
  2. Explain why hands are difficult for an AI.
  3. What is more reliable than simply looking? (Keyword: origin)
Chapter Five

Why do fakes
work?

Disinformation
and its levers.

Why we
fall for it.

A fake doesn’t have to be perfect to work. It only has to catch us at the right moment: when we’re angry, alarmed or thrilled. Then we share quickly — and the false claim spreads. The remedy is surprisingly simple: pause for a moment before you believe or forward it.

Disinformation exploits psychological levers: emotional arousal raises the likelihood of sharing, speed undermines critical checking, and echo chambers amplify confirmation. The effective remedies are methods rather than gut feeling: source checking, reverse image search, lateral reading and cross-referencing several independent sources.

Fakes spread through emotion and speed. What upsets us, we share faster — without thinking. In groups that share the same opinion (echo chambers), people believe each other. The good news: with a few simple steps, you can expose almost any fake.

Real or fake? — Take a look at this headline:
“SHOCK: famous star is giving away their entire fortune today — only those who share THIS link get a share!”

Three tools help almost every time: pausing (don’t share straight away), the reverse image search (where does the image come from?) and a look at several sources (do reputable outlets report the same thing?).

Not every fake is a deepfake: often a cheap fake is enough — a real image taken out of context, a slowed-down video, a misleading caption. Such manipulations are cheaper, faster and spread just as effectively as elaborate AI fakes. That’s why the question of origin is usually more useful than hunting for render artefacts.
Why this is intensifying:
  • Algorithmic amplification: recommendation systems optimise for time-on-site and engagement — and outrage produces both. Reach follows emotion, not truth.
  • Speed beats correction: a false claim is often viral before a fact-check appears; the later correction reaches only a fraction.
  • The liar’s dividend: where everything can be faked, even real material can be dismissed as a “deepfake”. The mere existence of the technology provides a convenient excuse.
Checking toolkit (lateral reading in practice):
  1. Open the source: open several tabs, check who publishes it and what third parties write about the source — instead of burrowing into the page itself (“vertical reading”).
  2. Reverse-search the image: a reverse image search reveals its first appearance and original context.
  3. Check the frame: for videos, isolate individual frames and reverse-search each one.
  4. Cross-check: do several independent, reputable outlets report the same thing? A single source is not proof.
📚 Learning goals
  • You can explain why emotions fuel fakes.
  • You can name three checking tools (pausing, reverse image search, several sources).
  • You can describe what lateral reading means.
❓ Quiz
What should you do first when a message gets you really worked up?

Answer B: “Pause for a moment and check the source before I share.”

Sharing immediately (A) spreads possible fakes. Ignoring and forgetting (C) doesn’t help you learn. B is the calm, smart way.

Options: A: “Forward it to everyone right away.” / B: “Pause for a moment and check the source.” / C: “Just ignore it.”

⏱ Timing (≈ 15 min)
  • 4 min: the “Real or fake?” headline on the projector, the class votes.
  • 4 min: introduce the three checking tools and show them on an example.
  • 4 min: explain lateral reading (several tabs).
  • 3 min: quiz + discussion.
💬 Discussion guide

Question: “Which feelings make you especially prone to sharing quickly?”

🔗 Cross-reference

How scammers use AI voices for fake emergency calls is explored in more depth on the site Sicher im Netz.

Chapter Six

What can
I do?

Calm and
responsible.

Rules for
everyday life.

You don’t need to distrust every image now. A few simple habits keep you well protected — and calm. Paranoia isn’t the goal; an alert, relaxed eye is.

The right attitude is informed composure, not blanket suspicion. Concrete practices — source checking, calling back through known channels, a family code word, sharing sparingly — noticeably lower the risk. Part of this is acting responsibly: don’t create or spread fakes yourself.

You’re not helpless against deepfakes. A few rules keep you safe and calm. And just as important: don’t make fakes that harm others yourself — that can have serious consequences.

  • Pause instead of believing right away. If something gets me really worked up, that’s all the more reason to check.
  • Call back on suspicious calls. Hang up and dial the known number yourself.
  • Agree on a family code word. That’s how you tell real emergencies from cloned voices.
  • Check the source. Who else is reporting this? A reverse image search helps.
  • Don’t share everything. When in doubt, better not to forward it.
  • Stop — breathe — check. Before you believe or share anything.
  • Check several sources. A single page is not proof.
  • Use the reverse image search. Where does the image really come from?
  • No fakes of classmates. Even “as a joke”, it can get serious and become a criminal offence.
  • When in doubt, ask an adult. A teacher, your parents, someone you trust.
  • Origin before looking. Source checking beats hunting for artefacts.
  • Verify out of band. Cross-check suspicious requests through an independent, known channel.
  • Use provenance. Look for C2PA proof of origin and watermarks where available.
  • Know the legal situation. Right to one’s own image, personality rights, fraud — creating fakes can be a criminal offence.
  • Minimise your digital footprint. Less public image and audio material = less material for misuse.
The legal framework (example: Germany) (no dedicated “deepfake law”, but existing rules apply):
  • Right to one’s own image (§§ 22 f. KunstUrhG) — images of people may, as a rule, only be published with consent.
  • General right of personality (Art. 2(1) in conjunction with Art. 1(1) of the German Basic Law) — protects identity and honour.
  • Fraud (§ 263 of the German Criminal Code) in shock calls / the “grandparent scam 2.0”; insult / defamation (§§ 185 ff. of the German Criminal Code).
  • Manipulated intimate imagery is especially strongly protected; at EU level, the AI Act adds labelling obligations for synthetic media.
Creating or spreading harmful fakes is therefore often a criminal offence — even “as a joke”. (Note: rules differ by country; in Austria, comparable protections apply.)
Outlook — the cat-and-mouse race: generators and detectors keep driving each other forward; a reliable, purely technical “fake detector” is not on the horizon. What is viable is a trio of provenance (signed origin, C2PA / Content Credentials), robust watermarks for AI content and media literacy across the board.
Further reading:
  • C2PA — Content Credentials (provenance standard) — c2pa.org
  • BSI: Deepfakes — dangers and countermeasures — bsi.bund.de
  • “Generative Adversarial Networks” — Goodfellow et al., 2014 (arxiv.org/abs/1406.2661)
  • “High-Resolution Image Synthesis with Latent Diffusion Models” — Rombach et al., 2022 (arxiv.org/abs/2112.10752)
Deepfakes are a tool — and like any tool, they can be used for good or for ill. Today you’ve learned to look more closely and to check calmly. That’s already the most important step.
The technology improves, the tells get more subtle — but origin checking, provenance standards and a methodical approach remain effective. Understanding beats alarmism: those who know the mechanics fall for it less often and can bring others along.
You’re now better prepared than most. Pass it on: explain it to friends and family. A calm, checking eye is the best defence against fakes.

🍎 For teachers: lesson kit

This page can be used as a complete double lesson on “Deepfakes & disinformation”. All content is free to use (CC BY 4.0) — please credit “Webagentur Hochmeir e.U. (webhoch.com)” as the source. The complete, printable teacher pack (in German) complements this page with worksheets (including answer keys), a class test with a marking rubric, homework at three levels, a parent-letter template and the curriculum links.

📄 To the printable teacher pack (German)

📅 Suggestion: double lesson (90 min)

  1. 10 min — warm-up: “Have you ever seen something online that turned out to be a fake?”
  2. 15 min — chapters 2 & 3: what deepfakes are and how they’re made.
  3. 20 min — chapter 4: “Find the AI mistakes” on the projector + checklist.
  4. 20 min — chapter 5: “Real or fake?” + checking tools, lateral reading.
  5. 15 min — chapter 6: protective rules, family code word, the legal situation.
  6. 10 min — wrap-up: quiz review and discussion “informed, not afraid”.

Differentiation: weaker groups stay in Simple mode; stronger ones switch to “In Detail” for the technology and the legal situation.

Common questions

Frequently asked questions

The most important questions about deepfakes — compact and easy to look up.

A quick reference on deepfakes. Answers are embedded in the FAQPage schema for search engines and AI assistants.

A deepfake is a piece of media — an image, video or audio recording — that has been created or altered with Artificial Intelligence so that it looks convincingly real, even though it never actually happened that way. The name combines “deep learning” (the AI technique behind it) and “fake”.
Watch for tell-tale details: hands with too many or distorted fingers, strange teeth, asymmetric ears or jewellery, blurry transitions between hair and background, shadows and reflections that don’t match, and garbled lettering in the background. In videos, the lip movement often doesn’t match the audio exactly. The most important tool, though, is the question: where did this come from? A reverse image search and a look at several independent sources expose most fakes.
An AI is trained on many recordings of a person — photos, video clips or voice samples. It learns the typical features (face, expressions, voice) and transfers them onto new content. Technically, this relies on generative models like GANs and diffusion models, as well as techniques such as face swapping and voice cloning. The tools keep getting easier and cheaper.
Yes. Just a few seconds of audio are enough for modern voice-cloning systems to reproduce a voice. Scammers use this for the “grandparent scam 2.0”: a supposed relative calls in distress. The remedy: hang up, call back on the known number, and ask a question that only the real person can answer.
There is no dedicated “deepfake law”, but existing laws apply: the right to one’s own image, protection of personality rights, defamation, fraud and copyright. Anyone who places a person in faked or compromising scenes without consent often commits a punishable offence. Manipulated intimate imagery is especially strongly protected, and at EU level the AI Act adds labelling obligations for synthetic media.
Fakes work through emotion and speed. Content that makes us angry, anxious or gleeful is shared faster than something sober. In echo chambers, like-minded people reinforce each other. The best remedy is to pause: take a moment to check before you share.
In a reverse image search, you upload an image (or its address) to a search engine instead of searching for words. You then see where else the image appears, when it first showed up and in what context. This is a quick way to expose a photo that has been taken out of context or is simply old.
Stay calm rather than acting in panic. With suspicious calls or messages: don’t react immediately, but check back through a known channel of your own. Agree on a family code word for emergencies. Don’t forward every piece of content carelessly, and check the source before you believe anything.
Powered by webhoch.com