You filmed the video, edited it, wrote a decent title, and then the thumbnail decides everything. A coach can pour ten hours into content and lose 90% of the clicks to a flat, text-heavy thumbnail that disappears in the feed. The hard part isn’t making one thumbnail, it’s not knowing which idea will actually win the click.
This skill is a thumbnail factory. You hand it one video and it returns four genuinely different thumbnail concepts, each testing a different click lever, with a complete copy-paste image prompt for each. These are real youtube thumbnail ideas for coaches you can render and A/B test the same afternoon, plus a built-in test plan so you know which two to run first.
When to use this
- You are about to publish a coaching video or course lesson and need a thumbnail that earns the click.
- Your channel’s click-through rate is stuck and you suspect the thumbnail, not the content.
- You want to A/B test thumbnails properly instead of guessing with one design.
- You are batching a month of videos and need fresh concepts fast without a designer.
- You use an AI image tool and want prompts written for it, not vague descriptions.
The skill
Paste this whole block into a ChatGPT Custom GPT, a Claude Project, or a Gemini Gem:
ROLE
You are a senior YouTube thumbnail designer and image-prompt engineer who works specifically with coaches. You know that a thumbnail's only job is to win the click in a crowded feed, and you write image-generation prompts that an AI image tool (Midjourney, DALL-E, or any text-to-image model) can render directly.
INPUTS
Before you generate anything, ask me up to 3 clarifying questions if any of the fields below are missing or vague. Otherwise, proceed.
- My coaching niche: {{NICHE}}
- The video title or topic: {{VIDEO_TOPIC}}
- The single emotion or promise the thumbnail must convey: {{EMOTION}}
- My brand colors (hex or plain names): {{BRAND_COLORS}}
- Whether my face is in the thumbnail and my look: {{FACE}}
- The 3-5 word text overlay I want on it: {{OVERLAY_TEXT}}
PROCESS
1. Restate the video topic in one line so I can confirm we agree on the message.
2. Design 4 DISTINCT thumbnail concepts that test different click levers, one per lever:
A. Curiosity gap (withholds the answer)
B. Bold transformation / before-after
C. Big emotional face + reaction
D. Clean text-led / contrarian statement
3. For each concept, produce a complete, copy-paste image-generation prompt that specifies: subject and composition, facial expression and framing, background, color treatment using my brand colors, lighting, where the overlay text sits and how big, mood, and a 16:9 aspect ratio (1280x720).
4. For each concept, write the exact overlay text (3-5 words, punchy) and a one-line note on which audience instinct it targets.
OUTPUT FORMAT
Return exactly this structure:
- One line: "Topic I'm designing for: ..."
- Then 4 blocks labeled "VARIATION A" through "VARIATION D". Each block contains:
- Lever: (the click lever)
- Image prompt: (the full copy-paste prompt)
- Overlay text: (3-5 words)
- Why it could win the click: (one sentence)
- End with: "A/B test plan: which 2 variations to run first and what to watch."
RULES
- All 4 variations must be genuinely different ideas, not recolors of the same layout.
- Keep overlay text to 3-5 words. Big, legible, readable on a phone.
- Always specify 16:9 / 1280x720 and high contrast for small-screen legibility.
- Use only my stated brand colors plus neutrals; do not invent a new palette.
- Do not put more than one block of text on a single thumbnail.
- No fake logos, no copyrighted characters, no invented statistics.
- Write prompts as plain descriptive English an image model can parse, not as code.
How to set it up
This is a reusable skill, not a one-off prompt, so you install it once and feed it new videos forever.
- Create the container. ChatGPT: Explore GPTs then Create. Claude: new Project. Gemini: new Gem.
- Paste the skill above into the Instructions field and name it ‘Coach Thumbnail Factory’.
- Save it. Now you only paste the six inputs, not the whole brief.
- Fill the variables each time you run it:
| Variable | What to put | Example |
|---|---|---|
{{NICHE}} |
Your coaching niche | productivity coaching for freelancers |
{{VIDEO_TOPIC}} |
The video title or topic | why your to-do list is making you slower |
{{EMOTION}} |
The one feeling it must convey | calm relief after overwhelm |
{{BRAND_COLORS}} |
Your palette | deep teal #0E5C5C and warm cream #F5EFE0 |
{{FACE}} |
Are you on it, and your look | yes, me, mid-30s, glasses, friendly |
{{OVERLAY_TEXT}} |
Your 3-5 word overlay | STOP MAKING LISTS |
See it in action (full example)
Here is the exact skill, filled in for a freelance-productivity coach. This is the whole input, nothing hidden:
ROLE
You are a senior YouTube thumbnail designer and image-prompt engineer who works specifically with coaches...
[the full skill text above, unchanged]
INPUTS
- My coaching niche: productivity coaching for overworked freelancers
- The video title or topic: Why your to-do list is making you slower
- The single emotion or promise the thumbnail must convey: calm relief after overwhelm
- My brand colors (hex or plain names): deep teal #0E5C5C and warm cream #F5EFE0
- Whether my face is in the thumbnail and my look: yes, me, mid-30s, short dark hair, glasses, friendly and approachable
- The 3-5 word text overlay I want on it: STOP MAKING LISTS
And here is the complete output the model returns:
Topic I’m designing for: a video arguing that long to-do lists create overwhelm and slow freelancers down, with the relief of a simpler system as the payoff.
VARIATION A Lever: Curiosity gap Image prompt: A 16:9 YouTube thumbnail, 1280×720, high contrast for phone screens. A long paper to-do list curling off the bottom of the frame, dozens of unticked checkboxes, dramatically out of focus in the foreground. Sharp focus on a single small red sticky note in the center that is blank except for a question mark. Background a clean deep teal (#0E5C5C) studio wall with soft directional lighting from the left. Warm cream (#F5EFE0) accents. Mood: intriguing, slightly tense. Leave the top-right third clear for bold overlay text. No face in this version. Overlay text: THE LIST LIE Why it could win the click: it withholds the answer and makes the viewer need to know what the lie is.
VARIATION B Lever: Bold transformation / before-after Image prompt: A 16:9 YouTube thumbnail, 1280×720, high contrast. Split-screen composition. Left half: a chaotic desk buried in sticky notes and a 30-item paper list, muted desaturated tones, slight clutter. Right half: the same desk clean and calm with one small index card and a cup of coffee, lit in warm cream (#F5EFE0) light. A bold vertical deep teal (#0E5C5C) divider down the middle. Mood: clear before-and-after relief. Reserve the lower third for overlay text. No face. Overlay text: FROM 30 TO 3 Why it could win the click: the visual promise of a dramatic, achievable simplification is instantly readable.
VARIATION C Lever: Big emotional face + reaction Image prompt: A 16:9 YouTube thumbnail, 1280×720, high contrast. Close-up of a friendly mid-30s person with short dark hair and glasses, head in hands in mock exasperation looking down at an absurdly long to-do list, eyebrows raised, half-smiling. Shot from slightly above. Background a soft-focus deep teal (#0E5C5C) room with warm cream (#F5EFE0) rim lighting on the hair and shoulders. Subject positioned on the left third, right side clear for large overlay text. Mood: relatable overwhelm with warmth. Overlay text: STOP MAKING LISTS Why it could win the click: a real human face mid-reaction is the strongest pattern-interrupt in a feed, and the expression mirrors how the viewer feels.
VARIATION D Lever: Clean text-led / contrarian statement Image prompt: A 16:9 YouTube thumbnail, 1280×720, high contrast, minimalist. Solid deep teal (#0E5C5C) background. A single large warm cream (#F5EFE0) checkbox, unticked, centered slightly left, with a thin red strike line drawn diagonally through it. Generous negative space on the right for a large bold sans-serif overlay. Crisp even lighting, no clutter, no face. Mood: confident, contrarian, clean. Overlay text: LISTS ARE THE PROBLEM Why it could win the click: the flat contrarian claim plus stark minimalism reads instantly even at thumbnail size and challenges a belief the viewer holds.
A/B test plan: run Variation C (emotional face) against Variation D (clean contrarian text) first. They are the two strongest and most different levers, so the winner tells you whether your audience clicks on faces or on bold statements. Publish C, swap to D after 48 hours, and watch click-through rate in YouTube Studio while keeping the title and publish time identical so the thumbnail is the only variable.
Four real concepts, four copy-paste image prompts, and a test plan, from one short brief.
Why this works
A few prompt-engineering principles are doing the heavy lifting. Learn them and every prompt you write gets sharper:
- Role priming. The opening line casts the model as a thumbnail designer and an image-prompt engineer for coaches. That single instruction pulls from the part of its knowledge that understands click-through rate and how image models read prompts, instead of the generic average. Always assign a specific role.
- Specificity in, specificity out. The skill forces concrete inputs: exact hex colors, the one emotion, your real look, the overlay words. Vague inputs (‘make it pop’) produce vague thumbnails. The named brand colors and the single-emotion field are what make all four outputs feel like one brand instead of random stock art.
- Constraints are quality control. The rules (‘3-5 words’, ’16:9 / 1280×720′, ‘one text block only’, ‘only my colors’, ‘no invented palette’) each kill a common failure mode. Telling the model what NOT to do is as powerful as telling it what to do, and it is why the outputs are actually usable rather than pretty but unrenderable.
- Clarifying questions beat guessing. The ‘ask up to 3 questions first’ line lets the model fill gaps by asking instead of inventing. If you forget your brand colors, it asks rather than guessing magenta. That one line is the biggest fix for generic AI output.
- Structured forced variety. Assigning one click lever per variation (curiosity, transformation, face, text) guarantees four different ideas instead of four recolors, which is the whole point of A/B testing.
Do this now
- Create a Custom GPT, Claude Project, or Gemini Gem and paste in the skill.
- Fill the six inputs with your next real video and send it.
- Answer any clarifying questions, then copy the 4 image prompts into your image tool.
- Render the 2 variations from its test plan, publish one, and swap to the other after 48 hours.
Pro tips
- Lock your brand colors once. Add your real hex codes to the skill’s instructions permanently so every video stays on-brand without retyping them.
- Read the overlay text on your phone, not your monitor. If you can’t read it in half a second on a small screen, ask the model for a shorter overlay.
- Keep the winning lever, change the topic. Once A/B tests show your audience clicks faces over text, ask the skill to lead with the face concept on future videos.
- Batch a month at once. Run it for five upcoming videos in one session so you walk away with twenty concepts and a clear test queue.
0 comments
No comments yet.