Constructing Visual Prompts for AI Video Generation
An Artistic and Practical Guide Based on “Memories from My Grandma’s House”
In Q4 2025, AI filmmaking expanded into generating sound and video at the same time. In this article, we will look at a few examples from my experiments and I will document my approach behind them.
My hope is that this creative technology research helps you in your next project.
A Poetic Approach to AI Filmmaking
At this stage of AI video generation, the process feels closer to writing poetry than directing a film. Each prompt becomes a verse: rhythm, tone and even what you choose not to say shapes the outcome.
To explain this, here’s an old anecdote:
A man orders coffee without milk. The waiter replies, “We’re out of milk, we do have cream. Would you like your coffee without cream instead?”
It’s absurd but it makes one thing clear: what you omit changes the result just as much as what you include.
Prompting works the same way.
Starting with curiosity
I have been experimenting mostly for fun and as always I began with what I genuinely wanted to see.
Recently, I’ve enjoyed observing old couples and how they slowly become similar over the years in a gentle, familiar way. They accept each other fully. They know they are on the same team.
visual prompt: “old couple walking in the street and with all the years they spend together you can tell that they are together by their fashion choices and their eyes is familiar and look alike”
Writing for AI is not about control. You may not get what you intended. Let it surprise you.
To understand the impact of a prompt, I generate at least twice and in this case, three generations helped me see how the model interpreted the same visual prompt differently.
Moving toward structured experiments
Later on, I moved into more structured experiments to investigate new ways of prompting in late 2025.
Let me show you how I build the visual prompts that guide my projects.
By the end of this article, you will find a full document of a creative technology research that began with sound-driven experiments and grew into a cinematic method for reconstructing Memory AI Art.
🍽 Dinner Experiment in Different Cities
I tested how a single word “dinner” can change across cities: Amsterdam, İstanbul and Paris.
A one-word change, in this case the location, can transform the entire vision of a scene. It’s an abracadabra moment where the model shows you what it “thinks” dinner looks and sounds like.
Each video began with a simple phrase, generated with Sora 2 Pro and ElevenLabs voice models with accents.
The structure was consistent:
Simple visual prompt → Dinner in [city]
Voice cue → short phrase in the local language (e.g., “Tot ziens”)
The goal was to see how AI understands short sentences and familiar concepts, how the “average inside of the model” shapes what we see and hear.
Sometimes less instruction shows more about the system itself.
✂️ What does “good small talk” even mean?
After the dinner tests, I became curious about something more delicate: small talk.
I often find myself trying to make sense of daily conversations between people the half sentences, the pauses, the background noise. So I tested how AI handles these ordinary human rhythms.
visual prompt: “A hairdresser trims someone’s hair on a quiet afternoon. The scissors click softly between words about the weather, a trip, a friend’s new job. The salon hums with dryers and faint radio noise. The air smells like spray and small talk.”
These early tests explored how AI interprets sound, rhythm and everyday language, setting the groundwork for later experiments.
They also introduced a deeper layer: the social soundscape: the way emotion hides in background noise, half sentences and ordinary moments.
These experiments showed me that realism can be more surprising than abstraction and that atmosphere often lives in the details we barely notice.
A good example of this is in Nuri Bilge Ceylan’s films, where small talk carries unspoken tension. People speak around the truth rather than directly at it.
This curiosity naturally led me to explore something more personal: “Memories from My Grandma’s House” a series rooted in intimate recollections.
I wanted to explore how emotionally precise language (like a form of mixed-media poetry) could guide AI toward warmth and belonging.
From Abstract Tests to Personal Worlds
After exploring urban and generic scenes, the next stage was to move toward rooted memory. This project reconstructs sensory fragments from my grandmother’s home in Gönen, Balıkesir, Türkiye.
The focus shifted from the “average” generated by the model to the intimate: What happens when AI is asked to remember something personal, something specific?
One of my first experiments was a memory: (1) applying kına (henna) beside my grandmother, (2) next to the soba the wood-burning stove my grandmother called maşinga.
The goal was to see if technology can reflect a sense of care, touch and slowness: qualities that usually belong only to memory.
Below, you can see how small changes in the visual prompt influenced the scene. In this phase, I was simply explaining to my AI assistant what I liked and what I wanted to explore further.
Each visual prompt followed a five-layer structure, combining poetic intuition with conceptual control:
The result is a script for feeling rather than for performance.
Building on the Conceptual Foundation
Together, these memories form a microcosmic portrait of domestic life: a world rebuilt through the language of memory rather than the syntax of cinema.
Remember, in cinema, directors don’t construct an entire universe; they give us just enough scenes for us to believe it exists.
In my case I had two each memory corresponds to a specific sensory cue:
Kına by the maşinga (henna beside the wood-burning stove) → warmth, smell, intimacy
Hamur Yemek (eating dough) → rhythm, touch, repetition
The process was iterative and guided by results, not rules.
In an early draft, I mentioned a TV in the background, but it added unnecessary complexity. The narration from the TV somehow “knew” what was happening in the room which made no sense. Removing it simplified the space.
visual prompt: “Inside a small house in Gönen, Balıkesir, warm light flickers from the soba. A child sits nearby, watching the slow swirl of henna in a small bowl. The air smells like wood smoke and color. Laughter moves softly through the room, along with the hum of an tv. Outside, the daylight rests quiet on the street“
The Final Memory Scenes
With more experiments, I realized that having a short visual vision phrase at the end of each prompt created consistency across separate memories.
🌿 Kına by the Maşinga
visual prompt: “Inside a small house in Gönen, Balıkesir, a girl and her grandmother sit beside the soba. They prepare kına, spreading it gently on their hands. The stove glows softly; the air smells of warmth and color. There is no radio, only the quiet crackle of fire and the sound of breath. Light moves slowly across the walls, and time feels patient. Visual vision: warm domestic light, tactile realism, soft rhythm, handheld intimacy“
🌾 Hamur Yemek
visual prompt: “Morning light filters through the window of the same house in Gönen, Balıkesir. By the soba, the grandmother kneads dough on a wooden tray. The sound is steady — press, fold, rest. Flour drifts through the air like pale dust. A girl sits nearby, stealing a small piece to taste, smiling. Outside, the yard waits in quiet light. Visual vision: warm domestic light, tactile realism, soft rhythm, handheld intimacy“
Together, these scenes felt like two pages from the same memory: separate moments and held by the same light.
Sharing the Results with Care
After all these experiments, I had two final versions of the scenes. The next morning, I woke up and showed them to my mom. We watched them together on my laptop.
In the end, AI filmmaking isn’t really about technique. Sometimes it’s about creating something small that connects you to someone you love.
These tiny experiments can feel more personal than you expect.
AI filmmaking becomes a dialogue between memory and computation.
The AI doesn’t dream: it reassembles averages.
Our job is to teach it to remember, softly.
If you try these methods, I hope you’ll also share your results with someone: a friend, a parent or even me.
I’m genuinely curious to see what you create.
Ayça Turan
ayca.tech
December 2025, Türkiye


