- Hollywood Tech Nerds
- Posts
- Is Generative AI Impacting TV Production???
Is Generative AI Impacting TV Production???
PLUS: The Sound and Vision of "The Wild Robot"
Happy Election Day Hollywood tech nerds! Go vote!
In this week’s post:
Subscribe to get Hollywood Tech Nerds magically delivered to your inbox every Tuesday!
Is Generative AI Impacting TV Production???
It’s time for another entry in my ongoing series “Any Details???” For newcomers, this is where I explore industry trade stories on AI “usage” in the business to see if there are any actual concrete examples on the specific ways it is being used, or if it’s just vague wishcasting.
Today we have yet another Variety article, this one titled “3 Ways Gen AI is Having an Early Impact in TV Production.” Let’s take a look and see if these are actual things!
1. AI Voice: AI voices have some early usefulness as they gain naturalism. Dubbing with synthetic voices is gaining some traction for “lower-stakes” content or platforms, such as localizing news or sports clips for YouTube or programming for FAST channels. In these cases, faster turnaround is the priority to extend audience reach for content that wouldn’t otherwise have been dubbed.
This is true, you will find plenty of AI-generated assets on all kinds of “lower-stakes” content throughout the world of streaming. The inconsistent quality of an AI dub for a low budget reality or news show is less important than its availability for a previously-excluded audience to be able to watch.
This use goes beyond just voice dubbing. A source of mine who worked for one of the AVOD streamers tells me content they aired would consistently be accompanied by AI-generated captions for some content, typically SRT rips from sources like YouTube’s auto-captioning service. The quality of this was always questionable and rarely QC’d. “Lower-stakes” material is also likely to have AI-generated artwork, as seen below for the colorized version of Nosferatu on Amazon Prime.
Sad to be in a world where Nosferatu is considered low-stakes, but here we are!
2. Face-swapping: Deep learning models are very powerful for rendering complex or subtle face modifications. The early potential here is proving out in lip-sync dubbing and face-swapping to achieve de-aging and other effects.
AI lip-sync tools, such as those offered by Flawless and MARZ’s LipDub AI, can synchronize an actor’s mouth and facial movements with a dubbed speech track. This year, major Hollywood studios are testing lip-sync dubbed content, as VIP+ reported in May 2024. The hope is that it pays off by giving audiences a more immersive experience of foreign-language content, making it feel like the viewer is watching content in their native language.
Face-swapping can also be used for cosmetic touch-ups or to change entire face structures to age or de-age actors and more. The same tools have also been discussed to open up editing to eliminate reshoots by allowing actors to redo lines remotely.
I’m somewhat skeptical of the actual value these tools have, my admittedly-cranky opinion is that they’ll never fully cross the Uncanny Valley, and long-term will look worse and worse as the human eye becomes more trained to these augmentations.
Just look at the trailer for Here, does this de-aging look good to you now? Will this age well (no pun intended) in 5-10 years? I know what young Tom Hanks looks like, and it’s not this!
3. Video generation: Video generation is rapidly advancing, and studios and some filmmakers are clearly interested in retaining and using these models as production tools. Yet for studio productions, it’s still unclear what it would mean professionally to use video generation in a workflow, including who is best situated to directly use it. Figuring out how to maximize these tools will be an internal focus in the coming months. Any studio productions that make use of video generation would likely work directly with AI company teams to get desired outputs from the tools.
Still, it’s important to recognize these systems have important differences versus traditional camerawork, VFX or animation. The differences appear in their photo and physics realism versus the real world, consistency or continuity and controllability. A critique of text-to-video is it’s effectively a “slot machine,” in that the output can look excellent but may not fit the exact need for a specific scene or production. Different techniques will have different capabilities, such as video-to-video, recently launched by Runway for Gen-3 Alpha.
Ahh yes, at last we have arrived at the requisitely meaningless AI hype. This is the key phrase: “it’s still unclear what it would mean professionally to use video generation in a workflow.”
OK… so why is this in an article titled “3 Ways Gen AI is Having an Early Impact in TV Production”? Sounds like in this particular case that it’s not having any impact at all. A “slot machine” that “may not fit the exact need for a specific scene or production” doesn’t seem particularly useful! If something was having an impact, wouldn’t there be an example of this to share?
Disruptive tech innovations in film and TV tend to be self-evidently useful. Think of the first time you saw the digital effects in Terminator 2 or Jurassic Park. It was immediately clear that these were impactful, industry-shifting developments. Nobody needed to breathlessly hype them. They created the hype themselves!
Conversely, consider the infamous Livia Soprano head used in the third season of The Sopranos.
Even at the time this aired, it was at the very least considered to be a distracting decision, if not totally detrimental to the storytelling of the episode. Someone insisting to you that this looked good, against the evidence of your very eyes, would likely be disqualified as anyone to take seriously on the subject.
My requirement for generative AI claims in entertainment is very simple: show me something that looks good and is useful. Until then, shut up!
The Sound and Vision of The Wild Robot
Prompted by numerous TikToks of sobbing audience members watching The Wild Robot, I checked it out for myself on PVOD. Did I weep like a baby? That’s none of your business! I’m much too professional to have been moved to tears by a film and then filled with melancholy for days afterward!
My alleged emotional breakdowns aside, I particularly loved The Wild Robot’s visual style, which abandons the CGI animation style of past DreamWorks films and creates an impressionistic, painted look. As described in Animation Magazine:
The Wild Robot… render[s] Roz and the natural world around her in impressionistic tones with painterly light and soft edges. The environments, [director Chris Sanders] says, were created without geometries and were completely hand-painted. In fact, every character and surface in the movie is hand-painted, except for Roz when she first arrives on the island at the start of the film.
“[Roz] has a CG surface like we’re used to in the very beginning, and as she progresses through the film, that surface is now being replaced with brush strokes because she’s getting dented and scratched and she’s getting mildew and she’s getting a patina, and she’s beginning to belong to the island,” Sanders says. “I believe we have upward of 30 different versions of Roz.”
Something tells me prompting “robot in the wilderness in the style of Monet” in the Sora slot machine hundreds of times would not ever result in the film’s beautiful look. The marriage of computer animation and human artistry is irreplaceable.
So too with The Wild Robot’s amazing sound design, as described in IndieWire:
Developing Roz’s sonic identity involved a lot of collaboration, not the least from Nyong’o herself. “We started doing early tests playing with vocal processing and seeing if we could, as part of her arc, have [Roz] go from sounding more robot-y and mechanical to sounding more human,” Lefferts said.
As part of that exploratory process, the sound team listened to early recordings of Nyong’o’s performance, applied some processing, and sent it to the voice recording team to play with the next day — something that Lefferts, who has worked on animated films from “Coraline” to “The Super Mario Bros. Movie,” had never gotten to do before. But playing with vocal processing was unnecessary; Nyong’o could convey the journey with her voice alone.
Even a film about a robot needed something computers can never generate: human touch!
Kernels (3 links worth making popcorn for)
Here’s a round-up of cool and interesting links about Hollywood and technology:
Disney’s App Store change is bad news for Apple. (link)
The broken promise of USB-C. (link)
Inside yet another AI event disaster. (link)