Roll is a new company that aims to use intelligent artificial intelligence to revolutionize how people use iPhones within video production, specifically talking head videos and remote interviews.
What’s especially interesting and novel about Roll is that it uses cloud-based AI technology to take the video data captured by an iPhone and make it look like a professional production, including with simulated slider, dolly, and jib movements. Roll also creates simulated bokeh to help make footage appear more cinematic.
Before diving into technology and performance, it’s crucial to understand why Roll exists and what the team hopes to accomplish with its new app.
There are two common scenarios for remote video creation: interview and video call participants use Zoom or another video conferencing app with built-in webcams, or people have a full-blown production setup with expensive and complicated audio and video equipment.
In the first case, it’s an easy and accessible setup, but performance varies widely based on webcam and network speeds. The second situation is often convoluted, not user-friendly, and pricy.
Roll aims to replace the performance of the second option while combining the accessibility and ease of use of a typical video conference app.
“Video production is ripe for disruption from the cloud. Large file sizes, complex processing, and multi-person review cycles make it the perfect candidate to have outsized benefits from cloud and AI technologies like large-scale storage, compute, and real-time sharing and collaboration,” says Roll founder and CEO, Faizan Buzdar.
How Does Roll Work?
Ahead of today’s announcement, PetaPixel chatted with Michelle Oh and Faizan Budzar from Roll about the team’s new app. Part of that discussion included a live demonstration using a pre-release version of Roll within Apple TestFlight on iOS and a beta version of the web client on a laptop.
What was immediately striking about Roll is how easy it is to use. It’s good to go after installing the app and creating an account. Someone must be hosting a call on Roll, but participants don’t need a computer; they can use just their phones.
However, combining a phone with a computer makes it easier to see what’s happening, check settings, and download the resulting videos.
Regarding those videos, Roll uses AI to create an automatic edit, although users can customize different camera movements and add AI-generated bokeh to create a different final video. Further, users can download the source footage to edit their videos. Source files are organized by the participant and, in the case of an iPhone with wide and telephoto cameras, by camera type.
For users with multi-camera iPhone models, like the iPhone 14 Pro, both cameras record simultaneously and send the data to the cloud, where Roll processes each camera stream. Users can preview their framing on the web app, which is especially helpful since it’s impossible to see the iPhone’s display when using its rear cameras.
Aside from that, there’s not much else for participants to worry about. A meeting host or designated producer can see all the camera feeds and communicate with each participant accordingly.
However, recording a “slate” is integral to the app’s AI power. A slate is when a user walks out of frame, comes back in the frame, says their name, counts off to three, and claps. The app will automatically prompt users to do this. This helps the app identify the user and analyze a room’s audio qualities. This data helps teach Roll to improve over time. The app isn’t trained using other user content.
The slate also helps Roll recreate a user’s room in 3D, which is what enables the app to move the camera around despite the iPhone being stationary.
Roll’s Results are Impressive
While PetaPixel tested a pre-release version of Roll, its results are already impressive. The app automatically generates a VFX sample video, showcasing the app’s available movements that people can use when editing a final video. There’s also a nearly-instant AI-powered edit ready for sharing, complete with different social media aspect ratios.
This is a big deal and a significant part of Roll’s appeal as a one-stop shop for high-quality video content. While it’s great to have manual control over pan, dolly, and crane shots, for many users, being able to record an interview and publish it within minutes of finishing the call across multiple online platforms is massive .
Ahead of today’s release, Roll met with many content creators on social media, film students, and even production teams. While professionals knew that Roll wasn’t using high-end video rigs to capture footage, people universally didn’t believe the results were from an iPhone.
That’s the power of AI and Roll’s mission — replacing expensive, hard-to-use gear with smartphones that many people have in their pockets and using AI to make the smartphone outperform its typical results. And the results definitely achieve that, with the potential to become even better over time.
Looking Ahead — What’s in the Works?
The Roll team is working hard on additional features for its mobile and web apps, including an easy-to-use text-based editor. This will allow users to cut lines, change camera angles, and add movements and visual effects in a single click. Teams will be able to collaborate in the editing process in real-time.
A Fascinating Use of AI
AI has undergone a significant transformation in public perception over the past year or two. People went from being excited about AI to feeling increasingly concerned about AI’s influence over daily life and its potential to be used nefariously.
To Roll’s credit, while AI is driving the app, AI isn’t being used to replace anything, but rather to draw the most out of an iPhone’s cameras and enable users to create high-quality content more easily.
While Roll technically does a bit of fancy generative work to simulate a user’s room, including for its panning and dolly shots which rely on the app creating background information that wasn’t captured by a phone’s camera(s), it pales in comparison to generative AI like Midjourney and what Adobe is doing in the latest version of Photoshop.
Sure, Roll does use AI to synthesize something, but it’s almost entirely using AI to take what’s already there and make it look more polished and cinematic.
While it makes sense that people are worried about how AI can be used, primarily how it can be used to fundamentally change something or even make someone look different or say things they never said, that’s not something Roll does or, per its creators, something it will ever do. The Roll team wants to enable people to create better content without spending a lot of money and buying new gear.
It’s also easy to see Roll’s potential concerning communicating with people in remote locations or places too dangerous to send production teams. For example, imagine having a video interview with a journalist on the ground in a warzone. If the journalist has an iPhone, it’s easy to quickly capture high-quality footage and use Roll’s AI technology to add a very robust coat of polish to the content.
People don’t have infinite time, money, or know-how to deal with creating on-the-fly video productions, especially not remotely, and Roll hopes not to let these constraints limit the look and feel of end-user content.
“Roll is fundamentally changing the way we create videos today by automating the entire process — we can get you from recording to publishing in 15 minutes. We’re excited to get it out in the world to see what incredible videos creators and businesses will make with it. We believe that Roll could become the future of video production,” adds Buzdar. Roll’s founding team comes from industry leaders including Box, GoPro, Technicolor, Epic Games, and Twitter.
Pricing and Availability
Notably, users don’t need an iPhone. People can participate in a Roll meeting from the web, although that user won’t get all the same fancy treatment as iPhone-captured footage. Users can also hook up fancy mirrorless cameras to their computers and use those.
There’s potential for AI to be used to help make footage and audio from many different types of sources look and sound more cohesive. There’s also some potential for the app to become more fully featured, including the ability to use text chat and perhaps even share files for call participants to view.
Concerning iPhone compatibility, any iPhone that can run the latest version of iOS is compatible, whether it has telephoto cameras or not. However, iPhone models with better cameras and multiple camera modules will have access to the full suite of VFX and result in better-looking content. Sufficient light is also essential for the app to work its magic, although users in dim environments can still participate in Roll video calls, albeit with a warning that their content won’t look as nice.
Although much of the app’s magic is happening not on a user’s device but in the cloud, network performance is surprisingly not much of a limiting factor. Thanks to fancy under-the-hood technology, the team has done a great job eliminating any signs of buffering or audio echoes. Despite a brief issue with a local network during the demo, the app didn’t miss a beat.
Additional “Creator” ($49/month on an annual basis) and “Pro” ($199/month on an annual plan) tiers offer users additional editing freedom, more recording hours, additional effects, and more storage. If users in these tiers exceed their recording limits, additional hours are available for more money.
Based on the pricing, which Roll says is subject to change based on data, it’s clear that Roll is primarily targeting users that have traditionally hired professional high-end video production teams. The company says it’s focused on the “prosumer” market, including businesses and also independent creators, podcasters, and influencers.
Users are encouraged to use some sort of tripod or stand for their phone, but it’s not required. A stand that allows a camera to be mounted on or near the computer monitor is recommended.
Image credits: Roll