This is a Guest Blog By Aditya Singhal, a veteran technopreneur, and Director and Executive producer at Charuvi Design Labs (CDL)
Sometime in the middle of last year, we started talking about Midjourney, an AI that generates images from textual prompts. Initially, I was a tad dismissive, as these days, only about 1% of technology leads to any true betterment in our lives, which have become always “on.” As an engineer from a very research-focused institution in the US, I was always more accustomed to adopting and learning about new technology, and the Luddite in me gave way to explore what was to become a current pop culture phenomenon of generative AI. As I write this, short films have started coming out through a combination of generative AI with even voice being ‘generated’.
To look at the microcosm of the Animation and VFX world through this lens, we spent a good 10 odd months exploring various generative AI technologies that we could get our hands on and see what implications arise, be it good or bad. We started off from the standpoint of positive gain and used many of the results for our in-house non-commercial projects, purely to understand and equip ourselves better.
In these past months, hundreds of such technologies have emerged, with many being adopted across mainstream products, the most prominent ones being from Adobe™ and Microsoft™. We are still learning about newer ones emerging and so cannot say that we have scoured all that is on offer, and it would be safe to assume that adopting all or as many as possible may be a bit ambitious.
I also make a disclaimer that our observations are, and learnings are from a user/industry perspective and not from a developer of technology perspective. This is done on purpose because most of the articles written about AI are from the development community themselves and were not really industry or job-function specific. Also, like all technologies, some industries are influenced more than others and so we take the specific case of our firm, an animation studio.
We embarked on our journey of exploration with the most obvious and logical application, concept art. Here, it has been the most widely used and reported across media. Images of futuristic Indians or heroic renditions of villagers in rural settings have drawn a lot of curiosity and awe, and rightly so. Short, animated films have come about using AI with all frames “generated.” It would be prudent to think that for concept art, this is astonishingly helpful. However, we did find massive limitations as the core principle of generative AI is that prior “information” is put in, and this puts a definitive limitation on creativity. Out of the over 1,000 designs generated through advanced prompting, about 10% were useful but required a lot of further touching up. Similarly, using face capture and some simple animation hocus-pocus, one could create 5-10 second animated videos of superb quality. But to make a production-grade, consumer-friendly, and commercially viable animation, the current technology is not adequate or that the effort taken isn’t cost effective enough to result in sufficient returns. We have indeed now started using some AI generated imagery in commercial work, but it has been the result of some unique creative use case applications.
I assume that soon, generative AI can make 3D models, an essential part of the animation process. It would allow modelers to have a starting point and hence make better models at the same time or less. This implies a better creative output for animation features or production. The same can be said for rigging, lighting, texturing, and more. There are already fairly disruptive gains to be seen in writing and story development. However, so far, we see this as a “cogitare,” a term derived from AI, but which means an assistant to cognition. What we see AI doing in our field for the short term is to be a peon, assistant, secretary, EA/PA but one that is also a librarian and guide, that does some of the heavy lifting of our brain. It takes us to the end goal sooner. However, it also implies that those with better understanding of language and text will use this better. We noticed this where those with better language skills, more experience and high demonstrated aptitude were able to use the technology better. They could arrive at a better result faster.
At CDL, we have been prudent to use AI and have been able to adopt it across departments such as accounting, HR, legal, besides some core functions too. Recently, a technical problem found a solution in chat GPT 4, despite the AI not having the last 2 years’ worth of data tokens. It was not that the solution wouldn’t be found eventually, it’s just that it adds speed to the process and allowed us to get from question mode to execution mode faster.
At a macro level, another massive benefit for low-income or emerging markets is that lesser-skilled talent will have better outputs. Text-to-image generation is easier to teach people than drawing skills, and this allows for a plethora of people to make content of various qualities, beyond those who are actively engaged in the industry, the examples of which are evident in social media already. Imagine skill development at a level where with medium level language skills, higher paying skillsets can be imparted, especially linked to services. We intend to use this at CDL for training and development and we are also seeing the same at other animation and gaming firms.
That being said, the strides made are indeed stupendous, and I am sure that some form of disruption will take over and very soon. However, I believe that generative AI will be helpful to organizations, and more jobs will be created within our industry than those destroyed because, essentially, the mundane clerical part of our jobs is what this is assisting in, and which hopefully will lead to better job satisfaction. As the pendulum swings, I ask chatGPT what is another word for someone who is luddite but perhaps a little fake and it did give me some nice ones such as pretendite, pseudo-luddite and luddite-poser. All of these could have been though-up by many people, but it saves a tonne of time. This is indeed the second Turing moment.