Alan Turing, considered the father of artificial intelligence, wondered in the 50s “can machines think?”, and from then on a thousand things have happened. Artificial intelligence (AI) has found multiple applications in various areas, and the audiovisual sector is no exception. Although it is not yet correct to say that artificial intelligence performs many functions completely autonomously, progress has been made in semi-automated processes of image processing and voice recognition or processing.
In the audiovisual field, AI is used for a variety of functions, and with its current growth, it is expected to cover even more in the future. However, it is currently more accurate to speak of semi-automated processes rather than fully autonomous tasks. AI can be a great support for publishers and artists in repetitive tasks, such as cataloging, archiving and documenting audiovisual content, making these processes less tedious and faster.
The key to an artificial intelligence is in its training. Basically, the human must load the machine with data and information, teaching it what output data to return to us when we enter certain input data. Training an AI means giving it specific, sufficient and detailed tools on a topic, so that, once it has learned from this information, it knows what return to generate for future inputs.
The concern about a machine replacing a human in an environment such as audiovisual, where creativity plays such an important role, is understandable. However, while AI can automate certain tasks, it cannot replace the role of the creative. Creativity and intuition are human characteristics that AI cannot yet replicate. However, AI can perform other more repetitive tasks within the audiovisual sector that can be of great support to the editor or artist. For example, cataloging, archiving and documenting audiovisual content can be less tedious and faster thanks to AI.
In the case of archival search for audiovisual creation, AI is able to generate a large amount of information thanks to algorithms that can recognize faces, detect logos and brands, read tags, segment people who appear or speak in the material, transcribe voice to text, extract keywords, categorize content, etc. These algorithms allow AI systems to process and organize large volumes of data efficiently, facilitating the work of industry professionals.
On the other hand, the development of artificial intelligence for voice processing has advanced so much that we can say that it can be used in a very good way in the audiovisual industry. This does not mean that transcriptions or subtitles will be made without errors; for optimal results, the audio material delivered to the AI must be of high quality, and the app must have been previously trained for this type of content. This includes training it for each person’s voice, the grammar used and the usual vocabulary. Even for most voice processing systems with artificial intelligence, it is critical to recognize the names of people, places or institutions; the system is currently unable to recognize a word it does not already know.
When a new technology arrives, questions such as “What now, what about my job, is artificial intelligence going to put us out of a job? The answer is NO. The technology is not ready to be autonomous; all data extracted with AI has an error rate between 3% and 10%. Therefore, continuous human supervision is required to correct the data generated by the system and add relevant information that cannot be automated.
The main difference between AI and work done by a human is the great increase in productivity and time reductions. Definitely and objectively, we should think that artificial intelligence came to help and look at it as a great process optimization tool, not as a system to replace humans in the creative creation process. Instead of fearing technology, it is more constructive to see it as an ally that can free professionals from monotonous tasks, allowing them to concentrate on more creative and valuable aspects of their work.