Gemini Now Supports Audio File Analysis and Transcription

Gemini Now Supports Audio File Analysis and Transcription

Arkadiy Andrienko

Google's Gemini AI assistant is becoming more versatile. Users can now upload not just text, images, and videos to the chatbot, but also audio files. This latest update significantly broadens the assistant's practical applications for everyday tasks.

The new feature allows for direct uploads of files in MP3, WAV, or M4A formats, opening up several useful scenarios for work and school. Gemini can now quickly transcribe a lecture or interview, generating an accurate text copy. It can also analyze a long meeting recording or podcast and provide a concise summary, distilling only the key takeaways and decisions to save the user time.

There are, however, some limitations that depend on your account type. For those using the free version of Gemini, the maximum length for audio processing is 10 minutes, with a cap of five such requests per day. Subscribers to the paid Google AI Pro and Ultra plans get significantly higher limits, being able to analyze audio clips up to three hours long.

Beyond audio, the update also touches on other formats. You can now upload packages of files to the chat, including entire code folders from GitHub (up to 5,000 files), as well as ZIP archives containing up to 10 items. The total size of uploaded data for a single request must not exceed the set limits.

This upgrade is another step in Google's strategy to build a unified ecosystem of smart assistants, gradually infusing its products with AI features. Deeper integration between Gemini and other Google products is expected in the future, promising an even smoother experience for tackling daily tasks.

    About the author
    Comments0