AI meeting companion: From voice to insight
Create an app to capture audio (like lectures) and summarize it. Build the app using OpenAI Whisper (text to speech), then summarize it with an open source LLAMA 2 LLM hosted by IBM watsonx. You deploy the app in a serverless environment using IBM Cloud Code Engine.
4.6 (14 Reviews)
Language
- English
Topic
- Artificial Intelligence
Industries
- Information Technology
Enrollment Count
- 108
Skills You Will Learn
- Artificial Intelligence, Python
Offered By
- IBMSkillsNetwork
Estimated Effort
- 45 min
Platform
- SkillsNetwork
Last Update
- December 20, 2024
In this project, you use OpenAI's Whisper to transform speech into text. Then, you use IBM watsonx AI to summarize and find key points. This stage couples with prompt engineering through PromptTemplate in Langchain. You'll make an app with HuggingFace Gradio as the user interface.
The output from the LLM not only summarizes and highlights key points, it also corrects minor mistakes made by the speech-to-text model, ensuring a coherent and accurate result.
A look at the project ahead
- Speech-to-text conversion: Use OpenAI's Whisper technology to convert lecture recordings into text, accurately.
- Content summarization: Implement IBM watsonx AI to effectively summarize the transcribed lectures and extract key points.
- User interface development: Create an intuitive and user-friendly interface using HuggingFace Gradio, ensuring ease of use for students and educators.
- App deployment: Learn and apply the skills necessary to deploy the application online by using IBM Code Engine, making the tool accessible to a wider audience.
What you'll need
Language
- English
Topic
- Artificial Intelligence
Industries
- Information Technology
Enrollment Count
- 108
Skills You Will Learn
- Artificial Intelligence, Python
Offered By
- IBMSkillsNetwork
Estimated Effort
- 45 min
Platform
- SkillsNetwork
Last Update
- December 20, 2024
Instructors
Sina Nazeri
Data Scientist at IBM
I am grateful to have had the opportunity to work as a Research Associate, Ph.D., and IBM Data Scientist. Through my work, I have gained experience in unraveling complex data structures to extract insights and provide valuable guidance.
Read moreContributors
Fateme Akbari
Data Scientist @IBM
I'm a data-driven Ph.D. Candidate at McMaster University and a data scientist at IBM, specializing in machine learning (ML) and natural language processing (NLP). My research focuses on the application of ML in healthcare, and I have a strong record of publications that reflect my commitment to advancing this field. I thrive on tackling complex challenges and developing innovative, ML-based solutions that can make a meaningful impact—not only for humans but for all living beings. Outside of my research, I enjoy exploring nature through trekking and biking, and I love catching ball games.
Read moreVicky Kuo
Data Scientist
I believe that success isn't just about individual milestones, but also about uplifting and encouraging others to reach their potential. This is why I'm passionate about combining my technical background with my eagerness to help people overcome technological hurdles and accelerate growth. When I’m not on the job, I love hiking with my two dogs or relaxing in a coffee shop. There's nothing better than having an insightful conversation over coffee, or even better, some volunteer work! Please feel free to reach out to me on LinkedIn.
Read more