Back to Catalog

Text Analytics 101

From social media to news articles to machine logs, text data is everywhere. This class will teach you about Information Extraction: how to extract structured data from text in order to derive valuable insights.

TA0105EN

(15)

Beginnercourse

Language

  • English

Topic

  • Text Analytics

Organization

  • CognitiveClass

Estimated Effort

  • 6 hours
About This course

About This Course

From social media to news articles to machine logs, text data is everywhere. This class will teach you about Information Extraction: how to extract structured data from text in order to derive valuable insights. You will learn about information extraction applications in various domains, such as social media, healthcare analytics, and financial risk analysis. You will explore common text analytics tasks, including entity, relation, and event extraction, as well as sentiment analysis. Finally, you will dive into "Declarative Information Extraction", a powerful method for doing high-performance and high-quality text analytics, and gain hands-on experience writing your own extractors.


What will I get after passing this course?


Course Syllabus

  • Module 1 - Getting to Know Information Extraction

  • Module 2 - Limitations in Information Extraction

  • Module 3 - Getting to Know SystemT

  • Module 4 - Information Extraction with AQL

  • Module 5 - AQL Basics

  • Module 6 - Advanced AQL

  • Module 7 - Declarative Information Extraction and the SystemT Optimizer

  • Module 8 - Best Practices


General Information

  • This course is free.
  • It is self-paced.
  • It can be taken at any time.
  • It can be taken as many times as you wish.

Recommended skills prior to taking this course

  • None
 

Course Staff

Yunyao Li

Yunyao Li 

Yunyao Li is a Principle Research Staff Member and Senior Research Manager at  IBM Almaden Research Center where she manages the Scalable Knowledge Intelligence department. She is also a Master Inventor and a member of IBM Academy of Technology.  Her expertise is in the interdisciplinary areas of natural language processing, databases, human-computer interaction, and information retrieval.  She is a founding member of SystemT, a state-of-the-art NLP system currently powering multiple IBM products, and numerous projects. She received her PhD and master degrees from  the University of Michigan Ann Arbor and undergraduate degrees from Tsinghua University, Beijing, China.  You can read about Yunyao's inspiring story from small-town China to Silicon Valley here.  Follow her on Twitter @yunyao_li.

 

Laura Chiticariu

Laura Chiticariu  

Laura Chiticariu is the Chief Architect of Watson Knowledge and Language Foundation, with technical leadership responsibilities over Watson Natural Language Understanding, Watson Knowledge Studio and Watson Knowledge Graph. Laura is a core member of the SystemT, R&D team, and strongly believes in the notion of "Transparent NLP": leveraging machine learning techniques, while ensuring that the NLP system remains transparent - easy to comprehend, debug and enhance. She holds a Ph.D. in Computer Science, and has been teaching NLP across universities within and outside the U.S.

 

Marina Danilevsky

Marina Danilevsky

Marina Danilevsky is a Research Staff Member in the Scalable Knowledge Intelligence group at IBM Almaden Research Center and a core member of the SystemT R&D team. She works in the intersection of data analytics, text mining, natural language processing, information networks, and human-computer interaction. She holds a Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign and a B.S. in Mathematics from the University of Chicago.

 

Huaiyu Zhu  

Huaiyu Zhu is a Research Staff Member in the Scalable Knowledge Intelligence group at IBM Almaden Research Center. His main research focus is on text analytics, natural language processing, machine learning and statistical information processing.

 

 

Atsushi Ono  

Atsushi Ono is a software engineer at Tokyo Software & Systems Development Lab (TSDL), IBM Japan. After several years of experience on business partner technical enablement missions, he has been working as a front-end developer on various projects, including contributing to the open source Dojo Mobile project. He has worked on the development of IBM Watson Knowledge Studio since the project’s inception.

 

Yuka Nomura

Yuka Nomura  

Yuka Nomura is a software engineer working on front-end development of IBM Watson Knowledge Studio at Tokyo Software & Systems Development Lab (TSDL), IBM Japan. She has contributed to user interface design and product development from her very first project start-up. She also specializes in robot application programming that runs on communication robots such as Pepper.

 

Chikako Oyanagi

Chikako Oyanagi

Chikako Oyanagi is a front-end software developer of IBM Watson Knowledge Studio at Tokyo Software & Systems Development Lab (TSDL), IBM Japan.

 

 

Teruki Tauchi

Teruki Tauchi

Teruki Tauchi is a front-end software developer of IBM Watson Knowledge Studio at Tokyo Software & Systems Development Lab (TSDL), IBM Japan. He joined IBM after obtaining a Master of Engineering degree in Computer Science from University College London in 2015.