Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale

Ebook363 pages2 hours

Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale

Name: Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale
Author: Mathangi Sri
ISBN: 9781484262467

By Mathangi Sri

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Work with natural language tools and techniques to solve real-world problems. This book focuses on how natural language processing (NLP) is used in various industries. Each chapter describes the problem and solution strategy, then provides an intuitive explanation of how different algorithms work and a deeper dive on code and output in Python.

Practical Natural Language Processing with Python follows a case study-based approach. Each chapter is devoted to an industry or a use case, where you address the real business problems in that industry and the various ways to solve them. You start with various types of text data before focusing on the customer service industry, the type of data available in that domain, and the common NLP problems encountered. Here you cover the bag-of-words model supervised learning technique as you try to solve the case studies. Similar depth is given to other use cases such as online reviews, bots, finance, and so on. As you cover theproblems in these industries you’ll also cover sentiment analysis, named entity recognition, word2vec, word similarities, topic modeling, deep learning, and sequence to sequence modelling.

By the end of the book, you will be able to handle all types of NLP problems independently. You will also be able to think in different ways to solve language problems. Code and techniques for all the problems are provided in the book.

What You Will Learn

Build an understanding of NLP problems in industry
Gain the know-how to solve a typical NLP problem using language-based models and machine learning
Discover the best methods to solve a business problem using NLP - the tried and tested ones
Understand the business problems that are tough to solve

Who This Book Is For

Analytics and data science professionals who want to kick start NLP, and NLP professionals who want to get new ideas to solve theproblems at hand.

Skip carousel

LanguageEnglish

PublisherApress

Release dateNov 30, 2020

ISBN9781484262467

Author

Mathangi Sri

Related authors

Skip carousel

Related to Practical Natural Language Processing with Python

Related ebooks

Skip carousel

PyTorch Recipes: A Problem-Solution Approach
Ebook
PyTorch Recipes: A Problem-Solution Approach
byPradeepta Mishra
Rating: 0 out of 5 stars
0 ratings
Deploy Machine Learning Models to Production: With Flask, Streamlit, Docker, and Kubernetes on Google Cloud Platform
Ebook
Deploy Machine Learning Models to Production: With Flask, Streamlit, Docker, and Kubernetes on Google Cloud Platform
byPramod Singh
Rating: 0 out of 5 stars
0 ratings
Text Analytics with Python: A Practitioner's Guide to Natural Language Processing
Ebook
Text Analytics with Python: A Practitioner's Guide to Natural Language Processing
byDipanjan Sarkar
Rating: 0 out of 5 stars
0 ratings
Mastering Large Language Models with Python: Unleash the Power of Advanced Natural Language Processing for Enterprise Innovation and Efficiency Using Large Language Models (LLMs) with Python
Ebook
Mastering Large Language Models with Python: Unleash the Power of Advanced Natural Language Processing for Enterprise Innovation and Efficiency Using Large Language Models (LLMs) with Python
byRaj Arun R
Rating: 0 out of 5 stars
0 ratings
The Pythonic Way: An Architect’s Guide to Conventions and Best Practices for the Design, Development, Testing, and Management of Enterprise Python Code
Ebook
The Pythonic Way: An Architect’s Guide to Conventions and Best Practices for the Design, Development, Testing, and Management of Enterprise Python Code
bySonal Raj
Rating: 5 out of 5 stars
5/5
Deep Learning with TensorFlow
Ebook
Deep Learning with TensorFlow
byAhmed Menshawy
Rating: 5 out of 5 stars
5/5
Python Text Processing with NLTK 2.0 Cookbook: LITE
Ebook
Python Text Processing with NLTK 2.0 Cookbook: LITE
byJacob Perkins
Rating: 4 out of 5 stars
4/5
Transfer Learning for Natural Language Processing
Ebook
Transfer Learning for Natural Language Processing
byPaul Azunre
Rating: 0 out of 5 stars
0 ratings
Data-Oriented Programming: Reduce software complexity
Ebook
Data-Oriented Programming: Reduce software complexity
byYehonathan Sharvit
Rating: 4 out of 5 stars
4/5
Real-World Natural Language Processing: Practical applications with deep learning
Ebook
Real-World Natural Language Processing: Practical applications with deep learning
byMasato Hagiwara
Rating: 0 out of 5 stars
0 ratings
Pattern-Oriented Software Architecture, On Patterns and Pattern Languages
Ebook
Pattern-Oriented Software Architecture, On Patterns and Pattern Languages
byFrank Buschmann
Rating: 5 out of 5 stars
5/5
C# Deconstructed: Discover how C# works on the .NET Framework
Ebook
C# Deconstructed: Discover how C# works on the .NET Framework
byMohammad Rahman
Rating: 0 out of 5 stars
0 ratings
Experimentation for Engineers: From A/B testing to Bayesian optimization
Ebook
Experimentation for Engineers: From A/B testing to Bayesian optimization
byDavid Sweet
Rating: 0 out of 5 stars
0 ratings
Pro Cryptography and Cryptanalysis: Creating Advanced Algorithms with C# and .NET
Ebook
Pro Cryptography and Cryptanalysis: Creating Advanced Algorithms with C# and .NET
byMarius Iulian Mihailescu
Rating: 0 out of 5 stars
0 ratings
Cross-Platform Desktop Applications: Using Node, Electron, and NW.js
Ebook
Cross-Platform Desktop Applications: Using Node, Electron, and NW.js
byPaul Jensen
Rating: 0 out of 5 stars
0 ratings
Introducing Deno: A First Look at the Newest JavaScript Runtime
Ebook
Introducing Deno: A First Look at the Newest JavaScript Runtime
byFernando Doglio
Rating: 0 out of 5 stars
0 ratings
Learn OpenCV with Python by Examples
Ebook
Learn OpenCV with Python by Examples
byJames Chen
Rating: 0 out of 5 stars
0 ratings
Algorithms and Data Structures for Massive Datasets
Ebook
Algorithms and Data Structures for Massive Datasets
byDzejla Medjedovic
Rating: 0 out of 5 stars
0 ratings
Moving To The Cloud: Developing Apps in the New World of Cloud Computing
Ebook
Moving To The Cloud: Developing Apps in the New World of Cloud Computing
byDinkar Sitaram
Rating: 3 out of 5 stars
3/5
Natural Language Processing with Java
Ebook
Natural Language Processing with Java
byRichard M Reese
Rating: 0 out of 5 stars
0 ratings
TensorFlow in Action
Ebook
TensorFlow in Action
byThushan Ganegedara
Rating: 0 out of 5 stars
0 ratings
Interpretable AI: Building explainable machine learning systems
Ebook
Interpretable AI: Building explainable machine learning systems
byAjay Thampi
Rating: 0 out of 5 stars
0 ratings
Python 3 Text Processing with NLTK 3 Cookbook
Ebook
Python 3 Text Processing with NLTK 3 Cookbook
byJacob Perkins
Rating: 4 out of 5 stars
4/5
Learning Python Design Patterns - Second Edition
Ebook
Learning Python Design Patterns - Second Edition
byGiridhar Chetan
Rating: 0 out of 5 stars
0 ratings
Spark GraphX in Action
Ebook
Spark GraphX in Action
byMichael Malak
Rating: 0 out of 5 stars
0 ratings
Human-in-the-Loop Machine Learning: Active learning and annotation for human-centered AI
Ebook
Human-in-the-Loop Machine Learning: Active learning and annotation for human-centered AI
byRobert (Munro) Monarch
Rating: 0 out of 5 stars
0 ratings
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
Ebook
Building Transformer Models with PyTorch 2.0: NLP, computer vision, and speech processing with PyTorch and Hugging Face (English Edition)
byPrem Timsina
Rating: 0 out of 5 stars
0 ratings
Java Concurrency Complete Self-Assessment Guide
Ebook
Java Concurrency Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Machine Learning with PySpark: With Natural Language Processing and Recommender Systems
Ebook
Machine Learning with PySpark: With Natural Language Processing and Recommender Systems
byPramod Singh
Rating: 0 out of 5 stars
0 ratings
The Handbook of Artificial Intelligence: Volume 2
Ebook
The Handbook of Artificial Intelligence: Volume 2
byAvron Barr
Rating: 0 out of 5 stars
0 ratings

Intelligence (AI) & Semantics For You

Skip carousel

Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
2084: Artificial Intelligence and the Future of Humanity
Ebook
2084: Artificial Intelligence and the Future of Humanity
byJohn C. Lennox
Rating: 4 out of 5 stars
4/5
ChatGPT For Dummies
Ebook
ChatGPT For Dummies
byPam Baker
Rating: 0 out of 5 stars
0 ratings
Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
Summary of Super-Intelligence From Nick Bostrom
Ebook
Summary of Super-Intelligence From Nick Bostrom
bySummary Station
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
Impromptu: Amplifying Our Humanity Through AI
Ebook
Impromptu: Amplifying Our Humanity Through AI
byReid Hoffman
Rating: 5 out of 5 stars
5/5
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
ChatGPT For Fiction Writing: AI for Authors
Ebook
ChatGPT For Fiction Writing: AI for Authors
byNova Leigh
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Our Final Invention: Artificial Intelligence and the End of the Human Era
Ebook
Our Final Invention: Artificial Intelligence and the End of the Human Era
byJames Barrat
Rating: 4 out of 5 stars
4/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
Ebook
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
byAlexander Cooper
Rating: 1 out of 5 stars
1/5
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
Ebook
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
byJoseph Kenna
Rating: 0 out of 5 stars
0 ratings
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
The Algorithm of the Universe (A New Perspective to Cognitive AI)
Ebook
The Algorithm of the Universe (A New Perspective to Cognitive AI)
byAncient Philosophy
Rating: 5 out of 5 stars
5/5
Large Language Models
Ebook
Large Language Models
byA. Scholtens
Rating: 2 out of 5 stars
2/5
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
Ebook
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
byS M Howard
Rating: 4 out of 5 stars
4/5
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
Ebook
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
byUtpal Chakraborty
Rating: 0 out of 5 stars
0 ratings
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
Ebook
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
byThe Passive Income Strategist
Rating: 4 out of 5 stars
4/5
ChatGPT: The Future of Intelligent Conversation
Ebook
ChatGPT: The Future of Intelligent Conversation
byCea West
Rating: 4 out of 5 stars
4/5
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
Ebook
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
byLogan Rivers
Rating: 5 out of 5 stars
5/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 5 out of 5 stars
5/5
Make Money with ChatGPT: Your Guide to Making Passive Income Online with Ease using AI: AI Wealth Mastery
Ebook
Make Money with ChatGPT: Your Guide to Making Passive Income Online with Ease using AI: AI Wealth Mastery
byBen Preston
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
Podcast episode
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
byCoder Radio
0 ratings
0% found this document useful
Distributing Geospatial Data: Distributing Geospatial Data - Every wondered why you might what to do this? Or maybe you understand the why but are unsure about the how? Perhaps you have heard people talk about partitioning data or sharding data, you might have heard some of thes...
Podcast episode
Distributing Geospatial Data: Distributing Geospatial Data - Every wondered why you might what to do this? Or maybe you understand the why but are unsure about the how? Perhaps you have heard people talk about partitioning data or sharding data, you might have heard some of thes...
byThe MapScaping Podcast - GIS, Geospatial, Remote Sensing, earth observation and digital geography
0 ratings
0% found this document useful
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
Podcast episode
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
Podcast episode
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
byData Engineering Podcast
0 ratings
0% found this document useful
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
Podcast episode
MLA 020 Kubeflow: Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) - Data Scientist at Dept Agency . (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of...
byMachine Learning Guide
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
Podcast episode
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful
Running Databases on Kubernetes
Podcast episode
Running Databases on Kubernetes
byThe Cloudcast
0 ratings
0% found this document useful
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
Podcast episode
25: Selenium, pytest, Mozilla – Dave Hunt: Interview with Dave Hunt @davehunt82. We Cover: Selenium Driver: http://www.seleniumhq.org/ pytest: http://docs.pytest.org/ pytest plugins: pytest-selenium: http://pytest-selenium.readthedocs.io/ pytest-html: https://pypi.python.
byTest and Code
0 ratings
0% found this document useful
Reflections On Designing A Data Platform From Scratch: A monologue by Tobias Macey, the host of the show, about the design considerations involved in building a data platform and how the lessons learned from running the Data Engineering Podcast are influencing the choices made.
Podcast episode
Reflections On Designing A Data Platform From Scratch: A monologue by Tobias Macey, the host of the show, about the design considerations involved in building a data platform and how the lessons learned from running the Data Engineering Podcast are influencing the choices made.
byData Engineering Podcast
100%
100% found this document useful
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
Podcast episode
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
Podcast episode
Graph Analytic Systems with Zachary Hanif - TWiML Talk #188: In this, the final episode of our Strata Data Conference series, we’re joined by Zachary Hanif, Director of Machine Learning at Capital One’s Center for Machine Learning. Zach led a session at Strata called “Network effects: Working with modern...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
MLA 019 DevOps
Podcast episode
MLA 019 DevOps
byMachine Learning Guide
100%
100% found this document useful
gRPC & protocol buffers: with Askhay Shah
Podcast episode
gRPC & protocol buffers: with Askhay Shah
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
Go in medicine & biology: with Timothy Stiles, creator of Poly
Podcast episode
Go in medicine & biology: with Timothy Stiles, creator of Poly
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
Computational Thinking & Learning Python During an AI Revolution
Podcast episode
Computational Thinking & Learning Python During an AI Revolution
byThe Real Python Podcast
0 ratings
0% found this document useful
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
Podcast episode
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
byData Engineering Podcast
0 ratings
0% found this document useful
MLA 015 SageMaker 1: Part 1 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See for an overview of tooling (also generally a great ML educational run-down.) And I forgot to...
Podcast episode
MLA 015 SageMaker 1: Part 1 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See for an overview of tooling (also generally a great ML educational run-down.) And I forgot to...
byMachine Learning Guide
0 ratings
0% found this document useful
Hacking with Go: Part 2: with Ivan Kwiatkowski
Podcast episode
Hacking with Go: Part 2: with Ivan Kwiatkowski
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
[AI is Here] Unlocking NLP's Potential in Banking - with Christophe Makni of Migros Bank: Today’s guest is Christophe Makni, Head of Business Operations at Migros Bank. Christophe shares a few key insights in this episode, starting with where natural language processing is finding a fit in banking today and the real deployments in the...
Podcast episode
[AI is Here] Unlocking NLP's Potential in Banking - with Christophe Makni of Migros Bank: Today’s guest is Christophe Makni, Head of Business Operations at Migros Bank. Christophe shares a few key insights in this episode, starting with where natural language processing is finding a fit in banking today and the real deployments in the...
byThe AI in Business Podcast
0 ratings
0% found this document useful
#121 — ChatGPT and How Generative AI is Augmenting Workflows
Podcast episode
#121 — ChatGPT and How Generative AI is Augmenting Workflows
byDataFramed
0 ratings
0% found this document useful
Amazon Kubernetes with Abby Fuller: Amazon’s container offerings include ECS (Elastic Container Service), EKS (Elastic Kubernetes Service), and Fargate. Through these different offerings, Amazon provides a variety of ways that a user can manage Kubernetes clusters and standalone containe...
Podcast episode
Amazon Kubernetes with Abby Fuller: Amazon’s container offerings include ECS (Elastic Container Service), EKS (Elastic Kubernetes Service), and Fargate. Through these different offerings, Amazon provides a variety of ways that a user can manage Kubernetes clusters and standalone containe...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
Podcast episode
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
byDataFramed
0 ratings
0% found this document useful
Bayesian A/B Testing: Today's guest is Cameron Davidson-Pilon. Cameron has a masters degree in quantitative finance from the University of Waterloo. Think of it as statistics on stock markets. For the last two years he's been the team lead of data science at Shopify. He's...
Podcast episode
Bayesian A/B Testing: Today's guest is Cameron Davidson-Pilon. Cameron has a masters degree in quantitative finance from the University of Waterloo. Think of it as statistics on stock markets. For the last two years he's been the team lead of data science at Shopify. He's...
byData Skeptic
100%
100% found this document useful
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
Podcast episode
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
Podcast episode
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
byGoogle Cloud Platform Podcast
100%
100% found this document useful
Exploring K-means Clustering and Building a Gradebook With Pandas
Podcast episode
Exploring K-means Clustering and Building a Gradebook With Pandas
byThe Real Python Podcast
0 ratings
0% found this document useful
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
Podcast episode
Eureka moments with natural language processing: featuring Nicholas Mohnacky of bundleIQ
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
TestContainers to Reduce Developer Frustration
Podcast episode
TestContainers to Reduce Developer Frustration
byThe Cloudcast
0 ratings
0% found this document useful
Jobs of Tomorrow: Windows Insider Podcast Episode 17
Podcast episode
Jobs of Tomorrow: Windows Insider Podcast Episode 17
byWindows Insider Podcast
100%
100% found this document useful

Skip carousel

How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
Usability
Linux Format
Article
Usability
Oct 19, 2021
3 min read
An easy-to-Understand Overview of Popular extended BPF Tools: BCC, Falco, and More
Techfastly
Article
An easy-to-Understand Overview of Popular extended BPF Tools: BCC, Falco, and More
Apr 1, 2022
7 min read
What an AI's Non-Human Language Actually Looks Like
The Atlantic
Article
What an AI's Non-Human Language Actually Looks Like
Jun 20, 2017
4 min read
Don’t Be Misled by GPT-4’s Gift of Gab
The Atlantic
Article
Don’t Be Misled by GPT-4’s Gift of Gab
Mar 15, 2023
4 min read
Charts And Diagrams
Linux Format
Article
Charts And Diagrams
Nov 15, 2022
1 min read
The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Sep 20, 2016
5 min read
Mucking About With AI
APC
Article
Mucking About With AI
May 22, 2023
2 min read
A.I.-POWERED RASPBERRY Pi
Linux Format
Article
A.I.-POWERED RASPBERRY Pi
Sep 19, 2023
1 min read
Data Fabric
PC Pro Magazine
Article
Data Fabric
Aug 13, 2020
3 min read
An Introduction To Rabbitmq
Linux Format
Article
An Introduction To Rabbitmq
Jun 29, 2021
RabbitMQ is a Message Broker, which means that it can safely hold messages generated by applications and make them available to other applications. The main advantages are reliability, support for clustering and high-availability queues, tracing capa
1 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
Rokoko Studio 2.0
3D World
Article
Rokoko Studio 2.0
Feb 23, 2021
1 min read
How does OpenAI’s GPT 3 work?
Techfastly
Article
How does OpenAI’s GPT 3 work?
May 3, 2021
4 min read
“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
PC Pro Magazine
Article
“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
Oct 8, 2022
9 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Aug 14, 2017
5 min read
The Coming Software Apocalypse
The Atlantic
Article
The Coming Software Apocalypse
Sep 26, 2017
33 min read
What Is The Future Of Game Streaming Now That Stadia Is Dead?
APC
Article
What Is The Future Of Game Streaming Now That Stadia Is Dead?
Oct 31, 2022
Once hyped as being ‘the future of gaming’, the Google Stadia game streaming service was officially, just three years after launch and before even making it to Australian shores. When game streaming first launched we did have some apprehension about
2 min read
Build A Mac Server
MacFormat
Article
Build A Mac Server
Sep 19, 2023
10 min read
Start Using MQTT For Sensor Logs
Linux Format
Article
Start Using MQTT For Sensor Logs
Feb 6, 2024
5 min read
Arch Linux
Linux Format
Article
Arch Linux
Aug 24, 2021
2 min read
Upgrade Your Marketing With Machine Learning
Fast Company
Article
Upgrade Your Marketing With Machine Learning
Sep 9, 2019
2 min read
Metasploitation
Linux Format
Article
Metasploitation
May 2, 2023
It’s a rare piece of code that never requires patching to fix some flaw or other that allows users to do what they were never meant to do. Exploits can be as simple as checking out plain text password files in an unprotected directory, or inputting s
5 min read
Introduction to eBPF Revolutionizing Linux Kernel Technology
Techfastly
Article
Introduction to eBPF Revolutionizing Linux Kernel Technology
Apr 1, 2022
6 min read
Murena Fairphone 4
Linux Format
Article
Murena Fairphone 4
Aug 22, 2023
5 min read
Artificial Empathy: The Last Step Of Humanizing Machines
Techfastly
Article
Artificial Empathy: The Last Step Of Humanizing Machines
Jul 1, 2021
1 min read
Congress Mandates New Car Technology To Stop Drunken Driving
AppleMagazine
Article
Congress Mandates New Car Technology To Stop Drunken Driving
Nov 12, 2021
4 min read
Ice Cold With Kali
Linux Format
Article
Ice Cold With Kali
May 2, 2023
3 min read
Create A RESTful Server In Go
Linux Format
Article
Create A RESTful Server In Go
Oct 19, 2021
8 min read

Related categories

Skip carousel

Reviews for Practical Natural Language Processing with Python

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Practical Natural Language Processing with Python - Mathangi Sri

M. SriPractical Natural Language Processing with Python https://doi.org/10.1007/978-1-4842-6246-7_1

1. Types of Data

Mathangi Sri¹

(1)

Bangalore, Karnataka, India

Natural language processing (NLP) is a field that helps humans communicate with computers naturally. It is a shift from the era when humans had to learn to use computers to computers being trained to understand humans. It is a branch of artificial intelligence (AI) that deals with language. The field dates back to the 1950s when a lot of research was undertaken in the machine translation area. Alan Turing predicted that by the early 2000s computers would be able to flawlessly understand and respond in natural language that you won’t be able to distinguish between humans and computers. We are far from that benchmark in the field of NLP. However, some argue that this may not even be the right lens to measure achievements in the field. Be that as it may, NLP is central to the success of many businesses. It is very difficult to imagine life without Google search, Alexa, YouTube recommendations, and so on. NLP has become ubiquitous today.

In order to understand this branch of AI better, let’s start with the fundamentals. Fundamental to any data science field is data. Hence understanding text data and various forms of it is at the heart of performing natural language processing. Let’s start with some of the most familiar daily sources of text data, from the angle of commercial usage:

Reviews

Social media posts/blogs

Chat data (business-to-consumer and consumer-to-consumer)

SMS data

Content data (news/videos/books)

IVR utterance data

Search

Search is one of the most widely used data sources from a customer angle. All search engine searches, whether a universal search engine or a search inside a website or an app, use at the core indexing, retrieval, and relevance-ranking algorithms. Search, also referred to as a query, is typically made up of short sentences of two or three words. Search engine results are approximate and they don’t necessarily need to be bang on with their results. For a query, multiple options are always presented as results. This user interface transfers the onus of finding the answer back to the user. Recount the number of times you have modified your query because you were not satisfied with the result. It’s unlikely that you blamed the performance of the engine. You focused your attention on modifying your query.

Reviews

Reviews are possibly the most widely analyzed data. Since this data is available openly or is easy to extract with web crawling, many organizations use this data. Reviews are very free flowing in nature and are very unstructured. Review mining is core to e-commerce companies like Amazon, Flipkart, eBay, and so on. Review sites like IMDB and Tripadvisor also have reviews data at their core. There are other organizations/vendors that provide insights on reviews collected by these companies. Figure 1-1 shows sample review data from www.amazon.in/dp/B0792KTHKK/ref=gw-hero-PC-dot-news-sketch?pf_rd_p=865a7afb-79a5-499b-82de-731a580ea265&pf_rd_r=TGGMS83TD4VZW7KQQBF3.

../images/486956_1_En_1_Chapter/486956_1_En_1_Fig1_HTML.png

Figure 1-1

Sample Amazon review

Note that the above review highlights the features that are important to the user: the scope of the product (music), the search efficiency, the speaker, and its sentiment. But we also get to know something about the user, such as the apps they care about. We could also profile the user on how objective or subjective they are.

As a quick, fun exercise, look at the long review from Amazon in Figure 1-2 and list the information you can extract from the review in the following categories: product features, sentiment, about the user, user sentiment, and whether the user is a purchaser.

../images/486956_1_En_1_Chapter/486956_1_En_1_Fig2_HTML.png

Figure 1-2

Extract some data from this review.

Social Media Posts/Blogs

Social media posts and blogs are widely researched, extracted, and analyzed, like reviews. Tweets and other microblogs are short and hence could seem easily extractable. However, tweets, depending on use cases, can carry a lot of noise. From my experience, on average only 1 out of every 100 tweets contains useful information on a given concept of interest. This is especially true in cases of analyzing sentiments for brands using Twitter data. In this research paper on sentiment analysis, only 20% of tweets in English and 10% of tweets in Turkish were found to be useful after collecting tweets for the topic: www.researchgate.net/profile/Serkan_Ayvaz/publication/320577176_Sentiment_Analysis_on_Twitter_A_Text_Mining_Approach_to_the_Syrian_Refugee_Crisis/links/5ec83c79299bf1c09ad59fb4/Sentiment-Analysis-on-Twitter-A-Text-Mining-Approach-to-the-Syrian-Refugee-Crisis.pdf. Hence looking for the right tweet in a corpus of tweets is a key to successfully mining Twitter or Facebook posts. Let’s take an example from https://twitter.com/explore:

Night Santa Cruz boardwalk and ocean

Took me while to get settings right. .....

Camera: pixel 3

Setting: raw, 1...https://t.co/XJfDq4WCuu

@Google @madebygoogle could you guys hook me up with the upcoming Pixel 4XL for my pixel IG. Just trying to stay ah...https://t.co/LxBHIRkGG1

China's bustling cities and countryside were perfect for a smartphone camera test. I pitted the #HuaweiP30Pro again...https://t.co/Cm79GQJnBT

#sun #sunrise #morningsky #glow #rooftop #silohuette madebygoogle google googlepixel #pixel #pixel3 #pixel3photos...https://t.co/vbScNVPjfy

RT @kwekubour: With The Effortlessly Fine, @acynam

../images/486956_1_En_1_Chapter/486956_1_En_1_Figa_HTML.png

x Pixel 3

Get A #Google #Pixel3 For $299, #Pixel3XL For $399 With Activation In These Smoking Hot #Dealshttps://t.co/ydbadB5lAn via @HotHardware

I purchased pixel 3 on January 26 2019 i started facing call drops issue and it is increasing day by day.i dont kn...https://t.co/1LTw9EdYzp

As you can see in this example, which displays sample tweets for Pixel 3, the content spans deals, reviews of the phone, amazing shots taken from the phone, someone awaiting the Pixel 4, and so on. In fact, if you want to understand the review or sentiment associated with Pixel 3, only 1 out of the 8 tweets is relevant.

A microblog’s data can contain power-packed information about a topic. In the above example of Pixel 3, you can find the following: the most liked or disliked features, the influence of location on the topic, the perception change over time, the impact of advertisements, the perception of advertisements for Pixel 3, and what kind of users like or dislike the product. Twitter can be mined as a leading indicator for various events, such as if a stock price of a particular company can be predicted if there is significant news about the company. The research paper at www.sciencedirect.com/science/article/pii/S2405918817300247 describes how Twitter data was used to correlate the movements of the FTSE Index (Financial Times Stock Exchange Index, an index of the 100 most capitalized companies on the London Stock Exchange) before, during, and after local elections in the United Kingdom.

Chat Data

Personal Chats

Personal chats are the classic everyday corpus of WhatsApp chat or Facebook or any other messenger service. They are definitely one of the richest sources of information to understand user behavior, more in the friends-and-family circle. They are filled with a lot of noise that needs to be weeded out, like you saw with the Twitter data. Only a small portion of the corpus is relevant for extracting useful commercial information. The incidence rate of this commercially useful information is not very high. That is to say, it has a low signal-to-noise ratio. The paper at www.aaai.org/ocs/index.php/ICWSM/ICWSM18/paper/view/17865/17043 studied openly submitted data from various WhatsApp chats. Figure 1-3 shows a word cloud of the WhatsApp groups analyzed by the paper.

../images/486956_1_En_1_Chapter/486956_1_En_1_Fig3_HTML.jpg

Figure 1-3

WhatsApp word cloud

Also the data privacy guidelines of messenger apps may not permit mining personal chats for commercial purposes. Some of the personal chats have functionality where a business can interact with the user, and I will cover that as part of the next section

Business Chats and Voice Call Data

Business chats, also referred to as live chats, are conversations that consumers have with a business on their website or app. Typically users reach out to chat agents about the issue they face in using a product or service. They may also discuss the product before making a purchase decision. Business chats are fairly rich in information, more so on the commercial preferences of the user. Look at the following example of a chat:

A lot of information can be cleanly taken from the above corpus. The user name, their problem, the fact that the user is responsive to emails, the user is also sensitive to price, the user’s sentiment, the courtesy of the agent, the outcome of the chat, the resolution provided by the agent, and the different departments of Best Telco.

Also, note how the data is laid out: it’s free flow text from the customer. But the chat agent plays a critical role in directing the chat. The initial lines talk about the issue, then the agent presents a resolution, and then towards the end of the chat a final answer is received along with the customer expressing their sentiment (in this case, positive).

The same interaction can happen over a voice call where a customer service representative and a user interact to solve the issue faced by the customer. Almost all characteristics are the same between voice calls and chat data, except in a voice call the original data is an audio file, which is transcribed to text first and then mined using text mining. At the end of the customer call, the customer service representative jots down the summary of the call. Referred to as agent call notes, these notes are also mined to analyze voice calls.

SMS Data

SMS is the best way to reach 35% of the world. SMS as a channel has one of the highest open rates (number of people who open the SMS message to number of people who received the SMS message): 5X over email open rates (https://blog.rebrandly.com/12-sms-text-message-marketing-statistics). On average, a person in the US receives 33 messages a day (www.textrequest.com/blog/how-many-texts-people-send-per-day/). Many app companies access customers’ SMS messages and mine the data to improve user experiences. For instance, apps like ReadItToMe read any SMS messages received by users while they are driving. Truecaller reads the SMS messages and classifies them into spam and non-spam. Walnut provides a view of users’ spending based on the SMS messages they have received. Just by looking at only transactional SMS data, much user information can be extracted: user’s income, their spending, type of spending, preference for online shopping, etc. The data source is more structured if we are only analyzing business messages. See Figure 1-4.

../images/486956_1_En_1_Chapter/486956_1_En_1_Fig4_HTML.jpg

Figure 1-4

A screenshot from the Walnut (https://capitalfloat.com/walnut/) app

Businesses follow a template and are more structured. Take the following SMS as an example. The noise in this dataset is much less. Clear information is presented in a clear style. Although different credit card companies can present different styles of information, it is still easier to extract information as compared to free-flow customer text.

Mini Statement for Card ******1884.Total due Rs. 4813.70. Minimum due Rs.240.69. Payment due on 07-SEP-19. Refer to your statement for more details.

Content Data

There is a proliferation of digital content in our lives. Online news articles, blogs, videos, social media, and online books are key types of content that we consume every day. On average, a consumer spends 8.8 hours consuming content digitally, per https://cmo.adobe.com/articles/2019/2/5-consumer-trends-that-are-shaping-digital-content-consumption.html. The following are the key problems data scientists need to solve to use text mining:

Content clustering (grouping similar)

Content classification

Entity recognition

Analyzing user reviews on content

Content recommendation

The other key data, like a user’s feedback on the content itself, is more structured: number of likes, shares, clicks, time spent, and so on. By combining the user preference data with the content data, we can understand a lot of information about the preference of the user, including lifestyle, life

Enjoying the preview?

Page 1 of 1

Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale

About this ebook

Mathangi Sri

Related authors

Related to Practical Natural Language Processing with Python

Related ebooks

Intelligence (AI) & Semantics For You

Related podcast episodes

Related articles

Related categories

Reviews for Practical Natural Language Processing with Python

What did you think?

Book preview

Practical Natural Language Processing with Python - Mathangi Sri

1. Types of Data

Search

Reviews

Social Media Posts/Blogs

Chat Data

Personal Chats

Business Chats and Voice Call Data

SMS Data

Content Data