Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)

Ebook597 pages4 hours

Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)

Name: Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)
Author: Partha Majumdar
ISBN: 9789355518484

By Partha Majumdar

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Classification algorithms are essential in machine learning as they allow us to make predictions about the class or category of an input by considering its features. These algorithms have a significant impact on multiple applications like spam filtering, sentiment analysis, image recognition, and fraud detection. If you want to expand your knowledge about classification algorithms, this book is the ideal resource for you.

The book starts with an introduction to problem-solving in machine learning and subsequently focuses on classification problems. It then explores the Naïve Bayes algorithm, a probabilistic method widely used in industrial applications. The application of Bayes Theorem and underlying assumptions in developing the Naïve Bayes algorithm for classification is also covered. Moving forward, the book centers its attention on the Logistic Regression algorithm, exploring the sigmoid function and its significance in binary classification. The book also covers Decision Trees and discusses the Gini Factor, Entropy, and their use in splitting trees and generating decision leaves. The Random Forest algorithm is also thoroughly explained as a cutting-edge method for classification (and regression). The book concludes by exploring practical applications such as Spam Detection, Customer Segmentation, Disease Classification, Malware Detection in JPEG and ELF Files, Emotion Analysis from Speech, and Image Classification.

By the end of the book, you will become proficient in utilizing classification algorithms for solving complex machine learning problems.

Skip carousel

Intelligence (AI) & Semantics

LanguageEnglish

PublisherBPB Online LLP

Release dateMay 23, 2023

ISBN9789355518484

Author

Partha Majumdar

Related to Mastering Classification Algorithms for Machine Learning

Related ebooks

Skip carousel

From Novice to ML Practitioner: Your Introduction to Machine Learning
Ebook
From Novice to ML Practitioner: Your Introduction to Machine Learning
byMoss Adelle Louise
Rating: 0 out of 5 stars
0 ratings
Everyday Data Structures
Ebook
Everyday Data Structures
byWilliam Smith
Rating: 0 out of 5 stars
0 ratings
Practical Mathematics for AI and Deep Learning: A Concise yet In-Depth Guide on Fundamentals of Computer Vision, NLP, Complex Deep Neural Networks and Machine Learning (English Edition)
Ebook
Practical Mathematics for AI and Deep Learning: A Concise yet In-Depth Guide on Fundamentals of Computer Vision, NLP, Complex Deep Neural Networks and Machine Learning (English Edition)
byTamoghna Ghosh
Rating: 0 out of 5 stars
0 ratings
Theory of Computation Simplified: Simulate Real-world Computing Machines and Problems with Strong Principles of Computation (English Edition)
Ebook
Theory of Computation Simplified: Simulate Real-world Computing Machines and Problems with Strong Principles of Computation (English Edition)
byDr. Varsha H. Patil
Rating: 0 out of 5 stars
0 ratings
Analysis and Design of Algorithms: A Beginner’s Hope
Ebook
Analysis and Design of Algorithms: A Beginner’s Hope
byShefali Singhal
Rating: 0 out of 5 stars
0 ratings
Introduction to Algorithms & Data Structures 2: A solid foundation for the real world of machine learning and data analytics
Ebook
Introduction to Algorithms & Data Structures 2: A solid foundation for the real world of machine learning and data analytics
byBolakale Aremu
Rating: 0 out of 5 stars
0 ratings
Deep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition)
Ebook
Deep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition)
byShekhar Khandelwal
Rating: 0 out of 5 stars
0 ratings
GROKKING ALGORITHM BLUEPRINT: Advanced Guide to Help You Excel Using Grokking Algorithms
Ebook
GROKKING ALGORITHM BLUEPRINT: Advanced Guide to Help You Excel Using Grokking Algorithms
byWilliam Turner
Rating: 0 out of 5 stars
0 ratings
Building Machine Learning Systems Using Python: Practice to Train Predictive Models and Analyze Machine Learning Results with Real Use-Cases (English Edition)
Ebook
Building Machine Learning Systems Using Python: Practice to Train Predictive Models and Analyze Machine Learning Results with Real Use-Cases (English Edition)
byDeepti Chopra
Rating: 0 out of 5 stars
0 ratings
Introduction to Algorithms & Data Structures 1: A solid foundation for the real world of machine learning and data analytics
Ebook
Introduction to Algorithms & Data Structures 1: A solid foundation for the real world of machine learning and data analytics
byBolakale Aremu
Rating: 0 out of 5 stars
0 ratings
Advanced Data Structures and Algorithms: Learn how to enhance data processing with more complex and advanced data structures (English Edition)
Ebook
Advanced Data Structures and Algorithms: Learn how to enhance data processing with more complex and advanced data structures (English Edition)
byAbirami A
Rating: 0 out of 5 stars
0 ratings
Machine Learning for the Web
Ebook
Machine Learning for the Web
byAndrea Isoni
Rating: 0 out of 5 stars
0 ratings
Deep Learning with C#, .Net and Kelp.Net: The Ultimate Kelp.Net Deep Learning Guide
Ebook
Deep Learning with C#, .Net and Kelp.Net: The Ultimate Kelp.Net Deep Learning Guide
byMatt R. Cole
Rating: 0 out of 5 stars
0 ratings
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
Ebook
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
byPeter Bradley
Rating: 0 out of 5 stars
0 ratings
Programming Massively Parallel Processors: A Hands-on Approach
Ebook
Programming Massively Parallel Processors: A Hands-on Approach
byDavid B. Kirk
Rating: 0 out of 5 stars
0 ratings
Programming the Network with Perl
Ebook
Programming the Network with Perl
byPaul Barry
Rating: 0 out of 5 stars
0 ratings
Approximate Dynamic Programming: Solving the Curses of Dimensionality
Ebook
Approximate Dynamic Programming: Solving the Curses of Dimensionality
byWarren B. Powell
Rating: 4 out of 5 stars
4/5
OpenCL in Action: How to accelerate graphics and computations
Ebook
OpenCL in Action: How to accelerate graphics and computations
byMatthew Scarpino
Rating: 0 out of 5 stars
0 ratings
GROKKING ALGORITHMS: Simple and Effective Methods to Grokking Deep Learning and Machine Learning
Ebook
GROKKING ALGORITHMS: Simple and Effective Methods to Grokking Deep Learning and Machine Learning
byEric Schmidt
Rating: 0 out of 5 stars
0 ratings
Practical C++ Backend Programming
Ebook
Practical C++ Backend Programming
byJustin Barbara
Rating: 0 out of 5 stars
0 ratings
Experimentation for Engineers: From A/B testing to Bayesian optimization
Ebook
Experimentation for Engineers: From A/B testing to Bayesian optimization
byDavid Sweet
Rating: 0 out of 5 stars
0 ratings
GROKKING ALGORITHMS: A Comprehensive Beginner's Guide to Learn the Realms of Grokking Algorithms from A-Z
Ebook
GROKKING ALGORITHMS: A Comprehensive Beginner's Guide to Learn the Realms of Grokking Algorithms from A-Z
byEric Schmidt
Rating: 0 out of 5 stars
0 ratings
Systems Programming: Designing and Developing Distributed Applications
Ebook
Systems Programming: Designing and Developing Distributed Applications
byRichard Anthony
Rating: 0 out of 5 stars
0 ratings
Pro Cryptography and Cryptanalysis: Creating Advanced Algorithms with C# and .NET
Ebook
Pro Cryptography and Cryptanalysis: Creating Advanced Algorithms with C# and .NET
byMarius Iulian Mihailescu
Rating: 0 out of 5 stars
0 ratings
Computational Learning Approaches to Data Analytics in Biomedical Applications
Ebook
Computational Learning Approaches to Data Analytics in Biomedical Applications
byKhalid Al-Jabery
Rating: 5 out of 5 stars
5/5
Mastering C++ Network Automation
Ebook
Mastering C++ Network Automation
byJustin Barbara
Rating: 0 out of 5 stars
0 ratings
ASP.NET Application Development Fundamentals
Ebook
ASP.NET Application Development Fundamentals
byJames Lombard
Rating: 0 out of 5 stars
0 ratings
Essential Algorithms: A Practical Approach to Computer Algorithms
Ebook
Essential Algorithms: A Practical Approach to Computer Algorithms
byRod Stephens
Rating: 5 out of 5 stars
5/5
Code with Java 21: A practical approach for building robust and efficient applications (English Edition)
Ebook
Code with Java 21: A practical approach for building robust and efficient applications (English Edition)
byAaron Ploetz
Rating: 0 out of 5 stars
0 ratings
Julia for Data Analysis
Ebook
Julia for Data Analysis
byBogumil Bogumil
Rating: 0 out of 5 stars
0 ratings

Intelligence (AI) & Semantics For You

Skip carousel

ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
2084: Artificial Intelligence and the Future of Humanity
Ebook
2084: Artificial Intelligence and the Future of Humanity
byJohn C. Lennox
Rating: 4 out of 5 stars
4/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
Summary of Super-Intelligence From Nick Bostrom
Ebook
Summary of Super-Intelligence From Nick Bostrom
bySummary Station
Rating: 5 out of 5 stars
5/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
ChatGPT For Fiction Writing: AI for Authors
Ebook
ChatGPT For Fiction Writing: AI for Authors
byNova Leigh
Rating: 5 out of 5 stars
5/5
ChatGPT For Dummies
Ebook
ChatGPT For Dummies
byPam Baker
Rating: 0 out of 5 stars
0 ratings
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
Ebook
Rise of Generative AI and ChatGPT: Understand how Generative AI and ChatGPT are transforming and reshaping the business world (English Edition)
byUtpal Chakraborty
Rating: 0 out of 5 stars
0 ratings
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
Enterprise AI For Dummies
Ebook
Enterprise AI For Dummies
byZachary Jarvinen
Rating: 3 out of 5 stars
3/5
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
Ebook
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
Ebook
Summary of Building a Second Brain: by Tiago Forte - A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential - A Comprehensive Summary
byAlexander Cooper
Rating: 1 out of 5 stars
1/5
10 Great Ways to Earn Money Through Artificial Intelligence(AI)
Ebook
10 Great Ways to Earn Money Through Artificial Intelligence(AI)
byAli Musa
Rating: 3 out of 5 stars
3/5
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
Ebook
Hacking With Linux 2020:A Complete Beginners Guide to the World of Hacking Using Linux - Explore the Methods and Tools of Ethical Hacking with Linux
byJoseph Kenna
Rating: 0 out of 5 stars
0 ratings
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
Ebook
Mastering ChatGPT: Create Highly Effective Prompts, Strategies, and Best Practices to Go From Novice to Expert
byTJ Books
Rating: 3 out of 5 stars
3/5
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
Ebook
THE CHATGPT MILLIONAIRE'S HANDBOOK: UNLOCKING WEALTH THROUGH AI AUTOMATION
byLogan Rivers
Rating: 5 out of 5 stars
5/5
ChatGPT for Marketing: A Practical Guide
Ebook
ChatGPT for Marketing: A Practical Guide
byJuanjo Ramos
Rating: 3 out of 5 stars
3/5
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
Impromptu: Amplifying Our Humanity Through AI
Ebook
Impromptu: Amplifying Our Humanity Through AI
byReid Hoffman
Rating: 5 out of 5 stars
5/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 5 out of 5 stars
5/5
ChatGPT for Screenwriters
Ebook
ChatGPT for Screenwriters
byEthan Michael Carter
Rating: 0 out of 5 stars
0 ratings
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
Ebook
What Makes Us Human: An Artificial Intelligence Answers Life's Biggest Questions
byJasmine Wang
Rating: 5 out of 5 stars
5/5

Related podcast episodes

Skip carousel

Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
Podcast episode
Build Better Machine Learning Models With Confidence By Adding Validation With Deepchecks: A cross-over episode from The Machine Learning Podcast with the team from Deepchecks, exploring the challenges of testing and validating machine learning applications and their work to make it easier.
byThe Python Podcast.__init__
0 ratings
0% found this document useful
A Chaos Engineering & Jeli Sandwich with Nora Jones: Nora Jones is the founder and CEO at Jeli, makers of an incident analysis platform that leverages data to recommend productive solutions to the problems at hand. Before this role, she was Head of Chaos Engineering and Human Factors at Slack, a senior soft
Podcast episode
A Chaos Engineering & Jeli Sandwich with Nora Jones: Nora Jones is the founder and CEO at Jeli, makers of an incident analysis platform that leverages data to recommend productive solutions to the problems at hand. Before this role, she was Head of Chaos Engineering and Human Factors at Slack, a senior soft
byScreaming in the Cloud
0 ratings
0% found this document useful
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
Podcast episode
Morgan Senkal: Using Epics to Improve Code Quality Within Sprints: Robby speaks with Morgan Senkal, Software Architect at Metal Toad. Morgan recalls a challenging 15-year-old legacy project that was reminiscent of a Stephen King story and explains what to think about when considering a software rewrite. Morgan and Robby keep a running analogy of technical debt and automotive repairs.
byMaintainable
0 ratings
0% found this document useful
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
Podcast episode
Microservices with Rafi Schloming: Microservices are a widely adopted pattern for breaking an application up into pieces that can be well-understood by the individual teams within the company. Microservices also allow these individual pieces to be scaled independently and updated in iso...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
Podcast episode
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Cloud Dataflow with Eric Anderson: Batch and stream processing systems have been evolving for the past decade. From MapReduce to Apache Storm to Dataflow, the best practices for large volume data processing have become more sophisticated as the industry and open source communities have ...
Podcast episode
Cloud Dataflow with Eric Anderson: Batch and stream processing systems have been evolving for the past decade. From MapReduce to Apache Storm to Dataflow, the best practices for large volume data processing have become more sophisticated as the industry and open source communities have ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Using FoundationDB As The Bedrock For Your Distributed Systems - Episode 80: An interview about the FoundationDB project and how it simplifies the work of building custom distributed systems applications
Podcast episode
Using FoundationDB As The Bedrock For Your Distributed Systems - Episode 80: An interview about the FoundationDB project and how it simplifies the work of building custom distributed systems applications
byData Engineering Podcast
0 ratings
0% found this document useful
CloudGraph with Tyson Kunovsky: The advent of the cloud introduced a new form of technical debt in which organizations can lose track of what infrastructure they have and how it relates to the business. While the cloud’s native APIs offer some transparency into your infrastructure,
Podcast episode
CloudGraph with Tyson Kunovsky: The advent of the cloud introduced a new form of technical debt in which organizations can lose track of what infrastructure they have and how it relates to the business. While the cloud’s native APIs offer some transparency into your infrastructure,
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Jobs of Tomorrow: Windows Insider Podcast Episode 17
Podcast episode
Jobs of Tomorrow: Windows Insider Podcast Episode 17
byWindows Insider Podcast
100%
100% found this document useful
Declarative Machine Learning For High Performance Deep Learning Models With Predibase
Podcast episode
Declarative Machine Learning For High Performance Deep Learning Models With Predibase
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Distributed Systems Tradeoffs with Camille Fournier: Distributed systems products are often marketed with terms like “real-time data” and “hassle-free scaling”, but what do those terms actually mean? Is data in a distributed system ever reliably “real time”? Do we ever have strong enough plans about our ...
Podcast episode
Distributed Systems Tradeoffs with Camille Fournier: Distributed systems products are often marketed with terms like “real-time data” and “hassle-free scaling”, but what do those terms actually mean? Is data in a distributed system ever reliably “real time”? Do we ever have strong enough plans about our ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
Podcast episode
All Things Azure with Dwayne Monroe: Dwayne Monroe is a senior cloud architect at Cloudreach, an organization that helps enterprises maximize their cloud investments, who’s focused on Azure. Prior to joining Cloudreach, Dwayne worked as a senior Microsoft and cloud architect at High Availabi
byScreaming in the Cloud
0 ratings
0% found this document useful
DynamoDB The Database of Choice for Serverless Applications with Alex DeBrie: Alex DeBrie is the founder of DeBrie, LLC, a cloud-native training and AWS consulting company with a focus on DynamoDB and serverless technologies. He’s also the author of The DynamoDB Book, a 450-page tome that offers tips, strategies, and more about dat
Podcast episode
DynamoDB The Database of Choice for Serverless Applications with Alex DeBrie: Alex DeBrie is the founder of DeBrie, LLC, a cloud-native training and AWS consulting company with a focus on DynamoDB and serverless technologies. He’s also the author of The DynamoDB Book, a 450-page tome that offers tips, strategies, and more about dat
byScreaming in the Cloud
0 ratings
0% found this document useful
The Past, Present, and Future of Deep Learning In PyTorch: An interview with the creator of the popular PyTorch deep learning framework
Podcast episode
The Past, Present, and Future of Deep Learning In PyTorch: An interview with the creator of the popular PyTorch deep learning framework
byThe Python Podcast.__init__
0 ratings
0% found this document useful
41. Bob Nystrom
Podcast episode
41. Bob Nystrom
byIt's All Widgets! Flutter Podcast
0 ratings
0% found this document useful
The Startup Inside IBM with Sachin Agarwal: Sachin Agarwal is the worldwide product management lead at IBM Aspera. He’s also an organizer at Empower Platform and the founder and CEO of Braid, a project management solution built inside Gmail, Google Calendar, and Slack. Previously, Sachin worked as
Podcast episode
The Startup Inside IBM with Sachin Agarwal: Sachin Agarwal is the worldwide product management lead at IBM Aspera. He’s also an organizer at Empower Platform and the founder and CEO of Braid, a project management solution built inside Gmail, Google Calendar, and Slack. Previously, Sachin worked as
byScreaming in the Cloud
0 ratings
0% found this document useful
Cloud Education Made Easy with Katie Bullard: Katie Bullard is the president of A Cloud Guru, a cloud education platform. She’s also a board member at Conservice, ChildCareCRM, and Journyx, Inc. Katie previously served as president and chief growth officer at ZoomInfo (formerly DiscoverOrg), VP of ma
Podcast episode
Cloud Education Made Easy with Katie Bullard: Katie Bullard is the president of A Cloud Guru, a cloud education platform. She’s also a board member at Conservice, ChildCareCRM, and Journyx, Inc. Katie previously served as president and chief growth officer at ZoomInfo (formerly DiscoverOrg), VP of ma
byScreaming in the Cloud
0 ratings
0% found this document useful
Using AI to supercharge DevX with Deepak Singh of AWS: Developer experience, or DevX, is a critical aspect of modern software development that focuses on creating a seamless and productive environment for developers. It encompasses everything from the tools and technologies used in the development process ...
Podcast episode
Using AI to supercharge DevX with Deepak Singh of AWS: Developer experience, or DevX, is a critical aspect of modern software development that focuses on creating a seamless and productive environment for developers. It encompasses everything from the tools and technologies used in the development process ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
Podcast episode
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
byData Engineering Podcast
0 ratings
0% found this document useful
235: Pair programming with Ben Orenstein & Tuple: In this episode, Kaushik goes solo and interviews Ben Orenstein. Ben is a prolific Ruby developer, an amazing conference speaker, an ardent vim-ster, and now the CEO of Tuple. Kaushik has been a big fan of Ben's work and was super stoked to talk to Ben and pick his brains on a host of topics: starting the company Tuple, pair programming in general, learning different programming languages and technology, giving better conference talks and more! This episode is chock full of wisdom from Ben. Enjoy!
Podcast episode
235: Pair programming with Ben Orenstein & Tuple: In this episode, Kaushik goes solo and interviews Ben Orenstein. Ben is a prolific Ruby developer, an amazing conference speaker, an ardent vim-ster, and now the CEO of Tuple. Kaushik has been a big fan of Ben's work and was super stoked to talk to Ben and pick his brains on a host of topics: starting the company Tuple, pair programming in general, learning different programming languages and technology, giving better conference talks and more! This episode is chock full of wisdom from Ben. Enjoy!
byFragmented - An Android Developer Podcast
0 ratings
0% found this document useful
Can we predict the accuracy of a Neural Network? Yes, with the WeightWatcher tool by Charles Martin, Ph.D. - 002: In this episode we do a deep dive into deep neural networks. What conclusions can we make looking at the distribution of eigenvalues of each layer?
Podcast episode
Can we predict the accuracy of a Neural Network? Yes, with the WeightWatcher tool by Charles Martin, Ph.D. - 002: In this episode we do a deep dive into deep neural networks. What conclusions can we make looking at the distribution of eigenvalues of each layer?
byMachine Learning Cafe
0 ratings
0% found this document useful
Declarative Machine Learning Without The Operational Overhead Using Continual: An interview with Tristan Zajonc about his work at Continual to make declarative machine learning workflows possible and seamless by building on top of the data warehouse, and how it reduces the time and cost of putting machine learning into production.
Podcast episode
Declarative Machine Learning Without The Operational Overhead Using Continual: An interview with Tristan Zajonc about his work at Continual to make declarative machine learning workflows possible and seamless by building on top of the data warehouse, and how it reduces the time and cost of putting machine learning into production.
byData Engineering Podcast
0 ratings
0% found this document useful
ChatOps with Jason Hand: Chat bots are your newest co-worker. Slack, HipChat, and other chat clients allow developers and other team members to communicate more dynamically than the limits of email. Companies have started to add bots to their chat rooms.
Podcast episode
ChatOps with Jason Hand: Chat bots are your newest co-worker. Slack, HipChat, and other chat clients allow developers and other team members to communicate more dynamically than the limits of email. Companies have started to add bots to their chat rooms.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
Podcast episode
Building Real Time Applications On Streaming Data With Eventador - Episode 129: An interview with Eventador CEO Kenny Gorman about the challenges of building a managed service for streaming data to simplify building real time applications
byData Engineering Podcast
0 ratings
0% found this document useful
Jacob Aronoff - At Least One Person Who Cares To See It Through: Robby has a chat with Staff Software Engineer at Lightstep from ServiceNow, Jacob Aronoff, about the vital signs of a thriving open source software project, the importance of a passionate community behind such projects, why understanding an open source project's own dependencies is crucial before adopting it, the nuances of evaluating a project's health through performance metrics, the organizational dynamics of the OpenTelemetry community, and so much more.
Podcast episode
Jacob Aronoff - At Least One Person Who Cares To See It Through: Robby has a chat with Staff Software Engineer at Lightstep from ServiceNow, Jacob Aronoff, about the vital signs of a thriving open source software project, the importance of a passionate community behind such projects, why understanding an open source project's own dependencies is crucial before adopting it, the nuances of evaluating a project's health through performance metrics, the organizational dynamics of the OpenTelemetry community, and so much more.
byMaintainable
0 ratings
0% found this document useful
Running Databases on Kubernetes
Podcast episode
Running Databases on Kubernetes
byThe Cloudcast
0 ratings
0% found this document useful
Managed Kafka with Tom Crayford: Kafka is a distributed log for producers and consumers to publish messages to each other. We’ve done many shows about Kafka as a key building block for distributed systems, but we often leave out the discussion of the complexities of setting up Kafka a...
Podcast episode
Managed Kafka with Tom Crayford: Kafka is a distributed log for producers and consumers to publish messages to each other. We’ve done many shows about Kafka as a key building block for distributed systems, but we often leave out the discussion of the complexities of setting up Kafka a...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Web Assembly's hidden talent with WasmCloud's Kevin Hoffman: WebAssembly-based wasmCloud is a Sandbox Project for the Cloud Native Computing Foundation (CNCF) and Cosmonic CEO Kevin Hoffman is convinced it's the next big thing in computing. He talks to Scott about why WebAssembly is so significant and considers it through a historical lens of decades of building distributed systems. Should you build your functions and services in the language you want and run them securely everywhere with WebAssembly?
Podcast episode
Web Assembly's hidden talent with WasmCloud's Kevin Hoffman: WebAssembly-based wasmCloud is a Sandbox Project for the Cloud Native Computing Foundation (CNCF) and Cosmonic CEO Kevin Hoffman is convinced it's the next big thing in computing. He talks to Scott about why WebAssembly is so significant and considers it through a historical lens of decades of building distributed systems. Should you build your functions and services in the language you want and run them securely everywhere with WebAssembly?
byHanselminutes with Scott Hanselman
0 ratings
0% found this document useful
Throwing Houlihans at MongoDB with Rick Houlihan: A year or so before the pandemic hit Corey traveled to Australia for a keynote speech. There he crossed paths with the closing keynote which was delivered by Rick Houlihan. Rick, Director Developer Relations for Strategic Accounts at MongoDB, put Corey’s
Podcast episode
Throwing Houlihans at MongoDB with Rick Houlihan: A year or so before the pandemic hit Corey traveled to Australia for a keynote speech. There he crossed paths with the closing keynote which was delivered by Rick Houlihan. Rick, Director Developer Relations for Strategic Accounts at MongoDB, put Corey’s
byScreaming in the Cloud
0 ratings
0% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
Podcast episode
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful

Skip carousel

Don’t Be Misled by GPT-4’s Gift of Gab
The Atlantic
Article
Don’t Be Misled by GPT-4’s Gift of Gab
Mar 15, 2023
4 min read
An Introduction To Rabbitmq
Linux Format
Article
An Introduction To Rabbitmq
Jun 29, 2021
RabbitMQ is a Message Broker, which means that it can safely hold messages generated by applications and make them available to other applications. The main advantages are reliability, support for clustering and high-availability queues, tracing capa
1 min read
Rise Of The Robots
Linux Format
Article
Rise Of The Robots
Jan 12, 2021
7 min read
» Stochastic Algorithms
Linux Format
Article
» Stochastic Algorithms
Dec 14, 2021
If you’re up for some relatively maths-heavy computer-science reading (and who isn’t?), then consider looking into stochastic algorithms. Sometimes lumped together with machine-learning, stochastic algorithms is a loosely defined category that you co
1 min read
We Need an FDA For Algorithms
Nautilus
Article
We Need an FDA For Algorithms
Nov 1, 2018
In the introduction to her new book, Hannah Fry points out something interesting about the phrase “Hello World.” It’s never been quite clear, she says, whether the phrase—which is frequently the entire output of a student’s first computer program—is
10 min read
The Fundamental Limits of Machine Learning
Nautilus
Article
The Fundamental Limits of Machine Learning
Sep 20, 2016
5 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
Create Smaller Sized Apps With React
Linux Format
Article
Create Smaller Sized Apps With React
Nov 19, 2019
You may not be surprised that some developers have criticised Electron (see tutorials LXF256), mostly regarding the memory usage of its final binaries. The initial binary is over 100MB, because a major chunk of code from Chrome is embedded. When you
6 min read
The Future Of Home Networking
APC
Article
The Future Of Home Networking
Feb 22, 2021
10 min read
Route Traffic Between Networks Using A Pi
Linux Format
Article
Route Traffic Between Networks Using A Pi
Jun 2, 2020
A deep-dive into Pi networking solutions resulted in this tutorial. The goal was to uncover a Pi configuration that would enable the routing of network traffic from a wired network to a wireless network. The aim is to build a network router using a R
10 min read
Software Pools Server Memory for Faster Networks
Futurity
Article
Software Pools Server Memory for Faster Networks
May 31, 2017
A group of engineers has created open-source software that allows for memory sharing among servers in a computer network, allowing for more efficient use of memory and even faster computer operations. For decades, operators of large computer clusters
2 min read
Create A Chrome Extension
Computeractive
Article
Create A Chrome Extension
Jan 19, 2022
Though I’ve been writing for tech magazines for 24 years, I don’t know much about coding. Kids today apparently start programming from the age of seven, whereas I never learned to do more than type 10 PRINT “Hello”. 20 GO TO 10. When I watch news sto
2 min read
Windows Sandbox: How To Use Microsoft’s Virtual Windows PC To Secure Your Digital Life
PCWorld
Article
Windows Sandbox: How To Use Microsoft’s Virtual Windows PC To Secure Your Digital Life
Jul 2, 2019
6 min read
Build A Mac Server
MacFormat
Article
Build A Mac Server
Sep 19, 2023
10 min read
Using EBPF To Monitor Filesystems
Linux Format
Article
Using EBPF To Monitor Filesystems
Dec 13, 2022
10 min read
What Is The Future Of Game Streaming Now That Stadia Is Dead?
APC
Article
What Is The Future Of Game Streaming Now That Stadia Is Dead?
Oct 31, 2022
Once hyped as being ‘the future of gaming’, the Google Stadia game streaming service was officially, just three years after launch and before even making it to Australian shores. When game streaming first launched we did have some apprehension about
2 min read
FAQ: iOS 17
Macworld UK
Article
FAQ: iOS 17
Jun 16, 2023
4 min read
FLASK Web Frameworks
Linux Format
Article
FLASK Web Frameworks
Jun 4, 2019
The main focus of Python has always been to get you cracking on with your coding – the language was never made for web programming. However, this has just made it more interesting to extend the language for the web, or to create an interface to web-b
9 min read
Top 10 Programming Languages
PC Pro Magazine
Article
Top 10 Programming Languages
Jan 5, 2023
8 min read
Introduction to eBPF Revolutionizing Linux Kernel Technology
Techfastly
Article
Introduction to eBPF Revolutionizing Linux Kernel Technology
Apr 1, 2022
6 min read
Python Mathematics Library
Linux Format
Article
Python Mathematics Library
Feb 8, 2022
1 min read
It’s Great When You’re K8s
Linux Format
Article
It’s Great When You’re K8s
Oct 18, 2022
8 min read
Create Asynchronous Code With Python
Linux Format
Article
Create Asynchronous Code With Python
Jun 29, 2021
8 min read
STEM Online Learning Resources
BBC Science Focus Magazine
Article
STEM Online Learning Resources
Sep 3, 2020
4 min read
What an AI's Non-Human Language Actually Looks Like
The Atlantic
Article
What an AI's Non-Human Language Actually Looks Like
Jun 20, 2017
4 min read
A Stanford Professor Didn't Just Debate His Scientific Critics — He Sued Them for $10 Million
Los Angeles Times
Article
A Stanford Professor Didn't Just Debate His Scientific Critics — He Sued Them for $10 Million
Nov 22, 2017
5 min read
Meet The Team
Linux Format
Article
Meet The Team
Jul 27, 2021
Come to think of it, a live music and coding session with Sonic Pi would be somewhat engaging. Viewers could submit their own code snippets via git in near real-time, and probably a chaotic cacophony would ensue, but it would make for a good experime
1 min read
The Easy Questions That Stump Computers
The Atlantic
Article
The Easy Questions That Stump Computers
May 3, 2020
What happens when you stack logs in a fireplace and drop a match? Some of the smartest machines have no idea.
10 min read
Artificial Empathy: The Last Step Of Humanizing Machines
Techfastly
Article
Artificial Empathy: The Last Step Of Humanizing Machines
Jul 1, 2021
1 min read
The Brain Uses Calculus to Control Fast Movements
Nautilus
Article
The Brain Uses Calculus to Control Fast Movements
Jan 4, 2023
4 min read

Related categories

Skip carousel

Reviews for Mastering Classification Algorithms for Machine Learning

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Mastering Classification Algorithms for Machine Learning - Partha Majumdar

CHAPTER 1 Introduction to Machine Learning

Welcome to this book.

In this book, we will explore models for classifying data. We need to classify data for various purposes. For example, from piles of data regarding credit card transactions, we need to find out if there is any fraudulent transaction. So essentially, we are classifying the data into two classes – good transactions and fraudulent transactions. For example, from data regarding pictures of food items, we need to figure out if a food item would suit a diabetic patient.

Human beings are experts in classification in most situations. However, the data to classify is too large in the modern world. So, we need machines to classify as effectively as humans so that it is practical to meet the demand.

This book will discuss various models using which, machines can effectively classify data. Before we discuss these classification models, we start with a discussion on what machine learning is. We will also explore how machines can be made to learn.

Figure 1.1

Structure

In this chapter, we will discuss the following topics:

Machine learning

Traditional programming versus programming for machine learning

The learning process of a machine

Kinds of data machines can learn from

Types of machine learning

Supervised learning

Unsupervised learning

Objectives

After reading this chapter, you can differentiate between traditional programming and programming for machine learning. Also, you will understand what are the different problems that can be solved by machine learning.

Machine learning

Neuroscientist Warren S. McCulloch and Logician Walter H. Pitts published A Logical Calculus of the ideas immanent in the Nervous activity in 1943 in the Bulletin of Mathematical Biophysics, Vol 5. In this paper, they discussed a mathematical model of neural networks. This is the first attempt to make machines think like the human brain. ¹

What is being able to think is a vast subject. We can make a simple abstraction, as shown in Figure 1.1. Thinking is a process of collecting data, finding patterns in the data, and making inferences from the patterns.

1 https://www.cse.chalmers.se/~coquand/AUTOMATA/mcp.pdf.

Figure 1.2: Abstraction of how thinking is performed

Let us discuss the process of thinking through an example. Suppose the data provided to us is a massive pile of medicines. On receiving this data, we could find patterns like which medicines are like each other. We may study the composition of the medicines and the manufacturers and many other attributes. We may classify the medicines as which medicine group is for curing what disease based on the patterns we find.

Machine learning is like this. We present data to the machine and sometimes provide information about the data. Based on this information and knowledge, the machine finds patterns and considers the mechanism to see them as its rules. Once the machine has formulated its rules, it makes inferences about a new situation.

Machine learning is a branch of Artificial Intelligence (AI). In machine learning, using mathematical modeling on data, a machine is made to learn the patterns in the data without any human intervention.

Traditional programming versus programming for machine learning

Programming for machine learning is different from traditional programming.

In traditional programming, we have data and rules. We apply the rules to the data to get the Output. Refer to Figure 1.3:

Figure 1.3: Traditional Programming

Consider this example from the world of Physics. When we want the computer to calculate the value of momentum, we tell the computer that the formula for momentum is the mass multiplied by the velocity and tell the computer the value of mass and velocity. Here, the value of mass and velocity is the Data. On this data, the computer applies the Rule, that is, the formula for momentum, to find the momentum value for us. The value of momentum calculated by the computer is the Output.

Momentum = Mass * Velocity

Generally written as,

Momentum = mv

In contrast to traditional programming, in machine learning, we supply the computer with Data and Output, and we expect the computer to generate the Rules as shown in Figure 1.4:

Figure 1.4: Programming for Machine Learning

Suppose we had a mechanism to get values of momentum from some experiment. And we knew the values of mass and velocity in each of the experiments. Now, if we want the computer to determine the formula for momentum, that would be a machine learning situation. So, we would input the values of mass, velocity, and momentum and tell the machine to determine the formula for calculating momentum.

The learning process of a machine

Let us discuss a simplistic way the machines learn. As you would imagine, the actual process is much more complex.

Consider that we have the following data, (Refer to Figure 1.5) from an experiment. We ask the machine to provide a relationship between momentum mass and velocity.

Figure 1.5: Input to a computer to create a Machine Learning Model

For the machine to build a model, the data scientist must tell the machine what model to make. Generally, the data scientist tries to understand the data. This step is called Exploratory Data Analysis. In the preceding situation, we have two independent variables, m, and v, and one dependent variable, M. We can plot this data on a 2-dimensional chart as shown in Figure 1.6:

Figure 1.6: Scatter Plot based on data in Figure 1.4

Let us say that data scientist decides to create a linear model of the form M = β0 + β1 * m + β2 * v. The machine needs to estimate the values of β0, β1, and β2.

The Data Scientist provides a starting value of β0, β1, and β2. Let us say that these values are β0 = 5, β1 = 5, and β2 = 5. Using these values, the machine calculates the values for M, as shown in Figure 1.7. We call the value calculated by the machine Mhat.

Figure 1.7: Initial estimates of Momentum (M) made by the machine

If we plot this data, we get the chart shown in Figure 1.8, where the dots are the actual values of M as provided to the machine. The stars are the values of M estimated by the machine:

Figure 1.8: Plot of the machine's initial Momentum (M) estimates

We can see that the machine did not do so well. However, the machine continues. The machine calculates its error in making the estimates, as shown in Figure 1.9. We see that the machine can overestimate or underestimate. So, the error can be negative or positive. Instead of considering the value of the error, we consider the value of the square of the error. Further, we calculate the mean of the squared error (MSE) across all the data points by averaging the squares of error.

Figure 1.9: Computation of error in estimates made by the machine

Now, the machine considers other values of β0, β1, and β2 so that the value of the MSE is minimized. After some rounds of calculations, the machine gets the following values of β0, β1, and β2, as shown in Figure 1.10:

Figure 1.10: Estimate of Momentum after minimizing MSE

The estimates, though better, could be more reasonable. So, the data scientist considers another strategy. This time the data scientist asks the computer to try and find a relationship between m*v and M. They want the machine to create an equation of the form M = β0 + β1 * m * v. As in the earlier case, the data scientist gives initial values for β0 and β1 as β0 = 5 and β1 = 5.

The setup is shown in Figure 1.11:

Figure 1.11: New setup

The machine tries to minimize the MSE for this setup and calculate the values of β0 and β1, as shown in Figure 1.12:

Figure 1.12: Estimate of Momentum after minimizing MSE for the new model devised in Figure 1.11

The machine has done much better. Let us plot this data and check (Refer to Figure 1.13):

Figure 1.13: Plot of new estimates made by the machine. The RED crosses are the estimates

So, the machine has given us a formula for calculating momentum based on the data provided to the machine. According to the machine:

Momentum = 2.46214616625777 + 0.991020053873143 * Mass * Velocity

Now, for any new value of Mass and Velocity, say Mass = 7 kg and Velocity = 8 km/h, the machine would say that:

Momentum = 2.46214616625777 + 0.991020053873143 * 7 kg * 8 km/h = 57.95926918 kg * km/h

This is pretty good as, according to the formula from physics, the value of momentum for Mass = 7 kg and Velocity = 8 km/h should be 56 kg * km/h.

Kinds of data the machines can learn from

From nature, human beings can gather data through the five sense organs. We can see, hear, smell, taste, and feel. Out of these five types of data, human beings have been able to digitize what they see and hear. Likewise, machines can also understand data from images and sounds.

Human beings have created a lot of digital data from various activities we perform. This data is either structured or unstructured.

Structured data is organized in tabular form and follow definite semantics. It is by far the data most processed by machines. As of 2022, about 80% of the data machines learn from are structured data. Machines are extremely good with structured data. Also, machines are very useful in working on structured data as humans fail to cope with the volumes of structured data. Examples of structured data can be found in any system where some transactions are conducted. For example, the data regarding credit card system transactions is structured. In a credit card system, millions of transactions are performed daily. Tasks like detecting fraudulent transactions are extremely difficult for human beings. So, here machines are best suited for the job.

Unstructured data is a more recent phenomenon. This has mainly exploded due to social media. Unstructured data has no definite semantics, so, such data must be expressed with some semantics before the machines can work on them. Over the years, many representations of unstructured data have emerged; thus, machines can work efficiently on such data. Examples of unstructured data include tweets and newspaper articles. Images and audio/video clips are also unstructured data.

We can also categorize data as semi-structured, containing portions of structured and unstructured data. For example, data from emails have a structure in that it contains structured information regarding the date the email was sent, who sent it, whom it was sent to, what the subject is, does it have attachments, and so on. However, the body of the emails contains unstructured data. As machines work well with structured and unstructured data, machines work well with semi-structured data too.

No matter the type of data, it must be understood that machines can only work on numbers. So, any data the machines need to understand must be presented to the machine in numbers. In this book, we will discuss various techniques for converting non-numeric data to numbers without any loss of context and allowing machines to learn from them. These discussions will be spread across all the remaining chapters as we will discuss different problems to be solved by the machines.

Types of machine learning

Machine learning can be classified into two main types. They are Supervised learning and Unsupervised learning.

In Supervised learning, we can perform two tasks: regression and classification.

In Unsupervised learning, we can do two tasks: clustering and dimensionality reduction.

There is a special case of Clustering tasks called Anomaly Detection.

Figure 1.14 summarizes all types of machine learning and the tasks that can be performed:

Figure 1.14: Types of machine learning

Some people also consider Reinforcement learning as one type of Machine Learning. At the same time, some people argue that Reinforcement learning is approximate dynamic programming.

Let us discuss each type of machine learning in more detail. However, this book focuses on the classification task, a type of supervised learning.

Supervised learning

In Supervised learning, the machine is provided data along with labels. The machine learns based on the data and the associated labels and then makes inferences. So, we are providing the machine with prior knowledge, and then after the machine learns from this knowledge, it can make decisions within the boundaries of this provided knowledge.

Labels are the analysis of the data as determined by humans. For example, if we want the machine to learn to differentiate between images of dogs and cats, we need to provide data regarding dogs and cats to the machines. Along with this data, we need to provide labels stating which are the images of dogs and which are the images of cats. Suppose we want the machine to predict the marks in an exam. In that case, we need to provide historical data along with labels stating how many marks were obtained under the circumstances provided in the data.

The bottom line in Supervised learning is that we provide existing knowledge to the machine and expect the machine to find patterns in the provided knowledge and make rules that the machine can use to answer future questions asked on the same subject.

Let us understand this with an example. Consider that we want the machine to be able to detect spam emails. So, we gather the data as shown in Table 1.1:

Table 1.1 : Example dataset of emails for spam detection

In this example, the dataset in Table 1.1 contains only 6 data points. In real situations, the datasets have thousands and millions of data points. Nevertheless, the dataset contains data in 4 variables: Contains spelling mistakes, Contains the word Urgent, Contains the word ASAP, and Contains a link to click. In machine language parlance, these variables are called Independent Variables. For these four variables, there is data in each data point. In normal circumstances, experts would have studied real emails and gathered these four characteristics for each email. Apart from collecting data regarding the characteristics of the emails, experts would also assign a label as to whether the email is benign or spam. The variable we refer to as the label is also called the dependent variable in machine learning parlance.

In Supervised learning, the machine would form patterns from the independent variables considering the associated dependent variable. From the pattern would emerge a rule that the machine will use when given new values for the independent variables.

The preceding example is a Classification problem where the machine needs to decide whether an email is benign or spam. This type of Classification problem is called a Binary classification problem, as the machine must decide between two options or classes.

There are classification problems where the machine needs to choose between more than two classes. Such classification problems are called Multi-class classification problems.

Implementation of classification on the example data provided in Table 1.1 is as follows:

import pandas as pd

df = pd.DataFrame([['NO', 'NO', 'NO', 'YES', 'Benign'],

['NO', 'NO', 'NO', 'NO', 'Benign'],

['YES', 'NO', 'YES', 'NO', 'Spam'],

['NO', 'YES', 'YES', 'YES', 'Spam'],

['YES', 'NO', 'NO', 'YES', 'Benign'],

['YES', 'YES', 'YES', 'YES', 'Spam']

columns = ['ContainsSpellingMistakes', 'ContainsUrgent', 'ContainsASAP', 'ContainsLink', 'Label']

)

ContainsSpellingMistakes ContainsUrgent ContainsASAP ContainsLink Label

0 NO NO NO YES Benign

1 NO NO NO NO Benign

2 YES NO YES NO Spam

3 NO YES YES YES Spam

4 YES NO NO YES Benign

5 YES YES YES YES Spam

X = df.drop('Label', axis = 1, inplace = False)

y = df['Label']

print(X, '\n\n', y)

ContainsSpellingMistakes ContainsUrgent ContainsASAP ContainsLink

0 NO NO NO YES

1 NO NO NO NO

2 YES NO YES NO

3 NO YES YES YES

4 YES NO NO YES

5 YES YES YES YES

0 Benign

1 Benign

2 Spam

3 Spam

4 Benign

5 Spam

Name: Label, dtype: object

from sklearn.preprocessing import LabelEncoder

# Convert all data to numbers

leX = LabelEncoder()

XL = X.apply(leX.fit_transform)

leY = LabelEncoder()

yL = leY.fit_transform(y)

print(XL, '\n\n', yL)

ContainsSpellingMistakes ContainsUrgent ContainsASAP ContainsLink

0 0 0 0 1

1 0 0 0 0

2 1 0 1 0

3 0 1 1 1

4 1 0 0 1

5 1 1 1 1

[0 0 1 1 0 1]

from sklearn.linear_model import LogisticRegression

# Build Model

lr = LogisticRegression()

lr.fit(XL, yL)

# Prepare Test Data

testData = ['NO', 'YES', 'NO', 'YES']

Xtest = leX.transform(testData)

prediction = lr.predict(Xtest.reshape(1, -1))

print('Prediction =', leY.inverse_transform(prediction))

Prediction = ['Benign']

Take another example. Suppose we have the temperatures of a city, say Bengaluru, every day for many years. We have three attributes, that is, the date, whether it was cloudy on that date, and the temperature on that date, as shown in Table 1.2:

Table 1.2 : Example dataset of temperatures in a city

Suppose we have this data from 01-Jan-2001 till 31-Dec-2015. Also, we want to know what the temperature would be on 25-Oct-2022. We should be able to predict the same using a Machine learning system with a Regression model. We need historical data for all the independent and associated dependent variables in Regression problems. In the example in Table 1.2, the date and whether cloudy or not are the independent variables or features. From some independent variables, we can derive many more independent variables. For example, from our data in Table 1.2, from the feature date, we can derive other independent variables like month, day of the year, etc. So, instead of using the date as the independent variable, we could use the month of the date and the day of the year as our features. Generating independent variable(s) or feature(s) from the existing independent variable(s) is called Feature Engineering (Refer to Table 1.3).

Table 1.3 : Feature Engineered dataset of temperatures in a city

The temperature is the dependent variable or target variable. Given this data, we want the machine to learn the patterns and create a rule. Then, given any date in the future and whether it is cloudy, the machine should predict the temperature on that day. So, this is also Supervised Learning.

A regression implementation on the example data provided in Table 1.2 is as follows:

import pandas as pd

df = pd.DataFrame([['01-01-2001', 'YES', 14.3],

['01-02-2001', 'NO', 13.7],

['01-03-2001', 'NO', 13.6],

['01-04-2001', 'YES', 14.3],

['01-05-2001', 'NO', 14.2],

['01-06-2001', 'YES', 12.8],

['01-07-2001', 'NO', 14.7],

['01-08-2001', 'NO', 11.3],

['01-09-2001', 'NO', 11.7],

['01-10-2001', 'NO', 12.1],

columns = ['Date', 'Cloudy', 'Temperature']

)

Date Cloudy Temperature

0 01-01-2001 YES 14.3

1 01-02-2001 NO 13.7

2 01-03-2001 NO 13.6

3 01-04-2001 YES 14.3

4 01-05-2001 NO 14.2

5 01-06-2001 YES 12.8

6 01-07-2001 NO 14.7

7 01-08-2001 NO 11.3

8 01-09-2001 NO 11.7

9 01-10-2001 NO 12.1

import datetime

import numpy as np

from sklearn.preprocessing import LabelEncoder

# Feature Engineering

# Get Month and Day of the Year

df['Month'] = pd.to_datetime(df['Date']).dt.month

referenceDate = np.array([datetime.datetime(2001, 1, 1)] * len(df))

df['DayOfYear'] = (pd.to_datetime(df['Date']) - referenceDate).dt.days

# Convert Cloudy to numbers

leC = LabelEncoder()

df['Cloudy'] = leC.fit_transform(df['Cloudy'])

Date Cloudy Temperature Month DayOfYear

0 01-01-2001

Enjoying the preview?

Page 1 of 1

Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)

About this ebook

Partha Majumdar

Read more from Partha Majumdar

Related authors

Related to Mastering Classification Algorithms for Machine Learning

Related ebooks

Intelligence (AI) & Semantics For You

Related podcast episodes

Related articles

Related categories

Reviews for Mastering Classification Algorithms for Machine Learning

What did you think?

Book preview

Mastering Classification Algorithms for Machine Learning - Partha Majumdar

CHAPTER 1

Introduction to Machine Learning

Structure

Objectives

Machine learning

Traditional programming versus programming for machine learning

The learning process of a machine

Kinds of data the machines can learn from

Types of machine learning

Supervised learning