Key Features
- Get to know seven algorithms for your data science needs in this concise, insightful guide
- Ensure you’re confident in the basics by learning when and where to use various data science algorithms
- Learn to use machine learning algorithms in a period of just 7 days
Book Description
Machine learning applications are highly automated and self-modifying, and they continue to improve over time with minimal human intervention as they learn with more data. To address the complex nature of various real-world data problems, specialized machine learning algorithms have been developed that solve these problems perfectly. Data science helps you gain new knowledge from existing data through algorithmic and statistical analysis.
This book will address the problems related to accurate and efficient data classification and prediction. Over the course of 7 days, you will be introduced to seven algorithms, along with exercises that will help you learn different aspects of machine learning. You will see how to pre-cluster your data to optimize and classify it for large datasets. You will then find out how to predict data based on the existing trends in your datasets.
This book covers algorithms such as: k-Nearest Neighbors, Naive Bayes, Decision Trees, Random Forest, k-Means, Regression, and Time-series. On completion of the book, you will understand which machine learning algorithm to pick for clustering, classification, or regression and which is best suited for your problem.
What you will learn
- Find out how to classify using Naive Bayes, Decision Trees, and Random Forest to achieve accuracy to solve complex problems
- Identify a data science problem correctly and devise an appropriate prediction solution using Regression and Time-series
- See how to cluster data using the k-Means algorithm
- Get to know how to implement the algorithms efficiently in the Python and R languages
About the Author
Dávid Natingga graduated in 2014 from Imperial College London in MEng Computing with a specialization in Artificial Intelligence. In 2011, he worked at Infosys Labs in Bangalore, India, researching the optimization of machine learning algorithms. In 2012 and 2013 at Palantir Technologies in Palo Alto, USA, he developed algorithms for big data.
In 2014 as a data scientist at Pact Coffee, London, UK, he created an algorithm suggesting products based on the taste preferences of the customers and the structures of the coffees. As a part of his journey to use pure mathematics to advance the field of AI, he is a PhD candidate in Computability Theory at University of Leeds, UK. In 2015, he spent 8 months at Japan’s Advanced Institute of Science and Technology as a research visitor.
Table of Contents
- Classifying from k-Nearest Neighbors
- Naive Bayes – choosing the most probable class
- Decision Trees
- Random Forest – forests of decision trees
- k-Means – dividing a dataset into k-groups
- Regression – learning models as functions
- Time Series – learning time-dependent models
- Appendix A: Python & R reference
- Appendix B: Statistics
- Appendix C: Glossary of Algorithms and Methods in Data Science