Book Review: Cleaning Data for Effective Data Science

Author:  David Mertz
Publisher:  Packt
Publication Date:  February 5, 2021
Publication Link
Prerequisites: Some Python or R

Disclaimer: The publisher sent me a copy of this book for review. I promise that everything said here is my own opinion regardless. All reviews at the Cross Trained Mind are open and honest.

About This Book

This is a crucial book thanks to the deluge of data we currently have and use in our software applications. It looks at both structure and content issues with data of various types and the pros and cons of methods to clean it enough to be useful.

Who Is This For?

This is a useful book for anyone who imports data into their application, which is a good number of us. Given the Python and R code in the book, it’s good to have some knowledge and experience of one of these languages, but that’s about all you need to know. I personally recommend anything earning a computer science degree to work through this book around the same time they learn about data structures and algorithms.

Organization

The overall organization of this book follows a standard data pipeline that you might set up for your application and what cleansing issues you might need to resolve along the way. This is, in my opinion, a great way to set up the book. Within each chapter, you have your topics, the exercises, and then a summary for the chapter. All in all, this is a well-organized book.

Did This Book Succeed?

I believe that the author did a tremendous job on a difficult and large topic. This is one of the most time-consuming and least talked about portions of any data pipeline for data science and machine learning tasks. It is also one of the most important. Anyone working in data or AI needs to read through this book and learn how to implement its processes in order to have cleaner and therefore more useful data.

Rating and Final Thoughts

I give this book a 5 out of 5.

It is useful, timely, and well organized. While it may not be set up like a cookbook or reference, it can use used as such. It is well suited as a textbook either for a course or self-study. It should be in anyone’s personal library, ready to be pulled when there is data to be cleansed.

Robotics

Book Review: Learn Robotics Programming, 2nd Edition

ByMatthew Emerick February 11, 2021

Author: Danny StaplePublisher: PacktPublication Date: February 12, 2021Publication LinkPrerequisites: Some Python programming About This Book I have had an interest in robotics since I was a teenager. My fiction-addled mind saw it as something easy to learn but…

Computer Vision

Book Review: OpenCV 4 with Python Blueprints, 2nd Ed

ByMatthew Emerick March 12, 2021

Authors: Menua Gevorgyan, Arsen Mamikonyan, Michael BeyelerPublisher: PacktPublication Date: March 20, 2020Publication LinkPrerequisites: Intermediate Python Overview: This book is for the intermediate or advanced Python programmer who is interested in computer vision with OpenCV. It’s…

Personal Thoughts

Book Format – Physical Versus Digital

ByMatthew Emerick January 22, 2021February 13, 2021

This article was originally published on the Packt Community site on August 26, 2020. Some readers prefer a physical copy while others prefer a digital one. What are the pros and cons of each? Physical…

Programming | Python

Book Review: Expert Python Programming, 4th Ed

ByMatthew Emerick June 8, 2021

Author: Michał Jaworski and Tarek ZiadéPublisher: PacktPublication Date: May 28, 2021Publication LinkPrerequisites: Expert Python Disclaimer: The publisher sent me a copy of this book for review. I promise that everything said here is my own opinion regardless. All…

Artificial Intelligence

Book Review: Artificial Intelligence, A Modern Approach, 3rd Edition

ByMatthew Emerick January 20, 2021February 13, 2021

Author: Stuart Russell and Peter NorvigPublisher: PearsonPublication Date: 2015Product LinkPrerequisites: Some programming knowledge About This Book Is there a single book I can recommend, the One Book that any Cross Trained Mind should read? Yes, actually. If you’re going…

Artificial Intelligence | Natural Language Processing

Book Review: Transformers for Natural Language Processing

ByMatthew Emerick May 17, 2021

Authors: Denis RothmanPublisher: PacktPublication Date: January 29, 2021Publication LinkPrerequisites: Intermediate Python, some knowledge of NLP Disclaimer: The publisher sent me a copy of this book for review. I promise that everything said here is my own opinion regardless….

About This Book

Who Is This For?

Organization

Did This Book Succeed?

Rating and Final Thoughts

Similar Posts

Leave a Reply Cancel reply