Statistics Is Easy: Case Studies on Real Scientific Datasets
Statistics Is Easy: Case Studies on Real Scientific Datasets
Computational analysis of natural science experiments often confronts noisy data due to natural variability in environment or measurement. Drawing conclusions in the face of such noise entails a statistical analysis.
Parametric statistical methods assume that the data is a sample from a population that can be characterized by a specific distribution (e.g., a normal distribution). When the assumption is true, parametric approaches can lead to high confidence predictions. However, in many cases particular distribution assumptions do not hold. In that case, assuming a distribution may yield false conclusions.
The companion book Statistics is Easy! gave a (nearly) equation-free introduction to nonparametric (i.e., no distribution assumption) statistical methods. The present book applies data preparation, machine learning, and nonparametric statistics to three quite different life science datasets. We provide the code as applied to each dataset in both R and Python 3. We also include exercises for self-study or classroom use.
Author: Manpreet Singh Katari, Sudarshini Tyagi, Dennis Shasha
Publisher: Morgan & Claypool
Published: 04/08/2021
Pages: 74
Binding Type: Paperback
Weight: 0.32lbs
Size: 9.25h x 7.50w x 0.16d
ISBN: 9781636390895
About the Author
Tyagi, Sudarshini: -
Sudarshini Tyagi is currently a software engineer at Goldman Sachs where she uses machine learning particularly natural language processing and statistics to detect anomalies in financial regulations. She received her Master's degree in Computer Science from Courant Institute of Mathematical Sciences at New York University, where she wrote a thesis on visually detecting breast cancers from mammograms. She also holds a Bachelor's degree in Computer Science from Rashtreeya Vidyalaya College of Engineering, Bengaluru.
Shasha, Dennis: -Dennis Shasha is a Julius Silver Professor of Computer Science at the Courant Institute of New York University and an Associate Director of NYU Wireless. In addition to his long fascination with nonparametric statistics, he works on meta-algorithms for machine learning to achieve guaranteed correctness rates; with biologists on pattern discovery for network inference; with physicists and financial people on algorithms for time series; on database tuning; and tree and graph matching.
Because he likes to type, he has written six books of puzzles about a mathematical detective named Dr. Ecco, a biography about great computer scientists, and a book about the future of computing. He has also written technical books about database tuning, biological pattern recognition, time series, DNA computing, resampling statistics, and causal inference in molecular networks.
He has written the puzzle column for various publications including Scientific American, Dr. Dobb's Journal, and currently the Communications of the ACM. He is a fellow of the ACM and an INRIA International Chair.