Always Learning

Advanced Search

Machine Learning in Production

Machine Learning in Production

Developing and Optimizing Data Science Workflows and Applications

Andrew Kelleher, Adam Kelleher

Feb 2019, Paperback, 280 pages
ISBN13: 9780134116549
ISBN10: 0134116542
This title is ordered on demand which may result in extended delivery times.
Special online offer - Save 30%
Was 32.99, Now 23.09Save: 9.90
  • Print pagePrint page
  • Email this pageEmail page
  • Share

The typical data science task in industry starts with an “ask” from the business. But few data scientists have been taught what to do with that ask. This book shows them how to assess it in the context of the business’s goals, reframe it to work optimally for both the data scientist and the employer, and then execute on it. Written by two of the experts who’ve achieved breakthrough optimizations at BuzzFeed, it’s packed with real-world examples that take you from start to finish: from ask to actionable insight.

Andrew Kelleher and Adam Kelleher walk you through well-formed, concrete principles for approaching common data science problems, giving you an easy-to-use checklist for effective execution. Using their principles and techniques, you’ll gain deeper understanding of your data, learn how to analyze noise and confounding variables so they don’t compromise your analysis, and save weeks of iterative improvement by planning your projects more effectively upfront.

Once you’ve mastered their principles, you’ll put them to work in two realistic, beginning-to-end site optimization tasks. These extended examples come complete with reusable code examples and recommended open-source solutions designed for easy adaptation to your everyday challenges. They will be especially valuable for anyone seeking their first data science job -- and everyone who’s found that job and wants to succeed in it.

Part I: Principles of Framing
1. Introduction: How We See Data Science
2. Translate an Ask into a Well-Formed problem
3. Framing/Re-framing

Part II: Principles of Choosing a Model
4. Finding Causal Relationships
5. Quantifying Quality and Confidence
6. Quantifying Error
7. Noise

Part III: Case Studies
8. The Initial Ask: Knowing When to Reframe
9. Building Domain Knowledge
10. Causal Modeling
11. Assessment of the Data Set
12. System Modeling
13. Refinement

Part IV: Appendices
A. Brief Overview of Common Algorithms
B. History/Progression of Search Algorithms
C. History/Progression of Metrics for User Engagement
D. Useful Papers and Further Reading

  • Practical principles and step-by-step techniques for transforming any business “ask” into actionable insight
  • Includes two complete end-to-end site optimization projects, with reusable code and open source tools recommendations readers can easily adapt for their own projects
  • Ideal for all junior and aspiring data scientists -- and for all software developers, engineers, and others with new responsibilities related to data science
  • By Andrew and Adam Kelleher, twin brothers playing pivotal roles in data science and engineering at BuzzFeed
  • No heavy math or statistics background required!

Andrew Kelleher is a staff software engineer and distributed systems architect at Venmo. He was previously a staff software engineer at BuzzFeed and has worked on data pipelines and algorithm implementations for modern optimization. He graduated with a BS in physics from Clemson University. He runs a meetup in New York City that studies the fundamentals behind distributed systems in the context of production applications, and was ranked one of FastCompany's most creative people two years in a row.

Adam Kelleher is chief data scientist for research at Barclays Investment Bank and an adjunct professor at Columbia University in the City of New York. He was formerly principal data scientist for BuzzFeed. He graduated from Clemson University with a BS in physics, and has a PhD in cosmology from University of North Carolina at Chapel Hill.