Thoughtful Machine Learning: A Test-Driven Approach
Format: PDF / Kindle (mobi) / ePub
Learn how to apply test-driven development (TDD) to machine-learning algorithms—and catch mistakes that could sink your analysis. In this practical guide, author Matthew Kirk takes you through the principles of TDD and machine learning, and shows you how to apply TDD to several machine-learning algorithms, including Naive Bayesian classifiers and Neural Networks.
Machine-learning algorithms often have tests baked in, but they can’t account for human errors in coding. Rather than blindly rely on machine-learning results as many researchers have, you can mitigate the risk of errors with TDD and write clean, stable machine-learning code. If you’re familiar with Ruby 2.1, you’re ready to start.
- Apply TDD to write and run tests before you start coding
- Learn the best uses and tradeoffs of eight machine learning algorithms
- Use real-world examples to test each algorithm through engaging, hands-on exercises
- Understand the similarities between TDD and the scientific method for validating solutions
- Be aware of the risks of machine learning, such as underfitting and overfitting data
- Explore techniques for improving your machine-learning models or data extraction
stores the html body's inner_text" do body = html.split("\n\n")[1..-1].join("\n\n") html_email.body.must_equal Nokogiri::HTML.parse(body).inner_text end it "stores subject like plaintext does as well" do subject = html.match(/^Subject: (.*)$/) html_email.subject.must_equal subject end end end As mentioned, we’re using Nokogiri to calculate the inner_text, and we’ll have to use it inside of the Email class as well. Now the problem is that we also need to detect the content_type. So we’ll add
other events. So spam was independently conditional on each word in the email. We can do the same with our current system. We can state that the probability of being in a particular state is primarily based on what happened in the previous state. So instead of P(Customer | S1, S2, …, Sn), our equation would be P(Customer | SN). But why can we get away with such a gross simplification? Given a state machine like the one we have just defined, the system infers probabilistically and recursively
Matt’s ice cream consumption Month 47F 4 2 Jan 50F 4 2 Feb 54F 4 3 Mar 58F 4 3 Apr 65F 4 3 May 70F 4 3 Jun 76F 4 4 Jul 76F 4 4 Aug 71F 4 4 Sep 60F 4 3 Oct 51F 4 2 Nov 46F 4 2 Dec You can see that I generally drink about four cups of coffee a day. I tend to eat more ice cream in the summer, and it’s generally hotter around that time (Figure 10-3). Figure 10-3. A graph showing my ice cream consumption Now we know inherently that the one thing that
Distancemargin errors, Optimizing with slackmargin maximization, trading off with slack variable minimization, Trading off margin maximization with slack variable minimization using CMarkov chains, Simplification through the Markov AssumptionMarkovian models, Hidden Markov Models(see also Hidden Markov Models)mathematical notations in this book, Mathematical Notation Used Throughout the Bookmatrix factorization, Data SetMatrix library (Ruby), EM ClusteringMatrixDeterminance class (example),
other takes 30 minutes, generally speaking the one that takes less time to train is probably better. The best approach would be to wrap a benchmark around the code to find out if it’s getting faster or slower over time. Many machine learning algorithms have max iterations built into them. In the case of neural networks, you might set a max epoch of 1,000 so that if the model isn’t trained within 1,000 iterations, it isn’t good enough. An epoch is just a measure of one iteration through all