F# for Machine Learning Essentials
Format: PDF / Kindle (mobi) / ePub
- Design algorithms in F# to tackle complex computing problems
- Be a proficient F# data scientist using this simple-to-follow guide
- Solve real-world, data-related problems with robust statistical models, built for a range of datasets
The F# functional programming language enables developers to write simple code to solve complex problems. With F#, developers create consistent and predictable programs that are easier to test and reuse, simpler to parallelize, and are less prone to bugs.
If you want to learn how to use F# to build machine learning systems, then this is the book you want.
Starting with an introduction to the several categories on machine learning, you will quickly learn to implement time-tested, supervised learning algorithms. You will gradually move on to solving problems on predicting housing pricing using Regression Analysis. You will then learn to use Accord.NET to implement SVM techniques and clustering. You will also learn to build a recommender system for your e-commerce site from scratch. Finally, you will dive into advanced topics such as implementing neural network algorithms while performing sentiment analysis on your data.
What you will learn
- Use F# to find patterns through raw data
- Build a set of classification systems using Accord.NET, Weka, and F#
- Run machine learning jobs on the Cloud with MBrace
- Perform mathematical operations on matrices and vectors using Math.NET
- Use a recommender system for your own problem domain
- Identify tourist spots across the globe using inputs from the user with decision tree algorithms
About the Author
Sudipta Mukherjee was born in Kolkata and migrated to Bangalore. He is an electronics engineer by education and a computer engineer/scientist by profession and passion. He graduated in 2004 with a degree in electronics and communication engineering.
He has a keen interest in data structure, algorithms, text processing, natural language processing tools development, programming languages, and machine learning at large. His first book on Data Structure using C has been received quite well. Parts of the book can be read on Google Books at http://goo.gl/pttSh. The book was also translated into simplified Chinese, available from Amazon.cn at http://goo.gl/lc536. This is Sudipta's second book with Packt Publishing. His first book, .NET 4.0 Generics (http://goo.gl/MN18ce), was also received very well. During the last few years, he has been hooked to the functional programming style. His book on functional programming, Thinking in LINQ (http://goo.gl/hm0lNF), was released last year. Last year, he also gave a talk at @FuConf based on his LINQ book (https://goo.gl/umdxIX). He lives in Bangalore with his wife and son.
Sudipta can be reached via e-mail at email@example.com and via Twitter at @samthecoder.
Table of Contents
- Introduction to Machine Learning
- Linear Regression
- Classification Techniques
- Information Retrieval
- Collaborative Filtering
- Sentiment Analysis
- Anomaly Detection
parameters that can be helpful while predicting a traffic jam. The intention is that after reading this chapter you will be able to use these classification techniques to address some of the problems you are facing yourself. Binary classification using k-NN In this example, you will solve the kaggle cat and dog identification challenge (https://www.kaggle.com/c/dogs-vs-cats). The challenge is to identify dogs and cats from photographs. Following are a couple of example photographs: • Image 1
WekaSharp.Dataset.setClassIndexWithLastAttribute let classes = iris.numClasses() printfn "%A" classes let j48Tt = TrainTest(iris, iris, ClassifierType.J48, WekaSharp. Parameter.J48.DefaultPara) let j48Cv = CrossValidation(5, iris, ClassifierType.J48, WekaSharp.Parameter.J48.DefaultPara) let j48Rs = RandomSplit(0.7, iris, ClassifierType.J48, WekaSharp.Parameter.J48.DefaultPara) // perform the task and get result let ttAccuracy = j48Tt |> WekaSharp.Eval.evalClassify |> WekaSharp.Eval.getAccuracy
rabbits, the resulting confusion matrix will look like the following table: Predicted Actual class Cat Dog Rabbit Cat 5 3 0 Dog 2 3 1 Rabbit 0 2 11 In this confusion matrix, of the eight actual cats, the system predicted that three were dogs and, of the six dogs, it predicted that one was a rabbit and two were cats. We can see from the matrix that the system in question has trouble distinguishing between cats and dogs, but can make the distinction between rabbits and other types
(non-anomalous) data can be line anomalous data. Anomaly detection is mostly an unsupervised learning problem because it is very difficult, if not impossible, to get a labeled training dataset that is anomalous. Sometimes, anomalies are referred to as "outliers." Objective After reading this chapter, you will be able to apply some of the techniques to identify any anomaly in data, and you will have a general understanding of how and where anomaly detection algorithms can be useful. All code is
www.it-ebooks.info Chapter 7 Given a new entry x, the following formula calculates the probability density estimation: n p ( x ) = ∏ p ( x j ; µ j ,σ j =1 2 j n )=∏ j =1 ( x − µ )2 j j exp − 2 2σ j 2πσ j 1 If is less than a predefined threshold, then the entry is tagged to be anomalous, else it is tagged as normal. The following code finds the average value of the jth feature: Here is a sample run of the px method: > let X =