Effective Machine Learning for Quantitative Finance
Sometimes we believe that Computer programs are magic, Softwares work like magic in the machine, You click on various buttons on Your PC and Boom! Real interest or real things are really far away from all this stuff. All you have to do is handle large databases, parse those as per requirement and do lots of UI tricks as well as many Exceptional Handling techniques those lead you to a great product or may be the one a real Human can handle! Sometimes it feel that it is also True in the case of Machine-Learning and Data-Science. Selecting some predicates and just applying most popular model might not be enough because we are developing a model and learning with it, So it makes sense to grow with it and get your models grow as well.
Let’s do some discussions with ourselves and learn how Good models could lead to better Trading-Strategies.
Why we use ML models?:
ML Models are used because it saves us to write 100’s or may be 1000’s of lines of code.That is mathematics, one equation can save knowledge of worth hundreds of pages in it. (Remember Binomial Expressions for calculation of Probability? )
Machine learning solutions are cost effective and it takes really less amount of time to implement those models rather than designing an expert system which takes large amount of time and might come with various error pron results on the other hand a simple ML model can be built in few hours(A bit complicated and little bit smart) and years can be used to implement and make it more reliable towards decision making process.
How Machine learning models can learn more Effectively?:
When we are really into model building process there is one thing we might get confused and that is which model to use, really I guess that is really a question one should look for in real. There are not just hundreds but thousands of algorithms one can look for and each year those algorithms get developed more and more with different set of features but here real thing again comes out as intensive question that really which model to use?
Model you want to use is really dependent on your requirement and type of features you believe could be helpful to predict the outcomes. While choosing any algorithm one should take care of three things: Representation, Evaluation, Optimization. Representation describes the features one need to pick for predictions and your algorithm must be able to handle that feature space. This space is also called hypothesis space.
Evaluation plays an important role as well. When you are figuring out which Algorithm you should use or which one can come up with better results, All you need to do is use some kind of Evaluation function for your algorithm/model/classifier and such Evaluation function could be internal as well as external, It all depends on your choice of Algorithm/Model.
Optimization is one another thing that really plays very important role in each aspect of building a model. There are various factors we consider such like ‘Predictability score of model’, ‘Learning rate’, ‘Your model’s memory consumption’ and so on. And there are various factors those are included while doing optimization parts which are something like how we handle all the search criteria.
How Magically Machine-Learning Algorithms do predictions?
One thing we have to understand carefully and that is Approximation/Generalization. My most favorite line about mathematics is : ‘Mathematics for Engineers is all about Finding an approximation Solution and That approximation solution is just enough to put rover on Mars’. An approximation solution leads almost close to various predictions. Splitting your data into train and test set is the best practice and one should never underestimate train test because based on these results of train and test set we are able to initiate the process of optimization. After trying various optimizations on our model which are like choosing the type of search either we are going to use ‘Beam Search’ or ‘Greedy Search’ or how test score improves. Notice that generalization being the goal has an interesting consequence for machine learning. Unlike in most other optimization problems, we don’t have access to the function we want to optimize! So all your Machine learning model does is just finds a common relation between various predicates by experimenting thousands/millions/billions times on your Training data and that experimentation process to find common patterns those may or may not be true for future-use are really based on such Experimentations/Hypothesis/Generalization.
Why Machine learning and Predictions are called BIG-Data?:
Let’s take an example that will explain how much data is good data. Suppose you want to apply for a job so chance of getting a job will be more if you will be able to apply for various positions. So more data is better than clever algorithm because each time machine will be having various cases to understand the classifications but on the other hand BIG data also causes the usage for heavy computing power and Programs like Apache-Spark or Hadoop are industry standard Examples those can be used to process and understand big data in understandable form with fastest way possible.
How many models/Algorithms I must learn?
There would be very simple answer and that is learn as many as you can and apply as many you can. Ensemble learning is known as very popular technique where multiple Algorithms can be used to achieve better results for example in Random-Forest A set of Decision-Trees is being used that means each model/sub-model or each decision tree has given a particular amount of weight and after various trainings random best weights of different Decision trees are being used to predict the outcomes. To understand the process of Ensemble learning one has to look into three things those are most important: 1. Bagging 2. Boosting and 3. Stacking
Bagging: we simply generate random variations of the training set by re-sampling, learn a classifier on each, and combine the results by voting. This works because it greatly reduces variance while only slightly increasing bias.
Boosting: In boosting, training examples have weights, and these are varied so that each new classifier focuses on the examples the previous ones tended to get wrong.
Stacking: Stacking, the outputs of individual classifiers become the inputs of a “higher-level” learner that figures out how best to combine them.
Combining it all becomes: Split training sets into random variations, Apply algorithm on each variance, give a rank(highest – lowest) to each[Bagging] , Now the algorithms/Models having lowest ranks will be improved(How improved?-look for optimization section) [Boosting], Now we have each individual model with weights/ranks and outputs as each have learned from different set of variances Let’s Stack them all – In stacking you have just to suppose that for higher level Model each individual lower level model acts as training set. Here we could have some biasness regarding the sub-models having greater scores those we assigned in Bagging but higher level Model/Algorithm also considers the decisions of Low-Ranked individual model so that reduces the effect of Biasness.
Which are better Models? – Simple or Complicated?:
When it comes to modeling a model or developing a model always it is assumed that simple models are performed better and generate less errors, Yes that is true in various cases but one must not think biased and that is how we profile our models. Simple models have less error and mush faster than complicated one but that does not imply all the time that simple models should be developed for each problem and one should never approach complicated models. Complicated models exist because those also work and provide good results as well Ensemble learning is good approach that has made this concept completely valid.
How to know that which model/Algorithm should apply?:
Few weeks ago I read an article about how one can come up with the solutions/results/models/algorithms those can solve all the problems of the world- and for sure answer is AI(Artificial intelligence) But are we really there yet? there will be very long debate on this section but real truth is or at-least I believe could be is that one must present or visualize the relationships between two or more variables of data so one could be able to understand that if the relationship is linear or non-liner Or moreover one also need to understand how such relations effect the actual outcome of the predictor. For example there could be perfect linear relationship between close prices of two stocks and Linear-Regression could be better apply-able for that than Logistic or any other. This thing simply conveys that all models don’t solve all the problems so difficult thing could be how to find out which model will be able to solve which problem in better way. For that a top overview of many models must be required.
If it is correlated that means it will always be true/predictable:
That is really a good assumption but what would be the need of Data-Scientist if it would be true all the time? Correlation does not imply causation and that means it must not be the only reason to construct a model, When building model there are various Hypothesis As a Data-Scientist need to propose and come with the ones those which come out as real relationship or better predictions when applied on actual life. In stocks we can come up with correlation that news directly effect the stock prices and constructing a strategy that uses NLP(Natural Language Processing) to generate buy/sell calls we might have come up with Negative returns and investors might lost their money. because There might be a correlation between news and stock prices but that does not mean it is the only factor that must be considered to build a model that runs trading strategy based on NLP, so at the end we have various things to consider like how much Stock-volume is present in the company, What are the rankings of stock in SP500 or how much money is available in books of company or how that particular stock is performing in the past decades of time.
Some Final thoughts:
When you think about ML models and how those can create happiness in your life, Here happiness means how you can come up with models those generate best predictions and always come up with great predictions there is only one thing to remember which is ‘Rule of Evolution‘. You can sit on one technology/Design for rest of your life OR you can grow! Thinking about Machine-Learning or building models those are related to Machine-learning is Learning. Machine is desperate for learning and learning is only possible by doing lots of stuff(Experimentation) getting good results keeping those models for further improvements and neglecting those are not Good enough to use but also give those a try may be sooner or later.