my net house

WAHEGURU….!

Category Archives: Uncategorized

Law of Wealthy LIfe

Wealthy life does not just mean to having lots of money in the bank but it is much more like creating various things in your society or running various engines those work in the manner that you are really able to make things happen in your life instantly, One thing you must remember or know carefully and that is If you really want to do it fast,  do it well. 🙂

Speed of implementation

Respect your time(Don’t waste on social media and stuff)

Go to bed early and get up early. Although I am writing this post so Late.:P 😦 😉

Important Julia Packages

  1. Julia Ipython

Julia is able to run very well on you Ipython notebook Environment. After all, All you have to do is Data-Science and Machine-Learning. 🙂

julia

1.1 Open Julia Prompt(At Ubuntu it works like typing ‘julia’ command in your Terminal)

1.2 run command > Pkg.add(“IJulia”) # it will do almost all the work.

2. DataFrames: Whenever you have to read lot of files in Excel-Style Julia DataFrames Package is good to go.

Pkg.add("DataFrames")

3. Arduino:

A Julia Package for interacting with Arduino.

https://github.com/rennis250/Arduino.jl

4. Neural Network Implementation of Julia

https://github.com/compressed/BackpropNeuralNet.jl

5. Visualizing and Plotting in Julia:

https://github.com/bokeh/Bokeh.jl

6. Reading and writing CSV files in Julia

https://github.com/JuliaData/CSV.jl

7. DataClusting in Julia:

https://github.com/JuliaStats/Clustering.jl

For more Large number of Packages, Please refer following link:

http://pkg.julialang.org/

Note*: You can also run most of the Shell commands in Julia environment as well. 🙂

things and things

Things those need to be understood in many ways.

  1. Various important parts of Statistics and implementation
  2. Hypothesis Testing
  3. Probability Distributions and Importance
  4. AIC and BIC
  5. Baysian models
  6. Some black Magics of OOPS

Hacker’s Guide to Quantitative Trading(Quantopian Python) Part 2

Quantopain Provides required API functions,Data,Helpful-community as well as batteries included Web-based Dashboard to play with Algorithmic-Trading, Create Your own trading Strategies, and launch your Trading model in live Market.

Here I will only talk about code and how it should be written to create your own Trading Strategy.

There are basically Two methods.

initialize() and handle_data()

initialize act as initializer for various variables. same as __init__ method in Python.

Now what kind of variables we have to declare in initialize() function is dependent on your strategy. we can select limited number of stocks,days,type of trading,variables required for Algorithms.

A very simple example of initialize() code could look like as follows:

def initialize(context): # consider context just as 'self' in Python

   context.stocks = [sid(24),sid(46632)] # sid stands for stock_id

initialize() also contains the stuff that can be used many times or all the times in our Trading Algorithm:

1. A counter that keeps track of how many minutes in the current day we’ve got.

2. A counter that keeps track of our current date.

3. A list that stores the securities that we want to use in our algorithm.

Whatever variables that you define here will remain persistent (meaning that they’ll exist) but will be mutable. So that means that if you initialize context.count as 0 in initialize, you can always change it later in handle_data().

A Simple Example of handle_data():

def handle_data(context,data):

   for stock in context.stocks:

        if stock in data:

            order(stock,1)

Momentum Strategy:(Common Trading Strategy)

In this strategy we consider Moving average price of stock as an important factor to make decision to put a security price in Long or Short.

Here is simple explanation of momentum Strategy:

● If the current price is greater than the moving average, long the security

● If the current price is less than the moving average, short the security

Now we will use Quantopian API to implement this strategy for Trading. instead, our algorithm here is going to be a little more sophisticated. We’re going to look at two moving averages: the 50 day moving average and the 200 day moving average.

David Edwards writes that “the idea is that stocks with similar 50 & 200 day moving averages are more likely to be fairly valued and the algorithm will avoid some of the wild swings that plague momentum strategies. The 50/200 day crossover is also a very common signal, so stocks might be more likely to continue in the direction of the 50day MA because a lot of investors enter and exit positions at that threshold.”

The decision-making behind Moving-average is as follows:

● If the 50 day moving averages is greater than the 200 day moving average, long the security/stock.

● If the 50 day moving average is less than the 200 day moving average, short the security/stock

So now Let’s make a Trading Bot!

1. First we have to create our initialize() function:

def initialize(context):

   set_universe(universe.DollarVolumeUniverse(floor_percentile=99.5,ceiling_percentile=100))

”’Set universe is inbuilt function by Quantopian which provide us the stocks with-in required universe. Here we have selected stocks those we have DollarVolumeUniverse with 99.5% and 100% as our floor and ceiling. This means that we’ll be selecting the top 99.5 ~ 100% stocks of our universe with the highest dollar*volume scores.

Please read the comments in the code.

   context.stocks_to_long = 5

   context.stocks_to_short = 5
   context.rebalance_date = None # we will get today's date then we will keep positions active for 10 days here

   context.rebalance_days = 10 # it is just an assumption now for 10 days or finer value


Now we have defined required __init__ parameters in initiliaze() let’s move to

handle_data()

def handle_data():

   if context.rebalance_date!=None: # if rebalnce date is not null then set next_date for changing the position of algorithm

       next_date = context.rebalance_date + timedelta(days=context.rebalnce_days) # next_date should be that days away from rebalnce_date

   if context.rebalance_date==None or next_date==get_datetime(): # if today is that day after 10 days when we market long/short for out stock

       context.rebalnce_date = get_datetime() # set rebalnce_date for today so next_date will be set to again 10 days ahead from rebalnce_date

       historical_data = history(200,'1d',price)

Get historical data of all stocks initilized in initiliaze() function, ‘1d’= 1 day,200=days,’price’=we are only fetching price details because that is only required for our strategy, may be for some strategy volume of stock could be more beneficial

  past_50days_mean = historical_data.tail(50).mean()

  past_200days_mean = historical_data.mean()

  diff = past_50days_mean/past_200days_mean-1

# if diff>0 we will long if diff<1 we will short

   buys = diff[diff>0]

   sells = diff[diff<0]   

# here we will get list of securities/stocks whose moving average will be

# greater as well as less than 0

   buys.sort() # sorting buys list why? - getting top securities from top- more is better
   sells.sort(ascending=False) # reverse sorting sells list - getting top seurities from bottom, less is better because we are selling agiainst market
   buys = buys.iloc[:buy_length] if buy_weight !=0 else None # buy_length = number of securities we want to purchase , 
   sells = sells.iloc[:short_length] if short_weight !=0 else None # short_length = number of securities we want to short

Now here we have buys and sells are two lists!! (remember carefully) all the decisions are going to be made based on these two lists

We can also implement risk factors in out Trading Strategy. Let’s implement minimum form of Risk-Factor, 0.02% of last_traded_price that means if security is going to much lower than that then we will exit.

We will go through each security in our data/universe and those who will satisfy condition of ‘buys’ and ‘sells’ list will be bought/sold.

# if security exists in our sells data

   for sym in data:

       if sells is not None and sym in sells.index:

           log.info('SHORT:%s'%sym.symbol)

           order_target_price(sym,short_weight.stop_price=data[sym].price_stops[sym])

   

# here stop_price is the price of security in real-time+change happend in stops

# order_target_price is inbuilt function.




   # if security exists in our buy data

   elif buys is not None and sym in buys.index:

       log.info('Long:%s'%sym.symbol)

       order_target_percent(sym,buy_weight,stop_price=data[sym].price-stops[sym])

   else:

       order_target(sym,0)


The `order_target_percent` method allows you to order a % target of your portfolio in that security. So this means that if 0% of your total portfolio belongs in AAPL and you order 30%, it will order 30%. But if you had 25% already and you tried ordering 30%, it will order 5%.

You can order using three different special order methods if you don’t want a normal market order:

#`stop_price`: Creates a stop order

#`limit_price`: Creates a limit order

#`StopLimitOrder`: Creates a stop­limit order





How Trading Differentiates from Gambling:

Most of times when you find that you are able to get good returns from your capital you try to beat the market, Beating the market means most of the traders tried to earn much more than fine earnings are being returned by the market for your stock, Such beating the market process can be done by various actions like reversing the momentum or looking for bad happenings in the market(which is also called finding the shit!)Some people are really good at this kung-fu but as you are just budding trader and you have only limited money of yours, So here one important thing should be remembered, “”Protect your capital””. – That’s what most of the Big banks do and if they will hire you as their Quant or Trading-Execution person they will expect same from you. Big banks have billions of dollars that they don’t want to loose but definitely want to used that money to get good returns from market.

So they follow one simple rule for most of the times.

Guaranteed returns even if those are low.

[Make sure returns should be positive after subtracting various costs like brokerage,leverage etc, Because getting positive returns by neglecting market costs is far easy but such strategies should not be used with real money.]

So the real key is think like a computer programmer at first place, something like it should work at first place, so first thing to make sure is getting returns even low but stable returns by calculating various risk-factors.

I am quoting some of informative things from SentDex Tutorial:

Most individual traders are trading on account sizes of somewhere between maybe $25,000 and $100,000 USD, so their motives are to hopefully increase that account size as much as possible, so this person is more likely to take part in High Risk High Yield (HRHY).

Most people who use HRHY strategies, tend to ignore the HR (High Risk) part, focusing on the HY (High Yield).

The same is common with gamblers,even over astronomical odds with things like the lottery.

In other words, always ask yourself – what’s about the market that makes my strategy work? Because, at the end of the day, algorithmic trading is more about trading than about algorithm.

Power of brain relaxation

This is kind of funny thing that is happening to me in these days, I am trying to be as relax as possible most of the time and it is going to increase my productivity, I feel people around me much cooler,calm,happy,positive,smiling and funny.

relaxation = less stress on mind = sit free do nothing excepting thinking 😀

 

If I am sitting freely most of the time then how I can be more productive or I am more productive because my mind  loves to have soft reboots after completion of small programming tasks?  soft reboots could be anything

  1. Going wash-room and say hello to stranger
  2. Looking at pretty girls in office 😀
  3. closing your eyes and remember the time when you play with your dog
  4. thinking about your dreams
  5. making possible your dreams by reading great quotes on internet
  6. thinking about journey of life
  7. dreaming to have great soul-mate or talking/chatting to her/him if you already have one 😀
  8. having fun with your colleagues
  9. drinking coffee
  10. Playing volleyball or any game which is available. yeahhhh!!!
  11.  writing silly blog-posts exactly like this one 😀 😀 😀

 

 

 

A simple script to do parsing of large file and save it to Numpy array

A normal approach:


huge_file = 'huge_file_location'
import re
import numpy as np
my_regex=re.compile(r'tt\d\d\d\d\d\d\d') #using a compiled regex saves the time
a=np.array([]) # just an array to save all the files
with open(file_location,'r') as f: # almost default method to open file
m = re.findall(my_regex,f.read())
np_array = np.append(a,m)
print np_array
print np_array.size
print 'unique'
print np.unique(np_array) # removing duplicate entries from array
print np.unique(np_array).size
np.save('BIG_ARRAY_LOCATION',np.unique(np_array))

In the above code f.read() saves big chuck of string into memory that is about 8GB in present situation. let’s fire up Generators.

A bit improved version:


def read_in_chunks(file_object):
while True:
data = file_object.read()
if not data:
break
yield data
import numpy as np
import re
a=np.array([])
my_regex=re.compile(r'tt\d\d\d\d\d\d\d')
f = open(file_location)
for piece in read_in_chunks(f):
m = re.findall(my_regex,piece) # but still this is bottle neck
np_array = np.append(a,m)
print np_array
print np_array.size
print 'unique'
print np.unique(np_array)
print np.unique(np_array).size

A little bit faster code:


file_location = '/home/metal-machine/Desktop/nohup.out'
def read_in_chunks(file_object):
while True:
data = file_object.read()
if not data:
break
yield data

import numpy as np
import re
a=np.array([])
my_regex=re.compile(r’tt\d\d\d\d\d\d\d’)
f = open(file_location)
def iterate_regex():
”’ trying to run iterator on matched list of strings as well”’
for piece in read_in_chunks(f):
yield re.findall(my_regex,piece)
for i in iterate_regex():
np_array = np.append(a,i)
print np_array
print np_array.size
print ‘unique’
print np.unique(np_array)
print np.unique(np_array).size

But why performance is still not taht good? Hmmm……
Have to look for more things. Please use the required indentation while testing. 😛

Look at the CPU usage running on Goole instance 8Core system.

 

cpu-usage.png

Julia Why and how not te be Python!(GD Task)

 

 

This post is intended to be rough draft for preparation for Julia presentation at TCC GNDEC. I am excited.

Now one thing always comes to mind why another Language/Technology?(I am nerd and geek as well, That’s what I do for living passionately!)

There is quite great trend has reached in field of computer science and that is all want to become Data-Scientist. or at least want to get paid as high as possible. DS seems to be right choice. 😉 (this is really a troll 😀 )

 

First thing how Julia is born?

Someone posted on Reddit!

 

We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.

(Did we mention it should be as fast as C?)

 

WoW!!! that’s real, interactive,compiled,with the speed of c and also easy to learn. as well as general purpose like Python? 😀 I am laughing yeah really laughing.

 

Well that’s how Julia came to life, or that’s the real motive behind Julia Project.

 

How you know Julia is for You?

  1. You are continually involved in computationally intensive work where runtime speed is a major bottleneck. (I just do that with each code block of mine if possible)

2. You use relatively sophisticated algorithms. (sometimes I do that)

3. You write a lot of your own routines from scratch (I don’t do that)

4. Nothing pleases you more than the thought of diving into someone else’s code library and dissecting its internals. (do that ‘a’-lot!)

Most of the above tasks listed I try to Do with ‘Python'(I am actually in serious relationship with this tool 😀 😉 because it never puts me down in Day job and just works! )

HOW JULIA A SO MUCH FASTER????

images

 

Julia is a really well-thought-out language. While the syntax looks superficially Matlabby,(Is that really a word?) that is about as far as the similarity goes. Like Matlab, R, and Python, Julia is interactive and dynamically typed, making it easy to get started programming.

But Julia differs from those languages in a few major ways.

Under the hood, it has a rigorous but infinitely flexible type system, and calls functions based on “multiple dispatch”: different code is automatically chosen based on the types of the all arguments supplied to a function.

(is it some kind of skipping type-checks each time?)

 

When these features are combined with the built-in just-in-time (JIT) compiler,

No GIL.. yeahhh!!!!!

they let code–even scalar for-loops, which are famous performance killers in R–run as fast as C or Fortran.

Yes I want it fast LIke SEE (C)

But the real killer is that you can do this with code as concise and expressive as Python.

Still Pythonic!

 

I am excited are you to sale the SHIP in the sea?

 

 

 

Understanding Deep learning!

Deep Learning is state of algorithms that attempts to model high level abstract
in data by using multiple layers of network,  It is called deep-learning
because more than one hidden network layer is used in the whole network. This implies that deep learning is considered as Technical term.

|||||||Input-layer||!!!!!!||Hidden-layer1|!!!!!!|Hidden-layer2|!!!!!!!|Output-layer||||||||

Deep learning is often considered as Brahma aastra to kill every problem in the world but that in not certainly true, Of-course Brahma-Astra is amazing tool but Deep-learning is not Brahmastra but still we care about it. 🙂

For newbie learning one can use “MNIST Data-Set” to fit and predict, in general the goal of deep-learning is to take input from low level and generate higher level abstraction through the composition of layers, But before doing that we need to understand the various parts of Deep Learning algorithm.

Input Layer:

This is also called visible layer, this layer contains an input
node for each of the entries in our feature vector.
For example, in the MNIST dataset each image is 28 x 28 pixels. If we use the raw pixel intensities for the images, our feature vector would be of  length 28 x 28 = 784, thus there would be 784 nodes in the input layer.

Hidden Layer:

From there nodes connecet to series of hidden layers, In the most simple terms, each hidden layer is an unsupervised Restricted
Boltzmann Machine where the output of each RBM in the hidden layer  sequence is used as input to the next.The final hidden layer then connects to an output layer.

Output Layer:

This layer contains the probabilities of each class label. For example, in our MNIST dataset we have 10 possible class labels (one for each of the digits 1-9). The output node that produces the largest probability is chosen as the overall classification.

This is quite introductory level of information about Deep Learning and understanding the working of Neural Networks, for implementation of Coding part stay tuned or you can also reach at: this link 

Natural Language processing

NLP which mostly stands for Natural Language Processing and most of  the time it is also considered as teaching things to computer so it can understand, speak, read and write words, As we all speak, read and write so computer do the same in terms of Machine learning algorithms 😉

NLP is s field of computer science and computational linguistics which is
driven by Advance Machine learning as well as uses Artificial intelligence
for making decisions for particular set of problems.

Most of the problems are solved or being solved Using NLP, As Artificial intelligence
is developed as same as Human Neural Network understands the things, here we can also talk about test developed by Alan Turing about prediction of words those are responded by machine, Turing proposed a human Elevator would judge the conversation happened between human and machine and conversation will only happen in the text medium that is using Keyboard only. If the evaluator cannot reliably tell the machine from the human (Turing originally suggested that the machine would convince a human 70% of the time after five minutes of conversation), the machine is said to have passed the test. The test does not check the ability to give correct answers to questions, only how closely answers resemble those a human would give.

Now for example we can take input from machine:

As ____Man___ is related to ___woman____

King is related to _______ ?

the particular answer will be ___queen___ (That is exact machine predicted)

So What’s under the hood? (How machine learns by itself?(we should call it himself or herself :P))

To understand that at first we need to understand How we learn? In our mind all the words are represented as image or set of images in the 3D space, whatever we think or imagine comes in the form of vector representation  for example If I think about King my mind will show me an image if King and all things those are related to king like queen, army or kingdom

So that is basically A machine is doing but we feed data in terms of vector form
Now meaning(king)-meaning(man)+meaning(woman) = ?

 

So when we remove the meaning of man from meaning of King we get something like
royal, so royal woman could be princess or Queen, But as we know in general sense Queen is more close to king rather than army,
prince or princess So what machine actually Understood? Machine took vector representation of Royal(y-axis) and Non-royal(x-axis) and in that graph queen and King are closest so Machine returned Queen

Vector(King)-Vector(man)+Vector(woman) = Vector(queen)

Now we can conclude Machine can Speak, read, write, Now from Engineering point of View we can replicate our knowledge in terms of decision making to tell the patient about any particular disease his/her heart has or can get in near future, such system can also be used to predict the health of our Heart. In 2010, the cost of cardiovascular disease in the U.S. was about $444 billion.

Using NLP we can cut down the cost of Diagnosis and  cure/prevention treatment, Which could be considered as string step towards the using AI in Medical field. Playing sake one can also test out this https://github.com/rkomartin/heart-disease-example project
To understand the real meaning of NLP ans AI combined in easy one can go to
http://www.cleverbot.com/ and ask it various questions, For example I asked it :

 

Me: Do you know about Programming languages
and reply was; Yes, I know about Python!

 

 

Python and my fav parts

In this Post one can get head-start knowledge about basic but bit advanced corners of Python programming which can lead to beautiful code-base and title for Python Ninja in terms of experience.

1. List comprehensions
2. Generator Expressions
3. Generator function

List comprehensions: Which are basically iteration over list
Earlier approach:


p = [1,2,3,4,5]
In [4]: for i in range(len(p)):
p[i]+=10
In [5]: p
out[5]: [11, 22, 13, 14, 15]


Using List comprehensions we can provide much better code:


p = [1,2,3,4,5]
In [7]: p = [i+10 for i in p] # THis is list comprehensions
In [8]: p
Out[8]: [11, 12, 13, 14, 15]


Generator Expressions:

One potential downside of using a list comprehension is that it might
produce a large result if the original input is large. If this is a concern,
you can use generator expressions to produce the filtered values iteratively.

for example if list input is quite big Generator Expression is Amazing:


In [37]:  p = (i for i in xrange(10000))

in[38]: p

Out[38]: at 0x7ff16e47b0a0>
Here we got the generator object, now we can use it as we want:

We can also use generator Expression as  argument:


In [49]:  p = (i for i in xrange(10000)) # remember we used xrange()
In [49]: p  = list(p)
In [50]: s = sum(i*i for i in p)
In [51]: s
Out[51]: 333283335000
In [52]: s = sum(i+10 for i in p)
In [53]: s
Out[53]: 50095000

This stuff is super fast.
now using the generator expression convert tuple into CSV file:


In [54]: s = ('ACME', 50, 123.45)
In [55]: print(','.join(str(x) for x in s))
ACME,50,123.45

So from all the above discussion we are now clear of one thing that we should
use Generator expressions as much as you know.

Generator Functions:

Now any function which contains “yield” function. that is generator function.
there are many few examples I would Like to share here:


In [21]: def hello(item):
h = item*3
yield h

In [22]: for i in hello(item=’iamamazing’):
print i

iamamazingiamamazingiamamazing

so now above it generator function as we are learning about it.

Then what is practical use of this thing? Generator functions are mostly fast in terms of text processing before Going further I need to see how Generators are very helpful in
data-processing, Data Flow , Big data or any other kind of Stuff.

%d bloggers like this: