my net house


Category Archives: Enlightenment

Everything you should know about Quantitative Finance(Almost Everything ;)

Quant trading is intellectually very satisfying because it draws heavily from fields as diverse as computer science, statistics, math, psychology, economics, business, operations, research, and history.
Quant trading combines programming and trading. Programming is the fastest way to take an idea and turn it into reality (compare it to writing a book, publishing, etc or to architecture a building, getting zoning permits,doing construction, etc). Trading is the most direct approach to making money (compare it with starting the next facebook, programming, hiring, advertising, etc or to creating a retail chain, hiring, waiting for money to flow in to reinvest, etc). Combining programming and trading,  you have the most direct path from an idea in your brain to cash but never forget that all you have to do is minimize Risk.

Quant trading is intellectually very satisfying because it draws heavily from fields as diverse as computer science, statistics, math, psychology, economics, business, operations research, and history.

Trading and Trading Strategies:

There are many words those sound complicated when you read or listen about Trading things for example on of those is Statistical Arbitrage Trading. It looks complicated but in real manner it is another form of Pair’s Trading. Pair’s trading is often considered as selling/buying two inter-related stocks/commodities/currencies/futures. Trading strategy is set of calculations those a Trader/Quant/Person believes that will work-out in near or long future in trading markets. There are various methods of trading like Fully-Automated or Semi-Automated. To write a Trading strategy one does not need to know programming or advanced mathematics but on the other hand one need to understand simple concepts of Statistical methods. On the other hand a degree in Advances degree in Mathematics or similar also does not assure that one can write “Profitable” Trading strategies. I believe it is more the  matter of Past experience in Trading as well.

How Easy it is to earn from Trading:

When we talk about programming languages often one question is asked, ‘How fast it is?’ and always counter question is ‘Define Fast’. I would like to implement same phenomenon here and that is ‘define easy’ or we can sat that Earning from Trading is EASY but we must not consider easy as ‘Instant’. Most of the times people or brokerage firms  are more interested in intra-day trading because big volume is there and that means more buy and sell opportunities.

Borrow Money While Trading:

Good thing about Trading is you have to small amount of Money and you can always get leverage. Leverage is kind you can lend from trading firms for trading so if you will get profit it will also be shared with Firm and on the other hand in the case of loss it will be all yours. Good trading Strategy is considered as one having good returns, better sharp Ratio, small draw-downs, and   uses as low leverage as possible, Suppose you got very good tip from anywhere about one really interesting trading investment that you are pretty sure will work even in that case you should use as less as possible. it is dangerous to over-leverage in pursuit of overnight riches.

How much time is required for self running Quantitative Business:

Time is really dependent on the type of Trading strategies you are using. Some trading strategies or Machine-learning models need to be trained few minutes  before market opens  as well as some models need to be trained after closing the market. Sometimes one has to come up with trading strategies those train models once a week and runs the Buy/Sell calls once a Month.

How Hard it is to Create Trading Strategies:

Creating Trading strategies are as hard as doing calculus and derivative or as easy as finding mean or median from a given set of data. You might talk about various things while creating Trading Strategies but simpler one is think more and more about Quantitative Techniques. Don’t look at the preciseness, perfect or most accurate results you want to achieve but look for the valuable information or quantify those valuable things you need to know about your Trading model.

Importance of Programming in Quantitative Finance:

Developing a Trading Strategy is taste of mind. One can develop various trading strategies by just looking at the things and writing a daily schedule like generate a Buy call at this time and Generate a Sell call at this time and that would also workout but real reason behind all this is you want to create a trading-Bot/Robot that does it automatically so you could observe it as external factor and  look into various things those can be tuned so converting a Trading Strategy into Program/code/algorithm is one of the most important task we should persuade.

A Dollar-neutral Portfolio:

The market value of the long positions equals the market value of the short positions.

Market-Neutral Portfolio:

The beta of the portfolio with respect to a market index is close to zero, where beta measures the ratio between
the expected returns of the portfolio and the expected returns of the market index) require twice the capital or leverage of a long- or short-only portfolio.


Directional Strategy – That only does one thing either buy or sell.
Bi-directional Strategy – That does both buy and sell.


Survivorship Bias Free Data and it’s importance:

One can use a database as Back-test that is not survivorship Bias free for intra-day trading as well because in most of the cases a good company does not fall in one day and same happens with rise of company.

How holding period does effect you Trading Strategies:

I was doing trading strategies based on various factors, Using machine-learning I was predicting if tomorrow’s close price is higher than today’s then buy otherwise sell the stock. it is great Practice to do but in real we have to understand one thing carefully and that is I am holding a stock for whole day, Anything could go wrong in whole day so what I had to do is shorter the holding period of the stock, That means for less time I will hold that particular stock less chances will be there for anything Going wrong and less Risk. 🙂

Sharp-Ratio(Most important thing):

Sharp Ratio defines how consistent your returns are! Think of sharp-ratio as independent from Benchmark if you want to create Strategy with Good returns and High Sharpe Ratio that means your approximation or understanding of Market should come up with various rules of Generalization. Your Sharpe ration should be more than 1(one), Strategy returns could be less or more than  benchmark returns. Sharp-Ratio is also considered as information-Ratio and formula to calculate sharp-Ratio is as follows:
Information Ratio(Sharpe-Ratio) = Average of Excess Returns/Standard Deviation of Excess Returns
Excess Returns = portfolio returns- benchmark returns
As a rule of thumb, any strategy that has a Sharpe ratio of less than 1 is not suitable as a stand-alone strategy. For a strategy that achieves profitability almost every month, its (annualized) Sharpe
ratio is typically greater than 2. For a strategy that is profitable almost every day, its Sharpe ratio is usually greater than 3.

One important thing you mush be knowing that Benchmarks varies according to the Market/Exchange you are working with.

Sharpe-Ratio Vs Returns:

If the Sharpe ratio is such a nice performance measure across different strategies, you may wonder why it is not quoted more often instead of returns. A higher Sharpe ratio will actually allow you to make more profits in the end, since it allows you to trade at a higher leverage. It is the leveraged return that matters in the end, not the nominal return of a trading strategy.

Why and How Draw-Down is bad:

A strategy suffers a draw-down whenever it has lost money recently.
A draw-down at a given time t is defined as the difference between the current equity value (assuming no redemption or cash infusion) of the portfolio and the global maximum of the equity curve occurring on or before time t.

You must know draw-down of strategy before using it!

Length and depth of your Draw down:

Your draw-down’s length defines how long it would take to recover the market and depth defines how much you can loose, but those results are based on your back-testing in real you have ti understand things in better way because in the real trading strategies Draw-Downs could be very less or more than benchmark results.
What is Slippage— ?

This delay can cause a “slippage,” the difference between the price that triggers the order and the execution price.

How Will Transaction Costs Affect the Strategy?
Transaction cost is brokerage amount or something that costs when you buy/sell any stock. Now as we understand that less the hold period will be more profit you can make or at-least less risk you will be having. One thing you must never forget and that is minimize the risk that is all behind the strategies and Algorithmic Trading. Every time a strategy buys and sells a security, it incurs a transaction cost. The more frequent it trades, the larger the impact of transaction costs will be on the profitability of the strategy. These transaction costs are not just due to commission fees charged by
the broker. There will also be the cost of liquidity—when you buy and sell securities at their market prices, you are paying the bid-ask spread. If you buy and sell securities using limit orders, however,
you avoid the liquidity costs but incur opportunity costs. This is because your limit orders may not be executed, and therefore you may miss out on the potential profits of your trade. Also, when you buy or sell a large chunk of securities, you will not be able to complete the transaction without impacting the prices at which this transaction is done. This effect on the market prices due to your own order is called market impact, and it can contribute to a large part of the total transaction cost when the security is not very liquid.

Why survivorship bias should not be there?

This is especially true if the strategy has a “value” bent; that is, it tends to buy stocks that are cheap. Some stocks were cheap because the companies were going bankrupt shortly. So if your strategy includes only those cases when the stocks were very cheap but eventually survived (and maybe prospered) and neglects those cases where the stocks finally did get de-listed, the backtest performance will, of course, be much better than what a trader would actually have suffered at that time.

Data-Snooping Bias?(Model Over-fitting)
If you build a trading strategy that has 100 parameters, it is very likely that you can optimize those parameters in such a way that the historical performance will look fantastic. It is also very likely that the future performance of this strategy will look nothing like its historical performance and will turn out to be very poor. By having so many parameters, you are probably fitting the model to historical accidents in the past that will not repeat themselves in
future. Actually, this so-called data-snooping bias is very hard to avoid even if you have just one or two parameters (such as entry and exit thresholds).

Important Questions to ask yourself:

1. How much time do you have for baby-sitting your trading programs?
2. How good a programmer are you?
3. How much capital do you have?
4. Is your goal to earn steady monthly income or to strive for a large, long-term capital gain?

Important questions you must ask yourself before using a Trading Strategy:
1. Does it outperform a benchmark?
2. Does it have a high enough Sharpe ratio?
3. Does it have a small enough drawdown and short enough draw-
down duration?
4. Does the backtest suffer from survivorship bias?
5. Does the strategy lose steam in recent years compared to its ear-
lier years?
5. Does the strategy have its own “niche” that protects it from intense  competition from large institutional money managers?

Sharp-Ratio and drop-downs(length and duration):
Quantitative traders use a good variety of performance measures. Which set of numbers to use is sometimes a matter of personal preference, but with ease of comparisons across different strategies and traders in mind, I would argue that the Sharpe ratio and draw-downs are the two most important. Notice that I did not include average annualized returns, the measure most commonly quoted by investors, because if you use this measure, you have to tell people a number
of things about what denominator you use to calculate returns. For example, in a long-short strategy, did you use just one side of capital or both sides in the denominator? Is the return a leveraged one (the
denominator is based on account equity), or is it leveraged (the denominator is based on market value of the portfolio)? If the equity or market value changes daily, do you use a moving average as the denominator, or just the value at the end of each day or each month? Most (but not all) of these problems associated with comparing re-turns can be avoided by quoting Sharpe ratio and draw-down instead as the standard performance measures.

Interesting things about Back-Testing:

Back-testing is the process of creating the historical trades given the historical information available at that time, and then finding out what the subsequent performance of those trades is. This process seems easy given that the trades were made using a computer algorithm in our case, but there are numerous ways in which it can go wrong. Usually, an erroneous back-test would produce a historical performance that is better than what we would have obtained in actual trading. We have already seen how survivorship bias in the data used for back-testing can result in inflated performance.

Importance of Sample Size (How much historical data you need?):
The most basic safeguard against data-snooping bias is to ensure that you have a sufficient amount of back-test data relative to the number of free parameters you want to optimize. As a rule of thumb, let’s assume that the number of data points needed for optimizing your parameters is equal to 252 times the number of free parameters your model has. So, for example, let’s assume you have a daily trading model with three parameters. Then you should
have at least three years’ worth of back-test data with daily prices.

However, if you have a three-parameter trading model that updates positions every minute, then you should have at least 252/390 year, or about seven months, of one-minute back-test data. (Note that if
you have a daily trading model, then even if you have seven months of minute-by-minute data points, effectively you only have about 7 × 21 = 147 data points, far from sufficient for testing a three parameter model.)

Training-Set and Test-Set:

It is very simple concept. You have to split data into two portions one would be training set for your model to learn and other would be Test-Set.

What is Profit cap?:

Profit cap is limit that at what amount you want your strategy to exit. In real achieving a profit cap is ultimate goal for the strategy, A greedy strategy without profit cap and Stop-loss could destroy all of your liquidity.
Parameter-less model:

This is like self sustaining model that does all the stuff by itself, that means you need to make sure that your model  is safe and secure and all the parameters like profit-cap is being calculated by itself.

The advantage of a parameterless trading model is that it minimizes the danger of over-fitting the model to multiple input parameters (the so-called “data-snooping bias”). So the back-test performance should be much closer to the actual forward performance. (Note that parameter optimization does not necessarily mean picking one best set of parameters that give the best back-test performance. Often, it is better to make a trading decision based on some kind of averages over different sets of parameters.)

A Simple understanding of Back-testing:

Backtesting is about conducting a realistic historical simulation of the performance of a strategy. The hope is that the future performance of the strategy will resemble its past performance, though as your investment manager will never tire of telling you, this is by no means guaranteed!
There are many nuts and bolts involved in creating a realistic historical back-test and in reducing the divergence of the future Backtesting.

Things to take care in Back-Testing:

Data: Split/dividend adjustments, noise in daily high/low, and survivorship bias.
Performance measurement: Annualized Sharpe ratio and maximum draw-down.
Look-ahead bias: Using unobtainable future information for past trading decisions.
Data-snooping bias: Using too many parameters to fit historical data, and avoiding it using large enough sample, out-of-sample testing, and sensitivity analysis.
Transaction cost: Impact of transaction costs on performance.
Strategy refinement: Common ways to make small variations on the strategy to optimize performance.


Importance of Speed in Algorithmic Trading:

There are various things included when you talk about HFT and speed. It does matter that which part of your Trading algorithm takes much more time for execution. Think of this as 90/10 rule in software development. Optimize that 10% portion of your code that Takes 90% time. If your Trading strategy is written in Python that means it could be slow on various portions so it’s better to use C or C++ for such purpose but on the other hand you can also use Cython which is really fast in the case of development as well as in the case of Execution of code. On the other hand your internet connection should be fast as well so you would be able to get data really fast and make decisions based on that data.

Let’s again talk about importance of Programming:

You can use various available custom Platforms for trading as well as you can also create custom ones that uses various back-tests, Different Trading Exchanges(Using APIs),Different Machine-learning models as well as different Platforms and Programming Languages.
How to Decrease Brokerage Cost using Programming?:
In order to minimize market impact cost, you should limit the size (number of shares) of your orders based on the liquidity of the stock. One common measure of liquidity is the average daily volume (it is your choice what lookback period you want to average over).
As a rule of thumb, each order should not exceed 1 percent of the average daily volume. As an independent trader, you may think that it is not easy to reach this 1 percent threshold, and you would be right when the stock in question is a large-cap stock belonging to the S&P 500. However, you may be surprised by the low liquidity of
some small-cap stocks out there.


Paper Trading and Re-creating Strategies:(Testing in Real Market)
The moment you start paper trading you will realize that there is a glaring look-ahead bias in your strategy—there may just be no way you could have obtained some crucial piece of data before you enter an order! If this happens, it is “back to the drawing board.” You should be able run your ATS, execute paper trades, and then compare the paper trades and profit and loss (P&L) with the theoretical ones generated by your backtest program using the latest data. If the difference is not due to transaction costs (including an expected delay in execution for the paper trades), then your software likely has bugs.

Another benefit of paper trading is that it gives you better intuitive understanding of your strategy, including the volatility of its P&L, the typical amount of capital utilized, the number of trades per day, and the various operational difficulties including data issues. Even though you can theoretically check out most of these features
of your strategy in a back-test, one will usually gain intuition only if
one faces them on a daily, ongoing basis. Back-testing also won’t reveal the operational difficulties, such as how fast you can download all the needed data before the market opens each day and how you
can optimize your operational procedures in actual execution.

Psychology and Trading:

This is one of the most important concept you must be knowing. Trading is real money and that real money could make you really mad and in lots of ways. I am again pointing this thing out which is Algorithmic Trading is all about Minimizing your Risk not to get Really rich instantly. Yes you can create Strategies with high Sharpe ratio and sell it to firms like JP Morgan or other big Banks then you will be rich very quickly. 🙂

More or less it’s not just the Trading-Strategy you create but it’s your mind and experiences as well those help you to grow as well as your Capital.

How we can implement RISK-Management Using Programming?:

Calculating the Kelly criterion is relatively simple and relies on two basic components: your trading strategy’s win percentage probability and its win to loss ratio.

The win percentage probability is the probability that a trade will have a positive return. The win to loss ratio is equal to your total trading profits divided by your total trading losses.

These will help you arrive at a number called the Kelly percentage. This gives you a guide to what percentage of your trading account is the maximum amount you should risk on any given trade.

The formula for the Kelly percentage looks like this:

Kelly % = W – [(1 – W) / R]

  • Kelly % = Kelly percentage
  • W = Win percentage probability (total number of winning trades/total number of trades)
  • R = Win to loss ratio (total amount gained by winning trades/total amount lost by losing trades)


Max Dama:

Quantitative Trading by Ernie P Chan:

Google Search:

Important Julia Packages

  1. Julia Ipython

Julia is able to run very well on you Ipython notebook Environment. After all, All you have to do is Data-Science and Machine-Learning. 🙂


1.1 Open Julia Prompt(At Ubuntu it works like typing ‘julia’ command in your Terminal)

1.2 run command > Pkg.add(“IJulia”) # it will do almost all the work.

2. DataFrames: Whenever you have to read lot of files in Excel-Style Julia DataFrames Package is good to go.


3. Arduino:

A Julia Package for interacting with Arduino.

4. Neural Network Implementation of Julia

5. Visualizing and Plotting in Julia:

6. Reading and writing CSV files in Julia

7. DataClusting in Julia:

For more Large number of Packages, Please refer following link:

Note*: You can also run most of the Shell commands in Julia environment as well. 🙂

Hacker’s Guide to Quantitative Trading(Quantopian Python) Part 2

Quantopain Provides required API functions,Data,Helpful-community as well as batteries included Web-based Dashboard to play with Algorithmic-Trading, Create Your own trading Strategies, and launch your Trading model in live Market.

Here I will only talk about code and how it should be written to create your own Trading Strategy.

There are basically Two methods.

initialize() and handle_data()

initialize act as initializer for various variables. same as __init__ method in Python.

Now what kind of variables we have to declare in initialize() function is dependent on your strategy. we can select limited number of stocks,days,type of trading,variables required for Algorithms.

A very simple example of initialize() code could look like as follows:

def initialize(context): # consider context just as 'self' in Python

   context.stocks = [sid(24),sid(46632)] # sid stands for stock_id

initialize() also contains the stuff that can be used many times or all the times in our Trading Algorithm:

1. A counter that keeps track of how many minutes in the current day we’ve got.

2. A counter that keeps track of our current date.

3. A list that stores the securities that we want to use in our algorithm.

Whatever variables that you define here will remain persistent (meaning that they’ll exist) but will be mutable. So that means that if you initialize context.count as 0 in initialize, you can always change it later in handle_data().

A Simple Example of handle_data():

def handle_data(context,data):

   for stock in context.stocks:

        if stock in data:


Momentum Strategy:(Common Trading Strategy)

In this strategy we consider Moving average price of stock as an important factor to make decision to put a security price in Long or Short.

Here is simple explanation of momentum Strategy:

● If the current price is greater than the moving average, long the security

● If the current price is less than the moving average, short the security

Now we will use Quantopian API to implement this strategy for Trading. instead, our algorithm here is going to be a little more sophisticated. We’re going to look at two moving averages: the 50 day moving average and the 200 day moving average.

David Edwards writes that “the idea is that stocks with similar 50 & 200 day moving averages are more likely to be fairly valued and the algorithm will avoid some of the wild swings that plague momentum strategies. The 50/200 day crossover is also a very common signal, so stocks might be more likely to continue in the direction of the 50day MA because a lot of investors enter and exit positions at that threshold.”

The decision-making behind Moving-average is as follows:

● If the 50 day moving averages is greater than the 200 day moving average, long the security/stock.

● If the 50 day moving average is less than the 200 day moving average, short the security/stock

So now Let’s make a Trading Bot!

1. First we have to create our initialize() function:

def initialize(context):


”’Set universe is inbuilt function by Quantopian which provide us the stocks with-in required universe. Here we have selected stocks those we have DollarVolumeUniverse with 99.5% and 100% as our floor and ceiling. This means that we’ll be selecting the top 99.5 ~ 100% stocks of our universe with the highest dollar*volume scores.

Please read the comments in the code.

   context.stocks_to_long = 5

   context.stocks_to_short = 5
   context.rebalance_date = None # we will get today's date then we will keep positions active for 10 days here

   context.rebalance_days = 10 # it is just an assumption now for 10 days or finer value

Now we have defined required __init__ parameters in initiliaze() let’s move to


def handle_data():

   if context.rebalance_date!=None: # if rebalnce date is not null then set next_date for changing the position of algorithm

       next_date = context.rebalance_date + timedelta(days=context.rebalnce_days) # next_date should be that days away from rebalnce_date

   if context.rebalance_date==None or next_date==get_datetime(): # if today is that day after 10 days when we market long/short for out stock

       context.rebalnce_date = get_datetime() # set rebalnce_date for today so next_date will be set to again 10 days ahead from rebalnce_date

       historical_data = history(200,'1d',price)

Get historical data of all stocks initilized in initiliaze() function, ‘1d’= 1 day,200=days,’price’=we are only fetching price details because that is only required for our strategy, may be for some strategy volume of stock could be more beneficial

  past_50days_mean = historical_data.tail(50).mean()

  past_200days_mean = historical_data.mean()

  diff = past_50days_mean/past_200days_mean-1

# if diff>0 we will long if diff<1 we will short

   buys = diff[diff>0]

   sells = diff[diff<0]   

# here we will get list of securities/stocks whose moving average will be

# greater as well as less than 0

   buys.sort() # sorting buys list why? - getting top securities from top- more is better
   sells.sort(ascending=False) # reverse sorting sells list - getting top seurities from bottom, less is better because we are selling agiainst market
   buys = buys.iloc[:buy_length] if buy_weight !=0 else None # buy_length = number of securities we want to purchase , 
   sells = sells.iloc[:short_length] if short_weight !=0 else None # short_length = number of securities we want to short

Now here we have buys and sells are two lists!! (remember carefully) all the decisions are going to be made based on these two lists

We can also implement risk factors in out Trading Strategy. Let’s implement minimum form of Risk-Factor, 0.02% of last_traded_price that means if security is going to much lower than that then we will exit.

We will go through each security in our data/universe and those who will satisfy condition of ‘buys’ and ‘sells’ list will be bought/sold.

# if security exists in our sells data

   for sym in data:

       if sells is not None and sym in sells.index:




# here stop_price is the price of security in real-time+change happend in stops

# order_target_price is inbuilt function.

   # if security exists in our buy data

   elif buys is not None and sym in buys.index:'Long:%s'%sym.symbol)




The `order_target_percent` method allows you to order a % target of your portfolio in that security. So this means that if 0% of your total portfolio belongs in AAPL and you order 30%, it will order 30%. But if you had 25% already and you tried ordering 30%, it will order 5%.

You can order using three different special order methods if you don’t want a normal market order:

#`stop_price`: Creates a stop order

#`limit_price`: Creates a limit order

#`StopLimitOrder`: Creates a stop­limit order

How Trading Differentiates from Gambling:

Most of times when you find that you are able to get good returns from your capital you try to beat the market, Beating the market means most of the traders tried to earn much more than fine earnings are being returned by the market for your stock, Such beating the market process can be done by various actions like reversing the momentum or looking for bad happenings in the market(which is also called finding the shit!)Some people are really good at this kung-fu but as you are just budding trader and you have only limited money of yours, So here one important thing should be remembered, “”Protect your capital””. – That’s what most of the Big banks do and if they will hire you as their Quant or Trading-Execution person they will expect same from you. Big banks have billions of dollars that they don’t want to loose but definitely want to used that money to get good returns from market.

So they follow one simple rule for most of the times.

Guaranteed returns even if those are low.

[Make sure returns should be positive after subtracting various costs like brokerage,leverage etc, Because getting positive returns by neglecting market costs is far easy but such strategies should not be used with real money.]

So the real key is think like a computer programmer at first place, something like it should work at first place, so first thing to make sure is getting returns even low but stable returns by calculating various risk-factors.

I am quoting some of informative things from SentDex Tutorial:

Most individual traders are trading on account sizes of somewhere between maybe $25,000 and $100,000 USD, so their motives are to hopefully increase that account size as much as possible, so this person is more likely to take part in High Risk High Yield (HRHY).

Most people who use HRHY strategies, tend to ignore the HR (High Risk) part, focusing on the HY (High Yield).

The same is common with gamblers,even over astronomical odds with things like the lottery.

In other words, always ask yourself – what’s about the market that makes my strategy work? Because, at the end of the day, algorithmic trading is more about trading than about algorithm.

Python for text processing

Python is more about ‘Programming like Hacker’ while writing your code if you keep things in mind like reference counting, type-checking, data manipulation, using stacks, managing variables,eliminating usage of lists, using less and less “for” loops could really warm up your code for great looking code as well as less usage of CPU-resources with great Speed.

Slower than C:

Yes Python is slower than C but you really need to ask yourself that what is fast or what you really want to do. There are several methods to write Fibonacci in Python. Most popular is one using ‘for loop’ only because most of the programmers coming from C background uses lots and lots of for loops for iteration. Python has for loops as well but if you really can avoid for loop by using internal-loops provided by Python Data Structures and Numpy like libraries for array handling You will have Win-Win situation most of the times. 🙂

Now let’s go with some Python tricks those are Super cool if you are the one who manipulates,Filter,Extract,parse data most of the time in your job.

Python has many inbuilt methods text processing methods:

>>> m = ['i am amazing in all the ways I should have']

>>> m[0]

'i am amazing in all the ways I should have'

>>> m[0].split()

['i', 'am', 'amazing', 'in', 'all', 'the', 'ways', 'I', 'should', 'have']

>>> n = m[0].split()

>>> n[2:]

['amazing', 'in', 'all', 'the', 'ways', 'I', 'should', 'have']

>>> n[0:2]

['i', 'am']

>>> n[-2]



>>> n[:-2]

['i', 'am', 'amazing', 'in', 'all', 'the', 'ways', 'I']

>>> n[::-2]

['have', 'I', 'the', 'in', 'am']

Those are uses of lists to do string manipulation. Yeah no for loops.

Interesting portions of Collections module:

Now let’s talk about collections.

Counter is just my personal favorite.

When you have to go through ‘BIG’ lists and see what are actually occurrences:

from collections import Counter

>>> Counter(xrange(10))

Counter({0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1})

>>> just_list_again = Counter(xrange(10))

>>> just_list_again_is_dict = just_list_again

>>> just_list_again_is_dict[1]


>>> just_list_again_is_dict[2]


>>> just_list_again_is_dict[3]


>>> just_list_again_is_dict['3']


Some other methods using counter:


Counter({'a': 10, 'r': 2, 'b': 2, 'k': 1, 'd': 1})

>>> c1=Counter('abraakadabraaaaa')

>>> c1.most_common(4)

[('a', 10), ('r', 2), ('b', 2), ('k', 1)]

>>> c1['b']


>>> c1['b'] # work as dictionary


>>> c1['k'] # work as dictionary


>>> type(c1)

<class 'collections.Counter'>

>>> c1['b'] = 20

>>> c1.most_common(4)

[('b', 20), ('a', 10), ('r', 2), ('k', 1)]

>>> c1['b'] += 20

>>> c1.most_common(4)

[('b', 40), ('a', 10), ('r', 2), ('k', 1)]

>>> c1.most_common(4)

[('b', 20), ('a', 10), ('r', 2), ('k', 1)]

Aithematic and uniary operations:

>>> from collections import Counter

>>> c1=Counter('hello hihi hoo')

>>> +c1

Counter({'h': 4, 'o': 3, ' ': 2, 'i': 2, 'l': 2, 'e': 1})

>>> -c1


>>> c1['x']


Counter is like a dictionary but it also considers the counting important of all the content you are looking for. So you can plot the stuff on Graphs.


it makes your chunks of data into meaningful manner.

>>> from collections import OrderedDict
>>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
>>> new_d = OrderedDict(sorted(d.items()))
>>> new_d
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])
>>> for key in new_d:
...     print (key, new_d[key])
apple 4
banana 3
orange 2
pear 1


Think it the way you need to save each line of your CSV into list of lines but along with that you also need to take care of not just the memory but as well as You should be able to store each line as dictionary data structure so if you are fetching lines from Excel or CSV document which comes in place when you work at Data-Processing environment.

# The primitive approach
lat_lng = (37.78, -122.40)
print 'The latitude is %f' % lat_lng[0]
print 'The longitude is %f' % lat_lng[1]

# The glorious namedtuple
LatLng = namedtuple('LatLng', ['latitude', 'longitude'])
lat_lng = LatLng(37.78, -122.40)
print 'The latitude is %f' % lat_lng.latitude
print 'The longitude is %f' % lat_lng.longitude


It is Container of Containers: Yes that’s really true. 🙂

You better be above Python3.3 to try this code.

>>> from collections import ChainMap

>>> a1 = {'m':2,'n':20,'r':490}

>>> a2 = {'m':34,'n':32,'z':90}

>>> chain = ChainMap(a1,a2)

>>> chain

ChainMap({'n': 20, 'm': 2, 'r': 490}, {'n': 32, 'm': 34, 'z': 90})

>>> chain['n']


# let me make sure one thing, It does not combines the dictionaries instead chain them.

>>> new_chain = ChainMap({'a':22,'n':27},chain)

>>> new_chain['a']


>>> new_chain['n']



You can also do comprehensions with dictionaries or sets as well.

>>> m = {'a': 1, 'b': 2, 'c': 3, 'd': 4}

>>> m

{'d': 4, 'a': 1, 'b': 2, 'c': 3}

>>> {v: k for k, v in m.items()}

{1: 'a', 2: 'b', 3: 'c', 4: 'd'}

StartsWith and EndsWith methods for String Processing:

Startswith, endswith. All things have a start and an end. Often we need to test the starts and ends of strings. We use the startswith and endswith methods.

phrase = "cat, dog and bird"

# See if the phrase starts with these strings.
if phrase.startswith("cat"):

if phrase.startswith("cat, dog"):

# It does not start with this string.
if not phrase.startswith("elephant"):



Map and IMap as inbuilt functions for iteration:

map is rebuilt in Python3 using generators expressions under the hood which helps to save lot of memory but in Python2 map uses dictionary like expressions so you can use ‘itertools’ module in python2 and in itertools the name of map function is changed to imap.(from itertools import imap)

>>>m = lambda x:x*x
>>>print m
 at 0x7f61acf9a9b0>
>>>print m(3)

# now as we understand lamda returns the values of expressions for various functions as well, one just have to look
# for various other stuff when you really takes care of other things

>>>my_sequence = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
>>>print map(m,my_sequence)

#so square is applied on each element without using any loop or if.

For more on map,reduce and filter you can fetch following jupyter notebook from my Github:


How to learn Fast!!

Learning is divine and sometimes process to reach that divine feeling could be stressful. Most of the times we do things at work or at life to make life more interesting and rich(in terms of knowledge and security) but in real manner Learning is DAMN! difficult and it takes to much time to learn any new skill. That is biggest lie running around us which is really making us slow and less progressive in our life.

All we really need is have to take care of few things/hurdles those come in our situations and learn few tactics or build up unique attitude towards those Tactics so we will be able to overcome such stuff and be good at anything just in 20 hours.

Learning new skills improves your life:

We as regular people have stuff to do in day to day life to earn some bucks and pay rent, even if you are a student or anything you have various tasks like Playing,conversations over coffee,movies,class-bunks etc so does not matter either you are in professional life or non-professional you have no time at all! But one thing we all know that learning a new skill can improve our life, it is just not to improve your life but even the people around you!

So First thing is believe in this thought/idea/saying ‘Learning a new skill improves life.’ It will give lot of fuel to your feelings and will motivate you constantly at each hurdle you get for specific skill that you are going to learn any time soon.

Set your time for each day or just take 4-5 days off from Everything: As I specifically told you that it only takes just 20 hours to learn anything or to be good at anything so you have to manage those 20 hours. if you are going to give 20 hours to each of your project /skill you want to learn/do you have to set time for sure.

it will work out something like this:

1 hour each day–>1 day to complete 1 hour learning:

20 days to complete 20 days learning

2 hours each day:

10 days to learn anything or to be good at anything.

4 hours each day and BOOM! –> Just ‘Five days’ to learn anything!

One thing you have to understand carefully, If you are giving 4 hours of your daily life to specific task/work you have to make sure that you have to give yourself a challenge or set of challenges those you have to complete in time. Don’t make the challenges too hard or too soft, Just the right amount and that right amount will entirely be dependent on your capacity of memorizing/reading and doing practical work related to that skill. Amount of doing research and practical may vary based on the skill, If you want to learn swiming then you will spent like 0.5 hour for reading and other 3.5 into swimming pool, If you are learning about programming it is good to give 1 hour for reading about some basics of programmingand rest will be consumed by solving a challenge. Let me make sure one thing solving a programming challange does not mean looking at Google to find a solution. 😀

If you are going to learn about Machine-learning or modeling a system you must give like 50-50 %age of time to each of the task.

By following above approach you don’t have to like wait for ever do learn/do anything new or something else.

Perfection is the enemy of Good:

There is one another research which is you have to spend 10,000 hours if you really want to learn anything, Which is also true in another sense. This research is based on the events/learnings of people who are GODs in their fields.

So you don’t have to be GOD or Just perfect to learn a new skill and enjoy returns come from that skill, For example if you want to learn how to play football you just have to read some rules,Get a football,find a ground/park around and kick your football with your legs, may be after 7-8 days you will be able to find some friends or team or others to join you. so that will be easy but it will take to really 10,000 hours of you want to compete against Ronaldo or Messy.

And when you get good enough and you enjoy doing it and that leads to more practice/perfection of that skill.

Make your decision and set target performance level:

You have to decide first what you actually want to do with that skill or what you actually want to learn, There will me many tasks you want to do in your life but you really have to write those in some manner. Now other important thing is setting a Target Performance level, How much you want to gain from specific skill. If you want to learn programming you have to tell yourself that you are learning because you want to make your business website or you are learning to code because you want to get job at company or you want to learn to code because you want to get job at Google.

It is always great to dream BIG but having a small stepping stones does matter, so If you want to get job at Google as a programmer it is always good to work on your personal project first then move to some professional paid work, after that you will get the ability to guide yourself like where to go from that point.

in other words once you get most baseline proficiency in something it sucks less that baseline level give you inspiration to learn more about that skill.

Deconstruction of skill:

Most of the time you see a skill is subset of various things, When you care about learning a new skill a quick study about that skill can tell you that how many other subsets are there for that particular skill-set, Some subsets are good to go with but some are really difficult to understand, So this deconstruction can help you to understand which easy skills you can learn first as well as what about of subsets you need/want to learn.

For example if you want to learn about Algorithmic trading you don’t need to learn first about marketing strategies, macro economics, policies/factors effects wall-street,internal structure of wall-street,quantitative analytics,machine-learning as well various machine learning techniques those are used for research purpose while construction of a algorithm/strategy, But in real for starting you just have to learn about those strategies which are currently being used by most of the traders and gives good returns, that number of strategies is not more than 10 or something so at first for algorithmic trading you have to know about those strategies and know for what conditions which strategy should be applied on stock market so you will b able to get better results of your trading.

In this process you will found that most of the 2-3 sub-skills repeat over and over again which help you to learn/do things much faster and that save your lot of time and energy which is mot important.

Research VS Practice:

When we have to learn any skill procrastinate! At some level procrastination is really good thing because back in your brain you unconsciously process/think about that skill, but if it is too much it will kill your focus as well, so right amount of procrastination is great for you. When we learn anything new we read-research-discuss. But rather than just limiting ourself into research mode will kill our productivity as well.

Human brain loves to do research but we need to switch between things constantly which is read/research and do.

For example if you really are going to learn programming either you can read 5-6 books first then try to write a function which will fetch birthday information of your friend from Facebook and let you know if anyone’s birthday falls in present month.

Practices makes you perfect But how to practice?:

There are various things you have to know about practice, Make sure whatever new work you are doing/learning, Do it just before you are going to sleep and after sleeping try it out as first thing in the morning, Study shows that in sleep your brain turns your small practice into good neuron structures for passing of various messages, such messages makes your mind more strong and fast to react towards grasping of new skill-sets.

Above method works for both either it is Cogitative or Motor skill.

Removing the barriers: (a general approach)

Sometimes those barriers are just environmental distractions. You have to make a list of distractions those really comes into your world when you try learn a new skill, Those distractions should be turned off like your phone or Chat, some sound coming from outside, Turning off your TV Or at last but not least TURN OFF YOUR INTERNET.

If you want to learn how to play Harmonica you just have to put it some=where in front of you! this is behavior psychology, that just make sure rather than getting distracted from any other shiny object you have to see that thing you want to do/learn first.

It is something like keep those things on your Computer Desktop those you want to learn/do but that does not mean your desktop should be overfilled with things because that also kills your productivity and you system’s speed.

It is also observed from studies that if you listen vocal-music while doing/reading something or even programming effects your ability to be more productive,but if you listen non-vocal music or some jazz it will not just help you to increase your productivity but also help you to improve your mind state.

Commit to practice for at-least 20 hours!

For more information please refer following video:

Power of brain relaxation

This is kind of funny thing that is happening to me in these days, I am trying to be as relax as possible most of the time and it is going to increase my productivity, I feel people around me much cooler,calm,happy,positive,smiling and funny.

relaxation = less stress on mind = sit free do nothing excepting thinking 😀


If I am sitting freely most of the time then how I can be more productive or I am more productive because my mind  loves to have soft reboots after completion of small programming tasks?  soft reboots could be anything

  1. Going wash-room and say hello to stranger
  2. Looking at pretty girls in office 😀
  3. closing your eyes and remember the time when you play with your dog
  4. thinking about your dreams
  5. making possible your dreams by reading great quotes on internet
  6. thinking about journey of life
  7. dreaming to have great soul-mate or talking/chatting to her/him if you already have one 😀
  8. having fun with your colleagues
  9. drinking coffee
  10. Playing volleyball or any game which is available. yeahhhh!!!
  11.  writing silly blog-posts exactly like this one 😀 😀 😀




Understanding Deep learning!

Deep Learning is state of algorithms that attempts to model high level abstract
in data by using multiple layers of network,  It is called deep-learning
because more than one hidden network layer is used in the whole network. This implies that deep learning is considered as Technical term.


Deep learning is often considered as Brahma aastra to kill every problem in the world but that in not certainly true, Of-course Brahma-Astra is amazing tool but Deep-learning is not Brahmastra but still we care about it. 🙂

For newbie learning one can use “MNIST Data-Set” to fit and predict, in general the goal of deep-learning is to take input from low level and generate higher level abstraction through the composition of layers, But before doing that we need to understand the various parts of Deep Learning algorithm.

Input Layer:

This is also called visible layer, this layer contains an input
node for each of the entries in our feature vector.
For example, in the MNIST dataset each image is 28 x 28 pixels. If we use the raw pixel intensities for the images, our feature vector would be of  length 28 x 28 = 784, thus there would be 784 nodes in the input layer.

Hidden Layer:

From there nodes connecet to series of hidden layers, In the most simple terms, each hidden layer is an unsupervised Restricted
Boltzmann Machine where the output of each RBM in the hidden layer  sequence is used as input to the next.The final hidden layer then connects to an output layer.

Output Layer:

This layer contains the probabilities of each class label. For example, in our MNIST dataset we have 10 possible class labels (one for each of the digits 1-9). The output node that produces the largest probability is chosen as the overall classification.

This is quite introductory level of information about Deep Learning and understanding the working of Neural Networks, for implementation of Coding part stay tuned or you can also reach at: this link 

Natural Language processing

NLP which mostly stands for Natural Language Processing and most of  the time it is also considered as teaching things to computer so it can understand, speak, read and write words, As we all speak, read and write so computer do the same in terms of Machine learning algorithms 😉

NLP is s field of computer science and computational linguistics which is
driven by Advance Machine learning as well as uses Artificial intelligence
for making decisions for particular set of problems.

Most of the problems are solved or being solved Using NLP, As Artificial intelligence
is developed as same as Human Neural Network understands the things, here we can also talk about test developed by Alan Turing about prediction of words those are responded by machine, Turing proposed a human Elevator would judge the conversation happened between human and machine and conversation will only happen in the text medium that is using Keyboard only. If the evaluator cannot reliably tell the machine from the human (Turing originally suggested that the machine would convince a human 70% of the time after five minutes of conversation), the machine is said to have passed the test. The test does not check the ability to give correct answers to questions, only how closely answers resemble those a human would give.

Now for example we can take input from machine:

As ____Man___ is related to ___woman____

King is related to _______ ?

the particular answer will be ___queen___ (That is exact machine predicted)

So What’s under the hood? (How machine learns by itself?(we should call it himself or herself :P))

To understand that at first we need to understand How we learn? In our mind all the words are represented as image or set of images in the 3D space, whatever we think or imagine comes in the form of vector representation  for example If I think about King my mind will show me an image if King and all things those are related to king like queen, army or kingdom

So that is basically A machine is doing but we feed data in terms of vector form
Now meaning(king)-meaning(man)+meaning(woman) = ?


So when we remove the meaning of man from meaning of King we get something like
royal, so royal woman could be princess or Queen, But as we know in general sense Queen is more close to king rather than army,
prince or princess So what machine actually Understood? Machine took vector representation of Royal(y-axis) and Non-royal(x-axis) and in that graph queen and King are closest so Machine returned Queen

Vector(King)-Vector(man)+Vector(woman) = Vector(queen)

Now we can conclude Machine can Speak, read, write, Now from Engineering point of View we can replicate our knowledge in terms of decision making to tell the patient about any particular disease his/her heart has or can get in near future, such system can also be used to predict the health of our Heart. In 2010, the cost of cardiovascular disease in the U.S. was about $444 billion.

Using NLP we can cut down the cost of Diagnosis and  cure/prevention treatment, Which could be considered as string step towards the using AI in Medical field. Playing sake one can also test out this project
To understand the real meaning of NLP ans AI combined in easy one can go to and ask it various questions, For example I asked it :


Me: Do you know about Programming languages
and reply was; Yes, I know about Python!



Python and my fav parts

In this Post one can get head-start knowledge about basic but bit advanced corners of Python programming which can lead to beautiful code-base and title for Python Ninja in terms of experience.

1. List comprehensions
2. Generator Expressions
3. Generator function

List comprehensions: Which are basically iteration over list
Earlier approach:

p = [1,2,3,4,5]
In [4]: for i in range(len(p)):
In [5]: p
out[5]: [11, 22, 13, 14, 15]

Using List comprehensions we can provide much better code:

p = [1,2,3,4,5]
In [7]: p = [i+10 for i in p] # THis is list comprehensions
In [8]: p
Out[8]: [11, 12, 13, 14, 15]

Generator Expressions:

One potential downside of using a list comprehension is that it might
produce a large result if the original input is large. If this is a concern,
you can use generator expressions to produce the filtered values iteratively.

for example if list input is quite big Generator Expression is Amazing:

In [37]:  p = (i for i in xrange(10000))

in[38]: p

Out[38]: at 0x7ff16e47b0a0>
Here we got the generator object, now we can use it as we want:

We can also use generator Expression as  argument:

In [49]:  p = (i for i in xrange(10000)) # remember we used xrange()
In [49]: p  = list(p)
In [50]: s = sum(i*i for i in p)
In [51]: s
Out[51]: 333283335000
In [52]: s = sum(i+10 for i in p)
In [53]: s
Out[53]: 50095000

This stuff is super fast.
now using the generator expression convert tuple into CSV file:

In [54]: s = ('ACME', 50, 123.45)
In [55]: print(','.join(str(x) for x in s))

So from all the above discussion we are now clear of one thing that we should
use Generator expressions as much as you know.

Generator Functions:

Now any function which contains “yield” function. that is generator function.
there are many few examples I would Like to share here:

In [21]: def hello(item):
h = item*3
yield h

In [22]: for i in hello(item=’iamamazing’):
print i


so now above it generator function as we are learning about it.

Then what is practical use of this thing? Generator functions are mostly fast in terms of text processing before Going further I need to see how Generators are very helpful in
data-processing, Data Flow , Big data or any other kind of Stuff.

Cython in Easy Way, Be cool

Ok Be cool there is lot of buzzing supposed to be happened in my mind because there have to be,

Why I am going to learn about Cython?

1 . Sometimes people say my Favourite language is Damn! Slow! (which is not true if you know about Generators and futures and multiprocessing modules, Let’s just save this for some other day)

2. If I got to use it at any job any day in the field of Data-Analysis or Number-crunching computations.

3. Cyhton is Sexy, I love sexy things, 🙂 😉

4. Portability is good for you if you are Server administrator in day and Android-Developer at night, Or even Web-developer in Weekends.[So cython can help you everywhere]

5. You have lots of C++/C code but you can read it


Slide1> C+Python = Cython

You will not get it’s need in first place unless you are from NASA 😉


Slide2> Why we should/must love Python

{import antigravity}

Slide3> C+P….. = Cython

Do I need to learn C if I have to go with Cython?


Slide4> Cython is made for Python Programmers to achieve the Speed of C and Productivity of Python,

Aha! I feel Like I can create Whole universe now(If you understood above statement)


Slide5> Where to start if you want to write programs in Cython?

Rember (C+Python = Cython) or (Python+C = Cython) 😉

So let’s Start With Python…! yay!!


A very Simple Python Function(Don’t try at home 😉 )

def sum_text(double number_range):

“This function explains lot about my lazyness”
return sum([i*i for i in xrange(number_range)])

When you will run it in ipython uisng %timeit sum_text(100000000) function your system Can crash/hang if you don’t have good RAM like 10 GB

It will take around 15.6 seconds.

So now we need Cython to optimise it.

Again Using Ipython

cpdef double sum_text_cython(int number_range):
“This is cython(!100% but…), It works 😀 ”

cdef int i
return sum([i*i for i in xrange(number_range)])

Image of above thing:


%timeit sum_text_cython(100000000)

1 loops, best of 3: 3.43 s per loop

So it is about "2.5 times" faster, Yeah!!! (That's what I mean!)

For the information this Example is Just for information, It can produce lots of other errors

Continued.... (Going for dinner) Come with monte-CarloSimulation
Now really going for dinner...

Ok Now really thing is  we need to take things in real life, Like how to use Cython when You want to do real code in Python as well.

So Write  a .pyx file with simple Python code.

p = [i*i for i in xrange(number_range)]
return sum(p)

So this is simple Python thing, save it hellocython.pyx

NEVER SAVE IT  “cython.pyx” file.

$ easycython hellocython.pyx

It will generate hellocython.c file and then also compile that into .so file.

Nextslide> So we only have Python code but Cython does understand it well.

nxtSlide> Let’s make code more faster:

def sum_text(int number_range):
cdef int i
p = [i*i for i in xrange(number_range)]
return sum(p)

This above is not even 100% pure Cython function but  it is 2.5 times faster. 😀 with .so file.

nxtslide> Now we can easily import this into our regular Python call.

create normal file.

import hellocython

print hellocython.sum_text(100000)

Now we can call Our results both Cython = C+Python

[Next is Cython using Multi-threaded] AND [Convert it into presentation] Add more Cython examples and make GitHub repository…

%d bloggers like this: