my net house


Category Archives: Python

TPOT Python Example to Build Pipeline for AAPL

This is  just first Quick and Fast Post.

TPOT Research  Paper:

import datetime
import numpy as np
import pandas as pd
import sklearn
from pandas_datareader import data as read_data
from tpot import TPOTClassifier
from sklearn.model_selection import train_test_split

apple_data = read_data.get_data_yahoo("AAPL")
df = pd.DataFrame(index=apple_data.index)
df['multiple_day_returns'] =  df['price'].pct_change(3)
df['rolling_mean'] = df['daily_returns'].rolling(window = 4,center=False).mean()

df['time_lagged'] = df['price']-df['price'].shift(-2)

df['direction'] = np.sign(df['daily_returns'])
Y = df['direction']

X_train, X_test, y_train, y_test = train_test_split(X,Y,train_size=0.75, test_size=0.25)

tpot = TPOTClassifier(generations=50, population_size=50, verbosity=2), y_train)
print(tpot.score(X_test, y_test))

The Python file It returned: Which is real Code one can use to Create Trading Strategy. TPOT helped to Selected Algorithms and Value of It’s features. right now we have only provided ‘price’,’daily_returns’,’multiple_day_returns’,’rolling_mean’ to predict Target. One can use multiple features and implement as per the requirement.

import numpy as np
import pandas as pd
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split

# NOTE: Make sure that the class is labeled 'target' in the data file
tpot_data = pd.read_csv('PATH/TO/DATA/FILE', sep='COLUMN_SEPARATOR', dtype=np.float64)
features = tpot_data.drop('target', axis=1).values
training_features, testing_features, training_target, testing_target = \
            train_test_split(features, tpot_data['target'].values, random_state=42)

# Score on the training set was:1.0
exported_pipeline = GradientBoostingClassifier(learning_rate=0.5, max_depth=7, max_features=0.7500000000000001, min_samples_leaf=11, min_samples_split=12, n_estimators=100, subsample=0.7500000000000001), training_target)
results = exported_pipeline.predict(testing_features)


Socket Programming and have fun with Python

Client Socket and Server Socket:

Client Computer like your browser or any piece of code you want to talk to your server uses client socket and Server uses both client and server socket.

Sockets are great for Cross-Platform communication.

Following is minimal Example of Socket and stuff:

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("", 80))

What is INET?What is Sock_Stream?

Almost that is all happened on client side, When connect is completed socket that is ‘s’ we just created can be used to send and request the specific text page requested. This socket will be read and reply, after that it will be destroyed. Client sockets are normally only used for one exchange (or a small set of sequential exchanges).

Now let’s look what is happening at server side:

# create an INET, STREAMing socket
serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# bind the socket to a public host, and a well-known port
serversocket.bind((socket.gethostname(), 80))
# become a server socket


Generators, Context-Managers and Coroutines:(Completing course)

Generator Pipelines:

  1. Several Pipelines can be linked together.
  2. Items flows one by one through the entire Pipeline.
  3. Pipeline Functionality can be packaged into callable functions.

One Example of Generator pipeline:

def seprate_names():

for full_name in names:

for name in full_name.split(‘ ‘):

yield name

full_names = name.strip() for name in open(‘names.txt’)

names = seprate_names(full_names)

lengths = ((name,len(name))for name in names)

longest  = max(lengths,key=lambda x:x[1])

Another approach is as follows is one wants to use using Function name: anotherapproac



Why we need context-Manager?

‘with’ statement in Python that we use to do file operations is Context-manager. It is something like to “Have the state and open that state and with-in that state to do things”. using with in Python we open file and till the file is open we do some good things and after doing all good things we close the file. So ‘with’ is a context Manager using that we make the state of file open and after that we do all the things we need to do with file.

Other useful cases of Context-manager:

Close/Open File/Socket(Even it crashes)

Commit/Fetch (Even if crashes)

release the lock (Even it crashes)

When you really need Context-Managers?

At last fun not least, If I will be creating a Chat-BOT in Python then I would be able to use Context-Manager in Python so I would be doing some stuff and after completing that stuff I can go out and have fun.


So we use @staticmethod in Python class,  that means no matter what happens we will be able to run this method at any cost.

That is just using decorator. Now if we want to create a context manager using decorator?

So what is Context-manager?

  1. I have to go to a particular directory, list all files with .txt extension, then come-back to current location (Simple use case)


  1. I have to load specific ML model, I have to predict against several parameters and get results, Return at specific state(un-load the model)
  2. I have to open Socket connection, Read various kind of data, close Socket connection.

On the other way we can also write it like:




Context-managers are powerful tool to make code more modular and Succinct.

Now What if we have to use Context-manager as Yielded value?


May be little-bit  more about COntext-managers-

What are Coroutines and how we have to handle those?

  1. Receive Values
  2. May not return anything
  3. Not for iteration

What is the design of Co-routine:

  1. Receive Input
  2. Process that Input
  3. Stop at yield statement

send() method is used to send value to coroutines.


More uses of Co-routines:

Coroutines are really powerful for Data-Processing Operations. and Multi-Processing)

One of most important course about Co-Routines/Concurrency and really interesting way to handle multiprocessing: —think win win—-


RUN parallel commands on your linux system Using Python

This is really simple Python script that would run on your system if You have to run parallel commands, All you just has to do is Open multiple tabs and run commandson each ab as per your requirement.

#!/usr/bin/env python
import subprocess

command = 'ping .'
terminal = ['gnome-terminal']

for host in ('server1', 'server2'):
    terminal.extend(['--tab', '-e', '''
        bash -c '
            echo "%(host)s$ %(command)s"
            ssh -t %(host)s "%(command)s"
    ''' % locals()])

Asynchronous recipes in Python

“Concurrency” is not “Parallelism” May be it’s better. If you will not work with DataScience, DataProcessing, Machine-Learning and other operations which are CPU-Intensive you probably will found that you don’t need parallelism but you need concurrency more!

  1. A Simple Example is Training a machine learning model is CPU intensive or You can use GPU.
  2. To Make various Predictions from one model based on many Different Input-Parameters to find out best result You need Concurrency!

There are so Many ways one can hack into Python stuff and do cool Stuff either it is CPU intensive or just a task to do stuff that is good/bad/Better/Best for one user to communicate. One thing you have to believe that Python Does support Multiprocessing as well as Multi-threading

but for various reasons when you are doing CPU intensive Tasks you have to Stay away from using Threading operations in Python. Use Numpy, Cython,Jython or anything you feel, Write C++ code and glue it with Python

The number of threads will usually be equivalent to the number of cores you have. If you have hyperthreading on your processor, than you will be able to double the number of threads used.

Above image is just one Example to understand what actually we are doing. We are processing Chunks and Chuncks of Data. Now the real common scenario is If you are using I/O bound tasks use Threads in Python if you are using CPU bound tasks use Processes in Python.  I have worked with various Python Projects where Performance was issue at some level so at that time I always went to other things like Numpy, Pandas, Cyhton or numba but not Plain-Python.

Let’s come to the point and Point is What are those Recipes I can use:

Using concurrent.futures(futures module is also back-ported into Python2.x):

Suppose you have to call multiple URLs at same time using same Method. That is what actually Concurrency is, Apply same method different operations, We can do it either using ThreadPool or ProcessPool.

# Using Process Pool
from concurrent.futures import ProcessPoolExecutor,as_completed
def health_check1(urls_list):
pool = ProcessPoolExecutor(len(urls_list))
futures = [pool.submit(requests.get,url,verify=False) for url in final_url]
results = [r.result() for r in as_completed(futures)] # when all operations done
return results # a Python list of all results, Here you can also use Numpy as well

Using ThreadPool it is also not different:

# Using Thread Pool
from concurrent.futures import ThreadPoolExecutor,as_completed</code>

def just_func(urls_list):
pool = ThreadPoolExecutor(len(urls_list))
futures = [pool.submit(requests.get,url,verify=False) for url in urls_list]
results = [r.result() for r in as_completed(futures)] # when all operations done
return results # a Python list of all results, Here you can also use Numpy as well

In the above code ‘url_list’ is just list of tasks which are similar and can be processed using same kind of functions.

On the other-side using it with with as context manager is also not different. In this Example I will Use ProcessPoolexecutor’s inbuilt map function.

def just_func(url_list):
   with concurrent.futures.ProcessPoolExecutor(max_workers=len(final_url)) as executor:
        result =,final_url)
    return [i for i in result]

Using multiprocessing: (Multiprocessing is also Python-library that can be used for Asynchronous behavior of your code.)

*in Multiprocessing the difference between map and apply_async is only that Map returns results as task list is passed to it on the other-hand apply_async returns results based on results those returned by function.

# Function that run multiple tasks
def get_response(url):
“””returns response for URL ”””
    response = requests.get((url),verify=False)
   return response.text

Now above function is simple enough that is getting one URL and returning response but if have to pass multiple URLs but I want that get request to each URL should be fired at same time then That would be Asynchronous process not multiprocessing because in Multiprocessing Threads/Processes needs to communicate with each other but on the other hand in case of Asynchrounous threads don’t communicate(in Python because Python uses Process based multiprocessing not Thread Based although you can do thread-based multiprocessing in Python but then you are on your OWN 😀 😛 Hail GIL (Mogambo/Hitler)).

So above function will be like this as usual:

from multiprocessing import Pool
pool = Pool(processes=20)
resp_pool =,tasks)
URL_list = []
resp_pool =,tasks)

Although This is an interesting link one can watch while going into Multiprocessing in Python using Multiprocessing: It is Process-Bases Parallelism.

Using Gevent: Gevent is a concurrency library based around libev. It provides a clean API for a variety of concurrency and network related tasks.

import gevent
import random

def task(pid):
Some non-deterministic task
    print('Task %s done' % pid)

def asynchronous():
    threads = [gevent.spawn(task, i) for i in xrange(10)]


If you have to Call Asynchronously but want to return results in Synchronous Fashion:

import gevent.monkey

import gevent
import urllib2
import simplejson as json

def fetch(pid):
    response = urllib2.urlopen('')
    result =
    json_result = json.loads(result)
    datetime = json_result['datetime']

    print('Process %s: %s' % (pid, datetime))
    return json_result['datetime']

def asynchronous():
    threads = []
    for i in range(1,10):
        threads.append(gevent.spawn(fetch, i))


Assigning Jobs in Queue:

import gevent
from gevent.queue import Queue</code>

tasks = Queue()

def worker(n):
    while not tasks.empty():
        task = tasks.get()
        print('Worker %s got task %s' % (n, task))

print('Quitting time!')

def boss():
   for i in xrange(1,25):


gevent.spawn(worker, 'steve'),
gevent.spawn(worker, 'john'),
gevent.spawn(worker, 'nancy'),

When you have to manage Different Groups of Asynchronous Tasks:

import gevent
from gevent.pool import Group</code>

def talk(msg):
    for i in xrange(3):

g1 = gevent.spawn(talk, 'bar')
g2 = gevent.spawn(talk, 'foo')
g3 = gevent.spawn(talk, 'fizz')

group = Group()


Same As multiprocessing Library you can also use Pool to map various operations:

import gevent
from gevent.pool import Pool</code>

pool = Pool(2)

def hello_from(n):
    print('Size of pool %s' % len(pool)), xrange(3))

Using Asyncio:

Now let’s talk about concurrency Again! There is already lot of automation is going inside asyncio or Gevent but as programmer we have to understand how we need to break a “One large task into small chuncks of Subtasks so when we will write code we will be able to understand which tasks can work independently.

import time
import asyncio

start = time.time()

def tic():
    return 'at %1.1f seconds' % (time.time() - start)

async def gr1():
# Busy waits for a second, but we don't want to stick around...
    print('gr1 started work: {}'.format(tic()))
    await asyncio.sleep(2)
    print('gr1 ended work: {}'.format(tic()))

async def gr2():
# Busy waits for a second, but we don't want to stick around...
    print('gr2 started work: {}'.format(tic()))
    await asyncio.sleep(2)
    print('gr2 Ended work: {}'.format(tic()))

async def gr3():
    print("Let's do some stuff while the coroutines are blocked, {}".format(tic()))
    await asyncio.sleep(1)

ioloop = asyncio.get_event_loop()
tasks = [

Now in the above code gr1 and gr2 are somehow taking some time to return anything it could any kind of i/o operation so what we can do here is go to the gr3 in using the event_loop and event_loop will run until all three tasks are not completed.

Please have a closer look at await keyword in the above code. It is one of the most important step where you can assume interpreter is shifting from one task to another or you can call it pause for function. If you have worked with yield or yield from in Python2 and Python3 you would be able to understand that this is stateless step for the code.

There is on more library which is aiohttp that is being used to handle blocking Http requests with asyncio.

import time
import asyncio
import aiohttp

URL = ''

async def fetch_async(pid):
    print('Fetch async process {} started'.format(pid))
    start = time.time()
    response = await aiohttp.request('GET', URL)
    return response

async def asynchronous():
    start = time.time()
   tasks = [asyncio.ensure_future(fetch_async(i)) for i in range(1, MAX_CLIENTS +1)]
   await asyncio.wait(tasks)
    print("Process took: {:.2f} seconds".format(time.time() - start))

ioloop = asyncio.get_event_loop()

In all the above Examples we have just Scratched the world of concurrency but in real there would be much more to look into because real world problems are more complex and intensive. There are various other options in asyncio like handling exceptions with-in futures, creating future wrappers for normal tasks,Applying timeouts if task is taking more than required time and doing something else instead.

There is lot of inspiration I got while learning about concurrent programming in Python from the following Sources:

Run Flask in Parallel using ThreadPoolExecutor

from flask import Flask
from time import sleep
from concurrent.futures import ThreadPoolExecutor

executor = ThreadPoolExecutor(2)

app = Flask(__name__)

def run_jobs():
executor.submit(some_long_task2, 'hello', 123)
return 'Two jobs was launched in background!'

def some_long_task1():
print("Task #1 started!")
print("Task #1 is done!")

def some_long_task2(arg1, arg2):
print("Task #2 started with args: %s %s!" % (arg1, arg2))
print("Task #2 is done!")

if __name__ == '__main__':

Running Multiprocessing in Flask App(Let’s Spawn) Hell Yeah

Ok It was going to be long time but Finally yeah Finally Able to do Process based multiprocessing in Python and even on Flask. 🙂 oh yeah! There are various recipes for Multiprocessing in this python but here you can only Enjoy with Flask.


from multiprocessing import Pool
from flask import Flask
from flask import jsonify
import ast
import pandas as pd
import requests

app = Flask(__name__)
_pool = None

# Function that run multiple tasks
def get_response(x):
"""returns response for URL list"""
m = requests.get((x),verify=False)
return m.text

def health_check():
"""returns pandas dataframe into HTML for health-check Services"""
resp_pool =,tasks)
table_frame= pd.DataFrame([ast.literal_eval(resp) for resp in resp_pool])
return table_frame.to_html()

if __name__=='__main__':
_pool = Pool(processes=12) # this is important part- We
# insert production server deployment code
except KeyboardInterrupt:


Python For Yankees

Before going further reading about Class Development I would like you to tell
one important thing:

Most of the uses of inheritance can be simplifi ed or replaced with composition, and multiple
inheritance should be avoided at all costs.

I am sure after this you will be able to read lots and lots of code written in
python. 🙂 If you want to do it fast, Do it well. 🙂

Python Class Development Toolkit.

def __init__():

#Don't put anything in instance that you don't need instance for.

def i_am_class_method(self):
    return 'takes self as argument'

#If you need to shared data for whole class then put it on class level,
#not on instance level.

class IAm(object):

    def __init__(self):
        print shared_data

    def fun_2(self):
        print shared_data

Iron_clad_rule:- in java or C++ is–>do not expose your attributes

Subclassing is just like inhertance. Data+methods:

Class NewOne(IAm):

    def fun3(self):
       return IAm.fun_2(self)

#Multiple/Alternative-Constructors:(When you need to change the behavior
#of class's data by just calling an intensive function):

class Circle(object):

    def init_(self,radius):

    def fun_2(self):
        print self.radius

   @classmethod   # Alternative constructive
   def from_bbd(cls,new_radius):
        """construct a circle from bounding box diagnol"""

       raduis = new_radius/2.0/math.sqrt(2.0)
       return Circle(radius)

Make Alternate constructor to work for Subclass as well:

class Circle(object):

    def __init__(self,radius):

    def fun_2(self):
        print self.radius

   @classmethod   # Alternative constructive
    def from_bbd(cls,new_radius):
       """construct a circle from bounding box diagnol"""
        raduis = new_radius/2.0/math.sqrt(2.0)
        return cls(radius)

Independent methods inside class: – Why we ever need those?
Think of situation where everything in your code breaks but you just need to tell
user something and something and all you need is 🙂 Static-Method.!!

class Circle(object):

    def __init__(self,radius):

    def fun_2(self):
        print self.radius

    def just_method(just_parameter):
        return 'I am independent'

Getters and Setters in Python:(Access Data,change data on the fly)
As in the other languages there are inbuilt methods to access data from
class as well as change values of method attributes in the class.

Python uses the property decorator to perform such operations:

class Person(object):
    def __init__(self, first_name, last_name):
       self.first_name = first_name
       self.last_name = last_name</code>

    def full_name(self):
        return self.first_name + ' ' + self.last_name

    def full_name(self, value):
        first_name, last_name = value.split(' ')
        self.first_name = first_name
        self.last_name = last_name

    def full_name(self):
       del self.first_name
       del self.last_name

Slots in Pyhton:
When you have lots and lots of things to perform or do in the way that
your class instance is consuming HUGE memory then you must use SLOTS.

Earlier approach:

class MyClass(object):

def __init__(self, name, identifier): = name
    self.identifier = identifier

#Approach after SLOTS:

class MyClass(object):
    __slots__ = ['name', 'identifier']
    def __init__(self, name, identifier): = name
       self.identifier = identifier

Alpha generation Quantitative

The average quantitative strategy may take from 10 weeks to seven months to develop, code, test and launch.[6] It is important to note that alpha generation platforms differ from low latency algorithmic trading systems. Alpha generation platforms focus solely on quantitative investment research rather than the rapid trading of investments. While some of these platforms do allow analysts to take their strategies to market, others focus solely on the research and development of these highly complex mathematical and statistical models.

After building Models(Paint those!)

Cross Validation: Each Sample is separated into random equal sized sub-samples, Helps to improve model performance.

Different Forms of cross Validation:

  1. Train-Test Split – low variance but more bias
  2. LOOCV(Leave one out Cross validation) – Leave one data point out and apply model on rest of the data. -low bias but high variance,

Now in the above two methods we have limitations related to Bias and variance, So what to do? Let’s fire-up ‘Cross-Validation’!

There are various other important Cross Validation Examples/Methods those are interesting like Time-series_Split, Leave_P_out(LPO), Random_permutation_Split(Shuffle and split), StarifiedKfold,:

Special Case:

Some classification problems can exhibit a large imbalance in the distribution of the target classes: for instance there could be several times more negative samples than positive samples. In such cases it is recommended to use stratified sampling as implemented in StratifiedKFold and StratifiedShuffleSplit to ensure that relative class frequencies is approximately preserved in each train and validation fold.


%d bloggers like this: