my net house

WAHEGURU….!

Category Archives: Python

Alpha generation Quantitative

The average quantitative strategy may take from 10 weeks to seven months to develop, code, test and launch.[6] It is important to note that alpha generation platforms differ from low latency algorithmic trading systems. Alpha generation platforms focus solely on quantitative investment research rather than the rapid trading of investments. While some of these platforms do allow analysts to take their strategies to market, others focus solely on the research and development of these highly complex mathematical and statistical models.

https://en.wikipedia.org/wiki/Alpha_generation_platform

After building Models(Paint those!)

Cross Validation: Each Sample is separated into random equal sized sub-samples, Helps to improve model performance.

Different Forms of cross Validation:

  1. Train-Test Split – low variance but more bias
  2. LOOCV(Leave one out Cross validation) – Leave one data point out and apply model on rest of the data. -low bias but high variance,

Now in the above two methods we have limitations related to Bias and variance, So what to do? Let’s fire-up ‘Cross-Validation’!

There are various other important Cross Validation Examples/Methods those are interesting like Time-series_Split, Leave_P_out(LPO), Random_permutation_Split(Shuffle and split), StarifiedKfold,:

Special Case:

Some classification problems can exhibit a large imbalance in the distribution of the target classes: for instance there could be several times more negative samples than positive samples. In such cases it is recommended to use stratified sampling as implemented in StratifiedKFold and StratifiedShuffleSplit to ensure that relative class frequencies is approximately preserved in each train and validation fold.

 

Python for text processing

Python is more about ‘Programming like Hacker’ while writing your code if you keep things in mind like reference counting, type-checking, data manipulation, using stacks, managing variables,eliminating usage of lists, using less and less “for” loops could really warm up your code for great looking code as well as less usage of CPU-resources with great Speed.

Slower than C:

Yes Python is slower than C but you really need to ask yourself that what is fast or what you really want to do. There are several methods to write Fibonacci in Python. Most popular is one using ‘for loop’ only because most of the programmers coming from C background uses lots and lots of for loops for iteration. Python has for loops as well but if you really can avoid for loop by using internal-loops provided by Python Data Structures and Numpy like libraries for array handling You will have Win-Win situation most of the times. 🙂

Now let’s go with some Python tricks those are Super cool if you are the one who manipulates,Filter,Extract,parse data most of the time in your job.

Python has many inbuilt methods text processing methods:

>>> m = ['i am amazing in all the ways I should have']

>>> m[0]

'i am amazing in all the ways I should have'

>>> m[0].split()

['i', 'am', 'amazing', 'in', 'all', 'the', 'ways', 'I', 'should', 'have']

>>> n = m[0].split()

>>> n[2:]

['amazing', 'in', 'all', 'the', 'ways', 'I', 'should', 'have']

>>> n[0:2]

['i', 'am']

>>> n[-2]

'should'

>>>

>>> n[:-2]

['i', 'am', 'amazing', 'in', 'all', 'the', 'ways', 'I']

>>> n[::-2]

['have', 'I', 'the', 'in', 'am']

Those are uses of lists to do string manipulation. Yeah no for loops.

Interesting portions of Collections module:

Now let’s talk about collections.

Counter is just my personal favorite.

When you have to go through ‘BIG’ lists and see what are actually occurrences:

from collections import Counter

>>> Counter(xrange(10))

Counter({0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1})

>>> just_list_again = Counter(xrange(10))

>>> just_list_again_is_dict = just_list_again

>>> just_list_again_is_dict[1]

1

>>> just_list_again_is_dict[2]

1

>>> just_list_again_is_dict[3]

1

>>> just_list_again_is_dict['3']

0

Some other methods using counter:

Counter('abraakadabraaaaa')

Counter({'a': 10, 'r': 2, 'b': 2, 'k': 1, 'd': 1})

>>> c1=Counter('abraakadabraaaaa')

>>> c1.most_common(4)

[('a', 10), ('r', 2), ('b', 2), ('k', 1)]

>>> c1['b']

2

>>> c1['b'] # work as dictionary

2

>>> c1['k'] # work as dictionary

1

>>> type(c1)

<class 'collections.Counter'>

>>> c1['b'] = 20

>>> c1.most_common(4)

[('b', 20), ('a', 10), ('r', 2), ('k', 1)]

>>> c1['b'] += 20

>>> c1.most_common(4)

[('b', 40), ('a', 10), ('r', 2), ('k', 1)]

>>> c1.most_common(4)

[('b', 20), ('a', 10), ('r', 2), ('k', 1)]

Aithematic and uniary operations:

>>> from collections import Counter

>>> c1=Counter('hello hihi hoo')

>>> +c1

Counter({'h': 4, 'o': 3, ' ': 2, 'i': 2, 'l': 2, 'e': 1})

>>> -c1

Counter()

>>> c1['x']

0

Counter is like a dictionary but it also considers the counting important of all the content you are looking for. So you can plot the stuff on Graphs.

OrderedDict:

it makes your chunks of data into meaningful manner.

>>> from collections import OrderedDict
>>> d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
>>> new_d = OrderedDict(sorted(d.items()))
>>> new_d
OrderedDict([('apple', 4), ('banana', 3), ('orange', 2), ('pear', 1)])
>>> for key in new_d:
...     print (key, new_d[key])
... 
apple 4
banana 3
orange 2
pear 1

Namedtuple:

Think it the way you need to save each line of your CSV into list of lines but along with that you also need to take care of not just the memory but as well as You should be able to store each line as dictionary data structure so if you are fetching lines from Excel or CSV document which comes in place when you work at Data-Processing environment.

# The primitive approach
lat_lng = (37.78, -122.40)
print 'The latitude is %f' % lat_lng[0]
print 'The longitude is %f' % lat_lng[1]

# The glorious namedtuple
LatLng = namedtuple('LatLng', ['latitude', 'longitude'])
lat_lng = LatLng(37.78, -122.40)
print 'The latitude is %f' % lat_lng.latitude
print 'The longitude is %f' % lat_lng.longitude

ChainMap:

It is Container of Containers: Yes that’s really true. 🙂

You better be above Python3.3 to try this code.

>>> from collections import ChainMap

>>> a1 = {'m':2,'n':20,'r':490}

>>> a2 = {'m':34,'n':32,'z':90}

>>> chain = ChainMap(a1,a2)

>>> chain

ChainMap({'n': 20, 'm': 2, 'r': 490}, {'n': 32, 'm': 34, 'z': 90})

>>> chain['n']

20

# let me make sure one thing, It does not combines the dictionaries instead chain them.

>>> new_chain = ChainMap({'a':22,'n':27},chain)

>>> new_chain['a']

22

>>> new_chain['n']

27

Comprehensions:

You can also do comprehensions with dictionaries or sets as well.

>>> m = {'a': 1, 'b': 2, 'c': 3, 'd': 4}

>>> m

{'d': 4, 'a': 1, 'b': 2, 'c': 3}

>>> {v: k for k, v in m.items()}

{1: 'a', 2: 'b', 3: 'c', 4: 'd'}


StartsWith and EndsWith methods for String Processing:

Startswith, endswith. All things have a start and an end. Often we need to test the starts and ends of strings. We use the startswith and endswith methods.

phrase = "cat, dog and bird"

# See if the phrase starts with these strings.
if phrase.startswith("cat"):
    print(True)

if phrase.startswith("cat, dog"):
    print(True)

# It does not start with this string.
if not phrase.startswith("elephant"):
    print(False)

Output

True
True
False

Map and IMap as inbuilt functions for iteration:

map is rebuilt in Python3 using generators expressions under the hood which helps to save lot of memory but in Python2 map uses dictionary like expressions so you can use ‘itertools’ module in python2 and in itertools the name of map function is changed to imap.(from itertools import imap)

>>>m = lambda x:x*x
>>>print m
 at 0x7f61acf9a9b0>
>>>print m(3)
9

# now as we understand lamda returns the values of expressions for various functions as well, one just have to look
# for various other stuff when you really takes care of other things

>>>my_sequence = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
>>>print map(m,my_sequence)
[1,4,9,16,25,36,49,64,81,100,121,144,169,196,225,256,289,324,361,400]

#so square is applied on each element without using any loop or if.

For more on map,reduce and filter you can fetch following jupyter notebook from my Github:

http://github.com/arshpreetsingh

 

ORMise your DB operations!

ORM stands for Object Relational Mapping.

Now what is that?

Compared to traditional techniques of exchange between an object-oriented language and a relational database, ORM often reduces the amount of code that needs to be written.

There must be some kind of downside for this approach?

Disadvantages of ORM tools generally stem from the high level of abstraction obscuring what is actually happening in the implementation code. Also, heavy reliance on ORM software has been cited as a major factor in producing poorly designed databases

 

 

Now that diagram is just pulled from Internet that one is supposed to tell us about working structure of SQLAlchemy that is our main purpose to write this blog.

Let’s start from groud:

  • We have Database. (of-course what else we should have, mangoes 😀 )
  • DBAPI. (definitely needs otherwise how else we will make calls to our database for read and write operations )
  • SQLAlchemy CORE

Before talking about SQLAlchemy CORE we should really talk about what SQLAlchemy people believe about ORMs.

SQLAlchemy CORE

SQLAlchemy’s overall approach to these problems is entirely different from that of most other SQL / ORM tools, rooted in a so-called complimentarity- oriented approach; instead of hiding away SQL and object relational details behind a wall of automation, all processes are fully exposed within a series of composable, transparent tools. The library takes on the job of automating redundant tasks while the developer remains in control of how the database is organized and how SQL is constructed.

The main goal of SQLAlchemy is to change the way you think about databases and SQL!

That is true, when you start working with it you feel like you are controlling your DB with your crazy logics not from DB quiries. (That looks like bit freedom it could crash my DB if I have no Idea what I am doing but I guess that is the beauty of it. 😉 😀 )

Core contains the methods those are integrated with DB API to create Connection with DB,handle sessions,create and delete tables-rows-columns,insertion,execution,selection,accessing values from IDs comes into place. It really feels like you are just writing your favourite language while handling DB operations in place. Moreover every operation just works on the fly unless you have really messed up your DB like breaking connection while reading/writing process,declaring wrong types,messing with fields or just dumping data without even parsing-cleaning it bit. Exception handling really comes into place when you interact with DB this way. 🙂 ❤ 🙂 [I just love programming and it's nature]

There are many things as well from SQLAlchemy core those we can talk about but I feel we should stop here otherwise I might have to shift my career from developer to writer. 😉 😀

Let's taste some code so this post will really help me in near future when I will work with much complicated DB operations those really need mind mash up. 😉


from sqlalchemy import * # don't use * in production

# if you are using Mysql look for commented code
#engine = create_engine(‘mysql+pymysql://:@localhost/mdb_final’)

engine = create_engine(‘sqlite:////home/metal-machine/Desktop/sqlalchemy_example.db’)

metadata= MetaData(engine)

# creating tables, be careful with data-types 🙂
omdb_data = Table(‘positions’, metadata,
Column(‘omdb_id’, Integer, primary_key=True),
Column(‘status’, String(200)),
Column(‘timestamp’, Float(10)),
Column(‘symbol’, String(200)),
Column(‘amount’, Float(10)),
Column(‘base’, Float(10)),
Column(‘swap’, Float(10)),
Column(‘pl’, Float(10)),)

omdb_data.create() # creating tables and values
mm = omdb_data.insert() #

mm.execute({‘title’:str(movie_info.title),’poster’:str(movie_info.poster_url),’cover’:str(movie_info.cover_url),’imdb_rating’:float(movie_info.rating),’genre’:str(movie_info.genres),’plot’:str(movie_info.plot_outline),’year’:float(movie_info.year),’movie_id’:i,’runtime’:float(movie_info.runtime)})
So above can be considered as simplest form for understanding DB writing operations using SQLAlchemy.

Making connection and updating DBs.

# make sure this DB is already created, this time we are only creating connection to read
# or insert data if we need.

bit_fine_data = create_engine('sqlite:////home/metal-machine/Desktop/sqlalchemy_example.db')
order_data_meta=MetaData(bit_fine_data)
# calling all the tables in the required DB, we just have to pass table name in
# Table Class so we will be able to access,create,insert,execute from one variable.

positions_table = Table('positons',order_data_meta, autoload=True)
balance_status_table = Table('balance_status',order_data_meta, autoload=True)
account_info_table = Table('account_info',order_data_meta, autoload=True)

# inserting values in table
m=positions_table.insert({ 'status':positions['status'],'timestamp':positions['status'],'symbol':positions['status'],'amount':positions['amount'],'base':positions['base'],'swap':positions['swap'],'pl':positions['pl']})

# executing the insert data command
print bit_fine_data.execute(m)

How to read data from rows or columns from DB:


db = create_engine('sqlite:////home/metal-machine/Desktop/order_id.db')
metadata = MetaData(db)
# creating instance for Table-'orders'
tickers = Table('orders', metadata, autoload=True)

#selecting particular column from table 'orders'
time_stamp = tickers.select(tickers.c.timestamp)
# creating array from the data we get in the 'timestamp' column (creating array is optional #here)

timestamp_array = np.array([i[1] for i in time_stamp.execute()])

There are much more things left for SQLAlchemy core but I believe we should stop here and look for other things as well.

Stay tuned for SQLAlchemyORM part.

Rocks cluster for virtual containers

First of all I would like to thanks Rocks community for saving our lots of money, at initial we were thinking about buying very expensive hardware and use it as dedicated server on which we would be able to run multiple docker containers as well as multiple virtual machines. Such systems are quite  expensive: Following examples of such systems are considerable when you are really serious about some kind of computing power either for research or for server business kind of thing.

  1. http://www.ebay.in/itm/191944430915 (A base class example, price range is 50 K)
  2. http://stores.ebay.com/Cypress-Technology-Inc/HP-9000-Servers-HP-UX-/ (Other possible high availability options price range is more thank 100K)

 

But now we had to do setup with solution which should not be costlier more than 20-30K and we want at least 8 cores and 16 GB of RAM. Presently our requirement was not too high so rather than spending much amount on SSDs(Solid state drives) we just zeroed to normal mechanical HDs.

We used used core2duo and dual-core CPUs as slave nodes for Rocks cluster, Presently we are having i3 second generation home PC for Front-Node that we might upgrade in near future but it is really efficient and working pretty much fine on CentOS. ❤ ❤

 

Now when we talk about Cluster-computing only one thing comes in mind a set of connected CPUs to perform heavy operations and using all core together to run some kind of simulation and feel like a scientist at NASA. 😀

Thanks to ‘Dr. H.S. Rai’(http://hs.raiandrai.com/) that he introduced me Rock’s Cluster many months ago that really changed my perception about super-computers,parallel-processing and most of the stuff which I am still not able to remember. 😛

So back to clusters! There are many types of clusters it just really depends on your problem, like what kind of problem you want to  solve using such systems.

Problem: User/Client wanted a simple Machine having multiple cores and GBs of memory so he/she will be able to create new virtual container for any new user as per the requirement

This tutorial assumes that you have installed Rocks Front-node in one of the system and have lots of other hardware available to you to connect with your Front-node.

Something like this:

 

 

This:

OR EVEN THIS:

If you are still not getting what I am trying to say you better be first go to Rocks cluster website and look what they really are doing!Adding compute Nodes:(It is one form of cluster)

 

For adding virtual containers Rocks comes with XEN(http://www.rocksclusters.org/roll-documentation/xen/5.0) roll. There are so many Rock’s roles those come as per the requirement. For example there is HPC roll that comes with OpenMPI(Open message protocol interface) that can be used if you want to execute your code more than two or three nodes using computing cores of most systems together. A generic way is something like this:

# execute_progarm compute-node-0 compute-node-1 compute-node-3

Such type of systems are used when you have lot of data to analyse or handle but even for that present industry rely on expensive stuff rather than using Rock’s implementation. 😦 ;D let’s save this for another day and concentrate only on visualization stuff.

 

So for implementation of vitalized containers we have to install XEN roll in Front-node, that can be installed while normal installation of Front-node if you are using Jumbo DVD(comes with all rolls ~ size of 3.SOMETHING GBs) or Rocks also provide all rolls(http://www.rocksclusters.org/roll-documentation/) as different ISOs.

After successful installation of XEN roll one need to connect slave node either via direct to Ethernet card or use network-switch. (Make sure while doing all this stuff you are logged-in as root user)

Execute following command.(That’s my favourite command in the whole world, I FEEL like GOD 😀 )

#  insert-ethers

You will get screen like following or it could be different if you are using other version of Rocks but you only have to concentrate on VM-Containers. Mkae sure at this time your slave node is having PXE-boot enabled. To enable PXE boot you have to look for slvave-node BIOS.

 

 

Hit enter after choosing required option and you will look installation on slave node will be started. It could take some time so have patience. 😛

While your VM container is being installed please have a look at the stuff we are doing so you will be able to understand the architecture or our cluster.

 

 

 

 

 

Or you can also see our creativity as well. 😛

20160919_184952

 

 

 

 

 

 

 

 

After successful installation of VM container you can see it will be available in your System, save your node and quit. To analyse all this process or to get idea what I am really talking about you can look for this link as well but let me clear that first we are using VM container here not compute node.(http://www.rocksclusters.org/rocks-documentation/4.3/install-compute-nodes.html)

 

To assign IP to your VM container run following command but make sure you have your Static IP so one will be able to access your container from public internet.

# rocks add cluster ip="your_static_IP" num-computes=<1,2,or 3>

above commands look simple just mention your static IP and number of compute nodes
you want to use. It could be 1 or 2 depending on how many nodes you have and
how many VM containers you want to create.

Now clear one thing first yet we have only created virtual cluster not Virtual machines

# rocks list cluster

Above command will give you available VM clusters present in your system.

Now before creating Virtual machines we need to create RSA key pairs so we will be able to do login from Front-node to

Virtual cluster and do required operations:

# rocks create keys key=private.key passphrase=no
setting option: passphrase=no will not ask you for password but you can skip
that if you want to use password with your security key.

Add that key to your newly created virtual cluster:

# rocks add host key frontend-0-0-0 key=public.key
Following command will start installation of VM on your Virtual cluster:

# rocks open host console frontend-0-0-0 key=private.key

After installation of VM on frontend now it depends on you how you want to add virtual nodes to your system. This time these nodes will be real virtual and you can associate your static IP with those.

Again we are back to insrt-ethers but this time we are logged-in to our Virtual frontend node that we created by combining one and more node connecting together. (this text is written in bold format because it is mind blowing concept and I have blown my mind many times while understanding this step but I really don’t want you to blow yours. :D)

AGAIN: I am shouting that we have to login to VM Front-node not real Front-node 😀

# insert-ethers

Select “Compute” as the appliance type. (This time select compute and you don’t have to worry about booting and setting up slave node because we are working Virtually!!! yeah man!  I am high I think at this point and songs are being played in my mind:D)

##########################################################################################

In another terminal session on vi-1.rocksclusters.org, we’ll need to set up the environment to send commands to the Airboss on the physical frontend. We’ll do this by putting the RSA private key that we created in section Creating an RSA Key Pair (e.g., private.key) on vi-1.rocksclusters.org.

Prior to sending commands to the Airboss, we need to establish a ssh tunnel between the virtual frontend (e.g., vi-1) and the physical frontend (e.g., espresso, where the Airboss runs). This tunnel is used to securely pass Airboss messages. On the virtual frontend (e.g., vi-1), execute:

# ssh -f -N -L 8677:localhost:8677 espresso.rocksclusters.org

###########################################################################################

Now we can securely send messages to the Airboss.

Did I tell you what is Airboss?

 

 

 

 

Now make sure you know mac address of your systems so you would be able to power them on/off using following command:

# rocks set host power <mac-address> key=private.key action=install

How to get MAC address?

# rocks list host macs <your-cluster-name> key=private.key

<your-cluster-name> is the real-font-node that you named while installing the Rocks on your system first time!

Above command will be give output like this: (yours output will be definitely different according to your compute nodes)

MACS IN CLUSTER  
36:77:6e:c0:00:02
36:77:6e:c0:00:00
36:77:6e:c0:00:03

when you will power on your real node in VM container you will see that it is
detected by VM containers as follows:

 

To turn your VM off following command should be executed:

# rocks set host power compute-0-0 key=private.key action=off

It was all about setting up virtual cluster and turning on/off your nodes in-between the VM containers. Let me clear one thing that VM container is one that contains various physical nodes those we can use in combined form to create virtual machines as big or as small we want. (:P  that looks like easy definition :P)

 

OK now that was most difficult part and if you have reached here just give yourself a BIG SABAASH!

 

If you are aware of  Virt-manager provided by the RED hat you are good to have smooth ride from here otherwise defiantly take a look at  (https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Administration_Guide/chap-Virtualization_Administration_Guide-Managing_guests_with_the_Virtual_Machine_Manager_virt_manager.html)

Now keep yourself in Real-Front-node as root user and make sure the required Virtual-clusters are running as required otherwise you will not be able to create virtual machines on Virtual containers. (I am using ‘virtual’ word so many times and  I am really not sure either I am in real world or virtual?)

As root user in Front-node run:

# virt-manager

You should be able to see your VM containers there and virtual-machines:

 

Voilaa!!!

Now Don’t ask What to do with your Virtual Machines. 😛
Here is our setup:

 

4377333377526847638account_id1-jpg

A simple script to do parsing of large file and save it to Numpy array

A normal approach:


huge_file = 'huge_file_location'
import re
import numpy as np
my_regex=re.compile(r'tt\d\d\d\d\d\d\d') #using a compiled regex saves the time
a=np.array([]) # just an array to save all the files
with open(file_location,'r') as f: # almost default method to open file
m = re.findall(my_regex,f.read())
np_array = np.append(a,m)
print np_array
print np_array.size
print 'unique'
print np.unique(np_array) # removing duplicate entries from array
print np.unique(np_array).size
np.save('BIG_ARRAY_LOCATION',np.unique(np_array))

In the above code f.read() saves big chuck of string into memory that is about 8GB in present situation. let’s fire up Generators.

A bit improved version:


def read_in_chunks(file_object):
while True:
data = file_object.read()
if not data:
break
yield data
import numpy as np
import re
a=np.array([])
my_regex=re.compile(r'tt\d\d\d\d\d\d\d')
f = open(file_location)
for piece in read_in_chunks(f):
m = re.findall(my_regex,piece) # but still this is bottle neck
np_array = np.append(a,m)
print np_array
print np_array.size
print 'unique'
print np.unique(np_array)
print np.unique(np_array).size

A little bit faster code:


file_location = '/home/metal-machine/Desktop/nohup.out'
def read_in_chunks(file_object):
while True:
data = file_object.read()
if not data:
break
yield data

import numpy as np
import re
a=np.array([])
my_regex=re.compile(r’tt\d\d\d\d\d\d\d’)
f = open(file_location)
def iterate_regex():
”’ trying to run iterator on matched list of strings as well”’
for piece in read_in_chunks(f):
yield re.findall(my_regex,piece)
for i in iterate_regex():
np_array = np.append(a,i)
print np_array
print np_array.size
print ‘unique’
print np.unique(np_array)
print np.unique(np_array).size

But why performance is still not taht good? Hmmm……
Have to look for more things. Please use the required indentation while testing. 😛

Look at the CPU usage running on Goole instance 8Core system.

 

cpu-usage.png

Python and my fav parts

In this Post one can get head-start knowledge about basic but bit advanced corners of Python programming which can lead to beautiful code-base and title for Python Ninja in terms of experience.

1. List comprehensions
2. Generator Expressions
3. Generator function

List comprehensions: Which are basically iteration over list
Earlier approach:


p = [1,2,3,4,5]
In [4]: for i in range(len(p)):
p[i]+=10
In [5]: p
out[5]: [11, 22, 13, 14, 15]


Using List comprehensions we can provide much better code:


p = [1,2,3,4,5]
In [7]: p = [i+10 for i in p] # THis is list comprehensions
In [8]: p
Out[8]: [11, 12, 13, 14, 15]


Generator Expressions:

One potential downside of using a list comprehension is that it might
produce a large result if the original input is large. If this is a concern,
you can use generator expressions to produce the filtered values iteratively.

for example if list input is quite big Generator Expression is Amazing:


In [37]:  p = (i for i in xrange(10000))

in[38]: p

Out[38]: at 0x7ff16e47b0a0>
Here we got the generator object, now we can use it as we want:

We can also use generator Expression as  argument:


In [49]:  p = (i for i in xrange(10000)) # remember we used xrange()
In [49]: p  = list(p)
In [50]: s = sum(i*i for i in p)
In [51]: s
Out[51]: 333283335000
In [52]: s = sum(i+10 for i in p)
In [53]: s
Out[53]: 50095000

This stuff is super fast.
now using the generator expression convert tuple into CSV file:


In [54]: s = ('ACME', 50, 123.45)
In [55]: print(','.join(str(x) for x in s))
ACME,50,123.45

So from all the above discussion we are now clear of one thing that we should
use Generator expressions as much as you know.

Generator Functions:

Now any function which contains “yield” function. that is generator function.
there are many few examples I would Like to share here:


In [21]: def hello(item):
h = item*3
yield h

In [22]: for i in hello(item=’iamamazing’):
print i

iamamazingiamamazingiamamazing

so now above it generator function as we are learning about it.

Then what is practical use of this thing? Generator functions are mostly fast in terms of text processing before Going further I need to see how Generators are very helpful in
data-processing, Data Flow , Big data or any other kind of Stuff.

__init__.py in Python

Packages are a way of structuring Python’s module namespace by using “dotted module names”. For example, the module name A.B designates a submodule named B in a package named A. Just like the use of modules saves the authors of different modules from having to worry about each other’s global variable names, the use of dotted module names saves the authors of multi-module packages like NumPy or the Python Imaging Library from having to worry about each other’s module names.

Suppose you want to design a collection of modules (a “package”) for the uniform handling of sound files and sound data. There are many different sound file formats (usually recognized by their extension, for example: .wav, .aiff, .au), so you may need to create and maintain a growing collection of modules for the conversion between the various file formats. There are also many different operations you might want to perform on sound data (such as mixing, adding echo, applying an equalizer function, creating an artificial stereo effect), so in addition you will be writing a never-ending stream of modules to perform these operations. Here’s a possible structure for your package (expressed in terms of a hierarchical filesystem):

sound/                          Top-level package
      __init__.py               Initialize the sound package
      formats/                  Subpackage for file format conversions
              __init__.py
              wavread.py
              wavwrite.py
              aiffread.py
              aiffwrite.py
              auread.py
              auwrite.py
              ...
      effects/                  Subpackage for sound effects
              __init__.py
              echo.py
              surround.py
              reverse.py
              ...
      filters/                  Subpackage for filters
              __init__.py
              equalizer.py
              vocoder.py
              karaoke.py
              ...

When importing the package, Python searches through the directories on sys.path looking for the package subdirectory.

The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path. In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable, described later.

Users of the package can import individual modules from the package, for example:

import sound.effects.echo

This loads the submodule sound.effects.echo. It must be referenced with its full name.

sound.effects.echo.echofilter(input, output, delay=0.7, atten=4)

An alternative way of importing the submodule is:

from sound.effects import echo

This also loads the submodule echo, and makes it available without its package prefix, so it can be used as follows:

echo.echofilter(input, output, delay=0.7, atten=4)

Yet another variation is to import the desired function or variable directly:

from sound.effects.echo import echofilter

Again, this loads the submodule echo, but this makes its function echofilter() directly available:

echofilter(input, output, delay=0.7, atten=4)

Note that when using from package import item, the item can be either a submodule (or subpackage) of the package, or some other name defined in the package, like a function, class or variable. The import statement first tests whether the item is defined in the package; if not, it assumes it is a module and attempts to load it. If it fails to find it, an ImportError exception is raised.

Contrarily, when using syntax like import item.subitem.subsubitem, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.

6.4.1. Importing * From a Package

Now what happens when the user writes from sound.effects import *? Ideally, one would hope that this somehow goes out to the filesystem, finds which submodules are present in the package, and imports them all. This could take a long time and importing sub-modules might have unwanted side-effects that should only happen when the sub-module is explicitly imported.

The only solution is for the package author to provide an explicit index of the package. The import statement uses the following convention: if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered. It is up to the package author to keep this list up-to-date when a new version of the package is released. Package authors may also decide not to support it, if they don’t see a use for importing * from their package. For example, the file sound/effects/__init__.py could contain the following code:

__all__ = ["echo", "surround", "reverse"]

This would mean that from sound.effects import * would import the three named submodules of the sound package.

If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py. It also includes any submodules of the package that were explicitly loaded by previous import statements. Consider this code:

import sound.effects.echo
import sound.effects.surround
from sound.effects import *

In this example, the echo and surround modules are imported in the current namespace because they are defined in the sound.effects package when the from...import statement is executed. (This also works when __all__ is defined.)

Although certain modules are designed to export only names that follow certain patterns when you use import *, it is still considered bad practise in production code.

Remember, there is nothing wrong with using from Package import specific_submodule! In fact, this is the recommended notation unless the importing module needs to use submodules with the same name from different packages.

6.4.2. Intra-package References

When packages are structured into subpackages (as with the sound package in the example), you can use absolute imports to refer to submodules of siblings packages. For example, if the module sound.filters.vocoder needs to use the echo module in the sound.effects package, it can use from sound.effects import echo.

You can also write relative imports, with the from module import name form of import statement. These imports use leading dots to indicate the current and parent packages involved in the relative import. From the surround module for example, you might use:

from . import echo
from .. import formats
from ..filters import equalizer

Note that relative imports are based on the name of the current module. Since the name of the main module is always "__main__", modules intended for use as the main module of a Python application must always use absolute imports.

6.4.3. Packages in Multiple Directories

Packages support one more special attribute, __path__. This is initialized to be a list containing the name of the directory holding the package’s __init__.py before the code in that file is executed. This variable can be modified; doing so affects future searches for modules and subpackages contained in the package.

While this feature is not often needed, it can be used to extend the set of modules found in a package.

How Toptal Web Engineering Community is revoluting IT industry?

Globalization lead to moving lots of people from various places to do the work they love and want to make amazing future in specific industry, IT industry is one of that too but most part of information technology industry is oriented around “work from your place/city/country/”. One can set-up office at home to fetch work from any location in the world, Complete it within a required time and get paid. Which seems like a great plan for client as well as for Computer guy(Developer,Freelancer,programmer) Such criteria lead to freelancing industry which has created great influence over society, Due to that most of the people  believe in “work for yourself” term that resulted quality work and great innovations by individuals.

Job portal work like a bridge between Client and Freelancer which should have great standards for Jobs and Freelancers. I believe Toptal is one of them because of it’s well defined selection process. Which makes the freelancers to not just be serious about work but also connects the clients with best Developers for their work.

From the top view Toptal is creating a great community of developers and companies who just love their work and believe in great careers. A combined goal leads to well prepared future and better values. In the information age everyone wants to work with smart people to learn well and get notified intellectually as well. Just like Thomas  Edison who brought great people together to invent great things which worked very well and no doubt Toptal is doing same on world level. Would love to be part of such community!

 

 

Cython in Easy Way, Be cool

Ok Be cool there is lot of buzzing supposed to be happened in my mind because there have to be,

Why I am going to learn about Cython?

1 . Sometimes people say my Favourite language is Damn! Slow! (which is not true if you know about Generators and futures and multiprocessing modules, Let’s just save this for some other day)

2. If I got to use it at any job any day in the field of Data-Analysis or Number-crunching computations.

3. Cyhton is Sexy, I love sexy things, 🙂 😉

4. Portability is good for you if you are Server administrator in day and Android-Developer at night, Or even Web-developer in Weekends.[So cython can help you everywhere]

5. You have lots of C++/C code but you can read it

 

Slide1> C+Python = Cython

You will not get it’s need in first place unless you are from NASA 😉

 

Slide2> Why we should/must love Python

{import antigravity}

Slide3> C+P….. = Cython

Do I need to learn C if I have to go with Cython?

No

Slide4> Cython is made for Python Programmers to achieve the Speed of C and Productivity of Python,

Aha! I feel Like I can create Whole universe now(If you understood above statement)

 

Slide5> Where to start if you want to write programs in Cython?

Rember (C+Python = Cython) or (Python+C = Cython) 😉

So let’s Start With Python…! yay!!

 

A very Simple Python Function(Don’t try at home 😉 )

def sum_text(double number_range):

“This function explains lot about my lazyness”
return sum([i*i for i in xrange(number_range)])

When you will run it in ipython uisng %timeit sum_text(100000000) function your system Can crash/hang if you don’t have good RAM like 10 GB

It will take around 15.6 seconds.

So now we need Cython to optimise it.

Again Using Ipython

%%cython
cpdef double sum_text_cython(int number_range):
“This is cython(!100% but…), It works 😀 ”

cdef int i
return sum([i*i for i in xrange(number_range)])

Image of above thing: 
https://dl.dropboxusercontent.com/u/32435266/cython.png

 

%timeit sum_text_cython(100000000)

1 loops, best of 3: 3.43 s per loop

So it is about "2.5 times" faster, Yeah!!! (That's what I mean!)

For the information this Example is Just for information, It can produce lots of other errors






Continued.... (Going for dinner) Come with monte-CarloSimulation
Now really going for dinner...
http://hplgit.github.io/teamods/MC_cython/sphinx/main_MC_cython.html

Ok Now really thing is  we need to take things in real life, Like how to use Cython when You want to do real code in Python as well.

So Write  a .pyx file with simple Python code.

sum_text(number_range):
p = [i*i for i in xrange(number_range)]
return sum(p)

So this is simple Python thing, save it hellocython.pyx

NEVER SAVE IT  “cython.pyx” file.

$ easycython hellocython.pyx

It will generate hellocython.c file and then also compile that into .so file.

Nextslide> So we only have Python code but Cython does understand it well.

nxtSlide> Let’s make code more faster:

def sum_text(int number_range):
cdef int i
p = [i*i for i in xrange(number_range)]
return sum(p)

This above is not even 100% pure Cython function but  it is 2.5 times faster. 😀 with .so file.

nxtslide> Now we can easily import this into our regular Python call.

create normal a.py file.

import hellocython

print hellocython.sum_text(100000)

Now we can call Our results both Cython = C+Python

[Next is Cython using Multi-threaded] AND [Convert it into presentation] Add more Cython examples and make GitHub repository…

%d bloggers like this: