Choosing The Right Python Data Type for Analysis


Untitled






In [4]:
import datetime 

Dictionary

Dictionary is used when there exists a mapping relationship, for example, in stock market data, stock prices are linked to a specific date.

In [94]:
stock = {"Header":["Open","High","Low","Close"],"12/12/2015":[32.03,50,40,32]}
In [95]:
stock["12/12/2015"]
Out[95]:
[32.03, 50, 40, 32]

Dictionary provides some useful methods

In [74]:
list(stock.iterkeys())  #key iterator, giving you a list
Out[74]:
['Header', '12/12/2015']
In [78]:
list(stock.iteritems()) #iterm iterator, giving you tuples, representing relations
Out[78]:
[('Header', ['Open', 'High', 'Low', 'Close']),
 ('12/12/2015', [32.03, 50, 40, 32])]

Dictionary can also be feeded in the constructor of pandas dataframe. We will talk about this in my later posts.

Tuple

Anything you want to be considered as a whole should use turple becuase it’s immutable. Turple often used in feeding a set of arguments into function caller.

In [68]:
x=12
y=23
z=23.6

point = (x,y,z) # a coordinate can use turple to replesent because the
                # position of each element matters 

Some people also argues that you can compare tuple to strcut in C/C++ since tuple usually holds heterogeneous collections.

As I mentioned, tuple also used to represent relations in discrete data structure. I can use you an example using dictionary and tuple

In [66]:
# for example, we have y = x^2
f = {1:1,2:4,3:9}
f.items()  # the items method can turn dictionary into a tuple
Out[66]:
[(1, 1), (2, 4), (3, 9)]

List

List is like array in other programming language. You should use list in situations that uses array. In python, list provides you more methods to perform comprehensive analysis.

List as stack operation

In [14]:
l = []
l.append(1)
l.append(2)
l.append(3)
In [15]:
l
Out[15]:
[1, 2, 3]
In [22]:
l.pop()  # First in first out 
Out[22]:
1
In [17]:
l
Out[17]:
[1, 2]
In [18]:
l.append(4)
l
Out[18]:
[1, 2, 4]
In [19]:
l.pop()
Out[19]:
4
In [20]:
l.pop()
Out[20]:
2
In [21]:
l
Out[21]:
[1]

List sorting

In [32]:
l = [2,3,9,4]
l.sort()
In [33]:
l
Out[33]:
[2, 3, 4, 9]

List Reversing

In [34]:
l.reverse()
In [35]:
l
Out[35]:
[9, 4, 3, 2]

List Extend Method

In [40]:
l.extend([1]) # need to feed a list, and will insert the element to a proper place
In [39]:
l
Out[39]:
[9, 4, 3, 2, 1]

Other List Methods

In [42]:
l.remove(1) #remove elements
l
Out[42]:
[9, 4, 3, 2]
In [52]:
l.index(4) # return the position of an element
Out[52]:
1
In [62]:
l.insert(0,2) # insert 2 into index 0
l
Out[62]:
[2, 9, 4, 3, 2]

We will talk about data structure usage more in-depth later in my blog when we go further into analysis.