Database for Financial Analysis I - SQLite

After merging and reshaping the datasets, we can use SQLite to store, organize and manipulate data in smaller environments.
SQLite is a public-domain software package that provides a relational database management system.



[img: SQL flowchart]


Goals: Automate Financial Analysis Process in SQLite

- Install Scientific Python "sqlite3" to convert Excel file into SQLite database in 8 lines of code
- Excel file will be updated after rerun/reconnect to SQLite
- Set index and modifying primary key in SQLite or in Python
- Create and save query result table in SQLite and Excel
- Create and alter tables with pandas
- Insert columns to the tables        
- Use free graphical user interface (GUI) tools to run SQL queries on database 


Connecting SQLite

- The first row in each sheet should have legal column names
- Excel tab name cannot have spaces, but underscores are acceptable 
- Here is the conversion codes:
  - Import library sqlite3
  - Name of Excel .xlsx file. SQLite database will have the same name and extension .db (filename+".db")
  - Use if_exists so we can repeat the process

Connect Excel file into SQlite
- The example file is a lsit of stocks with 4 tabs - market value, share, cost and combined

In [2]:
filename="data/SQLite/Stock" 
con=sqlite3.connect(filename+".db")  

wb=pd.read_excel(filename+'.xlsx',sheet_name=None)
for sheet in wb:
    wb[sheet].to_sql(sheet,con, index=False, if_exists='replace')
   
con.commit()
#con.close()
C:\apps\python36\python-3.6.3.amd64\lib\site-packages\pandas\core\generic.py:1534: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.
  chunksize=chunksize, dtype=dtype)


Query the database

- The benefit of SQlite is we can set up unique identifer as primay key
- We can repeat/automate the query process when new data get updated 
- Save the query result to SQLite or Excel    


Join the table

- Option A : Inner join between the reinsurance, premium and loss tables 
- Option B : Selecting row


Option A : Using SQL for Inner Join between three tables

In [3]:
sql_statement='''
select
m.Symbol as Market_Tab,
s.Symbol as Share_Tab,
c.Symbol as Cost_Tab
from Market as m
join Share as s on s.Symbol=m.Symbol
join Cost as c on c.Symbol=m.Symbol
'''
df = pd.read_sql(sql_statement,con)

df.to_sql('/SQlite/QueryResult', con,if_exists='replace') 
df.to_excel('data/SQlite/QueryResult.xlsx') 


Option B : Using Pandas issuing SQL command for Inner Join between three tables

In [4]:
sql_statement='select * from Market'
df_market = pd.read_sql(sql_statement,con)
df_market['Market_Tab']=df_market['Symbol']

sql_statement='select * from Share'
df_share = pd.read_sql(sql_statement,con)
df_share['Share_Tab']=df_share['Symbol']

sql_statement='select * from Cost'
df_cost = pd.read_sql(sql_statement,con)
df_cost['Cost_Tab']=df_cost['Symbol']

df_merge = pd.merge(df_market, df_share, on=['Symbol'], how='inner', suffixes=['_from_Market_Tab','_from_Share_Tab'])

df_merge = pd.merge(df_merge, df_cost, on=['Symbol'], how='inner', suffixes=['','_from_Cost_Tab'])

df_merge[['Market_Tab','Share_Tab','Cost_Tab']].head()

df_merge=df_merge[['Market_Tab','Share_Tab','Cost_Tab']]

print("rows=%s" % df_merge.shape[0])

df_merge.to_sql('/SQlite/QueryRresult2', con,if_exists='replace') 
df_merge.to_excel('data/SQlite/QueryResult2.xlsx') 
Out[4]:
Market_Tab Share_Tab Cost_Tab
0 AA AA AA
1 AAPL AAPL AAPL
2 ABBV ABBV ABBV
3 ABT ABT ABT
4 ADBE ADBE ADBE
rows=19


DB Browser - a tool to visualize and access SQLite databases

- DB Browser for SQLite is a high quality, visual, open source tool used to create, design, and edit database files compatible with SQLite.
- Save the query result to SQLite
- Below is the screenshot of the database structures after connecting from Excel to SQLite by using Python  

[img:DB Browser after save merge result as SQL file]


Insert a column to table

- Option A - use DML to add a column :-|
- Option B - use to_sql to add a column - much easier and more native to PANDAS way of doing things


View table before inserting a new column

In [5]:
sql_statement='''
select s.* from share as s
'''
df = pd.read_sql(sql_statement,con)
df.head()
Out[5]:
Symbol Description Quantity
0 AA ALCOA 11.00
1 AAPL APPLE 1.03
2 ABBV ABBVIE 1.06
3 ABT ABBOTT 2.05
4 ADBE ADOBE 5.00


Option A - use DML to add a column :-|

In [6]:
## Obtain a Cursor object to execute SQL statements

cur=con.cursor()

## Add a new column to share table

addColumn = "ALTER TABLE Share ADD COLUMN BROKER varchar(32)"

cur.execute(addColumn) 

sql_statement='''
select s.* from Share as s
'''
df = pd.read_sql(sql_statement,con)
df.head()
Out[6]:
<sqlite3.Cursor at 0x1b63ec76c00>
Out[6]:
Symbol Description Quantity BROKER
0 AA ALCOA 11.00 None
1 AAPL APPLE 1.03 None
2 ABBV ABBVIE 1.06 None
3 ABT ABBOTT 2.05 None
4 ADBE ADOBE 5.00 None


Option B - use to_sql to add a column - much easier and more native to PANDAS way of doing things

In [7]:
df['Broker2']='Fidelity LLC'
df.to_sql('Share',con,if_exists='replace') 

df.head()
C:\apps\python36\python-3.6.3.amd64\lib\site-packages\pandas\core\generic.py:1534: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.
  chunksize=chunksize, dtype=dtype)
Out[7]:
Symbol Description Quantity BROKER Broker2
0 AA ALCOA 11.00 None Fidelity LLC
1 AAPL APPLE 1.03 None Fidelity LLC
2 ABBV ABBVIE 1.06 None Fidelity LLC
3 ABT ABBOTT 2.05 None Fidelity LLC
4 ADBE ADOBE 5.00 None Fidelity LLC
In [9]:
## Rename the SQLite Table
renameTable = "ALTER TABLE Combined RENAME TO CombinedNew"
cur.execute(renameTable)
Out[9]:
<sqlite3.Cursor at 0x1b63ec76c00>


Inserting a new column called "testing" to the SQlite

In [10]:
con = sqlite3.connect("data/SQLite/Stock.db")
cur = con.cursor()
addColumn = "ALTER TABLE Market ADD COLUMN Broker_Name varchar(32)"
cur.execute(addColumn)
Out[10]:
<sqlite3.Cursor at 0x1b63eed4c00>
In [ ]:
## Closing the connection to the database file
con.close()