After merging and reshaping the datasets, we can use SQLite to store, organize and manipulate data in smaller environments.
SQLite is a public-domain software package that provides a relational database management system.

[img: SQL flowchart]

Goals: Automate Financial Analysis Process in SQLite

- Install Scientific Python "sqlite3" to convert Excel file into SQLite database in 8 lines of code
- Excel file will be updated after rerun/reconnect to SQLite
- Set index and modifying primary key in SQLite or in Python
- Create and save query result table in SQLite and Excel
- Create and alter tables with pandas
- Insert columns to the tables        
- Use free graphical user interface (GUI) tools to run SQL queries on database

Connecting SQLite

- The first row in each sheet should have legal column names
- Excel tab name cannot have spaces, but underscores are acceptable 
- Here is the conversion codes:
  - Import library sqlite3
  - Name of Excel .xlsx file. SQLite database will have the same name and extension .db (filename+".db")
  - Use if_exists so we can repeat the process

Connect Excel file into SQlite
- The example file is a lsit of stocks with 4 tabs - market value, share, cost and combined

In [2]:

filename="data/SQLite/Stock" 
con=sqlite3.connect(filename+".db")  

wb=pd.read_excel(filename+'.xlsx',sheet_name=None)
for sheet in wb:
    wb[sheet].to_sql(sheet,con, index=False, if_exists='replace')
   
con.commit()
#con.close()

C:\apps\python36\python-3.6.3.amd64\lib\site-packages\pandas\core\generic.py:1534: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.
  chunksize=chunksize, dtype=dtype)

Query the database

- The benefit of SQlite is we can set up unique identifer as primay key
- We can repeat/automate the query process when new data get updated 
- Save the query result to SQLite or Excel

Join the table

- Option A : Inner join between the reinsurance, premium and loss tables 
- Option B : Selecting row

Option A : Using SQL for Inner Join between three tables

In [3]:

sql_statement='''
select
m.Symbol as Market_Tab,
s.Symbol as Share_Tab,
c.Symbol as Cost_Tab
from Market as m
join Share as s on s.Symbol=m.Symbol
join Cost as c on c.Symbol=m.Symbol
'''
df = pd.read_sql(sql_statement,con)

df.to_sql('/SQlite/QueryResult', con,if_exists='replace') 
df.to_excel('data/SQlite/QueryResult.xlsx')

Option B : Using Pandas issuing SQL command for Inner Join between three tables

In [4]:

sql_statement='select * from Market'
df_market = pd.read_sql(sql_statement,con)
df_market['Market_Tab']=df_market['Symbol']

sql_statement='select * from Share'
df_share = pd.read_sql(sql_statement,con)
df_share['Share_Tab']=df_share['Symbol']

sql_statement='select * from Cost'
df_cost = pd.read_sql(sql_statement,con)
df_cost['Cost_Tab']=df_cost['Symbol']

df_merge = pd.merge(df_market, df_share, on=['Symbol'], how='inner', suffixes=['_from_Market_Tab','_from_Share_Tab'])

df_merge = pd.merge(df_merge, df_cost, on=['Symbol'], how='inner', suffixes=['','_from_Cost_Tab'])

df_merge[['Market_Tab','Share_Tab','Cost_Tab']].head()

df_merge=df_merge[['Market_Tab','Share_Tab','Cost_Tab']]

print("rows=%s" % df_merge.shape[0])

df_merge.to_sql('/SQlite/QueryRresult2', con,if_exists='replace') 
df_merge.to_excel('data/SQlite/QueryResult2.xlsx')

Out[4]:

	Market_Tab	Share_Tab	Cost_Tab
0	AA	AA	AA
1	AAPL	AAPL	AAPL
2	ABBV	ABBV	ABBV
3	ABT	ABT	ABT
4	ADBE	ADBE	ADBE

rows=19

DB Browser - a tool to visualize and access SQLite databases

- DB Browser for SQLite is a high quality, visual, open source tool used to create, design, and edit database files compatible with SQLite.
- Save the query result to SQLite
- Below is the screenshot of the database structures after connecting from Excel to SQLite by using Python

[img:DB Browser after save merge result as SQL file]

Insert a column to table

- Option A - use DML to add a column :-|
- Option B - use to_sql to add a column - much easier and more native to PANDAS way of doing things

View table before inserting a new column

In [5]:

sql_statement='''
select s.* from share as s
'''
df = pd.read_sql(sql_statement,con)
df.head()

Out[5]:

	Symbol	Description	Quantity
0	AA	ALCOA	11.00
1	AAPL	APPLE	1.03
2	ABBV	ABBVIE	1.06
3	ABT	ABBOTT	2.05
4	ADBE	ADOBE	5.00

Option A - use DML to add a column :-|

In [6]:

## Obtain a Cursor object to execute SQL statements

cur=con.cursor()

## Add a new column to share table

addColumn = "ALTER TABLE Share ADD COLUMN BROKER varchar(32)"

cur.execute(addColumn) 

sql_statement='''
select s.* from Share as s
'''
df = pd.read_sql(sql_statement,con)
df.head()

Out[6]:

<sqlite3.Cursor at 0x1b63ec76c00>

Out[6]:

	Symbol	Description	Quantity	BROKER
0	AA	ALCOA	11.00	None
1	AAPL	APPLE	1.03	None
2	ABBV	ABBVIE	1.06	None
3	ABT	ABBOTT	2.05	None
4	ADBE	ADOBE	5.00	None

Option B - use to_sql to add a column - much easier and more native to PANDAS way of doing things

In [7]:

df['Broker2']='Fidelity LLC'
df.to_sql('Share',con,if_exists='replace') 

df.head()

C:\apps\python36\python-3.6.3.amd64\lib\site-packages\pandas\core\generic.py:1534: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.
  chunksize=chunksize, dtype=dtype)

Out[7]:

	Symbol	Description	Quantity	BROKER	Broker2
0	AA	ALCOA	11.00	None	Fidelity LLC
1	AAPL	APPLE	1.03	None	Fidelity LLC
2	ABBV	ABBVIE	1.06	None	Fidelity LLC
3	ABT	ABBOTT	2.05	None	Fidelity LLC
4	ADBE	ADOBE	5.00	None	Fidelity LLC

In [9]:

## Rename the SQLite Table
renameTable = "ALTER TABLE Combined RENAME TO CombinedNew"
cur.execute(renameTable)

Out[9]:

<sqlite3.Cursor at 0x1b63ec76c00>

Inserting a new column called "testing" to the SQlite

In [10]:

con = sqlite3.connect("data/SQLite/Stock.db")
cur = con.cursor()
addColumn = "ALTER TABLE Market ADD COLUMN Broker_Name varchar(32)"
cur.execute(addColumn)

Out[10]:

<sqlite3.Cursor at 0x1b63eed4c00>

In [ ]:

## Closing the connection to the database file
con.close()