How to Map a Data Model to a Database

This post will examine how to map a data model to a database.

For a basic data model, decisions are made on whether to classify objects as “things” or “attributes“. Consider an application built for managing audio tracks.

Data Model

To map the data model from a logic model to a physical model, a database, you have to augment your table. Foreign keys are used to augment a table. Each foreign key points to a primary key of another table.

Primary Keys, Foreign Keys, Logical Keys

For an application that manages audio tracks, the albums should be numbered. These numbers would be called the primary keys or the “id“. For an album numbered 4, then that primary key “id” would be listed as the foreign key “album_id” for a track that is on that album.

Primary keys become end points that a foreign key would point to. The logical key is something you might use in an SQL WHERE clause.

wk3b_representing_2

It is important to have a naming convention for any data model. It makes drawing out the data model much easier.

Data Model Sketch

 

Consider SQLite with Python to build a database. These can both run on the desktop, are fast, and easy to setup.

How to Parse Email Data with Python

This post will examine how to parse email data with Python.

You will go step by step through the program emaildb.py.

This program will read through mbox-short.txt file, count the occurrences of each email, and put that data in a database.

In the first line of code you will import a library you need to talk to the database.

The second line of code establishes a connection to the database file emaildb.sqlite.

The third line of code creates a cursor object that allows you to make commands to the database.

email db

Next, you will call the execute method on the cursor. This program will create a new table called “Counts” every time it runs. So, first it will drop the table if it exists, and then it will create a new table.

drop table

Notice how you are using SQL commands in Python to talk to the database.

The next several lines of code are similar to what you learned in the Python dictionary post. Here, you are parsing through the file and pulling out the email address.parse data

 

The next line of code is different. This line uses a technique called parameter substitution.

parameter sub

In this line, you are selecting a row from the table that match the email. The question mark after email serves as a placeholder for a value that will be substituted in. You are substituting in the current value of the email variable.

Next, you will use a “try” statement. You want to use “try” because your program will blow up if no rows were found that matched the email from the previous line. If the email was found, then you advance the count of that email.

In the “except” statement, you are saving the program from blowing up from the email not being found in the table. Rather, you insert this new email into the table, and start the count at 1.

Finally, the conn.commit() statement is very important, because it writes your new changes to the database.

commit

The next thing you will do is run select statement to list the top 10 emails in descending order.

top 10 emails

Before closing out the program, you will loop through the rows and print the columns for each row. It is a good idea to convert your column fields to str type before printing.

print entries

Make an SQL Database

In this post you will make an SQL database.

If you have not done so already, you need to install an SQL database tool. See picture below.

SQLite

SQLite Browser reads and writes a special type of file that is optimized for rapid retrieval, and storage of data.

Run SQLite Browser and create a new database. Once complete, your screen should display the following:

create database

Now, go ahead and create a table. Use the following command on the right side of the picture shown below (assume case sensitivity). This specifies a new table called “Users”, with two columns, and up to 128 length variable characters. This code also forms a contract with the table. A contract with what is allowed, and not allowed. Put this command in the execute SQL window.

create table

It is up to you regarding the terms of the contract, but you need to plan it ahead of time.

Next, add data (“records”) in your table as shown below.

add records

You can even view the SQL Log to view the code every time you add a new record.

Now, you can learn for SQL commands to insert, delete, and retrieve data.

 

You put the SQL commands in the Execute SQL window.

At any time you can hit the “Browse Data” button to see all your records.

Following is how you delete a row.

delete row

 

This will delete all rows that have the matching email address.

The update statement allows you to find certain records and change their values.

sql update

Finally, you can do a retrieval with the statement shown below.

retrieving records

The “SELECT *” command translate to “Select All”.

You can also sort your records.

sorting records

Below is a summary of all the SQL commands you have learned in this post.

SQL summary

This covers the basics of SQL databases. The real power will come later when you learn to exploit connections between tables.

Database Application SQLite and Python

This post will introduce the database application SQLite in relation to Python.

For large project, there are two main roles. One of those roles in the database administrator, who often consults with the developer (the other role). These are rather specialized jobs.

two roles

The administrator talks directly to the database, while the developer talks to indirectly, by way of the application. The following picture illustrates this split between roles, in large projects.

large project structure

However, for smaller projects, one person can handle both roles.

By handling both these roles you will:

  1. Use the database application SQLite to create tables.
  2. Write Python programs to retrieve, clean, and put cleaned data in the tables.
  3. Write another program to pull the cleaned data out and output a nice file.

For now, you will focus on the first step. You will learn how to create a database model, or contract.

database model

There are several common database systems. Oracle dominated this market, mainly because they were the first to embrace the relational model concept. However, there are good alternatives.

common database systems

 

In the context of Python, you will use SQLite. As it turns out, SQLite is quite popular.

SQLite is Popular

SQLite is fast, and is good for smaller amounts of data. Most importantly, the database application SQLite is embedded in Python.

 

Using Databases with Python – A Must Know

It is a good idea to understand the importance of using databases with Python while learning Python. If you are analyzing data, pulling data from over the network, then it makes sense to store that data in a database. You can then set up a process of pulling data from the database as you need it. Overall, it can speed up your workflow.

A good database system to learn using databases with Python is DB Browser for SQLite.

using databases with python

Relational databases comprise a whole sub-field of computer science. They are relevant because you can pull out an entry from huge amounts of data in a split-second. It would take you much longer if you had to read through the data.

relational databases

You can look no further than Oracle to understand the relevance of relational databases. The majority of their revenue comes from database products.

The underlying foundation of databases is rooted in mathematics. This is present in the terminology that experts use to describe databases.

database terminology

The idea behind databases is that you model data at a connection point.

database model

However, programmers tend to think of it in terms of rows and columns.

Typically, when you make a table, the first row becomes metadata for the table. You often use the first row to title what each column is for. Therefore, you can refer to this first row as the schema for the table. It sets the rules for each column with regard to what goes there, for example, a string, an integer, etc.

In the early 1960s, the database pioneers figured out ways to quickly retrieve data from random access memory, without having to go through the data sequentially. However, databases were very complex. As a result, a new component of internet architecture evolved called the database application. This allows you, the programmer, to talk with the database, by way of the database application. At this point, an industry standard was desired for the language for the API between a database and its application. The name of the language the industry agreed on was SQL (Structured Query Language).

metadata

SQL is a great language, but it depends on the data being clean . The nice thing about Python, is it can deal really well with unstructured data. So together, Python and SQL, you have a powerful combination.

SQL