Python Scatter Plot Example Using Matplotlib

A Python scatter plot example can be used as a reference to build another plot, or to remind us about the proper syntax.

Python scatter plots example often use the Matplotlib library because it is arguably the most powerful Python library for data visualization. It is usually used in combination with the Python Numpy library.

Suppose you have two Python lists. One is a list of  home prices, and the other list represents the size of the living area. You want to use these lists to see if there is a correlation between the two. This problem calls for a simple linear regression analysis. However, a scatter plot can help infer if there is a strong or weak correlation.

The Python Scatter Plot Example

The list for home prices is:

homeprice = [208500, 181500, 223500, 140000, 250000, 143000, 307000, 200000, 129900, 118000, 129500, 345000, 144000, 279500, 157000, 132000, 149000, 90000, 159000, 139000, 325300]

The list for living area size is:

livearea = [1710, 1262, 1786, 1717, 2198, 1362, 1694, 2090, 1774, 1077, 1040, 2324, 912, 1494, 1253, 854, 1004, 1296, 1114, 1339, 2376]

The next step is to import the libraries.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Inline comments can explain the remaining steps.

# Convert lists to numpy arrays
np_homeprice = np.array(homeprice)
np_livearea = np.array(livearea)

# Set arguments for the x and y axis
plt.scatter(np_livearea, np_homeprice)

# Label x and y axis
plt.xlabel(‘Living Area Square Footage’)
plt.ylabel(‘Sale Price of Home’)

# Give title to plot
plt.title(‘Home Sale Price vs. Size of Living Area’)

# Display the plot
plt.show()

Executing this code will result in a scatter plot.

Python Scatter Plot Example

Play with this example in the interactive Google Colab.

How to Parse Email Data with Python

This post will examine how to parse email data with Python.

You will go step by step through the program emaildb.py.

This program will read through mbox-short.txt file, count the occurrences of each email, and put that data in a database.

In the first line of code you will import a library you need to talk to the database.

The second line of code establishes a connection to the database file emaildb.sqlite.

The third line of code creates a cursor object that allows you to make commands to the database.

email db

Next, you will call the execute method on the cursor. This program will create a new table called “Counts” every time it runs. So, first it will drop the table if it exists, and then it will create a new table.

drop table

Notice how you are using SQL commands in Python to talk to the database.

The next several lines of code are similar to what you learned in the Python dictionary post. Here, you are parsing through the file and pulling out the email address.parse data

 

The next line of code is different. This line uses a technique called parameter substitution.

parameter sub

In this line, you are selecting a row from the table that match the email. The question mark after email serves as a placeholder for a value that will be substituted in. You are substituting in the current value of the email variable.

Next, you will use a “try” statement. You want to use “try” because your program will blow up if no rows were found that matched the email from the previous line. If the email was found, then you advance the count of that email.

In the “except” statement, you are saving the program from blowing up from the email not being found in the table. Rather, you insert this new email into the table, and start the count at 1.

Finally, the conn.commit() statement is very important, because it writes your new changes to the database.

commit

The next thing you will do is run select statement to list the top 10 emails in descending order.

top 10 emails

Before closing out the program, you will loop through the rows and print the columns for each row. It is a good idea to convert your column fields to str type before printing.

print entries

Database Application SQLite and Python

This post will introduce the database application SQLite in relation to Python.

For large project, there are two main roles. One of those roles in the database administrator, who often consults with the developer (the other role). These are rather specialized jobs.

two roles

The administrator talks directly to the database, while the developer talks to indirectly, by way of the application. The following picture illustrates this split between roles, in large projects.

large project structure

However, for smaller projects, one person can handle both roles.

By handling both these roles you will:

  1. Use the database application SQLite to create tables.
  2. Write Python programs to retrieve, clean, and put cleaned data in the tables.
  3. Write another program to pull the cleaned data out and output a nice file.

For now, you will focus on the first step. You will learn how to create a database model, or contract.

database model

There are several common database systems. Oracle dominated this market, mainly because they were the first to embrace the relational model concept. However, there are good alternatives.

common database systems

 

In the context of Python, you will use SQLite. As it turns out, SQLite is quite popular.

SQLite is Popular

SQLite is fast, and is good for smaller amounts of data. Most importantly, the database application SQLite is embedded in Python.

 

Using Databases with Python – A Must Know

It is a good idea to understand the importance of using databases with Python while learning Python. If you are analyzing data, pulling data from over the network, then it makes sense to store that data in a database. You can then set up a process of pulling data from the database as you need it. Overall, it can speed up your workflow.

A good database system to learn using databases with Python is DB Browser for SQLite.

using databases with python

Relational databases comprise a whole sub-field of computer science. They are relevant because you can pull out an entry from huge amounts of data in a split-second. It would take you much longer if you had to read through the data.

relational databases

You can look no further than Oracle to understand the relevance of relational databases. The majority of their revenue comes from database products.

The underlying foundation of databases is rooted in mathematics. This is present in the terminology that experts use to describe databases.

database terminology

The idea behind databases is that you model data at a connection point.

database model

However, programmers tend to think of it in terms of rows and columns.

Typically, when you make a table, the first row becomes metadata for the table. You often use the first row to title what each column is for. Therefore, you can refer to this first row as the schema for the table. It sets the rules for each column with regard to what goes there, for example, a string, an integer, etc.

In the early 1960s, the database pioneers figured out ways to quickly retrieve data from random access memory, without having to go through the data sequentially. However, databases were very complex. As a result, a new component of internet architecture evolved called the database application. This allows you, the programmer, to talk with the database, by way of the database application. At this point, an industry standard was desired for the language for the API between a database and its application. The name of the language the industry agreed on was SQL (Structured Query Language).

metadata

SQL is a great language, but it depends on the data being clean . The nice thing about Python, is it can deal really well with unstructured data. So together, Python and SQL, you have a powerful combination.

SQL

Python Object Oriented Programming

This post will examine Python object oriented programming.

When you think of Python object oriented programming, you should think of it as an orchestration of objects using the capabilities that each object has.

object oriented programming

A function is a bit of code, but an object is a bit of code and data.

python object

Part of the goal in object oriented programming is to take a complex problem and break it into smaller parts. Then you can hide complexity in the smaller parts, which allows you to work on other parts without having to worry about the complexity. Essentially, you want a simple interface, that hides complexity.

In the end, your program begins a network of objects that you orchestrate to get the desired output.

object orchestration