Internet Protocol Layer – Beauty in Simplicity

internet protocol layerThe Internet Protocol Layer is one part within the four layer architecture of the TCP/IP model. This layer is responsible for transmitting packets of information across the network. It has no other concern with the other layers in the model. This narrow focus of the Internet Protocol layer allows the network engineers to deal with a small piece of a very large and complex challenge. It is sometime referred to as the Internetwork Protocol, because it deals with getting messages from network to network.

A nice feature about IP is that it does not have to be perfect. It’s designed in a way that data can sometimes get dropped, or sent different ways, but in the end it corrects itself and ultimately works. This layer had to introduce, and relies heavily on, the address of the destination host. This is what we call the IP address.

The IP address format is four numbers separated by dots. Each number is between zero and 255. The address is broken into two parts. The prefix is the network number. The second part is the computer number within the network. For example, a college campus could have one network number. So, this prefix in the IP address will be the same for every computer on that network. When a packet of information comes zooming across the internet for that campus, the routers only worry about the prefix, i.e., the network number.  This greatly simplifies the job of the router, because it only worries about the prefix. This allows routers to work very fast. Once a message reaches the destination network, it is up to that network to forward the message on the correct computer.

DHCP for Computers that Move Around

network address translationDynamic Host Configuration Protocol (DHCP) is the technology that allows someone to take their laptop to a school, then a coffee shop, and then home. Yet, everything still works. The user can still send messages back and forth regardless of their locations. This is because whenever someone opens their computer up at a coffee shop, or wherever, the computer sends out a message saying “Hey, I’m here, please give me a number to use on your network”. However, you may have noticed that wherever you are, your IP address starts with 192.168. This is actually a non-routable address that you get through a technology called Network Address Translation (NAT). You only see this non-routable address, and you do not see the real unique address assigned to you by the network.

Time to Live Saves Internet Protocol Layer From Infinite Loops

Because routers work imperfectly with imperfect information, they can occasionally send packets of information round and round through the same subset of routers. If this process were to never stop then an infinite loop forms. The router is mistaken by thinking it’s routing the packet of information correctly. It doesn’t know that it’s looping the packet. This problems gets corrected with a Time to Live (TTL) field inside the router. TTL starts a number, say 30, and each time a packet passes through that router, it subtracts one from the TTL field. If TTL goes down to zero, meaning the packet looped through 30 times, then the packet gets thrown out. When a packet gets thrown out, a notification is sent back to the sending computer to inform it that there was a problem. The computer can then send it out again until it successfully hops its way across the internet. If the sending computer wants to find out exactly when and where the package got thrown out, it can fun a program called Traceroute to diagnose the problem.

The simplicity of how routers work is one reason why the TCP/IP model succeeded. Routers don’t have to worry about the order of packets, they don’t have to store information, but rather they just forward on packets according to their best guess. They don’t have to be perfect. This allowed for the internet to be scalable, and to grow quickly.

Network Infrastructure Evolution

store and forwared networkingNetwork infrastructure evolution begins with the store and forward networking model. This model was how early internet adopters (1960s – 1980s) would send messages back and forth to host computers. While being able to send a message across a network infrastructure was a revolutionary computing breakthrough, big deficiencies did not go unnoticed. With this model, a message got sent one at a time. They would get sent through a series of hops from one computer to the next. When a message was received by an intermediary computer, it would be stored there, and then forwarded on to the next computer once the line was open. A big problem was that a long message would clog the system, and drastically slowdown the delivery of other messages waiting in que. Another problem is that there was not a built-in method for dynamically addressing outages in the network.

packet switching
The idea of packet switching lead to a shared network infrastructure.

After more than 20 years of researching ways to address problems in store and forward networking, the idea of packets was innovated. With the notion of packet-switching, a message is broken into small packets. The packets get sent out on the internet to find their way. These packets would also have to traverse a series of hops. However, because messages are broken into smaller packets, it leads to better sharing of resources for transmission of data. Further, packets of the same message are not required to take the same series of hops to reach their final destination. The packets have no regard as to how they find their way, but they do know when all the packets of the message have arrived, and how to assemble back to the complete message.

This notion of packet-switching lead to the shared network infrastructure that we use in our TCP/IP networks today. With this notion, the network of big computers evolved to a shared network of small routers. The main purpose of the routers is to forward packets. Moreover, the existence of a single router would become less relevant than one computer in store and forward networking. In that model, one computer played a critical role in the whole reliability of the network. However, with much more routers setup everywhere with the sole purpose of forwarding packets, it was to become not so critical if one router went offline. There would be other paths available for the packet to be routed through.

network infrastructure
The TCP/IP layered network model

However, this problem of reliability was still a big problem. The way you solve a big problem is to break it down to a subset of smaller problems. Then you can focus on solving each smaller problem. Breaking down this problem lead to the layered network model. There were several variations as to how many layers the problem got broken into, but the model that became most popular is the TCP/IP (Internet Protocol Suite) model.

The TCP/IP model consists of four layers. They are Application, Transport, Internet, and Link. So to solve the whole problem of internet reliability, you can focus on one layer at a time. Each layer presents a difficult problem in itself, but it is manageable.

When discussing the evolution of our shared network infrastructure, it must be noted that there was also a model called the 7 layer OSI model. The Open System Interconnection model competed with the TCP/IP model as the preferred model for building out the internet, but TCP/IP became more popular.

The Digital Certificate – Verifying Identity

The public key cryptosystem provides an effective method for keeping information confidential. A digital certificate provides a way to verify the integrity of your communication. When you go to Amazon’s site, how do you know it’s really Amazon? You know because of the digital certificate. A digital certificate, also known as an identity certificate, or public key certificate, provides a digital signature that binds a public to key to a person or organization. The digital certificate is used to verify that a public key belongs to an individual.

A certificate authority is a trusted entity that issues digital certificates. The digital certificate certifies ownership of the public key by the named subject of the certificate.

Who is the certificate authority? Is it a government agency? Can anyone become a certificate authority?

The certificate authority is a trusted third-party. Some are better (more trusted) than others. It can range in price from hundreds to thousands per year to have your public key verified by certificate authority. Verisign is one of the oldest and most trusted certificate authorities. They are also one of the most expensive.

The idea is that if you have an online business wherein you would handle people’s sensitive information, you want to be viewed as trustworthy and respectable. The stamp of a credible certificate authority provides this trust and respect for your customers. The certificate authority has lots of responsibility on their end of the bargain. They want to protect their track record of successfully verifying identity. If they make a mistake, then all credibility could be lost, and the public may no longer choose to use them as a trusted authority.

Knowing the Certificate Authorities to Trust

Chances are, the operating system of your computer has pre-installed a list of certificate authorities and their digital certificates. Naturally, Verisign is one of the authorities that is probably included in the pre-installed list. Thus, the operating system (Windows, for example), has placed lots of trust in an authority that is on this pre-installed list. In essence, all the respect you have for Verisign comes from their proven ability to verify the integrity of public keys.

Public Key Encryption – An Issue of Confidentiality

Confidentiality is hiding information. Integrity is knowing identity. Public key encryption focuses on confidentiality of information transmitted over the internet.

The method discussed in the previous lesson, of having a secret key to encrypt and decrypt data, is that the two (or more) intended parties need to share a secret key. The problem is that the internet is not a safe place to transmit that secret key. You have to assume there are many hackers and eavesdroppers waiting to steal your secrets.

How do we overcome this problem?

Should companies like Amazon send a Fed-Ex package to you (and all their customers) containing a unique secret key? What if you lose your secret key?

Fortunately, companies like Amazon do not have to, thanks to the idea of public key encryption. Public key encryption was proposed by Whit Diffie, Martin Hellman and Ralph Merkle in 1976. The idea gained popularity because of its elegance.

Public key encryption is an asymmetric-key cryptosystem. Meaning, it does not rely on the same key to encrypt and decrypt the message. It has a public key that does not need to be secure. The public key is what goes out to the internet to encrypt data. The private key rests in the user’s computer, and it decrypts the data. These two keys have a mathematical relationship that is well understood, but very difficult to compute, if the key lengths are long enough.

Interestingly enough, public key encryption was initially met with rejection. However, as it became obvious of its elegance, it quickly became the standard.

It is not impossible to hack the mathematical relationship between the two keys. However, it’s extremely difficult to do, and requires vast amounts of computing power. Therefore, an entity such as the CIA, for example, could hack the relationship between the two keys if they really wanted to. This tell us that perfect security is really impossible, but public key encryption is accepted because there is simply not enough computing power that makes it practical to hack the cryptosystem. The only good method to hack it is brute force. As computing power increases, you can just increase the length of the keys.

Generating Public & Private Key Pairs

  1. You start by randomly selecting two very large prime numbers.
  2. You multiply those two numbers.
  3. A series of computational steps follow that generates the public and private pair.

The essence lies in the fact that finding the two random, very large prime numbers is akin to searching for a needle in a haystack.

Some functions are easy to work in both directions, but other functions, not so much.

For example:

What are the factors of 55,124,159? That’s a hard question.

What do you multiply 7919 by to get 55,124,159? That’s an easy question.

The receiver of a public key cryptosystem knows half the equation. The rest of the world does not. It is nearly impossible to decrypt without knowing this half of the equation. Not impossible, but nearly impossible.

Take for example, you want to buy something on Amazon. You will have to give Amazon your credit card information. Amazon has a public key and a private key. The public key goes out through the insecure internet and encrypts your credit card information, and then retrieves it back from the insecure internet.

You have to assume that eavesdroppers and hackers are out their stealing this information. However, that’s okay, because they don’t have the private key. They don’t have half of the equation to decrypt. For now, it’s also safe to assume they don’t possess the computing power to use the brute force required to decrypt it.

Amazon is in sole possession of their secret key. So, it is very easy for Amazon to decrypt your credit card information. They have half of the equation.

Keeping with this notion, you need a mini layer within the main network layers of the internet. This spawned a change to http. A Secure Sockets Layer (SSL), was nested between the transport and application layers of our network architecture.

This underscores the beauty of layered network architecture. No change was required to the layers below this new SSL mini layer. Further, there is really no danger in the fact that hackers can see your encrypted data.

As long as your computer does not have some type of malware, virus, or as long as Amazon does not let their data fall in the wrong hands, the public key cryptosystem works very well.

SSL is also known as Transport Layer Security (TSL) or HTTPS.

Do you see the difference between these two URLs?

https://www.amazon.com OR http://www.amazon.com

Never type in sensitive information, credit card numbers, or passwords on a page without the ‘https’ in the URL.

All of the concerns herein practically ensures the confidentiality of your information across the internet. The next part is to focus on how you can trust the integrity of who you are sharing the information with.

Cryptographic Hash Function – Explained

A cryptographic hash function is a technique that takes a set of text and reduces it down to a fixed length of data. This small fixed set of numbers is called the digest, or hash value. The hash function that reduces the message to a digest is wholly rooted in mathematics. In fact, there is an entire field of mathematics devoted to understanding how to make good hash functions.

A key thing to remember is that the message can be any length but the cryptographic hash function will always return a fixed length of data as the digest. for example, the message could be 11 characters or perhaps 121 characters in length, but the hash function would return, in both cases, a digest of 30 characters. The digest will be different for each message, but a good cryptographic hash function always returns the same length of characters.

Hash Function Digest Examples
The messages can be different lengths, but the hash function always returns a digest of the same length.

If two different messages return the same digest, then the cryptographic function is deemed to be bad.

A respectable system, such as your online credit card account, should never store your password in their database. Your credit card company may use a cryptographic hash function on your password, and store only the digest. This way, if their database is hacked, then the hacker will not see your password. Rather they will see the cryptographic hash.

Further, a respectable cryptosystem will never send you your password, because they don’t have it. They only have the hash, and you can not derive a cryptographic hash function. They only work one way. That is why good systems send you a reset link when you lose your password. (Hint: You may want to avoid websites that send you your actual password when you lose it, rather than a reset link).

Sha-1 is a well-known cryptographic hashing function. It’s not perfect, but suitable for basic hashing needs.

Digital Signatures Provide Message Integrity

You can use a cryptographic hash function on digital signatures to verify the integrity of the message. That is, you can verify the message actually came from the person you think it did.

While hashing passwords has integrity in verifying identification of the user, think of a digital signature as  having integrity in verifyying indentification of a message sender. For example, your doctor writing a prescription by email. It allows you to print the email to take to your pharmacy. The integrity of the digital signature should verify that it actually came from your doctor; and that the prescription was not modified in transit over the internet.

A Simple Example of Digital Signature Integrity

  1. The sender and receiver are the only two people who know a secret.
  2. The sender writes a message and adds on the secret. A cryptographic hash function is used to form a digest of the message+secret.
  3. The message+digest is sent. (Remember the digest is a hash of the message+secret).
  4. The receiver takes the message+digest, removes the digest, and adds the secret. Now the receiver has the message+secret, and keeps the digest for comparison.
  5. The receiver runs the same cryptographic hash function on message+secret, which returns a digest.
  6. The receiver now compares this digest with the original digest it removed from the message+digest transmission.
  7. If it’s a match then the integrity of the digital signature is verified. Of course, this only works if the secret it not compromised.

The integrity only works if the sender and receiver can keep a secret!

While the secret key technique is valid, it’s not practical for the internet because there is not a secure way to distribute keys. The internet is an insecure medium. This fact underscores the challenge of internet security.

If this post was interesting to you, then you may be interested to read how the pioneers of internet security were able to solve the security issue by developing the concept of public-key encryption.