Cryptography as a Solution – Using Advanced Techniques for Data Protection

Introduction to Data Protection

In the world of information security it is highly advised to implement security solutions in layers. Solutions such as authentication, authorization, input validation and others help us maintain order and security when dealing with access to data.

It is important to note that these techniques do not help with the data itself. For example, if a malicious user gains access to the data store directly (such as the database) and the data is kept as clear text, the attacker can read confidential data, modify it, or render it unavailable.

In order to protect data, we use security models such as CIA (also known as the CIA triad). The CIA – Confidentiality, Integrity and Availability (or, in other versions, Authenticated) specify that data must be (according to the company’s policy):

Confidential – read by authorized users or systems only.
Integrity – edited/modified by authorized users or systems only.
Available – accessible to authorized users or systems.

Additional specifications include:

Message origin authentication – the ability to positively identify the origin of the message.
Non repudiation – the inability by a user or system to deny the message.

It is common to use techniques such as encryption and hashing (for confidentiality and availability) to protect the data as well as SSL/TLS to protect data on transit.

However, sometimes just using standard data protection techniques will not be enough. They might not provide a proper solution to a specific problem.

Introduction to Cryptography, Encryption and Cryptanalysis

*You can skip this part if you are familiar with basic cryptography techniques.

In order to fully understand some advanced cryptography-based solutions, we must first have an idea about what cryptography is and how it relates to the CIA triad.

Cryptography is the science of turning information from a readable format into a non-readable format and vice versa using complicated mathematical algorithms (for more information regarding the mathematics of encryption: https://www.coursera.org/learn/crypto).

The data or information in its readable format is called “plain text” or “clear text” and in its unreadable format is called “encrypted text” or “cipher text”, the process of turning the former into the latter is called “encryption” and the process of retrieving the encrypted text back to its plain text format is called “decryption”.

In order to do so, we use an encryption algorithm (AKA a cipher) with a key or set of keys.

One question that may come to mind is: “how can we be sure that the encrypted/cipher text cannot be turned back into plain text by an unauthorized entity?”

The reality is that with enough time and processing power, any text that was encrypted can be cracked (with the introduction of Quantum computing, the cracking process will be significantly shorter! – https://www.tripwire.com/state-of-security/featured/will-quantum-computers-threaten-modern-cryptography/).

However, strong ciphers are ciphers that are resistant to cryptanalysis and support large keys.

Generally, encryption algorithms are divided into two main techniques: symmetric and a-symmetric encryption.

Symmetric encryption algorithms (for example: AES) use a key (or set of keys, AKA “secret key”/”shared secret”) for both encryption and decryption. One of the main advantages of symmetric encryption is performance, which is better than a-symmetric encryption. However, a common problem is the key secrecy. If the key is stolen, the encryption can be reversed with very little effort.

A-symmetric encryption algorithms (for example: RSA) use 2 keys in a set (usually called public and private keys) for encryption and decryption (each key is one-directional). Normally, the public key (which is sent to an external entity such as a client) is used for encryption and the private key (which is kept safely on the server) is used for decryption.

Two of the main problems that occur in cryptography are key secrecy and secure key generation.

In this article, we will discuss some scenarios in which such problems can occur, a few possible solutions and code samples in C#, Java and Python and in some technologies such as Android mobile applications, WCF and more.

Password-Based Key Derivation

Password-Based Key Derivation is a method in which a secret key is created, usually, by using a combination of a password, a salt and iterations. This usually a technique that helps overcome secret key storage for symmetric encryption.

One of the main drawbacks of symmetric cryptography is, since the same key is used for both encryption and decryption, storing the key in a location that is reachable for an attacker (e.g. storing it hardcoded in the client application or storing it in a file in the client application solution) could jeopardize the confidentiality of the application.

Password Based Key Derivation (PBKD) provides us, essentially, with an ability to create keys that can be used for confidentiality and data integrity (e.g. encryption and keyed hashing) usually to overcome difficult key storage. Changing “something you have” category to “something you know” or adding to the former.

Note that since this solution is based on passwords, secure password management is needed (For more information on the subject: https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet and https://www.owasp.org/index.php/Authentication_Cheat_Sheet).

If the password must be hardcoded, we can store it in a byte array/encrypted byte array to make it harder for an attacker to obtain the password.

The PBKD makes use of a password and a salt. In some scenarios, the encryption is done by combining the result of the password, salt and a master key/secret to create the entire key.

SSL/Certificate Pinning – Introduction

Before we talk about SSL/certificate pinning, scenarios and implementations, we first must understand:

What is SSL?
What protection does SSL offers?
What are the main threats that SSL protect against?
What are the limitations of SSL?

A Little About SSL

SSL/TLS (Secure Sockets Layer/Transport Layer Security) are cryptographic protocols that provide transport security (i.e. data that is transmitted between the server and the client are encrypted).

One of the main features that SSL/TLS also provides is a form of identification.

This identification is done by presenting a digital certificate.

A digital certificate is a piece of information that helps to identify an entity by presenting it to the recipient. In SSL/TLS case, upon the SSL handshake, the server identifies itself to the client.

The digital certificate contains information that could be used for identification and or (such as: issuer, subject and a public key).

The client then decides whether to continue with the handshake and complete the handshake or abort.

(The server normally presents the certificate to the client, and the client has to either allow or deny it).

Accept Self-Signed Certificates/”Trust the certificate”

A self-signed certificate is a digital certificate that is not signed by any Certificate Authority (will be explained shortly). Technically, any entity can create its own digital certificate by using tools such as Makecert.exe and OpenSSL binaries.

This method is normally used during development or a small organization that does not have Certificate Authority.

In this case, the user of the client application has to import the certificate directly to the certificate store prior to the handshake.

CA Signed Certificate/Chained Certificate

A Certificate Authority is essentially a service that can issue a signed certificate to en entity to help it identify itself.

If the recipient has the certificate of the Certificate Authority in its trusted store, any valid certificates that are signed by it, will be accepted automatically.

There are 2 categories of a Certificate Authority (CA):

Root CA – A Certificate Authority that is globally known and trusted. Some companies around the world can provide this service. Such as: Comodo, Verisign, Godaddy and more.
Intermediate/subordinate CA – A Certificate Authority which is not root and is known and trusted within a confined area. There could be multiple intermediate CA for any company.

A certificate that is signed by an intermediate and/or by a root CA is a Chained/Signed Certificate.

What does SSL protect from?

SSL/TLS is used to protect the CIA of information against sniffing and Man in the Middle (MitM) attacks.

Typically, an attacker that wants to view information that is passed between a client and a server will use a network analyzer tool (AKA a “Sniffer”) such as Wireshark (https://www.wireshark.org/)

Another technique that can help an attacker view and modify the information is by using a Proxy. A proxy is a tool that was typically used by IT to avoid data leakage.

The SSL client connects to the proxy which then relays the message to the intended recipient.

The SSL handshake takes place between the client and the proxy and therefore the proxy views the information in clear text (for example: by illegitimate use of Burp Suite (https://portswigger.net/burp/) or Fiddler (http://www.telerik.com/fiddler) for HTTP or by using AppSec Labs Advanced Packet Editor (https://appsec-labs.com/advanced-packet-editor/ – which is based on Packet Editor (http://www.packeteditor.com/)) or Java-based ProKSy (https://appsec-labs.com/proksy/) for normal TCP communication or by using AppSec Labs WCF Toolkit (https://appsec-labs.com/wcf-proxy/) for WCF based applications).

Limitations of SSL

There is, however, one very important thing that we need to understand.

Even though SSL identifies the server, it does not guarantee that the server is in fact the intended server (i.e. the server that was originally intended to be contacted).

Finally, SSL/Certificate Pinning

Since the default mechanism for certificate validation is to accept any valid and signed certificate, the goal of SSL/Certificate pinning is to only accept specific digital certificates (one or more) and therefore to overcome this limitation.

The process of pinning a certificate (i.e. only accepting a single or some certificates) can be done as an extension of a certificate verification process (e.g. SSL certificate verification, code signing verification and general client/server identity verification) or even replace it.

Note that the SSL/certificate pinning is only an additional security step that should be combined with other security mechanisms such as obfuscation and anti-debugging.

Furthermore, we must make sure that the standard validation checks are performed:

Path validation check – the path validation verifies that all the signatures on certificates in the chain are valid according to the required chain and order.
Validity check – the validity check makes sure that the current time in which the validation takes place is between the notbefore and notafter of each certificate.
Revocation status check – the revocation status check is to make sure that any of the certificates are not marked as revoked by a CA. (For more info: https://blogs.msdn.microsoft.com/ieinternals/2011/04/07/understanding-certificate-revocation-checks/)

In this procedure, the client (or in some cases, the server) will test certain components of the presented certificate against a white list to determine whether to allow it or deny.

The type of information that is tested dictates whether it’s a strong or weak pinning according to the ability of a malicious user to forge the same certificate.

The pinning itself can be on the actual certificate and/or the CA certificates.

To read the entire article, including code sample in Java (native and android), C# and Python, you are welcome to follow this link