Learn the basic concepts of digital signatures and PKI (Public Key Infrastructure): public keys, private keys, digital certificates, certification authorities, certification chains, keystores, and so forth. This article includes useful information for non-Java developers also.
This series of articles makes the reader familiar with the problems related to digitally signing documents in Java-based Web applications and suggests specific approaches for their solving. A fully functional, open-source framework is presented for digitally signing documents in the client's Web browsers and verifying signatures, certificates, and certification chains.
Part 1 introduces the basic concepts of digital signatures and PKI (Public Key Infrastructure): public keys, private keys, digital certificates, certification authorities, certification chains, keystores, and so forth.
Part 2 describes the procedures and algorithms for digitally signing documents and digital signature verification.
Part 3 introduces the class libraries for working with digital signatures and certificates on a Java 2 platform and gives a short description of the most important classes and interfaces from Java Cryptography Architecture (JCA) and Java Certification Path API that concern the use of digital signatures and certificates.
Part 4 provides an analysis of the most essential problems connected with the digitally signing of documents in Web-based systems and suggests a particular solution for them. Motivated is the need for using a digitally signed Java applet that is integrated with the Web application and signs the files on the client's machine before uploading them to the server. The problems related to Java applets signing and interoperability between applets and Web browsers are examined. The mechanisms for verification of digital signatures, certificates, and certification chains and the possibilities for their particular application are also discussed and analyzed.
Part 5 proposes the NakovDocumentSigner system to give the developers a fully functional framework for digitally signing documents in the client's Web browsers and verifying signatures, certificates, and certification chains on the server side. The system consists of a Java applet for digitally signing and a reference J2EE Web application for signatures and certificates verification. It demonstrates how the Java Cryptography Architecture and Java Certification Path API can be applied to provide the Web applications with digital signature functionality. The full source code of the framework is included and discussed.
Part 1. Basic Concepts Related to Digital Signatures
When transferring important documents electronically, it is often necessary to certify in a reliable way who is actually the sender (author) of a given document. One approach for certifying the origin of documents and files is by using the so-called digital signature (electronic signature).
The digital signing of documents uses public key cryptography as a mathematical base.
Public Key Cryptography
Public key cryptography is a mathematical science used to provide confidentiality and authenticity in information exchange by using cryptographic algorithms that work with public and private keys. These cryptographic algorithms are used to digitally sign documents, digital signature verification, and document encryption and decryption.
The public and private keys are a mathematically bound cryptographic key pair (public/private key pair). To each public key corresponds exactly one private key and vice versa; to each private key corresponds exactly one public key. To use public key cryptography, one must have a public key and its corresponding private key.
The public key is a number (sequence of bits), which is usually bound to a person. A public key can be used to check digital signatures, created with the corresponding private key, as well as for encrypting documents that can then be decrypted only by the owner of the corresponding private key. The public keys are not secret to anybody and are usually publicly available. The public key of a given person must be known to anyone communicating with the person using public key cryptography.
The private key is a number (sequence of bits), known only to its owner. With his or her private key, a person can sign documents and decrypt documents that are encrypted with the corresponding public key. To a certain extent, the private keys resemble the well-known access passwords, which are a widespread authentication method over the Internet. The similarity is that with the private key, as well with the password, a person can prove his or her identity, i.e. to authenticate himself or herself. In addition, as with the passwords, the private keys are meant to be secret to all but the owner. In contrast to the access passwords, the private keys are not so short to be remembered and therefore their storing requires special care. If a private key falls into the hands of a person not owning the key (that is, if the key is stolen), the whole communication, based on public key cryptography, depending on this private key, becomes meaningless. In such cases, the stolen key must be announced invalid and be substituted to become possible again to communicate securely with the owner of the key.
For its purposes, public key cryptography uses such cryptographic algorithms that it is practically impossible for contemporary mathematics and the current computing machinery to find the private key of a person, knowing his or her public key. In fact, the finding of a private key that corresponds to a given public key is possible in theory, but the necessary time and computing power make such operations meaningless. From a mathematical point of view, it is impossible to sign a document without knowing the private key of the person who signs it. It is also impossible to decrypt a document that was encrypted with the public key of a given person without knowing the corresponding private key. The science dealing with breaking cryptographic keys and codes is called cryptanalysis.
The digital signing is a mechanism for certifying the origin and the integrity of electronically transmitted information. In the process of digitally signing, additional information—called a digital signature—is added to the given document, calculated using the contents of the document and some private key. At a later stage, this information can be used to check the origin of the signed document.
The digital signature is a number (sequence of bits), calculated mathematically when signing a given document (message). This number depends on the contents of the message, the algorithm used for signing, and the private key used to perform the signing. The digital signature allows the recipient to check the actual origin of the information and its integrity.
Public Key Infrastructure and Digital Certificates
The Public Key Infrastructure (PKI) provides the architecture, organization, techniques, practices, and procedures that support, by means of digital certificates, the application of the public key cryptography for the purposes of the secure information exchange over insecure networks and transmission media. For the issuance and control of such digital certificates, the Public Key Infrastructure relies on the so-called certification authorities, which allow trust between strange parties, participating in secured communication based on public and private keys.
The digital certificates bind a particular public key with a particular person. They are issued by special kind of authorities (certification authorities) in strict security precautions, which guarantee their authenticity. We can think of the digital certificates as electronic documents, certifying that a given public key is property of a given person. In practice for the purposes of the digital signature, the most widely used are X.509 certificates.
X.509 is a widely accepted standard for digital certificates. An X.509 digital certificate contains the public key of a given person, private data about this person (name, organization and so on), information about the certification authority that has issued the certificate, validity period information, information about the used cryptographic algorithms, and other various details.
The certification authority (CA) is an institution entitled to issue digital certificates and to sign them with its own private key. The purpose of the certificates is to confirm that a given public key is property of a given person, and the purpose of the certification authorities is to confirm that the given certificate is valid and can be trusted. In this sense, the certification authorities are an unbiased trusted third party that provides for a high degree of security in the computer-based information exchange. If a certification authority has issued a digital certificate to a given person and has signed that this certificate really belongs to the person, we can believe that the public key in the certificate does in fact belong to the person, provided we trust the certification authority.
Depending on the necessary security level, certificates with different levels of trust are used. For the issuance of some kinds of certificates, only the owner's e-mail address is needed, while the issuance of others requires the personal presence of the owner, who inks his or her signature on paper-based documents in some office of the certification authority.
Not all certification authorities can be trusted because it is possible that malicious people present themselves asa certification authority which does not really exist, or is fake. To trust a certification authority, it has to be worldly acknowledged and approved. In the world of digital security, the approved world certification authorities depend on very strict policies and procedures for issuing certificates and, thanks to them, they keep the trust of their clients. For a greater security, these authorities obligatorily use special hardware that guarantees the impossibility of leaks of important information; for example, private keys. Among the best-known approved world certification authorities are the following companies: VeriSign Inc., Thawte Consulting, GlobalSign NV/SA, Baltimore Technologies, TC TrustCenter AG, Entrust Inc. and so on.
Every certification authority possesses a certificate and a corresponding private key, with which it signs the certificates it issues to its clients. A certification authority can be at the top level (top-level certification authority; root CA) or at some subsequent level. Top-level certification authorities issue themselves a certificate at the beginning of their activity and sign it with the same certificate. These certificates are called Root certificates. The Root certificates of trusted world certification authorities are publicly available on their Web sites and can be used for verification of other certificates. The non-top-level certification authorities depend on some upper-level authority to issue them a certificate, which allows them to issue and sign certificates for their clients.
It is technically possible to use each certificate to sign every other certificate, but in practice the possibility to sign certificates is highly limited. Every certificate contains unchangeable information about whether it can be used to sign other certificates. The certification authorities issue certificates to their clients; they cannot be used to sign other certificates. The certificates that can be used to sign other certificates are issued only to certification authorities with very strong security precautions. If a client buys a certificate from some certification authority and signs another certificate with it, the newly signed certificate will be invalid because it will be signed by a certificate in which it is specified that it cannot be used to sign other certificates.
A given certificate can be signed by another certificate (most frequently, the property of some certification authority) or to be signed by itself. The certificates that are not signed by another certificate, rather by themselves, are called self-signed certificates. In particular, the Root certificates of the top-level certification authorities are self-signed certificates. Generally, a self-signed certificate cannot certify the relationship between a public key and a given person because, by using the appropriate software, everyone can generate such a certificate to the name of the chosen person or company.
Although self-signed certificates cannot be trusted, they find their application. For example, within the bounds of an inter-company infrastructure, where it is possible to physically transfer the certificates in a secure way between the individual employees and the inter-company systems, self-signed certificates can successfully replace certificates issued by certification authorities. In such inter-company media, it is not necessary for some certification authority to confirm that a given public key belongs to a particular person because this can be guaranteed by the method of issuing and transferring certificates. For example, when a new person is employed by a given company, it is possible for the system administrator to issue him or her a self-signed certificate and to give it to him or her on a floppy disk or in another secure way. Then, the administrator can transport this certificate in a secure way to all inter-company systems and in this way it would be guaranteed that all inner systems have the real certificates of all employees.
The described security scheme based on self-signed certificates can be improved if the company establishes its own local certification authority for its employees. For that purpose, the company must initially issue a self-signed certificate, and to issue certificates to its employees that are signed with this certificate. In that way, the initial certificate of the company is a trusted Root certificate, and the company itself is a top-level certification authority.
In both described schemes, there is a possibility of a misuse by the system administrator who has the rights to issue certificates. This problem could be solved by enforcing strict inter-company procedures for the issuance and control of certificates, but complete security cannot be guaranteed.
In communication over the Internet, where there is no secure way to determine whether a given certificate sent over the network has not been changed somewhere on the way, self-signed certificates are almost not used, but only certificates issued by some approved certification authority. In such networks, the SSL protocol is most often used to secure the communications providing secure channels, called SSL tunnels. The SSL (Secure Socket Layer) protocol relies on public key cryptography and certificates to allow two communicating parties to set up an encrypted channel between each other. It guarantees the channel is secure only if the certificates used for establishing the channel are trusted. For example, if a Web server on the Internet must communicate with Web browsers over a secured communication channel (SSL tunnel), it must own by all means a certificate issued by some well-known certification authority. Otherwise, it would be possible for an encrypted channel between the clients and this Web server to be tapped by malicious people.
The certificates issued by approved certification authorities allow a higher degree of security of the communication, regardless of whether they are used in a private corporate network or on the Internet. Nevertheless, self-signed certificates are often used because the certificates issued by the certification authorities cost money and require efforts on behalf of their owner for the initial issuance, the periodical renewal, and the reliable storage of the corresponding private key.
Certificate Chains and Verification
When a top-level certification authority issues a certificate to a client, it signs it with its Root certificate. In this way, a certification chain consisting of two certificates is formed: the certification authority's certificate standing before the certificate of the client. A certification chain is a sequence of certificates in which each certificate is signed by the one after it. At the beginning of the chain usually stands some certificate issued to an end-client, and at the end of the chain is the Root certificate of some certification authority. In the middle of the chain stand the certificates of some intermediate certification authorities. The common practice is for the top-level certification authorities to issue certificates to the intermediate certification authorities and to specify in these certificates that they can be used to issue other certificates. The intermediate certification authorities issue certificates to their clients or to other certification authorities. The rights set to the end-client certificates do not allow them to use the certificates to sign other certificates, but this restriction does not apply to the certificates issued to intermediate certification authorities.
A certificate at the beginning of a certification chain can be trusted only if this certification chain can be successfully verified. In this case, it is said to be a verified certificate. The verification of a certification chain includes the verification that every certificate in it is signed by the next certificate in the chain. As for the last certificate, it is verified that it is on the list of unconditionally trusted Root certificates. Every software system that performs certificates verification maintains a list of trusted Root certificates, which it unconditionally trusts. These are the Root certificates of worldly acknowledged certification authorities. For example, the Web browser Internet Explorer, by default, comes with a list of about 150 trusted Root certificates, and the browser Mozilla in its initial installation contains about 70 trusted certificates. Certification chain verification includes not only the verification that each certificate is signed by the next one and that the certificate at the end of this chain is on the list of trusted Root certificates. It is also necessary to verify that each certificate in the chain is still valid, and also that each certificate except the first one has the right to be used to sign other certificates. If the verification of the last condition is omitted, it would be possible for end clients to issue a certificate to whomever they want and the verification of the issued certificate to be successful. In the verification of a given certification chain, it is checked also whether some certificate in it is revoked. The purpose of the combination of all described verifications is to determine whether a certificate can be trusted. If the verification of a certification chain is unsuccessful, it does not necessarily mean that there is a forgery attempt. It is possible that the list of trusted Root certificates used in the verification does not contain the Root certificate at the end of the chain, although it is real. Generally, a certificate cannot be verified if the whole its certification chain is not present or if the Root certificate at the beginning of the chain is not on the list of trusted certificates. The certification chain of a given certificate can be programmatically constructed, in case it is not present; but for that purpose, all certificates in it must be present.
Protected Keystores and Certificate Files
In the systems for electronic signing of documents, protected stores for keys and certificates (protected keystores) are used. Such stores can contain three kinds of elements: certificates, certification chains, and private keys. As the information stored in protected stores is confidential due to security considerations, it is accessed using two-level passwords: a password to the store and separate passwords for the private keys in it. Thanks to these passwords, in case of eventual stealing of a protected keystore, the confidential information stored there can not be easily read. In practice, the private keys, as a particularly important and confidential piece of information, are never stored outside the keystores and are always protected with access passwords.
There are several developed standards for protected keystores. The most widespread is the PKCS#12 standard, in which the store is a file with the standard extension .PFX (or the more rarely used extension .P12). A PFX file usually contains a certificate, a private key corresponding to it, and a certification chain proving the certificate authenticity. The presence of a certification chain is not necessary and sometimes the PFX files contain only a certificate and a private key. In most cases, to facilitate the user, the password to access a PFX file is the same as the password to access the private key stored in it. Due to this reason, when using PFX files most frequently only a single access password is required.
When a certification authority issues a digital certificate to a client, the client ultimately gets a protected keystore, which contains the issued certificate, its corresponding private key, and the whole certification chain, proving the certificate's authenticity. The protected store is given to the client either in the form of a PFX file, as a smart card, or is directly installed in his or her Web browser.
Usually, when a certificate is issued over the Internet, independent of how the user confirms his or her identity, in the issuance procedure the user's Web browser plays an important part. In a request for a certificate issuance sent to a given certification authority from its Web site, the user's Web browser generates a public/private key pair and sends the public key to the authority's server. The browser keeps the private key a secret and does not send it to the authority. The certification authority, after verifying the authenticity of its client's personal identity data, issues him or her certificate in which it records the public key received by the client's Web browser and his or her confirmed identity data. For some types of certificates, the identity data can consist only of a verified e-mail address, while for others the data can contain full information about the person: name, address, identity card's number, and so on. The personal data verification is done using a procedure, determined by the respective certification authority. After the certification authority's server issues the certificate to its client, it redirects him or her to the Web page from which this certificate can be installed in the client's Web browser. In reality, the user somehow receives his or her newly issued certificate from the certification authority along with the complete certification chain. Meanwhile, the Web browser has stored the private key corresponding to the certificate and, at the end, the user obtains a certificate and its corresponding private key, along with the certification chain of the certificate, installed in his or her Web browser. The method of storing private keys varies with the different browsers, but in any case such confidential information is protected at least with a password. In the described mechanism for issuing a certificate, the user's private key remains unknown to the certification authority and in that way the user can be sure that no one else has access to his or her private key.
Most Web browsers can use the certificates and private keys stored in them for authentication before secure SSL servers. Many e-mail clients can also use the certificates stored in the Web browsers for signing, encrypting, and decrypting electronic mail. However, some applications cannot directly use the certificates from the users' Web browsers, but can work with PFX keystores. In such cases, the users can export their certificates from their Web browsers, along with their corresponding private keys in PFX files and to use them in any other application. In Internet Explorer, the certificate and private key are exported from the main menu by using the commands Tools | Internet Options | Contents | Certificates | Export; and in Netscape and Mozilla, by using the commands Edit | Preferences | Privacy & Security | Certificates | Manage Certificates | Backup. By default, when exporting a certificate and a private key in a PFX file, Internet Explorer does not include the complete certification chain in the output file, but the user can specify them using an additional option.
There are several standards for storing X.509 digital certificates. Most frequently, ASN.1 DER encoding is used, in which the certificates are stored in files with a .CER extension (or more rarely with .CRT or .CA extensions). A CER file contains a public key, information about its owner, and a digital signature of some certification authority, certifying that this public key really belongs to that person. The certification authorities distribute from their sites their Root certificates in CER files. A CER file can be stored in binary format or text format, encoded with Base64.
Sometimes, a person or a company happens to lose control over his or her certificates and their corresponding private keys and they fall in the hands of other people, who can eventually take advantage of them. In such cases, it is necessary to revoke these certificates (revoked certificates).
The certification authorities periodically (or by emergency) publish lists of particular certificates that are temporarily disabled or revoked before their expiration date. These lists are digitally signed by the certification authority that issues them, and are called certificate revocation lists (CRL). In such lists are specified the name of the certification authority that has issued the certificate, the issue date, the date of the next publishing of such list, the serial numbers of the revoked certificates and the specific times and reasons for revocation.
As mentioned earlier, my next article will describe the procedures and algorithms for digitally signing documents and digital signature verification.
About the Author
Svetlin Nakov is part-time computer science lecturer in Sofia University,
Bulgaria. He has over 5 years of professional software engineering and
training experience and currently works as IT consultant in a leading
Bulgarian software company. His areas of expertise include Java and related
technologies, .NET Framework, network security, data structures and
algorithms, and programming code quality. More information on his research
background, skills and work experience is available from his home site