Cryptography and Network Security

 After reading chapter 6, analyze the structure of advanced encryption standards and why it makes it so strong.  The initial post must be completed by wednesday at 11:59 eastern.  You are also required to post a response to a minimum of two other students in the class by the end of the week.  You must use at least one scholarly resource.  Every discussion posting must be properly APA formatted. 

Symbol Expression Meaning
D, K D(K, Y) Symmetric decryption of ciphertext Y using secret key K
D, PRa D(PRa, Y) Asymmetric decryption of ciphertext Y using A’s private key PRa
D, PUa D(PUa, Y) Asymmetric decryption of ciphertext Y using A’s public key PUa
E, K E(K, X) Symmetric encryption of plaintext X using secret key K
E, PRa E(PRa, X) Asymmetric encryption of plaintext X using A’s private key PRa
E, PUa E(PUa, X) Asymmetric encryption of plaintext X using A’s public key PUa
K Secret key
PRa Private key of user A
PUa Public key of user A
MAC, K MAC(K, X) Message authentication code of message X using secret key K
The finite field of order p, where p is prime.The field is defined as
the set Zp together with the arithmetic operations modulo p.
GF(2n) The finite field of order 2n
Zn Set of nonnegative integers less than n
gcd gcd(i, j)
Greatest common divisor; the largest positive integer that
divides both i and j with no remainder on division.
mod a mod m Remainder after division of a by m
mod, K a K b (mod m) a mod m = b mod m
mod, [ a [ b (mod m) a mod m ≠ b mod m
dlog dloga,p(b) Discrete logarithm of the number b for the base a (mod p)
w f(n)
The number of positive integers less than n and relatively
prime to n.
This is Euler’s totient function.
Σ a
ai a1 + a2 + g + an
Π q
ai a1 * a2 * g * an
� i � j
i divides j, which means that there is no remainder when j is
divided by i
� , � �a � Absolute value of a

Symbol Expression Meaning
} x } y x concatenated with y
≈ x ≈ y x is approximately equal to y
⊕ x⊕ y Exclusive-OR of x and y for single-bit variables;
Bitwise exclusive-OR of x and y for multiple-bit variables
:, ; :x; The largest integer less than or equal to x
∈ x∈ S The element x is contained in the set S.
A · (a1, a2,
c ak)
The integer A corresponds to the sequence of integers
(a1, a2, c ak)

In the four years since the sixth edition of this book was published, the field has seen contin-
ued innovations and improvements. In this new edition, I try to capture these changes while
maintaining a broad and comprehensive coverage of the entire field. To begin this process of
revision, the sixth edition of this book was extensively reviewed by a number of professors
who teach the subject and by professionals working in the field. The result is that, in many
places, the narrative has been clarified and tightened, and illustrations have been improved.
Beyond these refinements to improve pedagogy and user-friendliness, there have been
substantive changes throughout the book. Roughly the same chapter organization has been
retained, but much of the material has been revised and new material has been added. The
most noteworthy changes are as follows:
■ Fundamental security design principles: Chapter 1 includes a new section discussing the
security design principles listed as fundamental by the National Centers of Academic
Excellence in Information Assurance/Cyber Defense, which is jointly sponsored by the
U.S. National Security Agency and the U.S. Department of Homeland Security.
■ Attack surfaces and attack trees: Chapter 1 includes a new section describing these two
concepts, which are useful in evaluating and classifying security threats.
■ Number theory coverage: The material on number theory has been consolidated
into a single chapter, Chapter 2. This makes for a convenient reference. The relevant
portions of Chapter 2 can be assigned as needed.
■ Finite fields: The chapter on finite fields has been revised and expanded with addi-
tional text and new figures to enhance understanding.
■ Format-preserving encryption: This relatively new mode of encryption is enjoying
increasing commercial success. A new section in Chapter 7 covers this method.
■ Conditioning and health testing for true random number generators: Chapter 8 now
provides coverage of these important topics.
■ User authentication model: Chapter 15 includes a new description of a general model
for user authentication, which helps to unify the discussion of the various approaches
to user authentication.
■ Cloud security: The material on cloud security in Chapter 16 has been updated and
expanded to reflect its importance and recent developments.
■ Transport Layer Security (TLS): The treatment of TLS in Chapter 17 has been updated,
reorganized to improve clarity, and now includes a discussion of the new TLS version 1.3.
■ Email Security: Chapter 19 has been completely rewritten to provide a comprehensive
and up-to-date discussion of email security. It includes:
— New: discussion of email threats and a comprehensive approach to email security.
— New: discussion of STARTTLS, which provides confidentiality and authentication
for SMTP.

— Revised: treatment of S/MIME has been updated to reflect the latest version 3.2.
— New: discussion of DNSSEC and its role in supporting email security.
— New: discussion of DNS-based Authentication of Named Entities (DANE) and the
use of this approach to enhance security for certificate use in SMTP and S/MIME.
— New: discussion of Sender Policy Framework (SPF), which is the standardized way
for a sending domain to identify and assert the mail senders for a given domain.
— Revised: discussion of DomainKeys Identified Mail (DKIM) has been revised.
— New: discussion of Domain-based Message Authentication, Reporting, and Confor-
mance (DMARC) allows email senders to specify policy on how their mail should
be handled, the types of reports that receivers can send back, and the frequency
those reports should be sent.
The subject, and therefore this book, draws on a variety of disciplines. In particular,
it is impossible to appreciate the significance of some of the techniques discussed in this
book without a basic understanding of number theory and some results from probability
theory. Nevertheless, an attempt has been made to make the book self-contained. The book
not only presents the basic mathematical results that are needed but provides the reader
with an intuitive understanding of those results. Such background material is introduced
as needed. This approach helps to motivate the material that is introduced, and the author
considers this preferable to simply presenting all of the mathematical material in a lump at
the beginning of the book.
The book is intended for both academic and professional audiences. As a textbook, it is
intended as a one-semester undergraduate course in cryptography and network security for
computer science, computer engineering, and electrical engineering majors. The changes to
this edition are intended to provide support of the ACM/IEEE Computer Science Curricula
2013 (CS2013). CS2013 adds Information Assurance and Security (IAS) to the curriculum rec-
ommendation as one of the Knowledge Areas in the Computer Science Body of Knowledge.
The document states that IAS is now part of the curriculum recommendation because of the
critical role of IAS in computer science education. CS2013 divides all course work into three
categories: Core-Tier 1 (all topics should be included in the curriculum), Core-Tier-2 (all or
almost all topics should be included), and elective (desirable to provide breadth and depth).
In the IAS area, CS2013 recommends topics in Fundamental Concepts and Network Security
It is the purpose of this book to provide a practical survey of both the principles and practice
of cryptography and network security. In the first part of the book, the basic issues to be
addressed by a network security capability are explored by providing a tutorial and survey
of cryptography and network security technology. The latter part of the book deals with the
practice of network security: practical applications that have been implemented and are in
use to provide network security.

in Tier 1 and Tier 2, and Cryptography topics as elective. This text covers virtually all of the
topics listed by CS2013 in these three categories.
The book also serves as a basic reference volume and is suitable for self-study.
The book is divided into eight parts.
■ Background
■ Symmetric Ciphers
■ Asymmetric Ciphers
■ Cryptographic Data Integrity Algorithms
■ Mutual Trust
■ Network and Internet Security
■ System Security
■ Legal and Ethical Issues
The book includes a number of pedagogic features, including the use of the computer
algebra system Sage and numerous figures and tables to clarify the discussions. Each chap-
ter includes a list of key words, review questions, homework problems, and suggestions
for further reading. The book also includes an extensive glossary, a list of frequently used
acronyms, and a bibliography. In addition, a test bank is available to instructors.
The major goal of this text is to make it as effective a teaching tool for this exciting and
fast-moving subject as possible. This goal is reflected both in the structure of the book and in
the supporting material. The text is accompanied by the following supplementary material
that will aid the instructor:
■ Solutions manual: Solutions to all end-of-chapter Review Questions and Problems.
■ Projects manual: Suggested project assignments for all of the project categories listed
■ PowerPoint slides: A set of slides covering all chapters, suitable for use in lecturing.
■ PDF files: Reproductions of all figures and tables from the book.
■ Test bank: A chapter-by-chapter set of questions with a separate file of answers.
■ Sample syllabuses: The text contains more material than can be conveniently covered
in one semester. Accordingly, instructors are provided with several sample syllabuses
that guide the use of the text within limited time.
All of these support materials are available at the Instructor Resource Center
(IRC) for this textbook, which can be reached through the publisher’s Web site To gain access to the IRC, please contact your
local Pearson sales representative.

For many instructors, an important component of a cryptography or network security course
is a project or set of projects by which the student gets hands-on experience to reinforce
concepts from the text. This book provides an unparalleled degree of support, including a
projects component in the course. The IRC not only includes guidance on how to assign and
structure the projects, but also includes a set of project assignments that covers a broad range
of topics from the text:
■ Sage projects: Described in the next section.
■ Hacking project: Exercise designed to illuminate the key issues in intrusion detection
and prevention.
■ Block cipher projects: A lab that explores the operation of the AES encryption algo-
rithm by tracing its execution, computing one round by hand, and then exploring the
various block cipher modes of use. The lab also covers DES. In both cases, an online
Java applet is used (or can be downloaded) to execute AES or DES.
■ Lab exercises: A series of projects that involve programming and experimenting with
concepts from the book.
■ Research projects: A series of research assignments that instruct the student to research
a particular topic on the Internet and write a report.
■ Programming projects: A series of programming projects that cover a broad range of
topics and that can be implemented in any suitable language on any platform.
■ Practical security assessments: A set of exercises to examine current infrastructure and
practices of an existing organization.
■ Firewall projects: A portable network firewall visualization simulator, together with
exercises for teaching the fundamentals of firewalls.
■ Case studies: A set of real-world case studies, including learning objectives, case
description, and a series of case discussion questions.
■ Writing assignments: A set of suggested writing assignments, organized by chapter.
■ Reading/report assignments: A list of papers in the literature—one for each chapter—
that can be assigned for the student to read and then write a short report.
This diverse set of projects and other student exercises enables the instructor to use
the book as one component in a rich and varied learning experience and to tailor a course
plan to meet the specific needs of the instructor and students. See Appendix A in this book
for details.
One of the most important features of this book is the use of Sage for cryptographic examples
and homework assignments. Sage is an open-source, multiplatform, freeware package that
implements a very powerful, flexible, and easily learned mathematics and computer algebra
system. Unlike competing systems (such as Mathematica, Maple, and MATLAB), there are

no licensing agreements or fees involved. Thus, Sage can be made available on computers
and networks at school, and students can individually download the software to their own
personal computers for use at home. Another advantage of using Sage is that students learn
a powerful, flexible tool that can be used for virtually any mathematical application, not
just cryptography.
The use of Sage can make a significant difference to the teaching of the mathematics
of cryptographic algorithms. This book provides a large number of examples of the use of
Sage covering many cryptographic concepts in Appendix B, which is included in this book.
Appendix C lists exercises in each of these topic areas to enable the student to gain
hands-on experience with cryptographic algorithms. This appendix is available to instruc-
tors at the IRC for this book. Appendix C includes a section on how to download and get
started with Sage, a section on programming with Sage, and exercises that can be assigned to
students in the following categories:
■ Chapter 2—Number Theory and Finite Fields: Euclidean and extended Euclidean
algorithms, polynomial arithmetic, GF(24), Euler’s Totient function, Miller–Rabin, fac-
toring, modular exponentiation, discrete logarithm, and Chinese remainder theorem.
■ Chapter 3—Classical Encryption: Affine ciphers and the Hill cipher.
■ Chapter 4—Block Ciphers and the Data Encryption Standard: Exercises based
on SDES.
■ Chapter 6—Advanced Encryption Standard: Exercises based on SAES.
■ Chapter 8—Pseudorandom Number Generation and Stream Ciphers: Blum Blum
Shub, linear congruential generator, and ANSI X9.17 PRNG.
■ Chapter 9—Public-Key Cryptography and RSA: RSA encrypt/decrypt and signing.
■ Chapter 10—Other Public-Key Cryptosystems: Diffie–Hellman, elliptic curve.
■ Chapter 11—Cryptographic Hash Functions: Number-theoretic hash function.
■ Chapter 13—Digital Signatures: DSA.
For this new edition, a tremendous amount of original supporting material for students has
been made available online.
Purchasing this textbook new also grants the reader six months of access to the
Companion Website, which includes the following materials:
■ Online chapters: To limit the size and cost of the book, four chapters of the book are
provided in PDF format. This includes three chapters on computer security and one on
legal and ethical issues. The chapters are listed in this book’s table of contents.
■ Online appendices: There are numerous interesting topics that support material found
in the text but whose inclusion is not warranted in the printed text. A total of 20 online
appendices cover these topics for the interested student. The appendices are listed in
this book’s table of contents.

■ Homework problems and solutions: To aid the student in understanding the material,
a separate set of homework problems with solutions are available.
■ Key papers: A number of papers from the professional literature, many hard to find,
are provided for further reading.
■ Supporting documents: A variety of other useful documents are referenced in the text
and provided online.
■ Sage code: The Sage code from the examples in Appendix B is useful in case the student
wants to play around with the examples.
To access the Companion Website, follow the instructions for “digital resources for
students” found in the front of this book.
This new edition has benefited from review by a number of people who gave generously
of their time and expertise. The following professors reviewed all or a large part of the
manuscript: Hossein Beyzavi (Marymount University), Donald F. Costello (University of
Nebraska–Lincoln), James Haralambides (Barry University), Anand Seetharam (California
State University at Monterey Bay), Marius C. Silaghi (Florida Institute of Technology),
Shambhu Upadhyaya (University at Buffalo), Zhengping Wu (California State University
at San Bernardino), Liangliang Xiao (Frostburg State University), Seong-Moo (Sam) Yoo
(The University of Alabama in Huntsville), and Hong Zhang (Armstrong State University).
Thanks also to the people who provided detailed technical reviews of one or more
chapters: Dino M. Amaral, Chris Andrew, Prof. (Dr). C. Annamalai, Andrew Bain, Riccardo
Bernardini, Olivier Blazy, Zervopoulou Christina, Maria Christofi, Dhananjoy Dey, Mario
Emmanuel, Mike Fikuart, Alexander Fries, Pierpaolo Giacomin, Pedro R. M. Inácio,
Daniela Tamy Iwassa, Krzysztof Janowski, Sergey Katsev, Adnan Kilic, Rob Knox, Mina
Pourdashty, Yuri Poeluev, Pritesh Prajapati, Venkatesh Ramamoorthy, Andrea Razzini,
Rami Rosen, Javier Scodelaro, Jamshid Shokrollahi, Oscar So, and David Tillemans.
In addition, I was fortunate to have reviews of individual topics by “subject-area
gurus,” including Jesse Walker of Intel (Intel’s Digital Random Number Generator), Russ
Housley of Vigil Security (key wrapping), Joan Daemen (AES), Edward F. Schaefer of
Santa Clara University (Simplified AES), Tim Mathews, formerly of RSA Laboratories
(S/MIME), Alfred Menezes of the University of Waterloo (elliptic curve cryptography),
William Sutton, Editor/Publisher of The Cryptogram (classical encryption), Avi Rubin of
Johns Hopkins University (number theory), Michael Markowitz of Information Security
Corporation (SHA and DSS), Don Davis of IBM Internet Security Systems (Kerberos),
Steve Kent of BBN Technologies (X.509), and Phil Zimmerman (PGP).
Nikhil Bhargava (IIT Delhi) developed the set of online homework problems and
solutions. Dan Shumow of Microsoft and the University of Washington developed all of
the Sage examples and assignments in Appendices B and C. Professor Sreekanth Malladi of
Dakota State University developed the hacking exercises. Lawrie Brown of the Australian
Defence Force Academy provided the AES/DES block cipher projects and the security
assessment assignments.

Sanjay Rao and Ruben Torres of Purdue University developed the laboratory exercises
that appear in the IRC. The following people contributed project assignments that appear in
the instructor’s supplement: Henning Schulzrinne (Columbia University); Cetin Kaya Koc
(Oregon State University); and David Balenson (Trusted Information Systems and George
Washington University). Kim McLaughlin developed the test bank.
Finally, I thank the many people responsible for the publication of this book, all of
whom did their usual excellent job. This includes the staff at Pearson, particularly my editor
Tracy Johnson, program manager Carole Snyder, and production manager Bob Engelhardt.
Thanks also to the marketing and sales staffs at Pearson, without whose efforts this book
would not be in front of you.
Pearson would like to thank and acknowledge Somitra Kumar Sanadhya (Indraprastha
Institute of Information Technology Delhi), and Somanath Tripathy (Indian Institute of
Technology Patna) for contributing to the Global Edition, and Anwitaman Datta (Nanyang
Technological University Singapore), Atul Kahate (Pune University), Goutam Paul (Indian
Statistical Institute Kolkata), and Khyat Sharma for reviewing the Global Edition.
Dr. William Stallings has authored 18 titles, and counting revised editions, over 40 books
on computer security, computer networking, and computer architecture. His writings have
appeared in numerous publications, including the Proceedings of the IEEE, ACM Computing
Reviews, and Cryptologia.
He has 13 times received the award for the best Computer Science textbook of the
year from the Text and Academic Authors Association.
In over 30 years in the field, he has been a technical contributor, technical manager,
and an executive with several high-technology firms. He has designed and implemented
both TCP/IP-based and OSI-based protocol suites on a variety of computers and operating
systems, ranging from microcomputers to mainframes. As a consultant, he has advised gov-
ernment agencies, computer and software vendors, and major users on the design, selection,
and use of networking software and products.
He created and maintains the Computer Science Student Resource Site at This site provides documents and links on a variety of
subjects of general interest to computer science students (and professionals). He is a member
of the editorial board of Cryptologia, a scholarly journal devoted to all aspects of cryptology.
Dr. Stallings holds a PhD from MIT in computer science and a BS from Notre Dame
in electrical engineering.

Computer and Network
Security Concepts
1.1 Computer Security Concepts
A Definition of Computer Security
The Challenges of Computer Security
1.2 The OSI Security Architecture
1.3 Security Attacks
Passive Attacks
Active Attacks
1.4 Security Services
Access Control
Data Confidentiality
Data Integrity
Availability Service
1.5 Security Mechanisms
1.6 Fundamental Security Design Principles
1.7 Attack Surfaces and Attack Trees
Attack Surfaces
Attack Trees
1.8 A Model for Network Security
1.9 Standards
1.10 Key Terms, Review Questions, and Problems

This book focuses on two broad areas: cryptographic algorithms and protocols, which
have a broad range of applications; and network and Internet security, which rely
heavily on cryptographic techniques.
Cryptographic algorithms and protocols can be grouped into four main areas:
■ Symmetric encryption: Used to conceal the contents of blocks or streams of
data of any size, including messages, files, encryption keys, and passwords.
■ Asymmetric encryption: Used to conceal small blocks of data, such as encryp-
tion keys and hash function values, which are used in digital signatures.
■ Data integrity algorithms: Used to protect blocks of data, such as messages,
from alteration.
■ Authentication protocols: These are schemes based on the use of crypto-
graphic algorithms designed to authenticate the identity of entities.
The field of network and Internet security consists of measures to deter, prevent,
detect, and correct security violations that involve the transmission of information.
That is a broad statement that covers a host of possibilities. To give you a feel for the
areas covered in this book, consider the following examples of security violations:
1. User A transmits a file to user B. The file contains sensitive information
(e.g., payroll records) that is to be protected from disclosure. User C, who is
not authorized to read the file, is able to monitor the transmission and capture
a copy of the file during its transmission.
2. A network manager, D, transmits a message to a computer, E, under its man-
agement. The message instructs computer E to update an authorization file to
include the identities of a number of new users who are to be given access to
that computer. User F intercepts the message, alters its contents to add or delete
entries, and then forwards the message to computer E, which accepts the mes-
sage as coming from manager D and updates its authorization file accordingly.
After studying this chapter, you should be able to:
◆ Describe the key security requirements of confidentiality, integrity, and
◆ Describe the X.800 security architecture for OSI.
◆ Discuss the types of security threats and attacks that must be dealt with
and give examples of the types of threats and attacks that apply to differ-
ent categories of computer and network assets.
◆ Explain the fundamental security design principles.
◆ Discuss the use of attack surfaces and attack trees.
◆ List and briefly describe key organizations involved in cryptography

3. Rather than intercept a message, user F constructs its own message with the
desired entries and transmits that message to computer E as if it had come
from manager D. Computer E accepts the message as coming from manager D
and updates its authorization file accordingly.
4. An employee is fired without warning. The personnel manager sends a mes-
sage to a server system to invalidate the employee’s account. When the invali-
dation is accomplished, the server is to post a notice to the employee’s file as
confirmation of the action. The employee is able to intercept the message and
delay it long enough to make a final access to the server to retrieve sensitive
information. The message is then forwarded, the action taken, and the confir-
mation posted. The employee’s action may go unnoticed for some consider-
able time.
5. A message is sent from a customer to a stockbroker with instructions for vari-
ous transactions. Subsequently, the investments lose value and the customer
denies sending the message.
Although this list by no means exhausts the possible types of network security viola-
tions, it illustrates the range of concerns of network security.
A Definition of Computer Security
The NIST Computer Security Handbook [NIST95] defines the term computer secu-
rity as follows:
Computer Security: The protection afforded to an automated information system
in order to attain the applicable objectives of preserving the integrity, availability,
and confidentiality of information system resources (includes hardware, software,
firmware, information/data, and telecommunications).
This definition introduces three key objectives that are at the heart of com-
puter security:
■ Confidentiality: This term covers two related concepts:
Data1 confidentiality: Assures that private or confidential information is
not made available or disclosed to unauthorized individuals.
Privacy: Assures that individuals control or influence what information re-
lated to them may be collected and stored and by whom and to whom that
information may be disclosed.
1RFC 4949 defines information as “facts and ideas, which can be represented (encoded) as various forms
of data,” and data as “information in a specific physical representation, usually a sequence of symbols
that have meaning; especially a representation of information that can be processed or produced by a
computer.” Security literature typically does not make much of a distinction, nor does this book.

■ Integrity: This term covers two related concepts:
Data integrity: Assures that information (both stored and in transmit-
ted packets) and programs are changed only in a specified and authorized
System integrity: Assures that a system performs its intended function in
an unimpaired manner, free from deliberate or inadvertent unauthorized
manipulation of the system.
■ Availability: Assures that systems work promptly and service is not denied to
authorized users.
These three concepts form what is often referred to as the CIA triad. The
three concepts embody the fundamental security objectives for both data and for
information and computing services. For example, the NIST standard FIPS 199
(Standards for Security Categorization of Federal Information and Information
Systems) lists confidentiality, integrity, and availability as the three security objec-
tives for information and for information systems. FIPS 199 provides a useful char-
acterization of these three objectives in terms of requirements and the definition of
a loss of security in each category:
■ Confidentiality: Preserving authorized restrictions on information access
and disclosure, including means for protecting personal privacy and propri-
etary information. A loss of confidentiality is the unauthorized disclosure of
■ Integrity: Guarding against improper information modification or destruc-
tion, including ensuring information nonrepudiation and authenticity. A loss
of integrity is the unauthorized modification or destruction of information.
■ Availability: Ensuring timely and reliable access to and use of information.
A loss of availability is the disruption of access to or use of information or an
information system.
Although the use of the CIA triad to define security objectives is well estab-
lished, some in the security field feel that additional concepts are needed to present a
complete picture (Figure 1.1). Two of the most commonly mentioned are as follows:
Figure 1.1 Essential Network and Computer Security

■ Authenticity: The property of being genuine and being able to be verified and
trusted; confidence in the validity of a transmission, a message, or message
originator. This means verifying that users are who they say they are and that
each input arriving at the system came from a trusted source.
■ Accountability: The security goal that generates the requirement for actions
of an entity to be traced uniquely to that entity. This supports nonrepudia-
tion, deterrence, fault isolation, intrusion detection and prevention, and after-
action recovery and legal action. Because truly secure systems are not yet an
achievable goal, we must be able to trace a security breach to a responsible
party. Systems must keep records of their activities to permit later forensic
analysis to trace security breaches or to aid in transaction disputes.
We now provide some examples of applications that illustrate the requirements just
enumerated.2 For these examples, we use three levels of impact on organizations or
individuals should there be a breach of security (i.e., a loss of confidentiality, integ-
rity, or availability). These levels are defined in FIPS PUB 199:
■ Low: The loss could be expected to have a limited adverse effect on organi-
zational operations, organizational assets, or individuals. A limited adverse
effect means that, for example, the loss of confidentiality, integrity, or avail-
ability might (i) cause a degradation in mission capability to an extent and
duration that the organization is able to perform its primary functions, but the
effectiveness of the functions is noticeably reduced; (ii) result in minor dam-
age to organizational assets; (iii) result in minor financial loss; or (iv) result in
minor harm to individuals.
■ Moderate: The loss could be expected to have a serious adverse effect on
organizational operations, organizational assets, or individuals. A serious
adverse effect means that, for example, the loss might (i) cause a signifi-
cant degradation in mission capability to an extent and duration that the
organization is able to perform its primary functions, but the effectiveness
of the functions is significantly reduced; (ii) result in significant damage to
organizational assets; (iii) result in significant financial loss; or (iv) result in
significant harm to individuals that does not involve loss of life or serious,
life-threatening injuries.
■ High: The loss could be expected to have a severe or catastrophic adverse
effect on organizational operations, organizational assets, or individuals.
A  severe or catastrophic adverse effect means that, for example, the loss
might (i) cause a severe degradation in or loss of mission capability to an
extent and duration that the organization is not able to perform one or more
of its primary functions; (ii) result in major damage to organizational assets;
(iii) result in major financial loss; or (iv) result in severe or catastrophic harm
to individuals involving loss of life or serious, life-threatening injuries.
2These examples are taken from a security policy document published by the Information Technology
Security and Privacy Office at Purdue University.

CONFIDENTIALITY Student grade information is an asset whose confidentiality is
considered to be highly important by students. In the United States, the release of
such information is regulated by the Family Educational Rights and Privacy Act
(FERPA). Grade information should only be available to students, their parents,
and employees that require the information to do their job. Student enrollment
information may have a moderate confidentiality rating. While still covered by
FERPA, this information is seen by more people on a daily basis, is less likely to be
targeted than grade information, and results in less damage if disclosed. Directory
information, such as lists of students or faculty or departmental lists, may be as-
signed a low confidentiality rating or indeed no rating. This information is typically
freely available to the public and published on a school’s Web site.
INTEGRITY Several aspects of integrity are illustrated by the example of a hospital
patient’s allergy information stored in a database. The doctor should be able to
trust that the information is correct and current. Now suppose that an employee
(e.g., a nurse) who is authorized to view and update this information deliberately
falsifies the data to cause harm to the hospital. The database needs to be restored
to a trusted basis quickly, and it should be possible to trace the error back to the
person responsible. Patient allergy information is an example of an asset with a high
requirement for integrity. Inaccurate information could result in serious harm or
death to a patient and expose the hospital to massive liability.
An example of an asset that may be assigned a moderate level of integrity
requirement is a Web site that offers a forum to registered users to discuss some
specific topic. Either a registered user or a hacker could falsify some entries or
deface the Web site. If the forum exists only for the enjoyment of the users, brings
in little or no advertising revenue, and is not used for something important such
as research, then potential damage is not severe. The Web master may experience
some data, financial, and time loss.
An example of a low integrity requirement is an anonymous online poll. Many
Web sites, such as news organizations, offer these polls to their users with very few
safeguards. However, the inaccuracy and unscientific nature of such polls is well
AVAILABILITY The more critical a component or service, the higher is the level of
availability required. Consider a system that provides authentication services for
critical systems, applications, and devices. An interruption of service results in the
inability for customers to access computing resources and staff to access the re-
sources they need to perform critical tasks. The loss of the service translates into a
large financial loss in lost employee productivity and potential customer loss.
An example of an asset that would typically be rated as having a moderate
availability requirement is a public Web site for a university; the Web site provides
information for current and prospective students and donors. Such a site is not a
critical component of the university’s information system, but its unavailability will
cause some embarrassment.
An online telephone directory lookup application would be classified as a low
availability requirement. Although the temporary loss of the application may be
an annoyance, there are other ways to access the information, such as a hardcopy
directory or the operator.

The Challenges of Computer Security
Computer and network security is both fascinating and complex. Some of the
reasons follow:
1. Security is not as simple as it might first appear to the novice. The require-
ments seem to be straightforward; indeed, most of the major requirements for
security services can be given self-explanatory, one-word labels: confidential-
ity, authentication, nonrepudiation, or integrity. But the mechanisms used to
meet those requirements can be quite complex, and understanding them may
involve rather subtle reasoning.
2. In developing a particular security mechanism or algorithm, one must always
consider potential attacks on those security features. In many cases, successful
attacks are designed by looking at the problem in a completely different way,
therefore exploiting an unexpected weakness in the mechanism.
3. Because of point 2, the procedures used to provide particular services are
often counterintuitive. Typically, a security mechanism is complex, and it is not
obvious from the statement of a particular requirement that such elaborate
measures are needed. It is only when the various aspects of the threat are con-
sidered that elaborate security mechanisms make sense.
4. Having designed various security mechanisms, it is necessary to decide where
to use them. This is true both in terms of physical placement (e.g., at what points
in a network are certain security mechanisms needed) and in a logical sense
(e.g., at what layer or layers of an architecture such as TCP/IP [Transmission
Control Protocol/Internet Protocol] should mechanisms be placed).
5. Security mechanisms typically involve more than a particular algorithm or
protocol. They also require that participants be in possession of some secret in-
formation (e.g., an encryption key), which raises questions about the creation,
distribution, and protection of that secret information. There also may be a re-
liance on communications protocols whose behavior may complicate the task
of developing the security mechanism. For example, if the proper functioning
of the security mechanism requires setting time limits on the transit time of a
message from sender to receiver, then any protocol or network that introduces
variable, unpredictable delays may render such time limits meaningless.
6. Computer and network security is essentially a battle of wits between a per-
petrator who tries to find holes and the designer or administrator who tries to
close them. The great advantage that the attacker has is that he or she need
only find a single weakness, while the designer must find and eliminate all
weaknesses to achieve perfect security.
7. There is a natural tendency on the part of users and system managers to per-
ceive little benefit from security investment until a security failure occurs.
8. Security requires regular, even constant, monitoring, and this is difficult in
today’s short-term, overloaded environment.
9. Security is still too often an afterthought to be incorporated into a system
after the design is complete rather than being an integral part of the design

10. Many users and even security administrators view strong security as an
impediment to efficient and user-friendly operation of an information system
or use of information.
The difficulties just enumerated will be encountered in numerous ways as we
examine the various security threats and mechanisms throughout this book.
To assess effectively the security needs of an organization and to evaluate and
choose various security products and policies, the manager responsible for security
needs some systematic way of defining the requirements for security and character-
izing the approaches to satisfying those requirements. This is difficult enough in a
centralized data processing environment; with the use of local and wide area net-
works, the problems are compounded.
ITU-T3 Recommendation X.800, Security Architecture for OSI, defines such a
systematic approach.4 The OSI security architecture is useful to managers as a way
of organizing the task of providing security. Furthermore, because this architecture
was developed as an international standard, computer and communications vendors
have developed security features for their products and services that relate to this
structured definition of services and mechanisms.
For our purposes, the OSI security architecture provides a useful, if abstract,
overview of many of the concepts that this book deals with. The OSI security archi-
tecture focuses on security attacks, mechanisms, and services. These can be defined
briefly as
■ Security attack: Any action that compromises the security of information
owned by an organization.
■ Security mechanism: A process (or a device incorporating such a process)
that is designed to detect, prevent, or recover from a security attack.
■ Security service: A processing or communication service that enhances the
security of the data processing systems and the information transfers of an
organization. The services are intended to counter security attacks, and they
make use of one or more security mechanisms to provide the service.
In the literature, the terms threat and attack are commonly used to mean more
or less the same thing. Table 1.1 provides definitions taken from RFC 4949, Internet
Security Glossary.
3The International Telecommunication Union (ITU) Telecommunication Standardization Sector (ITU-T)
is a United Nations-sponsored agency that develops standards, called Recommendations, relating to tele-
communications and to open systems interconnection (OSI).
4The OSI security architecture was developed in the context of the OSI protocol architecture, which is
described in Appendix L. However, for our purposes in this chapter, an understanding of the OSI proto-
col architecture is not required.

A useful means of classifying security attacks, used both in X.800 and RFC 4949, is
in terms of passive attacks and active attacks (Figure 1.2). A passive attack attempts
to learn or make use of information from the system but does not affect system re-
sources. An active attack attempts to alter system resources or affect their operation.
Passive Attacks
Passive attacks (Figure 1.2a) are in the nature of eavesdropping on, or monitoring
of, transmissions. The goal of the opponent is to obtain information that is being
transmitted. Two types of passive attacks are the release of message contents and
traffic analysis.
The release of message contents is easily understood. A telephone conver-
sation, an electronic mail message, and a transferred file may contain sensitive or
confidential information. We would like to prevent an opponent from learning the
contents of these transmissions.
A second type of passive attack, traffic analysis, is subtler. Suppose that we
had a way of masking the contents of messages or other information traffic so that
opponents, even if they captured the message, could not extract the information
from the message. The common technique for masking contents is encryption. If we
had encryption protection in place, an opponent might still be able to observe the
pattern of these messages. The opponent could determine the location and identity
of communicating hosts and could observe the frequency and length of messages
being exchanged. This information might be useful in guessing the nature of the
communication that was taking place.
Passive attacks are very difficult to detect, because they do not involve any
alteration of the data. Typically, the message traffic is sent and received in an appar-
ently normal fashion, and neither the sender nor receiver is aware that a third party
has read the messages or observed the traffic pattern. However, it is feasible to pre-
vent the success of these attacks, usually by means of encryption. Thus, the empha-
sis in dealing with passive attacks is on prevention rather than detection.
Active Attacks
Active attacks (Figure 1.2b) involve some modification of the data stream or the
creation of a false stream and can be subdivided into four categories: masquerade,
replay, modification of messages, and denial of service.
A potential for violation of security, which exists when there is a circumstance, capability, action,
or event that could breach security and cause harm. That is, a threat is a possible danger that might
exploit a vulnerability.
An assault on system security that derives from an intelligent threat; that is, an intelligent act that
is a deliberate attempt (especially in the sense of a method or technique) to evade security services
and violate the security policy of a system.
Table 1.1 Threats and Attacks (RFC 4949)

A masquerade takes place when one entity pretends to be a different entity
(path 2 of Figure 1.2b is active). A masquerade attack usually includes one of the
other forms of active attack. For example, authentication sequences can be captured
and replayed after a valid authentication sequence has taken place, thus enabling an
authorized entity with few privileges to obtain extra privileges by impersonating an
entity that has those privileges.
Replay involves the passive capture of a data unit and its subsequent retrans-
mission to produce an unauthorized effect (paths 1, 2, and 3 active).
Modification of messages simply means that some portion of a legitimate mes-
sage is altered, or that messages are delayed or reordered, to produce an unauthor-
ized effect (paths 1 and 2 active). For example, a message meaning “Allow John
Smith to read confidential file accounts” is modified to mean “Allow Fred Brown to
read confidential file accounts.”
Figure 1.2 Security Attacks
(a) Passive attacks
(b) Active attacks
Internet or
other communications facility
Internet or
other communications facility
1 2

The denial of service prevents or inhibits the normal use or management of
communications facilities (path 3 active). This attack may have a specific target; for
example, an entity may suppress all messages directed to a particular destination
(e.g., the security audit service). Another form of service denial is the disruption of
an entire network, either by disabling the network or by overloading it with mes-
sages so as to degrade performance.
Active attacks present the opposite characteristics of passive attacks. Whereas
passive attacks are difficult to detect, measures are available to prevent their success.
On the other hand, it is quite difficult to prevent active attacks absolutely because
of the wide variety of potential physical, software, and network vulnerabilities.
Instead, the goal is to detect active attacks and to recover from any disruption or
delays caused by them. If the detection has a deterrent effect, it may also contribute
to prevention.
X.800 defines a security service as a service that is provided by a protocol layer of
communicating open systems and that ensures adequate security of the systems or
of data transfers. Perhaps a clearer definition is found in RFC 4949, which provides
the following definition: a processing or communication service that is provided by
a system to give a specific kind of protection to system resources; security services
implement security policies and are implemented by security mechanisms.
X.800 divides these services into five categories and fourteen specific services
(Table 1.2). We look at each category in turn.5
The authentication service is concerned with assuring that a communication is au-
thentic. In the case of a single message, such as a warning or alarm signal, the function
of the authentication service is to assure the recipient that the message is from the
source that it claims to be from. In the case of an ongoing interaction, such as the con-
nection of a terminal to a host, two aspects are involved. First, at the time of connec-
tion initiation, the service assures that the two entities are authentic, that is, that each
is the entity that it claims to be. Second, the service must assure that the connection is
not interfered with in such a way that a third party can masquerade as one of the two
legitimate parties for the purposes of unauthorized transmission or reception.
Two specific authentication services are defined in X.800:
■ Peer entity authentication: Provides for the corroboration of the identity of a
peer entity in an association. Two entities are considered peers if they imple-
ment to same protocol in different systems; for example two TCP modules
in two communicating systems. Peer entity authentication is provided for
5There is no universal agreement about many of the terms used in the security literature. For example, the
term integrity is sometimes used to refer to all aspects of information security. The term authentication is
sometimes used to refer both to verification of identity and to the various functions listed under integrity
in this chapter. Our usage here agrees with both X.800 and RFC 4949.

The assurance that the communicating entity is the
one that it claims to be.
Peer Entity Authentication
Used in association with a logical connection to
provide confidence in the identity of the entities
Data-Origin Authentication
In a connectionless transfer, provides assurance that
the source of received data is as claimed.
The prevention of unauthorized use of a resource
(i.e., this service controls who can have access to a
resource, under what conditions access can occur,
and what those accessing the resource are allowed
to do).
The protection of data from unauthorized
Connection Confidentiality
The protection of all user data on a connection.
Connectionless Confidentiality
The protection of all user data in a single data block.
Selective-Field Confidentiality
The confidentiality of selected fields within the user
data on a connection or in a single data block.
Traffic-Flow Confidentiality
The protection of the information that might be
derived from observation of traffic flows.
The assurance that data received are exactly as
sent by an authorized entity (i.e., contain no modi-
fication, insertion, deletion, or replay).
Connection Integrity with Recovery
Provides for the integrity of all user data on a connec-
tion and detects any modification, insertion, deletion,
or replay of any data within an entire data sequence,
with recovery attempted.
Connection Integrity without Recovery
As above, but provides only detection without
Selective-Field Connection Integrity
Provides for the integrity of selected fields within the
user data of a data block transferred over a connec-
tion and takes the form of determination of whether
the selected fields have been modified, inserted,
deleted, or replayed.
Connectionless Integrity
Provides for the integrity of a single connectionless
data block and may take the form of detection of
data modification. Additionally, a limited form of
replay detection may be provided.
Selective-Field Connectionless Integrity
Provides for the integrity of selected fields within a
single connectionless data block; takes the form of
determination of whether the selected fields have
been modified.
Provides protection against denial by one of the
entities involved in a communication of having par-
ticipated in all or part of the communication.
Nonrepudiation, Origin
Proof that the message was sent by the specified
Nonrepudiation, Destination
Proof that the message was received by the specified
Table 1.2 Security Services (X.800)
use at the establishment of, or at times during the data transfer phase of, a
connection. It attempts to provide confidence that an entity is not performing
either a masquerade or an unauthorized replay of a previous connection.
■ Data origin authentication: Provides for the corroboration of the source of a
data unit. It does not provide protection against the duplication or modifica-
tion of data units. This type of service supports applications like electronic mail,
where there are no prior interactions between the communicating entities.

Access Control
In the context of network security, access control is the ability to limit and control
the access to host systems and applications via communications links. To achieve
this, each entity trying to gain access must first be identified, or authenticated,
so that access rights can be tailored to the individual.
Data Confidentiality
Confidentiality is the protection of transmitted data from passive attacks. With re-
spect to the content of a data transmission, several levels of protection can be iden-
tified. The broadest service protects all user data transmitted between two users
over a period of time. For example, when a TCP connection is set up between two
systems, this broad protection prevents the release of any user data transmitted over
the TCP connection. Narrower forms of this service can also be defined, including
the protection of a single message or even specific fields within a message. These
refinements are less useful than the broad approach and may even be more complex
and expensive to implement.
The other aspect of confidentiality is the protection of traffic flow from
analysis. This requires that an attacker not be able to observe the source and desti-
nation, frequency, length, or other characteristics of the traffic on a communications
Data Integrity
As with confidentiality, integrity can apply to a stream of messages, a single mes-
sage, or selected fields within a message. Again, the most useful and straightforward
approach is total stream protection.
A connection-oriented integrity service, one that deals with a stream of mes-
sages, assures that messages are received as sent with no duplication, insertion,
modification, reordering, or replays. The destruction of data is also covered under
this service. Thus, the connection-oriented integrity service addresses both mes-
sage stream modification and denial of service. On the other hand, a connection-
less integrity service, one that deals with individual messages without regard to any
larger context, generally provides protection against message modification only.
We can make a distinction between service with and without recovery. Because
the integrity service relates to active attacks, we are concerned with detection rather
than prevention. If a violation of integrity is detected, then the service may simply
report this violation, and some other portion of software or human intervention is
required to recover from the violation. Alternatively, there are mechanisms avail-
able to recover from the loss of integrity of data, as we will review subsequently. The
incorporation of automated recovery mechanisms is, in general, the more attractive
Nonrepudiation prevents either sender or receiver from denying a transmitted mes-
sage. Thus, when a message is sent, the receiver can prove that the alleged sender in
fact sent the message. Similarly, when a message is received, the sender can prove
that the alleged receiver in fact received the message.

Availability Service
Both X.800 and RFC 4949 define availability to be the property of a system or a
system resource being accessible and usable upon demand by an authorized system
entity, according to performance specifications for the system (i.e., a system is avail-
able if it provides services according to the system design whenever users request
them). A variety of attacks can result in the loss of or reduction in availability. Some
of these attacks are amenable to automated countermeasures, such as authentica-
tion and encryption, whereas others require some sort of physical action to prevent
or recover from loss of availability of elements of a distributed system.
X.800 treats availability as a property to be associated with various security
services. However, it makes sense to call out specifically an availability service. An
availability service is one that protects a system to ensure its availability. This ser-
vice addresses the security concerns raised by denial-of-service attacks. It depends
on proper management and control of system resources and thus depends on access
control service and other security services.
Table 1.3 lists the security mechanisms defined in X.800. The mechanisms are
divided into those that are implemented in a specific protocol layer, such as TCP or
an application-layer protocol, and those that are not specific to any particular pro-
tocol layer or security service. These mechanisms will be covered in the appropriate
May be incorporated into the appropriate protocol
layer in order to provide some of the OSI security
The use of mathematical algorithms to transform
data into a form that is not readily intelligible. The
transformation and subsequent recovery of the data
depend on an algorithm and zero or more encryption
Digital Signature
Data appended to, or a cryptographic transformation
of, a data unit that allows a recipient of the data unit
to prove the source and integrity of the data unit and
protect against forgery (e.g., by the recipient).
Access Control
A variety of mechanisms that enforce access rights to
Data Integrity
A variety of mechanisms used to assure the integrity
of a data unit or stream of data units.
Mechanisms that are not specific to any particular
OSI security service or protocol layer.
Trusted Functionality
That which is perceived to be correct with respect
to some criteria (e.g., as established by a security
Security Label
The marking bound to a resource (which may be a
data unit) that names or designates the security attri-
butes of that resource.
Event Detection
Detection of security-relevant events.
Security Audit Trail
Data collected and potentially used to facilitate a
security audit, which is an independent review and
examination of system records and activities.
Security Recovery
Deals with requests from mechanisms, such as event
handling and management functions, and takes
recovery actions.
Table 1.3 Security Mechanisms (X.800)

places in the book. So we do not elaborate now, except to comment on the defini-
tion of encipherment. X.800 distinguishes between reversible encipherment mech-
anisms and irreversible encipherment mechanisms. A reversible encipherment
mechanism is simply an encryption algorithm that allows data to be encrypted and
subsequently decrypted. Irreversible encipherment mechanisms include hash algo-
rithms and message authentication codes, which are used in digital signature and
message authentication applications.
Table 1.4, based on one in X.800, indicates the relationship between security
services and security mechanisms.
Authentication Exchange
A mechanism intended to ensure the identity of an
entity by means of information exchange.
Traffic Padding
The insertion of bits into gaps in a data stream to
frustrate traffic analysis attempts.
Routing Control
Enables selection of particular physically secure
routes for certain data and allows routing changes,
especially when a breach of security is suspected.
The use of a trusted third party to assure certain
properties of a data exchange.
Peer entity authentication
s c
c p
Data origin authentication
Access control
Traffic flow confidentiality
Data integrity
Table 1.4 Relationship Between Security Services and Mechanisms

Despite years of research and development, it has not been possible to develop
security design and implementation techniques that systematically exclude security
flaws and prevent all unauthorized actions. In the absence of such foolproof tech-
niques, it is useful to have a set of widely agreed design principles that can guide
the development of protection mechanisms. The National Centers of Academic
Excellence in Information Assurance/Cyber Defense, which is jointly sponsored by
the U.S. National Security Agency and the U.S. Department of Homeland Security,
list the following as fundamental security design principles [NCAE13]:
■ Economy of mechanism
■ Fail-safe defaults
■ Complete mediation
■ Open design
■ Separation of privilege
■ Least privilege
■ Least common mechanism
■ Psychological acceptability
■ Isolation
■ Encapsulation
■ Modularity
■ Layering
■ Least astonishment
The first eight listed principles were first proposed in [SALT75] and have withstood
the test of time. In this section, we briefly discuss each principle.
Economy of mechanism means that the design of security measures embod-
ied in both hardware and software should be as simple and small as possible.
The motivation for this principle is that relatively simple, small design is eas-
ier to test and verify thoroughly. With a complex design, there are many more
opportunities for an adversary to discover subtle weaknesses to exploit that may
be difficult to spot ahead of time. The more complex the mechanism, the more
likely it is to possess exploitable flaws. Simple mechanisms tend to have fewer
exploitable flaws and require less maintenance. Further, because configuration
management issues are simplified, updating or replacing a simple mechanism
becomes a less intensive process. In practice, this is perhaps the most difficult
principle to honor. There is a constant demand for new features in both hard-
ware and software, complicating the security design task. The best that can be
done is to keep this principle in mind during system design to try to eliminate
unnecessary complexity.
Fail-safe defaults means that access decisions should be based on permission
rather than exclusion. That is, the default situation is lack of access, and the protec-
tion scheme identifies conditions under which access is permitted. This approach

exhibits a better failure mode than the alternative approach, where the default is
to permit access. A design or implementation mistake in a mechanism that gives
explicit permission tends to fail by refusing permission, a safe situation that can
be quickly detected. On the other hand, a design or implementation mistake in a
mechanism that explicitly excludes access tends to fail by allowing access, a failure
that may long go unnoticed in normal use. Most file access systems and virtually all
protected services on client/server systems use fail-safe defaults.
Complete mediation means that every access must be checked against the
access control mechanism. Systems should not rely on access decisions retrieved
from a cache. In a system designed to operate continuously, this principle requires
that, if access decisions are remembered for future use, careful consideration be
given to how changes in authority are propagated into such local memories. File
access systems appear to provide an example of a system that complies with this
principle. However, typically, once a user has opened a file, no check is made to see
if permissions change. To fully implement complete mediation, every time a user
reads a field or record in a file, or a data item in a database, the system must exercise
access control. This resource-intensive approach is rarely used.
Open design means that the design of a security mechanism should be open
rather than secret. For example, although encryption keys must be secret, encryption
algorithms should be open to public scrutiny. The algorithms can then be reviewed
by many experts, and users can therefore have high confidence in them. This is the
philosophy behind the National Institute of Standards and Technology (NIST)
program of standardizing encryption and hash algorithms, and has led to the wide-
spread adoption of NIST-approved algorithms.
Separation of privilege is defined in [SALT75] as a practice in which mul-
tiple privilege attributes are required to achieve access to a restricted resource.
A good example of this is multifactor user authentication, which requires the use of
multiple techniques, such as a password and a smart card, to authorize a user. The
term is also now applied to any technique in which a program is divided into parts
that are limited to the specific privileges they require in order to perform a specific
task. This is used to mitigate the potential damage of a computer security attack.
One example of this latter interpretation of the principle is removing high privilege
operations to another process and running that process with the higher privileges
required to perform its tasks. Day-to-day interfaces are executed in a lower privi-
leged process.
Least privilege means that every process and every user of the system should
operate using the least set of privileges necessary to perform the task. A good
example of the use of this principle is role-based access control. The system security
policy can identify and define the various roles of users or processes. Each role is
assigned only those permissions needed to perform its functions. Each permission
specifies a permitted access to a particular resource (such as read and write access
to a specified file or directory, connect access to a given host and port). Unless a
permission is granted explicitly, the user or process should not be able to access the
protected resource. More generally, any access control system should allow each
user only the privileges that are authorized for that user. There is also a temporal
aspect to the least privilege principle. For example, system programs or administra-
tors who have special privileges should have those privileges only when necessary;

when they are doing ordinary activities the privileges should be withdrawn. Leaving
them in place just opens the door to accidents.
Least common mechanism means that the design should minimize the func-
tions shared by different users, providing mutual security. This principle helps
reduce the number of unintended communication paths and reduces the amount of
hardware and software on which all users depend, thus making it easier to verify if
there are any undesirable security implications.
Psychological acceptability implies that the security mechanisms should not
interfere unduly with the work of users, while at the same time meeting the needs of
those who authorize access. If security mechanisms hinder the usability or accessibil-
ity of resources, then users may opt to turn off those mechanisms. Where possible,
security mechanisms should be transparent to the users of the system or at most
introduce minimal obstruction. In addition to not being intrusive or burdensome,
security procedures must reflect the user’s mental model of protection. If the protec-
tion procedures do not make sense to the user or if the user must translate his image
of protection into a substantially different protocol, the user is likely to make errors.
Isolation is a principle that applies in three contexts. First, public access sys-
tems should be isolated from critical resources (data, processes, etc.) to prevent dis-
closure or tampering. In cases where the sensitivity or criticality of the information
is high, organizations may want to limit the number of systems on which that data is
stored and isolate them, either physically or logically. Physical isolation may include
ensuring that no physical connection exists between an organization’s public access
information resources and an organization’s critical information. When implement-
ing logical isolation solutions, layers of security services and mechanisms should be
established between public systems and secure systems responsible for protecting
critical resources. Second, the processes and files of individual users should be iso-
lated from one another except where it is explicitly desired. All modern operating
systems provide facilities for such isolation, so that individual users have separate,
isolated process space, memory space, and file space, with protections for prevent-
ing unauthorized access. And finally, security mechanisms should be isolated in the
sense of preventing access to those mechanisms. For example, logical access control
may provide a means of isolating cryptographic software from other parts of the
host system and for protecting cryptographic software from tampering and the keys
from replacement or disclosure.
Encapsulation can be viewed as a specific form of isolation based on object-
oriented functionality. Protection is provided by encapsulating a collection of pro-
cedures and data objects in a domain of its own so that the internal structure of a
data object is accessible only to the procedures of the protected subsystem, and the
procedures may be called only at designated domain entry points.
Modularity in the context of security refers both to the development of security
functions as separate, protected modules and to the use of a modular architecture for
mechanism design and implementation. With respect to the use of separate security
modules, the design goal here is to provide common security functions and services,
such as cryptographic functions, as common modules. For example, numerous proto-
cols and applications make use of cryptographic functions. Rather than implement-
ing such functions in each protocol or application, a more secure design is provided
by developing a common cryptographic module that can be invoked by numerous

protocols and applications. The design and implementation effort can then focus on
the secure design and implementation of a single cryptographic module and includ-
ing mechanisms to protect the module from tampering. With respect to the use of a
modular architecture, each security mechanism should be able to support migration
to new technology or upgrade of new features without requiring an entire system
redesign. The security design should be modular so that individual parts of the secu-
rity design can be upgraded without the requirement to modify the entire system.
Layering refers to the use of multiple, overlapping protection approaches
addressing the people, technology, and operational aspects of information systems.
By using multiple, overlapping protection approaches, the failure or circumven-
tion of any individual protection approach will not leave the system unprotected.
We will see throughout this book that a layering approach is often used to provide
multiple barriers between an adversary and protected information or services. This
technique is often referred to as defense in depth.
Least astonishment means that a program or user interface should always
respond in the way that is least likely to astonish the user. For example, the mechanism
for authorization should be transparent enough to a user that the user has a good intui-
tive understanding of how the security goals map to the provided security mechanism.
In Section 1.3, we provided an overview of the spectrum of security threats and
attacks facing computer and network systems. Section 22.1 goes into more detail
about the nature of attacks and the types of adversaries that present security threats.
In this section, we elaborate on two concepts that are useful in evaluating and clas-
sifying threats: attack surfaces and attack trees.
Attack Surfaces
An attack surface consists of the reachable and exploitable vulnerabilities in a sys-
tem [MANA11, HOWA03]. Examples of attack surfaces are the following:
■ Open ports on outward facing Web and other servers, and code listening on
those ports
■ Services available on the inside of a firewall
■ Code that processes incoming data, email, XML, office documents, and indus-
try-specific custom data exchange formats
■ Interfaces, SQL, and Web forms
■ An employee with access to sensitive information vulnerable to a social
engineering attack
Attack surfaces can be categorized as follows:
■ Network attack surface: This category refers to vulnerabilities over an enterprise
network, wide-area network, or the Internet. Included in this category are net-
work protocol vulnerabilities, such as those used for a denial-of-service attack,
disruption of communications links, and various forms of intruder attacks.

■ Software attack surface: This refers to vulnerabilities in application, utility,
or operating system code. A particular focus in this category is Web server
■ Human attack surface: This category refers to vulnerabilities created by
personnel or outsiders, such as social engineering, human error, and trusted
An attack surface analysis is a useful technique for assessing the scale and
severity of threats to a system. A systematic analysis of points of vulnerability
makes developers and security analysts aware of where security mechanisms are
required. Once an attack surface is defined, designers may be able to find ways to
make the surface smaller, thus making the task of the adversary more difficult. The
attack surface also provides guidance on setting priorities for testing, strengthening
security measures, and modifying the service or application.
As illustrated in Figure 1.3, the use of layering, or defense in depth, and attack
surface reduction complement each other in mitigating security risk.
Attack Trees
An attack tree is a branching, hierarchical data structure that represents a set of poten-
tial techniques for exploiting security vulnerabilities [MAUW05, MOOR01, SCHN99].
The security incident that is the goal of the attack is represented as the root node of
the tree, and the ways that an attacker could reach that goal are iteratively and incre-
mentally represented as branches and subnodes of the tree. Each subnode defines a
subgoal, and each subgoal may have its own set of further subgoals, and so on. The
final nodes on the paths outward from the root, that is, the leaf nodes, represent differ-
ent ways to initiate an attack. Each node other than a leaf is either an AND-node or an
OR-node. To achieve the goal represented by an AND-node, the subgoals represented
by all of that node’s subnodes must be achieved; and for an OR-node, at least one of
the subgoals must be achieved. Branches can be labeled with values representing dif-
ficulty, cost, or other attack attributes, so that alternative attacks can be compared.
Figure 1.3 Defense in Depth and Attack Surface
Attack surface
security risk
security risk
security riskD
Small Large
security risk

The motivation for the use of attack trees is to effectively exploit the infor-
mation available on attack patterns. Organizations such as CERT publish security
advisories that have enabled the development of a body of knowledge about both
general attack strategies and specific attack patterns. Security analysts can use the
attack tree to document security attacks in a structured form that reveals key vul-
nerabilities. The attack tree can guide both the design of systems and applications,
and the choice and strength of countermeasures.
Figure 1.4, based on a figure in [DIMI07], is an example of an attack tree
analysis for an Internet banking authentication application. The root of the tree is
the objective of the attacker, which is to compromise a user’s account. The shaded
boxes on the tree are the leaf nodes, which represent events that comprise the
attacks. Note that in this tree, all the nodes other than leaf nodes are OR-nodes.
The analysis to generate this tree considered the three components involved in
Figure 1.4 An Attack Tree for Internet Banking Authentication
Bank account compromise
User credential compromise
User credential guessing
UT/U1a User surveillance
UT/U1b Theft of token and
handwritten notes
Malicious software
installation Vulnerability exploit
UT/U2a Hidden code
UT/U2b Worms
UT/U3a Smartcard analyzers
UT/U2c Emails with
malicious code
UT/U3b Smartcard reader
UT/U3c Brute force attacks
with PIN calculators
CC2 Sniffing
UT/U4a Social engineering
IBS3 Web site manipulation
UT/U4b Web page
CC1 Pharming
Redirection of
communication toward
fraudulent site
CC3 Active man-in-the
middle attacks
IBS1 Brute force attacks
User communication
with attacker
Injection of commands
Use of known authenticated
session by attacker
Normal user authentication
with specified session ID
CC4 Pre-defined session
IDs (session hijacking)
IBS2 Security policy

■ User terminal and user (UT/U): These attacks target the user equipment,
including the tokens that may be involved, such as smartcards or other pass-
word generators, as well as the actions of the user.
■ Communications channel (CC): This type of attack focuses on communica-
tion links.
■ Internet banking server (IBS): These types of attacks are offline attacks against
the servers that host the Internet banking application.
Five overall attack strategies can be identified, each of which exploits one or
more of the three components. The five strategies are as follows:
■ User credential compromise: This strategy can be used against many ele-
ments of the attack surface. There are procedural attacks, such as monitoring
a user’s action to observe a PIN or other credential, or theft of the user’s
token or handwritten notes. An adversary may also compromise token
information using a variety of token attack tools, such as hacking the smart-
card or using a brute force approach to guess the PIN. Another possible
strategy is to embed malicious software to compromise the user’s login and
password. An adversary may also attempt to obtain credential information
via the communication channel (sniffing). Finally, an adversary may use
various means to engage in communication with the target user, as shown
in Figure 1.4.
■ Injection of commands: In this type of attack, the attacker is able to intercept
communication between the UT and the IBS. Various schemes can be used
to be able to impersonate the valid user and so gain access to the banking
■ User credential guessing: It is reported in [HILT06] that brute force attacks
against some banking authentication schemes are feasible by sending ran-
dom usernames and passwords. The attack mechanism is based on distributed
zombie personal computers, hosting automated programs for username- or
password-based calculation.
■ Security policy violation: For example, violating the bank’s security policy
in combination with weak access control and logging mechanisms, an em-
ployee may cause an internal security incident and expose a customer’s
■ Use of known authenticated session: This type of attack persuades or forces
the user to connect to the IBS with a preset session ID. Once the user authen-
ticates to the server, the attacker may utilize the known session ID to send
packets to the IBS, spoofing the user’s identity.
Figure 1.4 provides a thorough view of the different types of attacks on an
Internet banking authentication application. Using this tree as a starting point, secu-
rity analysts can assess the risk of each attack and, using the design principles out-
lined in the preceding section, design a comprehensive security facility. [DIMO07]
provides a good account of the results of this design effort.

A model for much of what we will be discussing is captured, in very general terms, in
Figure 1.5. A message is to be transferred from one party to another across some sort
of Internet service. The two parties, who are the principals in this transaction, must
cooperate for the exchange to take place. A logical information channel is established
by defining a route through the Internet from source to destination and by the coop-
erative use of communication protocols (e.g., TCP/IP) by the two principals.
Security aspects come into play when it is necessary or desirable to protect the
information transmission from an opponent who may present a threat to confidentiality,
authenticity, and so on. All the techniques for providing security have two components:
■ A security-related transformation on the information to be sent. Examples
include the encryption of the message, which scrambles the message so that it
is unreadable by the opponent, and the addition of a code based on the con-
tents of the message, which can be used to verify the identity of the sender.
■ Some secret information shared by the two principals and, it is hoped,
unknown to the opponent. An example is an encryption key used in conjunc-
tion with the transformation to scramble the message before transmission
and unscramble it on reception.6
A trusted third party may be needed to achieve secure transmission. For
example, a third party may be responsible for distributing the secret information
6Part Two discusses a form of encryption, known as a symmetric encryption, in which only one of the two
principals needs to have the secret information.
Figure 1.5 Model for Network Security
Trusted third party
(e.g., arbiter, distributer
of secret information)

to the two principals while keeping it from any opponent. Or a third party may be
needed to arbitrate disputes between the two principals concerning the authenticity
of a message transmission.
This general model shows that there are four basic tasks in designing a par-
ticular security service:
1. Design an algorithm for performing the security-related transformation. The
algorithm should be such that an opponent cannot defeat its purpose.
2. Generate the secret information to be used with the algorithm.
3. Develop methods for the distribution and sharing of the secret information.
4. Specify a protocol to be used by the two principals that makes use of the
security algorithm and the secret information to achieve a particular security
Parts One through Five of this book concentrate on the types of security
mechanisms and services that fit into the model shown in Figure 1.5. However,
there are other security-related situations of interest that do not neatly fit this
model but are considered in this book. A general model of these other situations
is illustrated in Figure 1.6, which reflects a concern for protecting an information
system from unwanted access. Most readers are familiar with the concerns caused
by the existence of hackers, who attempt to penetrate systems that can be accessed
over a network. The hacker can be someone who, with no malign intent, simply gets
satisfaction from breaking and entering a computer system. The intruder can be a
disgruntled employee who wishes to do damage or a criminal who seeks to exploit
computer assets for financial gain (e.g., obtaining credit card numbers or perform-
ing illegal money transfers).
Another type of unwanted access is the placement in a computer system of
logic that exploits vulnerabilities in the system and that can affect application pro-
grams as well as utility programs, such as editors and compilers. Programs can pres-
ent two kinds of threats:
■ Information access threats: Intercept or modify data on behalf of users who
should not have access to that data.
■ Service threats: Exploit service flaws in computers to inhibit use by legitimate
Figure 1.6 Network Access Security Model
Computing resources
(processor, memory, I/O)
Internal security controls
Information system
—human (e.g., hacker)
(e.g., virus, worm)
Access channel

1.9 / STANDARDS 43
Viruses and worms are two examples of software attacks. Such attacks can be
introduced into a system by means of a disk that contains the unwanted logic con-
cealed in otherwise useful software. They can also be inserted into a system across a
network; this latter mechanism is of more concern in network security.
The security mechanisms needed to cope with unwanted access fall into two
broad categories (see Figure 1.6). The first category might be termed a gatekeeper
function. It includes password-based login procedures that are designed to deny
access to all but authorized users and screening logic that is designed to detect and
reject worms, viruses, and other similar attacks. Once either an unwanted user
or unwanted software gains access, the second line of defense consists of a vari-
ety of internal controls that monitor activity and analyze stored information in an
attempt to detect the presence of unwanted intruders. These issues are explored
in Part Six.
Many of the security techniques and applications described in this book have been
specified as standards. Additionally, standards have been developed to cover man-
agement practices and the overall architecture of security mechanisms and services.
Throughout this book, we describe the most important standards in use or that are
being developed for various aspects of cryptography and network security. Various
organizations have been involved in the development or promotion of these stan-
dards. The most important (in the current context) of these organizations are as
■ National Institute of Standards and Technology: NIST is a U.S. federal agency
that deals with measurement science, standards, and technology related to
U.S. government use and to the promotion of U.S. private-sector innovation.
Despite its national scope, NIST Federal Information Processing Standards
(FIPS) and Special Publications (SP) have a worldwide impact.
■ Internet Society: ISOC is a professional membership society with world-
wide organizational and individual membership. It provides leadership in
addressing issues that confront the future of the Internet and is the organiza-
tion home for the groups responsible for Internet infrastructure standards,
including the Internet Engineering Task Force (IETF) and the Internet
Architecture Board (IAB). These organizations develop Internet stan-
dards and related specifications, all of which are published as Requests for
Comments (RFCs).
■ ITU-T: The International Telecommunication Union (ITU) is an interna-
tional organization within the United Nations System in which governments
and the private sector coordinate global telecom networks and services. The
ITU Telecommunication Standardization Sector (ITU-T) is one of the three
sectors of the ITU. ITU-T’s mission is the development of technical standards
covering all fields of telecommunications. ITU-T standards are referred to as

■ ISO: The International Organization for Standardization (ISO)7 is a world-
wide federation of national standards bodies from more than 140 countries,
one from each country. ISO is a nongovernmental organization that promotes
the development of standardization and related activities with a view to fa-
cilitating the international exchange of goods and services and to developing
cooperation in the spheres of intellectual, scientific, technological, and eco-
nomic activity. ISO’s work results in international agreements that are pub-
lished as International Standards.
A more detailed discussion of these organizations is contained in Appendix D.
7ISO is not an acronym (in which case it would be IOS), but it is a word, derived from the Greek, mean-
ing equal.
Key Terms
access control
active attack
data confidentiality
data integrity
denial of service
OSI security architecture
passive attack
security attacks
security mechanisms
security services
traffic analysis
Review Questions
1.1 What is the OSI security architecture?
1.2 List and briefly define the three key objectives of computer security.
1.3 List and briefly define categories of passive and active security attacks.
1.4 List and briefly define categories of security services.
1.5 List and briefly define categories of security mechanisms.
1.6 List and briefly define the fundamental security design principles.
1.7 Explain the difference between an attack surface and an attack tree.
1.1 Consider an automated cash deposit machine in which users provide a card or an ac-
count number to deposit cash. Give examples of confidentiality, integrity, and avail-
ability requirements associated with the system, and, in each case, indicate the degree
of importance of the requirement.
1.2 Repeat Problem 1.1 for a payment gateway system where a user pays for an item
using their account via the payment gateway.

1.3 Consider a financial report publishing system used to produce reports for various
a. Give an example of a type of publication in which confidentiality of the stored
data is the most important requirement.
b. Give an example of a type of publication in which data integrity is the most im-
portant requirement.
c. Give an example in which system availability is the most important requirement.
1.4 For each of the following assets, assign a low, moderate, or high impact level for the
loss of confidentiality, availability, and integrity, respectively. Justify your answers.
a. A student maintaining a blog to post public information.
b. An examination section of a university that is managing sensitive information
about exam papers.
c. An information system in a pathological laboratory maintaining the patient’s data.
d. A student information system used for maintaining student data in a university
that contains both personal, academic information and routine administrative in-
formation (not privacy related). Assess the impact for the two data sets separately
and the information system as a whole.
e. A University library contains a library management system which controls the
distribution of books amongst the students of various departments. The library
management system contains both the student data and the book data. Assess the
impact for the two data sets separately and the information system as a whole.
1.5 Draw a matrix similar to Table 1.4 that shows the relationship between security ser-
vices and attacks.
1.6 Draw a matrix similar to Table 1.4 that shows the relationship between security
mechanisms and attacks.
1.7 Develop an attack tree for gaining access to the contents of a physical safe.
1.8 Consider a company whose operations are housed in two buildings on the same prop-
erty; one building is headquarters, the other building contains network and computer
services. The property is physically protected by a fence around the perimeter, and
the only entrance to the property is through this fenced perimeter. In addition to
the perimeter fence, physical security consists of a guarded front gate. The local net-
works are split between the Headquarters’ LAN and the Network Services’ LAN.
Internet users connect to the Web server through a firewall. Dial-up users get access
to a particular server on the Network Services’ LAN. Develop an attack tree in which
the root node represents disclosure of proprietary secrets. Include physical, social
engineering, and technical attacks. The tree may contain both AND and OR nodes.
Develop a tree that has at least 15 leaf nodes.
1.9 Read all of the classic papers cited in the Recommended Reading section for this
chapter, available at the Author Web site at The
papers are available at Compose a 500–1000 word paper (or 8–12
slide PowerPoint presentation) that summarizes the key concepts that emerge from
these papers, emphasizing concepts that are common to most or all of the papers.

2.1 Divisibility and The Division Algorithm
The Division Algorithm
2.2 The Euclidean Algorithm
Greatest Common Divisor
Finding the Greatest Common Divisor
2.3 Modular Arithmetic
The Modulus
Properties of Congruences
Modular Arithmetic Operations
Properties of Modular Arithmetic
Euclidean Algorithm Revisited
The Extended Euclidean Algorithm
2.4 Prime Numbers
2.5 Fermat’s and Euler’s Theorems
Fermat’s Theorem
Euler’s Totient Function
Euler’s Theorem
2.6 Testing for Primality
Miller–Rabin Algorithm
A Deterministic Primality Algorithm
Distribution of Primes
2.7 The Chinese Remainder Theorem
2.8 Discrete Logarithms
The Powers of an Integer, Modulo n
Logarithms for Modular Arithmetic
Calculation of Discrete Logarithms
2.9 Key Terms, Review Questions, and Problems
Appendix 2A The Meaning of Mod
Introduction to Number Theory

Number theory is pervasive in cryptographic algorithms. This chapter provides
sufficient breadth and depth of coverage of relevant number theory topics for under-
standing the wide range of applications in cryptography. The reader familiar with these
topics can safely skip this chapter.
The first three sections introduce basic concepts from number theory that are
needed for understanding finite fields; these include divisibility, the Euclidian algo-
rithm, and modular arithmetic. The reader may study these sections now or wait until
ready to tackle Chapter 5 on finite fields.
Sections 2.4 through 2.8 discuss aspects of number theory related to prime num-
bers and discrete logarithms. These topics are fundamental to the design of asymmetric
(public-key) cryptographic algorithms. The reader may study these sections now or
wait until ready to read Part Three.
The concepts and techniques of number theory are quite abstract, and it is often
difficult to grasp them intuitively without examples. Accordingly, this chapter includes
a number of examples, each of which is highlighted in a shaded box.
We say that a nonzero b divides a if a = mb for some m, where a, b, and m are
integers. That is, b divides a if there is no remainder on division. The notation b � a
is commonly used to mean b divides a. Also, if b � a, we say that b is a divisor of a.
After studying this chapter, you should be able to:
◆ Understand the concept of divisibility and the division algorithm.
◆ Understand how to use the Euclidean algorithm to find the greatest com-
mon divisor.
◆ Present an overview of the concepts of modular arithmetic.
◆ Explain the operation of the extended Euclidean algorithm.
◆ Discuss key concepts relating to prime numbers.
◆ Understand Fermat’s theorem.
◆ Understand Euler’s theorem.
◆ Define Euler’s totient function.
◆ Make a presentation on the topic of testing for primality.
◆ Explain the Chinese remainder theorem.
◆ Define discrete logarithms.

Subsequently, we will need some simple properties of divisibility for integers,
which are as follows:
■ If a � 1, then a = {1.
■ If a �b and b � a, then a = {b.
■ Any b ≠ 0 divides 0.
■ If a �b and b � c, then a � c:
The positive divisors of 24 are 1, 2, 3, 4, 6, 8, 12, and 24.
13 � 182; -5 � 30; 17 � 289; -3 � 33; 17 � 0
11 � 66 and 66 � 198 1 11 � 198
b = 7; g = 14; h = 63; m = 3; n = 2
7 � 14 and 7 � 63.
To show 7 � (3 * 14 + 2 * 63),
we have (3 * 14 + 2 * 63) = 7(3 * 2 + 2 * 9),
and it is obvious that 7 � (7(3 * 2 + 2 * 9)).
■ If b � g and b �h, then b � (mg + nh) for arbitrary integers m and n.
To see this last point, note that
■ If b � g, then g is of the form g = b * g1 for some integer g1.
■ If b �h, then h is of the form h = b * h1 for some integer h1.
mg + nh = mbg1 + nbh1 = b * (mg1 + nh1)
and therefore b divides mg + nh.
The Division Algorithm
Given any positive integer n and any nonnegative integer a, if we divide a by n,
we get an integer quotient q and an integer remainder r that obey the following
a = qn + r 0 … r 6 n; q = :a/n; (2.1)
where :x; is the largest integer less than or equal to x. Equation (2.1) is referred to
as the division algorithm.1
1Equation (2.1) expresses a theorem rather than an algorithm, but by tradition, this is referred to as the
division algorithm.

Figure 2.1a demonstrates that, given a and positive n, it is always possible to
find q and r that satisfy the preceding relationship. Represent the integers on the
number line; a will fall somewhere on that line (positive a is shown, a similar dem-
onstration can be made for negative a). Starting at 0, proceed to n, 2n, up to qn, such
that qn … a and (q + 1)n 7 a. The distance from qn to a is r, and we have found
the unique values of q and r. The remainder r is often referred to as a residue.
a = 11; n = 7; 11 = 1 * 7 + 4; r = 4 q = 1
a = -11; n = 7; -11 = (-2) * 7 + 3; r = 3 q = -2
Figure 2.1b provides another example.
Figure 2.1 The Relationship a = qn + r; 0 … r 6 n
n 2n 3n qn (q + 1)na
r(a) General relationship
0 15
= 2 × 15
(b) Example: 70 = (4 × 15) + 10
= 3 × 15
= 4 × 15
= 5 × 15
One of the basic techniques of number theory is the Euclidean algorithm, which
is a simple procedure for determining the greatest common divisor of two positive
integers. First, we need a simple definition: Two integers are relatively prime if and
only if their only common positive integer factor is 1.
Greatest Common Divisor
Recall that nonzero b is defined to be a divisor of a if a = mb for some m, where
a, b, and m are integers. We will use the notation gcd(a, b) to mean the greatest
common divisor of a and b. The greatest common divisor of a and b is the largest
integer that divides both a and b. We also define gcd(0, 0) = 0.

More formally, the positive integer c is said to be the greatest common divisor
of a and b if
1. c is a divisor of a and of b.
2. any divisor of a and b is a divisor of c.
An equivalent definition is the following:
gcd(a, b) = max[k, such that k � a and k �b]
Because we require that the greatest common divisor be positive, gcd(a, b) =
gcd(a, -b) = gcd(-a, b) = gcd(-a, -b). In general, gcd(a, b) = gcd( � a � , �b � ).
gcd(60, 24) = gcd(60, -24) = 12
8 and 15 are relatively prime because the positive divisors of 8 are 1, 2, 4, and 8, and
the positive divisors of 15 are 1, 3, 5, and 15. So 1 is the only integer on both lists.
Also, because all nonzero integers divide 0, we have gcd(a, 0) = � a � .
We stated that two integers a and b are relatively prime if and only if their
only common positive integer factor is 1. This is equivalent to saying that a and b are
relatively prime if gcd(a, b) = 1.
Finding the Greatest Common Divisor
We now describe an algorithm credited to Euclid for easily finding the greatest
common divisor of two integers (Figure 2.2). This algorithm has broad significance
in cryptography. The explanation of the algorithm can be broken down into the fol-
lowing points:
1. Suppose we wish to determine the greatest common divisor d of the integers
a and b; that is determine d = gcd(a, b). Because gcd( � a � , �b � ) = gcd(a, b),
there is no harm in assuming a Ú b 7 0.
2. Dividing a by b and applying the division algorithm, we can state:
a = q1b + r1 0 … r1 6 b (2.2)
3. First consider the case in which r1 = 0. Therefore b divides a and clearly no
larger number divides both b and a, because that number would be larger
than b. So we have d = gcd(a, b) = b.
4. The other possibility from Equation (2.2) is r1 ≠ 0. For this case, we can state
that d � r1. This is due to the basic properties of divisibility: the relations d � a
and d �b together imply that d � (a – q1b), which is the same as d � r1.
5. Before proceeding with the Euclidian algorithm, we need to answer the ques-
tion: What is the gcd(b, r1)? We know that d �b and d � r1. Now take any arbi-
trary integer c that divides both b and r1. Therefore, c � (q1b + r1) = a. Because
c divides both a and b, we must have c … d, which is the greatest common
divisor of a and b. Therefore d = gcd(b, r1).

Let us now return to Equation (2.2) and assume that r1 ≠ 0. Because b 7 r1,
we can divide b by r1 and apply the division algorithm to obtain:
b = q2r1 + r2 0 … r2 6 r1
As before, if r2 = 0, then d = r1 and if r2 ≠ 0, then d = gcd(r1, r2). Note that the
remainders form a descending series of nonnegative values and so must terminate
when the remainder is zero. This happens, say, at the (n + 1)th stage where rn – 1 is
divided by rn. The result is the following system of equations:

a = q1b + r1 0 6 r1 6 b
b = q2r1 + r2 0 6 r2 6 r1
r1 = q3r2 + r3 0 6 r3 6 r2
~ ~
~ ~
~ ~
rn – 2 = qnrn – 1 + rn 0 6 rn 6 rn – 1
rn – 1 = qn + 1rn + 0
d = gcd(a, b) = rn
w (2.3)
At each iteration, we have d = gcd(ri, ri+ 1) until finally d = gcd(rn, 0) = rn.
Thus, we can find the greatest common divisor of two integers by repetitive appli-
cation of the division algorithm. This scheme is known as the Euclidean algorithm.
Figure 2.3 illustrates a simple example.
We have essentially argued from the top down that the final result is the
gcd(a, b). We can also argue from the bottom up. The first step is to show that rn
divides a and b. It follows from the last division in Equation (2.3) that rn divides
rn – 1. The next to last division shows that rn divides rn – 2 because it divides both
Figure 2.2 Euclidean Algorithm
No Yes
a > b?
r > 0?
a and b
b with r
a with b
Divide a by b,
calling the
remainder r
GCD is
the final
value of b
END Figure 2.3 Euclidean
Algorithm Example:
gcd(710, 310)
710 = 2 × 310 + 90
310 = 3 × 90 + 40
90 = 2 × 40 + 10
40 = 4 × 10
Same GCD

terms on the right. Successively, one sees that rn divides all ri>s and finally a and b.
It remains to show that rn is the largest divisor that divides a and b. If we take any
arbitrary integer that divides a and b, it must also divide r1, as explained previously.
We can follow the sequence of equations in Equation (2.3) down and show that c
must divide all ri>s. Therefore c must divide rn, so that rn = gcd(a, b).
Let us now look at an example with relatively large numbers to see the power
of this algorithm:
To find d = gcd(a, b) = gcd(1160718174, 316258250)
a = q1b + r1 1160718174 = 3 * 316258250 + 211943424 d = gcd(316258250, 211943424)
b = q2r1 + r2 316258250 = 1 * 211943424 + 104314826 d = gcd(211943424, 104314826)
r1 = q3r2 + r3 211943424 = 2 * 104314826 + 3313772 d = gcd(104314826, 3313772)
r2 = q4r3 + r4 104314826 = 31 * 3313772 + 1587894 d = gcd(3313772, 1587894)
r3 = q5r4 + r5 3313772 = 2 * 1587894 + 137984 d = gcd(1587894, 137984)
r4 = q6r5 + r6 1587894 = 11 * 137984 + 70070 d = gcd(137984, 70070)
r5 = q7r6 + r7 137984 = 1 * 70070 + 67914 d = gcd(70070, 67914)
r6 = q8r7 + r8 70070 = 1 * 67914 + 2156 d = gcd(67914, 2156)
r7 = q9r8 + r9 67914 = 31 * 2156 + 1078 d = gcd(2156, 1078)
r8 = q10r9 + r10 2156 = 2 * 1078 + 0 d = gcd(1078, 0) = 1078
Therefore, d = gcd(1160718174, 316258250) = 1078
In this example, we begin by dividing 1160718174 by 316258250, which gives 3
with a remainder of 211943424. Next we take 316258250 and divide it by 211943424.
The process continues until we get a remainder of 0, yielding a result of 1078.
It will be helpful in what follows to recast the above computation in tabular
form. For every step of the iteration, we have ri- 2 = qiri- 1 + ri, where ri- 2 is the
dividend, ri- 1 is the divisor, qi is the quotient, and ri is the remainder. Table 2.1 sum-
marizes the results.
Dividend Divisor Quotient Remainder
a = 1160718174 b = 316258250 q1 = 3 r1 = 211943424
b = 316258250 r1 = 211943434 q2 = 1 r2 = 104314826
r1 = 211943424 r2 = 104314826 q3 = 2 r3 = 3313772
r2 = 104314826 r3 = 3313772 q4 = 31 r4 = 1587894
r3 = 3313772 r4 = 1587894 q5 = 2 r5 = 137984
r4 = 1587894 r5 = 137984 q6 = 11 r6 = 70070
r5 = 137984 r6 = 70070 q7 = 1 r7 = 67914
r6 = 70070 r7 = 67914 q8 = 1 r8 = 2156
r7 = 67914 r8 = 2156 q9 = 31 r9 = 1078
r8 = 2156 r9 = 1078 q10 = 2 r10 = 0
Table 2.1 Euclidean Algorithm Example

The Modulus
If a is an integer and n is a positive integer, we define a mod n to be the remainder
when a is divided by n. The integer n is called the modulus. Thus, for any integer a,
we can rewrite Equation (2.1) as follows:
a = qn + r 0 … r 6 n; q = :a/n;
a = :a/n; * n + (a mod n)
11 mod 7 = 4; -11 mod 7 = 3
73 K 4 (mod 23); 21 K -9 (mod 10)
Two integers a and b are said to be congruent modulo n, if (a mod n) =
(b mod n). This is written as a K b (mod n).2
2We have just used the operator mod in two different ways: first as a binary operator that produces a re-
mainder, as in the expression a mod b; second as a congruence relation that shows the equivalence of two
integers, as in the expression a K b (mod n). See Appendix 2A for a discussion.
Note that if a K 0 (mod n), then n � a.
Properties of Congruences
Congruences have the following properties:
1. a K b (mod n) if n � (a – b).
2. a K b (mod n) implies b K a (mod n).
3. a K b (mod n) and b K c (mod n) imply a K c (mod n).
To demonstrate the first point, if n � (a – b), then (a – b) = kn for some k.
So we can write a = b + kn. Therefore, (a mod n) = (remainder when b +
kn is divided by n) = (remainder when b is divided by n) = (b mod n).
23 K 8 (mod 5) because 23 – 8 = 15 = 5 * 3
-11 K 5 (mod 8) because -11 – 5 = -16 = 8 * (-2)
81 K 0 (mod 27) because 81 – 0 = 81 = 27 * 3
The remaining points are as easily proved.

Modular Arithmetic Operations
Note that, by definition (Figure 2.1), the (mod n) operator maps all integers into
the set of integers {0, 1, c , (n – 1)}. This suggests the question: Can we perform
arithmetic operations within the confines of this set? It turns out that we can; this
technique is known as modular arithmetic.
Modular arithmetic exhibits the following properties:
1. [(a mod n) + (b mod n)] mod n = (a + b) mod n
2. [(a mod n) – (b mod n)] mod n = (a – b) mod n
3. [(a mod n) * (b mod n)] mod n = (a * b) mod n
We demonstrate the first property. Define (a mod n) = ra and (b mod n) = rb.
Then we can write a = ra + jn for some integer j and b = rb + kn for some integer k.
(a + b) mod n = (ra + jn + rb + kn) mod n
= (ra + rb + (k + j)n) mod n
= (ra + rb) mod n
= [(a mod n) + (b mod n)] mod n
The remaining properties are proven as easily. Here are examples of the three
11 mod 8 = 3; 15 mod 8 = 7
[(11 mod 8) + (15 mod 8)] mod 8 = 10 mod 8 = 2
(11 + 15) mod 8 = 26 mod 8 = 2
[(11 mod 8) – (15 mod 8)] mod 8 = -4 mod 8 = 4
(11 – 15) mod 8 = -4 mod 8 = 4
[(11 mod 8) * (15 mod 8)] mod 8 = 21 mod 8 = 5
(11 * 15) mod 8 = 165 mod 8 = 5
To find 117 mod 13, we can proceed as follows:
112 = 121 K 4 (mod 13)
114 = (112)2 K 42 K 3 (mod 13)
117 = 11 * 112 * 114
117 K 11 * 4 * 3 K 132 K 2 (mod 13)
Exponentiation is performed by repeated multiplication, as in ordinary
Thus, the rules for ordinary arithmetic involving addition, subtraction, and
multiplication carry over into modular arithmetic.

Table 2.2 provides an illustration of modular addition and multiplication
modulo 8. Looking at addition, the results are straightforward, and there is a reg-
ular pattern to the matrix. Both matrices are symmetric about the main diagonal
in conformance to the commutative property of addition and multiplication. As in
ordinary addition, there is an additive inverse, or negative, to each integer in modu-
lar arithmetic. In this case, the negative of an integer x is the integer y such that
(x + y) mod 8 = 0. To find the additive inverse of an integer in the left-hand col-
umn, scan across the corresponding row of the matrix to find the value 0; the integer
at the top of that column is the additive inverse; thus, (2 + 6) mod 8 = 0. Similarly,
the entries in the multiplication table are straightforward. In modular arithmetic mod
8, the multiplicative inverse of x is the integer y such that (x * y) mod 8 = 1 mod 8.
Now, to find the multiplicative inverse of an integer from the multiplication table,
scan across the matrix in the row for that integer to find the value 1; the integer at
the top of that column is the multiplicative inverse; thus, (3 * 3) mod 8 = 1. Note
that not all integers mod 8 have a multiplicative inverse; more about that later.
Properties of Modular Arithmetic
Define the set Zn as the set of nonnegative integers less than n:
Zn = {0, 1, c , (n – 1)}
Table 2.2 Arithmetic Modulo 8
+ 0 1 2 3 4 5 6 7
0 0 1 2 3 4 5 6 7
1 1 2 3 4 5 6 7 0
2 2 3 4 5 6 7 0 1
3 3 4 5 6 7 0 1 2
4 4 5 6 7 0 1 2 3
5 5 6 7 0 1 2 3 4
6 6 7 0 1 2 3 4 5
7 7 0 1 2 3 4 5 6
(a) Addition modulo 8
* 0 1 2 3 4 5 6 7
0 0 0 0 0 0 0 0 0
1 0 1 2 3 4 5 6 7
2 0 2 4 6 0 2 4 6
3 0 3 6 1 4 7 2 5
4 0 4 0 4 0 4 0 4
5 0 5 2 7 4 1 6 3
6 0 6 4 2 0 6 4 2
7 0 7 6 5 4 3 2 1
(b) Multiplication modulo 8
w -w w-1
0 0 —
1 7 1
2 6 —
3 5 3
4 4 —
5 3 5
6 2 —
7 1 7
(c) Additive and multiplicative
inverse modulo 8

This is referred to as the set of residues, or residue classes (mod n). To be more pre-
cise, each integer in Zn represents a residue class. We can label the residue classes
(mod n) as [0], [1], [2], c , [n – 1], where
[r] = {a: a is an integer, a K r (mod n)}
The residue classes (mod 4) are
[0] = {c , -16, -12, -8, -4, 0, 4, 8, 12, 16, c }
[1] = {c , -15, -11, -7, -3, 1, 5, 9, 13, 17, c }
[2] = {c , -14, -10, -6, -2, 2, 6, 10, 14, 18, c }
[3] = {c , -13, -9, -5, -1, 3, 7, 11, 15, 19, c }
Property Expression
Commutative Laws
(w + x) mod n = (x + w) mod n
(w * x) mod n = (x * w) mod n
Associative Laws
[(w + x) + y] mod n = [w + (x + y)] mod n
[(w * x) * y] mod n = [w * (x * y)] mod n
Distributive Law [w * (x + y)] mod n = [(w * x) + (w * y)] mod n
(0 + w) mod n = w mod n
(1 * w) mod n = w mod n
Additive Inverse (-w) For each w∈ Zn, there exists a z such that w + z K 0 mod n
Table 2.3 Properties of Modular Arithmetic for Integers in Zn
Of all the integers in a residue class, the smallest nonnegative integer is the
one used to represent the residue class. Finding the smallest nonnegative integer to
which k is congruent modulo n is called reducing k modulo n.
If we perform modular arithmetic within Zn, the properties shown in Table 2.3
hold for integers in Zn. We show in the next section that this implies that Zn is a
commutative ring with a multiplicative identity element.
There is one peculiarity of modular arithmetic that sets it apart from ordinary
arithmetic. First, observe that (as in ordinary arithmetic) we can write the following:
if (a + b) K (a + c) (mod n) then b K c (mod n) (2.4)
(5 + 23) K (5 + 7)(mod 8); 23 K 7(mod 8)
Equation (2.4) is consistent with the existence of an additive inverse. Adding
the additive inverse of a to both sides of Equation (2.4), we have
((-a) + a + b) K ((-a) + a + c)(mod n)
b K c (mod n)

However, the following statement is true only with the attached condition:
if (a * b) K (a * c)(mod n) then b K c(mod n) if a is relatively prime to n (2.5)
Recall that two integers are relatively prime if their only common positive integer
factor is 1. Similar to the case of Equation (2.4), we can say that Equation (2.5) is
consistent with the existence of a multiplicative inverse. Applying the multiplicative
inverse of a to both sides of Equation (2.5), we have
((a-1)ab) K ((a-1)ac)(mod n)
b K c(mod n)
To see this, consider an example in which the condition of Equation (2.5) does not
hold. The integers 6 and 8 are not relatively prime, since they have the common
factor 2. We have the following:
6 * 3 = 18 K 2(mod 8)
6 * 7 = 42 K 2(mod 8)
Yet 3 [ 7 (mod 8).
The reason for this strange result is that for any general modulus n, a multi-
plier a that is applied in turn to the integers 0 through (n – 1) will fail to produce a
complete set of residues if a and n have any factors in common.
With a = 6 and n = 8,
Z8 0 1 2 3 4 5 6 7
Multiply by 6 0 6 12 18 24 30 36 42
Residues 0 6 4 2 0 6 4 2
Because we do not have a complete set of residues when multiplying by
6, more than one integer in Z8 maps into the same residue. Specifically,
6 * 0 mod 8 = 6 * 4 mod 8; 6 * 1 mod 8 = 6 * 5 mod 8; and so on. Because
this is a many-to-one mapping, there is not a unique inverse to the multiply
However, if we take a = 5 and n = 8, whose only common factor is 1,
Z8 0 1 2 3 4 5 6 7
Multiply by 5 0 5 10 15 20 25 30 35
Residues 0 5 2 7 4 1 6 3
The line of residues contains all the integers in Z8, in a different order.

In general, an integer has a multiplicative inverse in Zn if and only if that inte-
ger is relatively prime to n. Table 2.2c shows that the integers 1, 3, 5, and 7 have a
multiplicative inverse in Z8; but 2, 4, and 6 do not.
Euclidean Algorithm Revisited
The Euclidean algorithm can be based on the following theorem: For any integers
a, b, with a Ú b Ú 0,
gcd(a, b) = gcd(b, a mod b) (2.6)
gcd(55, 22) = gcd(22, 55 mod 22) = gcd(22, 11) = 11
gcd(18, 12) = gcd(12, 6) = gcd(6, 0) = 6
gcd(11, 10) = gcd(10, 1) = gcd(1, 0) = 1
To see that Equation (2.6) works, let d = gcd(a, b). Then, by the definition of
gcd, d � a and d �b. For any positive integer b, we can express a as
a = kb + r K r (mod b)
a mod b = r
with k, r integers. Therefore, (a mod b) = a – kb for some integer k. But because
d �b, it also divides kb. We also have d � a. Therefore, d � (a mod b). This shows that
d is a common divisor of b and (a mod b). Conversely, if d is a common divisor of b
and (a mod b), then d �kb and thus d � [kb + (a mod b)], which is equivalent to d � a.
Thus, the set of common divisors of a and b is equal to the set of common divisors
of b and (a mod b). Therefore, the gcd of one pair is the same as the gcd of the other
pair, proving the theorem.
Equation (2.6) can be used repetitively to determine the greatest common divisor.
This is the same scheme shown in Equation (2.3), which can be rewritten in
the following way.
Euclidean Algorithm
Calculate Which satisfies
r1 = a mod b a = q1b + r1
r2 = b mod r1 b = q2r1 + r2
r3 = r1 mod r2 r1 = q3r2 + r3
rn = rn – 2 mod rn – 1 rn – 2 = qnrn – 1 + rn
rn + 1 = rn – 1 mod rn = 0 rn – 1 = qn + 1rn + 0
d = gcd(a, b) = rn
We can define the Euclidean algorithm concisely as the following recursive

if (b=0) then return a;
else return Euclid(b, a mod b);
The Extended Euclidean Algorithm
We now proceed to look at an extension to the Euclidean algorithm that will be
important for later computations in the area of finite fields and in encryption algo-
rithms, such as RSA. For given integers a and b, the extended Euclidean algorithm
not only calculates the greatest common divisor d but also two additional integers x
and y that satisfy the following equation.
ax + by = d = gcd(a, b) (2.7)
It should be clear that x and y will have opposite signs. Before examining the
algorithm, let us look at some of the values of x and y when a = 42 and b = 30.
Note that gcd(42, 30) = 6. Here is a partial table of values3 for 42x + 30y.
x − 3 − 2 − 1 0 1 2 3
-3 -216 -174 -132 -90 -48 -6 36
-2 -186 -144 -102 -60 -18 24 66
-1 -156 -114 -72 -30 12 54 96
0 -126 -84 -42 0 42 84 126
1 -96 -54 -12 30 72 114 156
2 -66 -24 18 60 102 144 186
3 -36 6 48 90 132 174 216
Observe that all of the entries are divisible by 6. This is not surpris-
ing, because both 42 and 30 are divisible by 6, so every number of the form
42x + 30y = 6(7x + 5y) is a multiple of 6. Note also that gcd(42, 30) = 6 appears
in the table. In general, it can be shown that for given integers a and b, the smallest
positive value of ax + by is equal to gcd(a, b).
Now let us show how to extend the Euclidean algorithm to determine (x, y, d)
given a and b. We again go through the sequence of divisions indicated in Equation
(2.3), and we assume that at each step i we can find integers xi and yi that satisfy
ri = axi + byi. We end up with the following sequence.
a = q1b + r1 r1 = ax1 + by1
b = q2r1 + r2 r2 = ax2 + by2
r1 = q3r2 + r3 r3 = ax3 + by3
f f
rn – 2 = qnrn – 1 + rn rn = axn + byn
rn – 1 = qn + 1rn + 0
3This example is taken from [SILV06].

Now, observe that we can rearrange terms to write
ri = ri- 2 – ri- 1qi (2.8)
Also, in rows i – 1 and i – 2, we find the values
ri- 2 = axi- 2 + byi- 2 and ri- 1 = axi- 1 + byi- 1
Substituting into Equation (2.8), we have
ri = (axi- 2 + byi- 2) – (axi- 1 + byi- 1)qi
= a(xi- 2 – qixi- 1) + b(yi- 2 – qiyi- 1)
But we have already assumed that ri = axi + byi. Therefore,
xi = xi- 2 – qixi- 1 and yi = yi- 2 – qiyi- 1
We now summarize the calculations:
Extended Euclidean Algorithm
Calculate Which satisfies Calculate Which satisfies
r-1 = a x-1 = 1; y-1 = 0 a = ax-1 + by-1
r0 = b x0 = 0; y0 = 1 b = ax0 + by0
r1 = a mod b
q1 = :a/b;
a = q1b + r1 x1 = x-1 – q1x0 = 1
y1 = y-1 – q1y0 = -q1
r1 = ax1 + by1
r2 = b mod r1
q2 = :b/r1;
b = q2r1 + r2 x2 = x0 – q2x1
y2 = y0 – q2y1
r2 = ax2 + by2
r3 = r1 mod r2
q3 = :r1/r2;
r1 = q3r2 + r3 x3 = x1 – q3x2
y3 = y1 – q3y2
r3 = ax3 + by3
rn = rn – 2 mod rn – 1
qn = :rn – 2/rn – 1;
rn – 2 = qnrn – 1 + rn xn = xn – 2 – qnxn – 1
yn = yn – 2 – qnyn – 1
rn = axn + byn
rn + 1 = rn – 1 mod rn = 0
qn + 1 = :rn – 1/rn;
rn – 1 = qn + 1rn + 0 d = gcd(a, b) = rn
x = xn; y = yn
We need to make several additional comments here. In each row, we calculate
a new remainder ri based on the remainders of the previous two rows, namely ri- 1
and ri- 2. To start the algorithm, we need values for r0 and r-1, which are just a and b.
It is then straightforward to determine the required values for x-1, y-1, x0, and y0.
We know from the original Euclidean algorithm that the process ends
with a remainder of zero and that the greatest common divisor of a and b is
d = gcd(a, b) = rn. But we also have determined that d = rn = axn + byn.
Therefore, in Equation (2.7), x = xn and y = yn.
As an example, let us use a = 1759 and b = 550 and solve for
1759x + 550y = gcd(1759, 550). The results are shown in Table 2.4. Thus, we have
1759 * (-111) + 550 * 355 = -195249 + 195250 = 1.

A central concern of number theory is the study of prime numbers. Indeed, whole
books have been written on the subject (e.g., [CRAN01], [RIBE96]). In this section,
we provide an overview relevant to the concerns of this book.
An integer p 7 1 is a prime number if and only if its only divisors5 are {1 and
{p. Prime numbers play a critical role in number theory and in the techniques dis-
cussed in this chapter. Table 2.5 shows the primes less than 2000. Note the way the
primes are distributed. In particular, note the number of primes in each range of
100 numbers.
Any integer a 7 1 can be factored in a unique way as
a = p1a1 * p2a2 * g * ptat (2.9)
where p1 6 p2 6 c 6 pt are prime numbers and where each ai is a positive inte-
ger. This is known as the fundamental theorem of arithmetic; a proof can be found
in any text on number theory.
4In this section, unless otherwise noted, we deal only with the nonnegative integers. The use of negative
integers would introduce no essential differences.
5Recall from Section 2.1 that integer a is said to be a divisor of integer b if there is no remainder on
division. Equivalently, we say that a divides b.
i ri qi xi yi
-1 1759 1 0
0 550 0 1
1 109 3 1 -3
2 5 5 -5 16
3 4 21 106 -339
4 1 1 -111 355
5 0 4
Result: d = 1; x = -111; y = 355
Table 2.4 Extended Euclidean Algorithm Example
91 = 7 * 13
3600 = 24 * 32 * 52
11011 = 7 * 112 * 13
It is useful for what follows to express this another way. If P is the set of
all prime numbers, then any positive integer a can be written uniquely in the
following form:
a = q
pap where each ap Ú 0

79 83 89 97


The right-hand side is the product over all possible prime numbers p; for any par-
ticular value of a, most of the exponents ap will be 0.
The value of any given positive integer can be specified by simply listing all the
nonzero exponents in the foregoing formulation.
The integer 12 is represented by {a2 = 2, a3 = 1}.
The integer 18 is represented by {a2 = 1, a3 = 2}.
The integer 91 is represented by {a7 = 1, a13 = 1}.
Multiplication of two numbers is equivalent to adding the corresponding
exponents. Given a = q
pap, b = q
pbp. Define k = ab. We know that the inte-
ger k can be expressed as the product of powers of primes: k = q
pkp. It follows
that kp = ap + bp for all p ∈ P.
k = 12 * 18 = (22 * 3) * (2 * 32) = 216
k2 = 2 + 1 = 3; k3 = 1 + 2 = 3
216 = 23 * 33 = 8 * 27
a = 12; b = 36; 12 � 36
12 = 22 * 3; 36 = 22 * 32
a2 = 2 = b2
a3 = 1 … 2 = b3
Thus, the inequality ap … bp is satisfied for all prime numbers.
What does it mean, in terms of the prime factors of a and b, to say that a divides b?
Any integer of the form pn can be divided only by an integer that is of a lesser
or equal power of the same prime number, pj with j … n. Thus, we can say the
a = q
pap, b = q
If a �b, then ap … bp for all p.
It is easy to determine the greatest common divisor of two positive integers if
we express each integer as the product of primes.

The following relationship always holds:
If k = gcd(a, b), then kp = min(ap, bp) for all p.
Determining the prime factors of a large number is no easy task, so the pre-
ceding relationship does not directly lead to a practical method of calculating the
greatest common divisor.
Two theorems that play important roles in public-key cryptography are Fermat’s
theorem and Euler’s theorem.
Fermat’s Theorem6
Fermat’s theorem states the following: If p is prime and a is a positive integer not
divisible by p, then
ap – 1 K 1 (mod p) (2.10)
Proof: Consider the set of positive integers less than p: {1, 2, c , p – 1} and mul-
tiply each element by a, modulo p, to get the set X = {a mod p, 2a mod p, c ,
(p – 1)a mod p}. None of the elements of X is equal to zero because p does not
divide a. Furthermore, no two of the integers in X are equal. To see this, assume that
ja K ka(mod p)), where 1 … j 6 k … p – 1. Because a is relatively prime7 to p, we
can eliminate a from both sides of the equation [see Equation (2.3)] resulting in
j K k(mod p). This last equality is impossible, because j and k are both positive inte-
gers less than p. Therefore, we know that the (p – 1) elements of X are all positive
integers with no two elements equal. We can conclude the X consists of the set of
integers {1, 2, c , p – 1} in some order. Multiplying the numbers in both sets
(p and X) and taking the result mod p yields
a * 2a * g * (p – 1)a K [(1 * 2 * g * (p – 1)](mod p)
ap – 1(p – 1)! K (p – 1)! (mod p)
We can cancel the (p – 1)! term because it is relatively prime to p [see Equation
(2.5)]. This yields Equation (2.10), which completes the proof.
6This is sometimes referred to as Fermat’s little theorem.
7Recall from Section 2.2 that two numbers are relatively prime if they have no prime factors in common;
that is, their only common divisor is 1. This is equivalent to saying that two numbers are relatively prime
if their greatest common divisor is 1.
300 = 22 * 31 * 52
18 = 21 * 32
gcd(18,300) = 21 * 31 * 50 = 6

An alternative form of Fermat’s theorem is also useful: If p is prime and a is a
positive integer, then
ap K a(mod p) (2.11)
Note that the first form of the theorem [Equation (2.10)] requires that a be rela-
tively prime to p, but this form does not.
a = 7, p = 19
72 = 49 K 11 (mod 19)
74 K 121 K 7 (mod 19)
78 K 49 K 11 (mod 19)
716 K 121 K 7 (mod 19)
ap – 1 = 718 = 716 * 72 K 7 * 11 K 1 (mod 19)
p = 5, a = 3 ap = 35 = 243 K 3(mod 5) = a(mod p)
p = 5, a = 10 ap = 105 = 100000 K 10(mod 5) K 0(mod 5) = a(mod p)
Euler’s Totient Function
Before presenting Euler’s theorem, we need to introduce an important quantity in
number theory, referred to as Euler’s totient function. This function, written f(n),
is defined as the number of positive integers less than n and relatively prime to n.
By convention, f(1) = 1.
Determine f(37) and f(35).
Because 37 is prime, all of the positive integers from 1 through 36 are relatively
prime to 37. Thus f(37) = 36.
To determine f(35), we list all of the positive integers less than 35 that are
relatively prime to it:
1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 16, 17, 18
19, 22, 23, 24, 26, 27, 29, 31, 32, 33, 34
There are 24 numbers on the list, so f(35) = 24.
Table 2.6 lists the first 30 values of f(n). The value f(1) is without meaning
but is defined to have the value 1.
It should be clear that, for a prime number p,
f(p) = p – 1
Now suppose that we have two prime numbers p and q with p ≠ q. Then we can
show that, for n = pq,

f(n) = f(pq) = f(p) * f(q) = (p – 1) * (q – 1)
To see that f(n) = f(p) * f(q), consider that the set of positive integers less than
n is the set {1, c , (pq – 1)}. The integers in this set that are not relatively prime
to n are the set {p, 2p, c , (q – 1)p} and the set {q, 2q, c , (p – 1)q}. To see
this, consider that any integer that divides n must divide either of the prime num-
bers p or q. Therefore, any integer that does not contain either p or q as a factor is
relatively prime to n. Further note that the two sets just listed are non-overlapping:
Because p and q are prime, we can state that none of the integers in the first set can
be written as a multiple of q, and none of the integers in the second set can be writ-
ten as a multiple of p. Thus the total number of unique integers in the two sets is
(q – 1) + (p – 1). Accordingly,
f(n) = (pq – 1) – [(q – 1) + (p – 1)]
= pq – (p + q) + 1
= (p – 1) * (q – 1)
= f(p) * f(q)
f(21) = f(3) * f(7) = (3 – 1) * (7 – 1) = 2 * 6 = 12
where the 12 integers are {1, 2, 4, 5, 8, 10, 11, 13, 16, 17, 19, 20}.
Table 2.6 Some Values of Euler’s Totient Function f(n)
n f(n)
1 1
2 1
3 2
4 2
5 4
6 2
7 6
8 4
9 6
10 4
n f(n)
11 10
12 4
13 12
14 6
15 8
16 8
17 16
18 6
19 18
20 8
n f(n)
21 12
22 10
23 22
24 8
25 20
26 12
27 18
28 12
29 28
30 8
Euler’s Theorem
Euler’s theorem states that for every a and n that are relatively prime:
af(n) K 1(mod n) (2.12)
Proof: Equation (2.12) is true if n is prime, because in that case, f(n) = (n – 1)
and Fermat’s theorem holds. However, it also holds for any integer n. Recall that

f(n) is the number of positive integers less than n that are relatively prime to n.
Consider the set of such integers, labeled as
R = {x1, x2, c , xf(n)}
That is, each element xi of R is a unique positive integer less than n with gcd(xi, n) = 1.
Now multiply each element by a, modulo n:
S = {(ax1 mod n), (ax2 mod n), c , (axf(n) mod n)}
The set S is a permutation8 of R , by the following line of reasoning:
1. Because a is relatively prime to n and xi is relatively prime to n, axi must also
be relatively prime to n. Thus, all the members of S are integers that are less
than n and that are relatively prime to n.
2. There are no duplicates in S. Refer to Equation (2.5). If axi mod n= axj
mod n, then xi = xj.
(axi mod n) = q
axi K q
xi (mod n)
af(n) * Jqf(n)
xiR K qf(n)
xi (mod n)
af(n) K 1 (mod n)
which completes the proof. This is the same line of reasoning applied to the proof
of Fermat’s theorem.
8A permutation of a finite set of elements S is an ordered sequence of all the elements of S, with each
element appearing exactly once.
a = 3; n = 10; f(10) = 4; af(n) = 34 = 81 = 1(mod 10) = 1(mod n)
a = 2; n = 11; f(11) = 10; af(n) = 210 = 1024 = 1(mod 11) = 1(mod n)
As is the case for Fermat’s theorem, an alternative form of the theorem is also
af(n) + 1 K a(mod n) (2.13)
Again, similar to the case with Fermat’s theorem, the first form of Euler’s theorem
[Equation (2.12)] requires that a be relatively prime to n, but this form does not.

For many cryptographic algorithms, it is necessary to select one or more very large
prime numbers at random. Thus, we are faced with the task of determining whether
a given large number is prime. There is no simple yet efficient means of accomplish-
ing this task.
In this section, we present one attractive and popular algorithm. You may be
surprised to learn that this algorithm yields a number that is not necessarily a prime.
However, the algorithm can yield a number that is almost certainly a prime. This will
be explained presently. We also make reference to a deterministic algorithm for find-
ing primes. The section closes with a discussion concerning the distribution of primes.
Miller–Rabin Algorithm9
The algorithm due to Miller and Rabin [MILL75, RABI80] is typically used to test
a large number for primality. Before explaining the algorithm, we need some back-
ground. First, any positive odd integer n Ú 3 can be expressed as
n – 1 = 2kq with k 7 0, q odd
To see this, note that n – 1 is an even integer. Then, divide (n – 1) by 2 until the
result is an odd number q, for a total of k divisions. If n is expressed as a binary
number, then the result is achieved by shifting the number to the right until the
rightmost digit is a 1, for a total of k shifts. We now develop two properties of prime
numbers that we will need.
TWO PROPERTIES OF PRIME NUMBERS The first property is stated as follows: If p is
prime and a is a positive integer less than p, then a2 mod p = 1 if and only if either
a mod p = 1 or a mod p = -1 mod p = p – 1. By the rules of modular arithmetic
(a mod p) (a mod p) = a2 mod p. Thus, if either a mod p = 1 or a mod p = -1,
then a2 mod p = 1. Conversely, if a2 mod p = 1, then (a mod p)2 = 1, which is true
only for a mod p = 1 or a mod p = -1.
The second property is stated as follows: Let p be a prime number greater
than 2. We can then write p – 1 = 2kq with k 7 0, q odd. Let a be any integer in
the range 1 6 a 6 p – 1. Then one of the two following conditions is true.
1. aq is congruent to 1 modulo p. That is, aq mod p = 1, or equivalently,
aq K 1(mod p).
2. One of the numbers aq, a2q, a4q, c , a2
k – 1q is congruent to -1 mod-
ulo p. That is, there is some number j in the range (1 … j … k) such that
j – 1q mod p = -1 mod p = p – 1 or equivalently, a2
j – 1q K – 1(mod p).
Proof: Fermat’s theorem [Equation (2.10)] states that an – 1 K 1(mod n) if n is
prime. We have p – 1 = 2kq. Thus, we know that ap – 1 mod p = a2
kq mod p = 1.
Thus, if we look at the sequence of numbers
aq mod p, a2q mod p, a4q mod p, c , a2
k – 1q mod p, a2
kq mod p (2.14)
9Also referred to in the literature as the Rabin-Miller algorithm, or the Rabin-Miller test, or the Miller–
Rabin test.

we know that the last number in the list has value 1. Further, each number in the list
is the square of the previous number. Therefore, one of the following possibilities
must be true.
1. The first number on the list, and therefore all subsequent numbers on the list,
equals 1.
2. Some number on the list does not equal 1, but its square mod p does equal 1.
By virtue of the first property of prime numbers defined above, we know that
the only number that satisfies this condition is p – 1. So, in this case, the list
contains an element equal to p – 1.
This completes the proof.
DETAILS OF THE ALGORITHM These considerations lead to the conclusion that,
if n is prime, then either the first element in the list of residues, or remainders,
(aq, a2q, c , a2
k – 1q, a2
kq) modulo n equals 1; or some element in the list equals
(n – 1); otherwise n is composite (i.e., not a prime). On the other hand, if the
condition is met, that does not necessarily mean that n is prime. For example, if
n = 2047 = 23 * 89, then n – 1 = 2 * 1023. We compute 21023 mod 2047 = 1, so
that 2047 meets the condition but is not prime.
We can use the preceding property to devise a test for primality. The procedure
TEST takes a candidate integer n as input and returns the result composite if n is
definitely not a prime, and the result inconclusive if n may or may not be a prime.
TEST (n)
1. Find integers k, q, with k > 0, q odd, so that
(n − 1 = 2k q);
2. Select a random integer a, 1 < a < n - 1; 3. if aq mod n = 1 then return(”inconclusive”); 4. for j = 0 to k - 1 do 5. if a2 j qmod n = n - 1 then return(”inconclusive”); 6. return(”composite”); Let us apply the test to the prime number n = 29. We have (n - 1) = 28 = 22(7) = 2kq. First, let us try a = 10. We compute 107 mod 29 = 17, which is neither 1 nor 28, so we continue the test. The next calculation finds that (107)2 mod 29 = 28, and the test returns inconclusive (i.e., 29 may be prime). Let’s try again with a = 2. We have the following calculations: 27 mod 29 = 12; 214 mod 29 = 28; and the test again returns inconclusive. If we perform the test for all integers a in the range 1 through 28, we get the same inconclusive result, which is compatible with n being a prime number. Now let us apply the test to the composite number n = 13 * 17 = 221. Then (n - 1) = 220 = 22(55) = 2kq. Let us try a = 5. Then we have 555 mod 221 = 112, which is neither 1 nor 220(555)2 mod 221 = 168. Because we have used all values of j (i.e., j = 0 and j = 1) in line 4 of the TEST algorithm, the test returns composite, indi- cating that 221 is definitely a composite number. But suppose we had selected a = 21. Then we have 2155 mod 221 = 200; (2155)2 mod 221 = 220; and the test returns inconclusive, indicating that 221 may be prime. In fact, of the 218 integers from 2 through 219, four of these will return an inconclusive result, namely 21, 47, 174, and 200. 70 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY REPEATED USE OF THE MILLER–RABIN ALGORITHM How can we use the Miller–Rabin algorithm to determine with a high degree of confidence whether or not an integer is prime? It can be shown [KNUT98] that given an odd number n that is not prime and a randomly chosen integer, a with 1 6 a 6 n - 1, the probability that TEST will return inconclusive (i.e., fail to detect that n is not prime) is less than 1/4. Thus, if t different values of a are chosen, the probability that all of them will pass TEST (return inconclusive) for n is less than (1/4)t. For example, for t = 10, the probability that a nonprime number will pass all ten tests is less than 10-6. Thus, for a sufficiently large value of t , we can be confident that n is prime if Miller’s test always returns inconclusive. This gives us a basis for determining whether an odd integer n is prime with a reasonable degree of confidence. The procedure is as follows: Repeatedly invoke TEST (n) using randomly chosen values for a. If, at any point, TEST returns composite, then n is determined to be nonprime. If TEST continues to return inconclusive for t tests, then for a sufficiently large value of t, assume that n is prime. A Deterministic Primality Algorithm Prior to 2002, there was no known method of efficiently proving the primality of very large numbers. All of the algorithms in use, including the most popular (Miller– Rabin), produced a probabilistic result. In 2002 (announced in 2002, published in 2004), Agrawal, Kayal, and Saxena [AGRA04] developed a relatively simple deterministic algorithm that efficiently determines whether a given large number is a prime. The algorithm, known as the AKS algorithm, does not appear to be as efficient as the Miller–Rabin algorithm. Thus far, it has not supplanted this older, probabilistic technique. Distribution of Primes It is worth noting how many numbers are likely to be rejected before a prime num- ber is found using the Miller–Rabin test, or any other test for primality. A result from number theory, known as the prime number theorem, states that the primes near n are spaced on the average one every ln (n) integers. Thus, on average, one would have to test on the order of ln(n) integers before a prime is found. Because all even integers can be immediately rejected, the correct figure is 0.5 ln(n). For example, if a prime on the order of magnitude of 2200 were sought, then about 0.5 ln(2200) = 69 trials would be needed to find a prime. However, this figure is just an average. In some places along the number line, primes are closely packed, and in other places there are large gaps. The two consecutive odd integers 1,000,000,000,061 and 1,000,000,000,063 are both prime. On the other hand, 1001! + 2, 1001! + 3, c , 1001! + 1000, 1001! + 1001 is a sequence of 1000 consecutive composite integers. 2.7 / THE CHINESE REMAINDER THEOREM 71 2.7 THE CHINESE REMAINDER THEOREM One of the most useful results of number theory is the Chinese remainder theorem (CRT).10 In essence, the CRT says it is possible to reconstruct integers in a certain range from their residues modulo a set of pairwise relatively prime moduli. 10The CRT is so called because it is believed to have been discovered by the Chinese mathematician Sun-Tsu in around 100 A.D. The 10 integers in Z10, that is the integers 0 through 9, can be reconstructed from their two residues modulo 2 and 5 (the relatively prime factors of 10). Say the known residues of a decimal digit x are r2 = 0 and r5 = 3; that is, x mod 2 = 0 and x mod 5 = 3. Therefore, x is an even integer in Z10 whose remainder, on divi- sion by 5, is 3. The unique solution is x = 8. The CRT can be stated in several ways. We present here a formulation that is most useful from the point of view of this text. An alternative formulation is explored in Problem 2.33. Let M = q k i=1 mi where the mi are pairwise relatively prime; that is, gcd(mi, mj) = 1 for 1 … i, j … k, and i ≠ j. We can represent any integer A in ZM by a k-tuple whose elements are in Zmi using the following correspondence: A 4 (a1, a2, c , ak) (2.15) where A ∈ ZM, ai∈ Zmi, and ai = A mod mi for 1 … i … k. The CRT makes two assertions. 1. The mapping of Equation (2.15) is a one-to-one correspondence (called a bijection) between ZM and the Cartesian product Zm1 * Zm2 * c * Zmk. That is, for every integer A such that 0 … A 6 M, there is a unique k- tuple (a1, a2, c , ak) with 0 … ai 6 mi that represents it, and for every such k- tuple (a1, a2, c , ak), there is a unique integer A in ZM. 2. Operations performed on the elements of ZM can be equivalently performed on the corresponding k-tuples by performing the operation independently in each coordinate position in the appropriate system. Let us demonstrate the first assertion. The transformation from A to (a1, a2, c , ak), is obviously unique; that is, each ai is uniquely calculated as ai = A mod mi. Computing A from (a1, a2, c , ak) can be done as follows. Let 72 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY Mi = M/mi for 1 … i … k. Note that Mi = m1 * m2 * c * mi- 1 * mi+ 1 * c * mk, so that Mi K 0 (mod mj) for all j ≠ i. Then let ci = Mi * (Mi-1 mod mi) for 1 … i … k (2.16) By the definition of Mi, it is relatively prime to mi and therefore has a unique multi- plicative inverse mod mi. So Equation (2.16) is well defined and produces a unique value ci. We can now compute A K ¢ ak i=1 aici≤(mod M) (2.17) To show that the value of A produced by Equation (2.17) is correct, we must show that ai = A mod mi for 1 … i … k. Note that cj K Mj K 0 (mod mi) if j ≠ i, and that ci K 1 (mod mi). It follows that ai = A mod mi. The second assertion of the CRT, concerning arithmetic operations, follows from the rules for modular arithmetic. That is, the second assertion can be stated as follows: If A 4 (a1, a2, c , ak) B 4 (b1, b2, c , bk) then (A + B) mod M 4 ((a1 + b1) mod m1, c , (ak + bk) mod mk) (A - B) mod M 4 ((a1 - b1) mod m1, c , (ak - bk) mod mk) (A * B) mod M 4 ((a1 * b1) mod m1, c , (ak * bk) mod mk) One of the useful features of the Chinese remainder theorem is that it provides a way to manipulate (potentially very large) numbers mod M in terms of tuples of smaller numbers. This can be useful when M is 150 digits or more. However, note that it is necessary to know beforehand the factorization of M. To represent 973 mod 1813 as a pair of numbers mod 37 and 49, define m1 = 37 m2 = 49 M = 1813 A = 973 We also have M1 = 49 and M2 = 37. Using the extended Euclidean algorithm, we compute M1 -1 = 34 mod m1 and M2-1 = 4 mod m2. (Note that we only need to compute each Mi and each Mi -1 once.) Taking residues modulo 37 and 49, our representation of 973 is (11, 42), because 973 mod 37 = 11 and 973 mod 49 = 42. Now suppose we want to add 678 to 973. What do we do to (11, 42)? First we compute (678) 4 (678 mod 37, 678 mod 49) = (12, 41). Then we add the tuples element-wise and reduce (11 + 12 mod 37, 42 + 41 mod 49) = (23, 34). To verify that this has the correct effect, we compute 2.8 / DISCRETE LOGARITHMS 73 2.8 DISCRETE LOGARITHMS Discrete logarithms are fundamental to a number of public-key algorithms, includ- ing Diffie–Hellman key exchange and the digital signature algorithm (DSA). This section provides a brief overview of discrete logarithms. For the interested reader, more detailed developments of this topic can be found in [ORE67] and [LEVE90]. The Powers of an Integer, Modulo n Recall from Euler’s theorem [Equation (2.12)] that, for every a and n that are rela- tively prime, af(n) K 1 (mod n) where f(n), Euler’s totient function, is the number of positive integers less than n and relatively prime to n. Now consider the more general expression: am K 1 (mod n) (2.18) If a and n are relatively prime, then there is at least one integer m that satisfies Equation (2.18), namely, m = f(n). The least positive exponent m for which Equation (2.18) holds is referred to in several ways: ■ The order of a (mod n) ■ The exponent to which a belongs (mod n) ■ The length of the period generated by a (23, 34) 4 a1M1M1-1 + a2M2M2-1 mod M = [(23)(49)(34) + (34)(37)(4)] mod 1813 = 43350 mod 1813 = 1651 and check that it is equal to (973 + 678) mod 1813 = 1651. Remember that in the above derivation, Mi -1 is the multiplicative inverse of M1 modulo m1 and M2 -1 is the multiplicative inverse of M2 modulo m2. Suppose we want to multiply 1651 (mod 1813) by 73. We multiply (23, 34) by 73 and reduce to get (23 * 73 mod 37, 34 * 73 mod 49) = (14, 32). It is eas- ily verified that (14, 32) 4 [(14)(49)(34) + (32)(37)(4)] mod 1813 = 865 = 1651 * 73 mod 1813 Hiva-Network.Com 74 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY Table 2.7 shows all the powers of a, modulo 19 for all positive a 6 19. The length of the sequence for each base value is indicated by shading. Note the following: 1. All sequences end in 1. This is consistent with the reasoning of the preceding few paragraphs. 2. The length of a sequence divides f(19) = 18. That is, an integral number of sequences occur in each row of the table. 3. Some of the sequences are of length 18. In this case, it is said that the base inte- ger a generates (via powers) the set of nonzero integers modulo 19. Each such integer is called a primitive root of the modulus 19. More generally, we can say that the highest possible exponent to which a num- ber can belong (mod n) is f(n). If a number is of this order, it is referred to as a primitive root of n. The importance of this notion is that if a is a primitive root of n, then its powers a, a2, c , af(n) are distinct (mod n) and are all relatively prime to n. In particular, for a prime num- ber p, if a is a primitive root of p, then a, a2, c , ap - 1 are distinct (mod p). For the prime number 19, its primitive roots are 2, 3, 10, 13, 14, and 15. Not all integers have primitive roots. In fact, the only integers with primitive roots are those of the form 2, 4, pa, and 2pa, where p is any odd prime and a is a positive integer. The proof is not simple but can be found in many number theory books, including [ORE76]. To see this last point, consider the powers of 7, modulo 19: 71 K 7 (mod 19) 72 = 49 = 2 * 19 + 11 K 11 (mod 19) 73 = 343 = 18 * 19 + 1 K 1 (mod 19) 74 = 2401 = 126 * 19 + 7 K 7 (mod 19) 75 = 16807 = 884 * 19 + 11 K 11 (mod 19) There is no point in continuing because the sequence is repeating. This can be proven by noting that 73 K 1(mod 19), and therefore, 73 + j K 737j K 7j(mod 19), and hence, any two powers of 7 whose exponents differ by 3 (or a multiple of 3) are congruent to each other (mod 19). In other words, the sequence is periodic, and the length of the period is the smallest positive exponent m such that 7m K 1(mod 19). 2.8 / DISCRETE LOGARITHMS 75 Logarithms for Modular Arithmetic With ordinary positive real numbers, the logarithm function is the inverse of expo- nentiation. An analogous function exists for modular arithmetic. Let us briefly review the properties of ordinary logarithms. The logarithm of a number is defined to be the power to which some positive base (except 1) must be raised in order to equal the number. That is, for base x and for a value y, y = xlogx(y) The properties of logarithms include logx(1) = 0 logx(x) = 1 logx(yz) = logx(y) + logx(z) (2.19) logx(y r) = r * logx(y) (2.20) Consider a primitive root a for some prime number p (the argument can be developed for nonprimes as well). Then we know that the powers of a from a a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 4 8 16 13 7 14 9 18 17 15 11 3 6 12 5 10 1 3 9 8 5 15 7 2 6 18 16 10 11 14 4 12 17 13 1 4 16 7 9 17 11 6 5 1 4 16 7 9 17 11 6 5 1 5 6 11 17 9 7 16 4 1 5 6 11 17 9 7 16 4 1 6 17 7 4 5 11 9 16 1 6 17 7 4 5 11 9 16 1 7 11 1 7 11 1 7 11 1 7 11 1 7 11 1 7 11 1 8 7 18 11 12 1 8 7 18 11 12 1 8 7 18 11 12 1 9 5 7 6 16 11 4 17 1 9 5 7 6 16 11 4 17 1 10 5 12 6 3 11 15 17 18 9 14 7 13 16 8 4 2 1 11 7 1 11 7 1 11 7 1 11 7 1 11 7 1 11 7 1 12 11 18 7 8 1 12 11 18 7 8 1 12 11 18 7 8 1 13 17 12 4 14 11 10 16 18 6 2 7 15 5 8 9 3 1 14 6 8 17 10 7 3 4 18 5 13 11 2 9 12 16 15 1 15 16 12 9 2 11 13 5 18 4 3 7 10 17 8 6 14 1 16 9 11 5 4 7 17 6 1 16 9 11 5 4 7 17 6 1 17 4 11 16 6 7 5 9 1 17 4 11 16 6 7 5 9 1 18 1 18 1 18 1 18 1 18 1 18 1 18 1 18 1 18 1 Table 2.7 Powers of Integers, Modulo 19 76 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY 1 through (p - 1) produce each integer from 1 through (p - 1) exactly once. We also know that any integer b satisfies b K r (mod p) for some r, where 0 … r … (p - 1) by the definition of modular arithmetic. It follows that for any integer b and a primi- tive root a of prime number p, we can find a unique exponent i such that b K ai(mod p) where 0 … i … (p - 1) This exponent i is referred to as the discrete logarithm of the number b for the base a (mod p). We denote this value as dloga,p(b). 11 Note the following: dloga,p(1) = 0 because a0 mod p = 1 mod p = 1 (2.21) dloga,p(a) = 1 because a1 mod p = a (2.22) 11Many texts refer to the discrete logarithm as the index. There is no generally agreed notation for this concept, much less an agreed name. Here is an example using a nonprime modulus, n = 9. Here f(n) = 6 and a = 2 is a primitive root. We compute the various powers of a and find 20 = 1 24 K 7 (mod 9) 21 = 2 25 K 5 (mod 9) 22 = 4 26 K 1 (mod 9) 23 = 8 This gives us the following table of the numbers with given discrete logarithms (mod 9) for the root a = 2: Logarithm 0 1 2 3 4 5 Number 1 2 4 8 7 5 To make it easy to obtain the discrete logarithms of a given number, we rearrange the table: Number 1 2 4 5 7 8 Logarithm 0 1 2 5 4 3 Now consider x = adloga, p(x) mod p y = adloga, p(y) mod p xy = adloga, p(xy) mod p 2.8 / DISCRETE LOGARITHMS 77 Using the rules of modular multiplication, xy mod p = [(x mod p)(y mod p)] mod p adloga, p(xy) mod p = [(adloga, p(x) mod p)(adloga, p(y) mod p)] mod p = (adloga, p(x) + dloga, p(y)) mod p But now consider Euler’s theorem, which states that, for every a and n that are relatively prime, af(n) K 1(mod n) Any positive integer z can be expressed in the form z = q + kf(n), with 0 … q 6 f(n). Therefore, by Euler’s theorem, az K aq(mod n) if z K q mod f(n) Applying this to the foregoing equality, we have dloga, p(xy) K [dloga, p(x) + dloga, p(y)](mod f(p)) and generalizing, dloga, p(y r) K [r * dloga, p(y)](mod f(p)) This demonstrates the analogy between true logarithms and discrete logarithms. Keep in mind that unique discrete logarithms mod m to some base a exist only if a is a primitive root of m. Table 2.8, which is directly derived from Table 2.7, shows the sets of discrete logarithms that can be defined for modulus 19. Calculation of Discrete Logarithms Consider the equation y = gx mod p Given g, x, and p, it is a straightforward matter to calculate y. At the worst, we must perform x repeated multiplications, and algorithms exist for achieving greater effi- ciency (see Chapter 9). However, given y, g, and p, it is, in general, very difficult to calculate x (take the discrete logarithm). The difficulty seems to be on the same order of magnitude as that of factoring primes required for RSA. At the time of this writing, the asymp- totically fastest known algorithm for taking discrete logarithms modulo a prime number is on the order of [BETH91]: e((ln p) 1/3(ln(ln p))2/3) which is not feasible for large primes. 78 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY 2.9 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS (a) Discrete logarithms to the base 2, modulo 19 a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 log2,19(a) 18 1 13 2 16 14 6 3 8 17 12 15 5 7 11 4 10 9 (b) Discrete logarithms to the base 3, modulo 19 a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 log3,19(a) 18 7 1 14 4 8 6 3 2 11 12 15 17 13 5 10 16 9 (c) Discrete logarithms to the base 10, modulo 19 a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 log10,19(a) 18 17 5 16 2 4 12 15 10 1 6 3 13 11 7 14 8 9 (d) Discrete logarithms to the base 13, modulo 19 a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 log13,19(a) 18 11 17 4 14 10 12 15 16 7 6 3 1 5 13 8 2 9 (e) Discrete logarithms to the base 14, modulo 19 a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 log14,19(a) 18 13 7 8 10 2 6 3 14 5 12 15 11 1 17 16 4 9 (f) Discrete logarithms to the base 15, modulo 19 a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 log15,19(a) 18 5 11 10 8 16 12 15 4 13 6 3 7 17 1 2 14 9 Table 2.8 Tables of Discrete Logarithms, Modulo 19 Key Terms bijection composite number commutative Chinese remainder theorem discrete logarithm divisor Euclidean algorithm Euler’s theorem Euler’s totient function Fermat’s theorem greatest common divisor identity element index modular arithmetic modulus order prime number primitive root relatively prime residue 2.9 / KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS 79 Review Questions 2.1 What does it mean to say that b is a divisor of a? 2.2 What is the meaning of the expression a divides b? 2.3 What is the difference between modular arithmetic and ordinary arithmetic? 2.4 What is a prime number? 2.5 What is Euler’s totient function? 2.6 The Miller–Rabin test can determine if a number is not prime but cannot determine if a number is prime. How can such an algorithm be used to test for primality? 2.7 What is a primitive root of a number? 2.8 What is the difference between an index and a discrete logarithm? Problems 2.1 Reformulate Equation (2.1), removing the restriction that a is a nonnegative integer. That is, let a be any integer. 2.2 Draw a figure similar to Figure 2.1 for a 6 0. 2.3 For each of the following equations, find an integer x that satisfies the equation. a. 4 x K 2 (mod 3 ) b. 7 x K 4 (mod 9 ) c. 5 x K 3 (mod 1 1 ) 2.4 In this text, we assume that the modulus is a positive integer. But the definition of the expression a mod n also makes perfect sense if n is negative. Determine the following: a. 7 mod 4 b. 7 mod -4 c. -7 mod 4 d. -7 mod -4 2.5 A modulus of 0 does not fit the definition but is defined by convention as follows: a mod 0 = a. With this definition in mind, what does the following expression mean: a K b (mod 0)? 2.6 In Section 2.3, we define the congruence relationship as follows: Two integers a and b are said to be congruent modulo n if (a mod n) = (b mod n). We then proved that a K b (mod n) if n � (a - b). Some texts on number theory use this latter relation- ship as the definition of congruence: Two integers a and b are said to be congruent modulo n if n � (a - b). Using this latter definition as the starting point, prove that, if (a mod n) = (b mod n), then n divides (a - b). 2.7 What is the smallest positive integer that has exactly k divisors? Provide answers for values for 1 … k … 8. 2.8 Prove the following: a. a K b (mod n) implies b K a (mod n) b. a K b (mod n) and b K c (mod n) imply a K c (mod n) 2.9 Prove the following: a. [(a mod n) - (b mod n)] mod n = (a - b) mod n b. [(a mod n) * (b mod n)] mod n = (a * b) mod n 2.10 Find the multiplicative inverse of each nonzero element in Z5. 2.11 Show that an integer N is congruent modulo 9 to the sum of its decimal digits. For example, 7 2 3 K 7 + 2 + 3 K 1 2 K 1 + 2 K 3 (mod 9 ). This is the basis for the familiar procedure of “casting out 9’s” when checking computations in arithmetic. 80 CHAPTER 2 / INTRODUCTION TO NUMBER THEORY 2.12 a. Determine gcd(72345, 43215) b. Determine gcd(3486, 10292) 2.13 The purpose of this problem is to set an upper bound on the number of iterations of the Euclidean algorithm. a. Suppose that m = qn + r with q 7 0 and 0 … r 6 n. Show that m/2 7 r. b. Let Ai be the value of A in the Euclidean algorithm after the ith iteration. Show that Ai+ 2 6 Ai 2 c. Show that if m, n, and N are integers with (1 … m, n, … 2N), then the Euclidean algorithm takes at most 2N steps to find gcd(m, n). 2.14 The Euclidean algorithm has been known for over 2000 years and has always been a favorite among number theorists. After these many years, there is now a potential competitor, invented by J. Stein in 1961. Stein’s algorithms is as follows: Determine gcd(A, B) with A, B Ú 1. STEP 1 Set A1 = A, B1 = B, C1 = 1 STEP 2 For n > 1, (1) If An = Bn, stop. gcd(A, B) = AnCn
(2) If An and Bn are both even, set An + 1 = An/2, Bn + 1 = Bn/2,
Cn + 1 = 2Cn
(3) If An is even and Bn is odd, set An + 1 = An/2, Bn + 1 = Bn,
Cn + 1 = Cn
(4) If An is odd and Bn is even, set An + 1 = An, Bn + 1 = Bn/2,
Cn + 1 = Cn
(5) If An and Bn are both odd, set An + 1 = �An – Bn � , Bn + 1 =
min (Bn, An), Cn + 1 = Cn
Continue to step n + 1.
a. To get a feel for the two algorithms, compute gcd(6150, 704) using both the Euclid-
ean and Stein’s algorithm.
b. What is the apparent advantage of Stein’s algorithm over the Euclidean algorithm?
2.15 a. Show that if Stein’s algorithm does not stop before the nth step, then
Cn + 1 * gcd(An + 1, Bn + 1) = Cn * gcd(An, Bn)
b. Show that if the algorithm does not stop before step (n – 1), then
An + 2Bn + 2 …
c. Show that if 1 … A, B … 2N, then Stein’s algorithm takes at most 4N steps to find
gcd(m, n). Thus, Stein’s algorithm works in roughly the same number of steps as
the Euclidean algorithm.
d. Demonstrate that Stein’s algorithm does indeed return gcd(A, B).
2.16 Using the extended Euclidean algorithm, find the multiplicative inverse of
a. 135 mod 61
b. 7465 mod 2464
c. 42828 mod 6407
2.17 The purpose of this problem is to determine how many prime numbers there
are. Suppose there are a total of n prime numbers, and we list these in order:
p1 = 2 6 p2 = 3 6 p3 = 5 6 c 6 pn.
a. Define X = 1 + p1p2 c pn. That is, X is equal to one plus the product of all the
primes. Can we find a prime number Pm that divides X?
b. What can you say about m?
c. Deduce that the total number of primes cannot be finite.
d. Show that Pn + 1 … 1 + p1p2 c pn.

2.18 The purpose of this problem is to demonstrate that the probability that two random
numbers are relatively prime is about 0.6.
a. Let P = Pr[gcd(a, b) = 1]. Show that P = Pr[gcd(a, b) = d] = P/d2. Hint:
Consider the quantity gcd aa
b .
b. The sum of the result of part (a) over all possible values of d is 1. That is
Σd Ú1Pr[gcd(a, b) = d] = 1. Use this equality to determine the value of P. Hint:
Use the identity a

2.19 Why is gcd(n, n + 1) = 1 for two consecutive integers n and n + 1?
2.20 Using Fermat’s theorem, find 4 2 2 5 mod 13.
2.21 Use Fermat’s theorem to find a number a between 0 and 92 with a congruent to 71013
modulo 93.
2.22 Use Fermat’s theorem to find a number x between 0 and 37 with x 7 3 congruent to 4
modulo 37. (You should not need to use any brute-force searching.)
2.23 Use Euler’s theorem to find a number a between 0 and 9 such that a is congruent to
9 1 0 1 modulo 10. (Note: This is the same as the last digit of the decimal expansion of
9 1 0 0.)
2.24 Use Euler’s theorem to find a number x between 0 and 14 with x 6 1 congruent to 7
modulo 15. (You should not need to use any brute-force searching.)
2.25 Notice in Table 2.6 that f(n) is even for n 7 2. This is true for all n 7 2. Give a con-
cise argument why this is so.
2.26 Prove the following: If p is prime, then f(pi) = pi – pi- 1. Hint: What numbers have
a factor in common with pi?
2.27 It can be shown (see any book on number theory) that if gcd(m, n) = 1 then
f(mn) = f(m)f(n). Using this property, the property developed in the preceding
problem, and the property that f(p) = p – 1 for p prime, it is straightforward to
determine the value of f(n) for any n. Determine the following:
a. f(29) b. f(51) c. f(455) d. f(616)
2.28 It can also be shown that for arbitrary positive integer a, f(a) is given by
f(a) = q
ai – 1(pi – 1)]
where a is given by Equation (2.9), namely: a = P1a1P2a2 c Ptat. Demonstrate this result.
2.29 Consider the function: f(n) = number of elements in the set {a: 0 … a 6 n and
gcd(a, n) = 1}. What is this function?
2.30 Although ancient Chinese mathematicians did good work coming up with their
remainder theorem, they did not always get it right. They had a test for primality. The
test said that n is prime if and only if n divides (2n – 2).
a. Give an example that satisfies the condition using an odd prime.
b. The condition is obviously true for n = 2. Prove that the condition is true if n is an
odd prime (proving the if condition).
c. Give an example of an odd n that is not prime and that does not satisfy the condi-
tion. You can do this with nonprime numbers up to a very large value. This misled
the Chinese mathematicians into thinking that if the condition is true then n is prime.
d. Unfortunately, the ancient Chinese never tried n = 341, which is nonprime
(341 = 11 * 31), yet 341 divides 2341 – 2 without remainder. Demonstrate that
2341 K 2 (mod 341) (disproving the only if condition). Hint: It is not necessary to
calculate 2341; play around with the congruences instead.

2.31 Show that, if n is an odd composite integer, then the Miller–Rabin test will return
inconclusive for a = 1 and a = (n – 1).
2.32 If n is composite and passes the Miller–Rabin test for the base a, then n is called
a strong pseudoprime to the base a. Show that 2047 is a strong pseudoprime to the
base 2.
2.33 A common formulation of the Chinese remainder theorem (CRT) is as follows: Let
m1, c , mk be integers that are pairwise relatively prime for 1 … i, j … k, and i ≠ j.
Define M to be the product of all the mi>s. Let a1, c , ak be integers. Then the set of
x K a1(mod m1)
x K a2(mod m2)
x K ak(mod mk)
has a unique solution modulo M. Show that the theorem stated in this form is true.
2.34 The example used by Sun-Tsu to illustrate the CRT was
x K 2 (mod 3); x K 3 (mod 5); x K 2 (mod 7)
Solve for x.
2.35 Six professors begin courses on Monday, Tuesday, Wednesday, Thursday, Friday,
and Saturday, respectively, and announce their intentions of lecturing at intervals of
3, 2, 5, 6, 1, and 4 days, respectively. The regulations of the university forbid Sunday
lectures (so that a Sunday lecture must be omitted). When first will all six professors
find themselves compelled to omit a lecture? Hint: Use the CRT.
2.36 Find all primitive roots of 37.
2.37 Given 5 as a primitive root of 23, construct a table of discrete logarithms, and use it to
solve the following congruences.
a. 3×5 K 2 (mod 23)
b. 7×10 + 1 K 0 (mod 23)
c. 5x K 6 (mod 23)
Programming Problems
2.1 Write a computer program that implements fast exponentiation (successive squaring)
modulo n.
2.2 Write a computer program that implements the Miller–Rabin algorithm for a user-
specified n. The program should allow the user two choices: (1) specify a possible
witness a to test using the Witness procedure or (2) specify a number s of random
witnesses for the Miller–Rabin test to check.
The operator mod is used in this book and in the literature in two different ways: as
a binary operator and as a congruence relation. This appendix explains the distinc-
tion and precisely defines the notation used in this book regarding parentheses. This
notation is common but, unfortunately, not universal.

The Binary Operator mod
If a is an integer and n is a positive integer, we define a mod n to be the remainder
when a is divided by n. The integer n is called the modulus, and the remainder is
called the residue. Thus, for any integer a, we can always write
a = :a/n; * n + (a mod n)
Formally, we define the operator mod as
a mod n = a – :a/n; * n for n ≠ 0
As a binary operation, mod takes two integer arguments and returns the re-
mainder. For example, 7 mod 3 = 1. The arguments may be integers, integer vari-
ables, or integer variable expressions. For example, all of the following are valid,
with the obvious meanings:
7 mod 3
7 mod m
x mod 3
x mod m
(x2 + y + 1) mod (2m + n)
where all of the variables are integers. In each case, the left-hand term is divided by
the right-hand term, and the resulting value is the remainder. Note that if either the
left- or right-hand argument is an expression, the expression is parenthesized. The
operator mod is not inside parentheses.
In fact, the mod operation also works if the two arguments are arbitrary real num-
bers, not just integers. In this book, we are concerned only with the integer operation.
The Congruence Relation mod
As a congruence relation, mod expresses that two arguments have the same remain-
der with respect to a given modulus. For example, 7 K 4 (mod 3) expresses the
fact that both 7 and 4 have a remainder of 1 when divided by 3. The following two
expressions are equivalent:
a K b (mod m) 3 a mod m = b mod m
Another way of expressing it is to say that the expression a K b (mod m) is the
same as saying that a – b is an integral multiple of m. Again, all the arguments may
be integers, integer variables, or integer variable expressions. For example, all of
the following are valid, with the obvious meanings:
7 K 4 (mod 3)
x K y (mod m)
(x2 + y + 1) K (a + 1)(mod [m + n])
where all of the variables are integers. Two conventions are used. The congruence
sign is K . The modulus for the relation is defined by placing the mod operator fol-
lowed by the modulus in parentheses.

The congruence relation is used to define residue classes. Those numbers that
have the same remainder r when divided by m form a residue class (mod m). There
are m residue classes (mod m). For a given remainder r, the residue class to which it
belongs consists of the numbers
r, r { m, r { 2m, c
According to our definition, the congruence
a K b (mod m)
signifies that the numbers a and b differ by a multiple of m. Consequently, the con-
gruence can also be expressed in the terms that a and b belong to the same residue
class (mod m).

Classical Encryption Techniques
3.1 Symmetric Cipher Model
Cryptanalysis and Brute-Force Attack
3.2 Substitution Techniques
Caesar Cipher
Monoalphabetic Ciphers
Playfair Cipher
Hill Cipher
Polyalphabetic Ciphers
One-Time Pad
3.3 Transposition Techniques
3.4 Rotor Machines
3.5 Steganography
3.6 Key Terms, Review Questions, and Problems

Symmetric encryption, also referred to as conventional encryption or single-key
encryption, was the only type of encryption in use prior to the development of public-
key encryption in the 1970s. It remains by far the most widely used of the two types
of encryption. Part One examines a number of symmetric ciphers. In this chapter, we
begin with a look at a general model for the symmetric encryption process; this will
enable us to understand the context within which the algorithms are used. Next, we
examine a variety of algorithms in use before the computer era. Finally, we look briefly
at a different approach known as steganography. Chapters 4 and 6 introduce the two
most widely used symmetric cipher: DES and AES.
Before beginning, we define some terms. An original message is known as the
plaintext, while the coded message is called the ciphertext. The process of convert-
ing from plaintext to ciphertext is known as enciphering or encryption; restoring the
plaintext from the ciphertext is deciphering or decryption. The many schemes used
for encryption constitute the area of study known as cryptography. Such a scheme
is known as a cryptographic system or a cipher. Techniques used for deciphering a
message without any knowledge of the enciphering details fall into the area of crypt-
analysis. Cryptanalysis is what the layperson calls “breaking the code.” The areas of
cryptography and cryptanalysis together are called cryptology.
A symmetric encryption scheme has five ingredients (Figure 3.1):
■ Plaintext: This is the original intelligible message or data that is fed into the
algorithm as input.
■ Encryption algorithm: The encryption algorithm performs various substitu-
tions and transformations on the plaintext.
■ Secret key: The secret key is also input to the encryption algorithm. The key is
a value independent of the plaintext and of the algorithm. The algorithm will
produce a different output depending on the specific key being used at the
time. The exact substitutions and transformations performed by the algorithm
depend on the key.
After studying this chapter, you should be able to:
◆ Present an overview of the main concepts of symmetric cryptography.
◆ Explain the difference between cryptanalysis and brute-force attack.
◆ Understand the operation of a monoalphabetic substitution cipher.
◆ Understand the operation of a polyalphabetic cipher.
◆ Present an overview of the Hill cipher.
◆ Describe the operation of a rotor machine.

■ Ciphertext: This is the scrambled message produced as output. It depends on
the plaintext and the secret key. For a given message, two different keys will
produce two different ciphertexts. The ciphertext is an apparently random
stream of data and, as it stands, is unintelligible.
■ Decryption algorithm: This is essentially the encryption algorithm run in
reverse. It takes the ciphertext and the secret key and produces the original
There are two requirements for secure use of conventional encryption:
1. We need a strong encryption algorithm. At a minimum, we would like the algo-
rithm to be such that an opponent who knows the algorithm and has access to
one or more ciphertexts would be unable to decipher the ciphertext or figure
out the key. This requirement is usually stated in a stronger form: The oppo-
nent should be unable to decrypt ciphertext or discover the key even if he or
she is in possession of a number of ciphertexts together with the plaintext that
produced each ciphertext.
2. Sender and receiver must have obtained copies of the secret key in a secure
fashion and must keep the key secure. If someone can discover the key and
knows the algorithm, all communication using this key is readable.
We assume that it is impractical to decrypt a message on the basis of the
ciphertext plus knowledge of the encryption/decryption algorithm. In other words,
we do not need to keep the algorithm secret; we need to keep only the key secret.
This feature of symmetric encryption is what makes it feasible for widespread use.
The fact that the algorithm need not be kept secret means that manufacturers can
and have developed low-cost chip implementations of data encryption algorithms.
These chips are widely available and incorporated into a number of products. With
the use of symmetric encryption, the principal security problem is maintaining the
secrecy of the key.
Let us take a closer look at the essential elements of a symmetric encryp-
tion scheme, using Figure 3.2. A source produces a message in plaintext,
X = [X1, X2, c , XM]. The M elements of X are letters in some finite alphabet.
Traditionally, the alphabet usually consisted of the 26 capital letters. Nowadays,
Figure 3.1 Simplified Model of Symmetric Encryption
Y = E(K, X ) X = D(K, Y )
Secret key shared by
sender and recipient
Secret key shared by
sender and recipient
Encryption algorithm
(e.g., AES)
Decryption algorithm
(reverse of encryption

the binary alphabet {0, 1} is typically used. For encryption, a key of the form
K = [K1, K2, c , KJ] is generated. If the key is generated at the message source,
then it must also be provided to the destination by means of some secure channel.
Alternatively, a third party could generate the key and securely deliver it to both
source and destination.
With the message X and the encryption key K as input, the encryption algo-
rithm forms the ciphertext Y = [Y1, Y2, c , YN]. We can write this as
Y = E(K, X)
This notation indicates that Y is produced by using encryption algorithm E as a
function of the plaintext X, with the specific function determined by the value of
the key K.
The intended receiver, in possession of the key, is able to invert the
X = D(K, Y)
An opponent, observing Y but not having access to K or X, may attempt to
recover X or K or both X and K. It is assumed that the opponent knows the encryp-
tion (E) and decryption (D) algorithms. If the opponent is interested in only this
particular message, then the focus of the effort is to recover X by generating a plain-
text estimate Xn . Often, however, the opponent is interested in being able to read
future messages as well, in which case an attempt is made to recover K by generat-
ing an estimate Kn .
Figure 3.2 Model of Symmetric Cryptosystem
Y = E(K, X )
Secure channel

Cryptographic systems are characterized along three independent dimensions:
1. The type of operations used for transforming plaintext to ciphertext. All
encryption algorithms are based on two general principles: substitution,
in which each element in the plaintext (bit, letter, group of bits or letters)
is mapped into another element, and transposition, in which elements
in the plaintext are rearranged. The fundamental requirement is that no
information be lost (i.e., that all operations are reversible). Most systems,
referred to as product systems, involve multiple stages of substitutions and
2. The number of keys used. If both sender and receiver use the same key, the
system is referred to as symmetric, single-key, secret-key, or conventional
encryption. If the sender and receiver use different keys, the system is referred
to as asymmetric, two-key, or public-key encryption.
3. The way in which the plaintext is processed. A block cipher processes the input
one block of elements at a time, producing an output block for each input
block. A stream cipher processes the input elements continuously, producing
output one element at a time, as it goes along.
Cryptanalysis and Brute-Force Attack
Typically, the objective of attacking an encryption system is to recover the key in
use rather than simply to recover the plaintext of a single ciphertext. There are two
general approaches to attacking a conventional encryption scheme:
■ Cryptanalysis: Cryptanalytic attacks rely on the nature of the algorithm plus
perhaps some knowledge of the general characteristics of the plaintext or even
some sample plaintext–ciphertext pairs. This type of attack exploits the charac-
teristics of the algorithm to attempt to deduce a specific plaintext or to deduce
the key being used.
■ Brute-force attack: The attacker tries every possible key on a piece of cipher-
text until an intelligible translation into plaintext is obtained. On average, half
of all possible keys must be tried to achieve success.
If either type of attack succeeds in deducing the key, the effect is catastrophic:
All future and past messages encrypted with that key are compromised.
We first consider cryptanalysis and then discuss brute-force attacks.
Table 3.1 summarizes the various types of cryptanalytic attacks based on the
amount of information known to the cryptanalyst. The most difficult problem is
presented when all that is available is the ciphertext only. In some cases, not even
the encryption algorithm is known, but in general, we can assume that the opponent
does know the algorithm used for encryption. One possible attack under these cir-
cumstances is the brute-force approach of trying all possible keys. If the key space
is very large, this becomes impractical. Thus, the opponent must rely on an analysis
of the ciphertext itself, generally applying various statistical tests to it. To use this

approach, the opponent must have some general idea of the type of plaintext that
is concealed, such as English or French text, an EXE file, a Java source listing, an
accounting file, and so on.
The ciphertext-only attack is the easiest to defend against because the oppo-
nent has the least amount of information to work with. In many cases, however,
the analyst has more information. The analyst may be able to capture one or more
plaintext messages as well as their encryptions. Or the analyst may know that certain
plaintext patterns will appear in a message. For example, a file that is encoded in the
Postscript format always begins with the same pattern, or there may be a standard-
ized header or banner to an electronic funds transfer message, and so on. All these
are examples of known plaintext. With this knowledge, the analyst may be able to
deduce the key on the basis of the way in which the known plaintext is transformed.
Closely related to the known-plaintext attack is what might be referred to as a
probable-word attack. If the opponent is working with the encryption of some gen-
eral prose message, he or she may have little knowledge of what is in the message.
However, if the opponent is after some very specific information, then parts of the
message may be known. For example, if an entire accounting file is being transmit-
ted, the opponent may know the placement of certain key words in the header of the
file. As another example, the source code for a program developed by Corporation
X might include a copyright statement in some standardized position.
If the analyst is able somehow to get the source system to insert into the sys-
tem a message chosen by the analyst, then a chosen-plaintext attack is possible.
An example of this strategy is differential cryptanalysis, explored in Appendix S.
Type of Attack Known to Cryptanalyst
Ciphertext Only ■ Encryption algorithm
■ Ciphertext
Known Plaintext ■ Encryption algorithm
■ Ciphertext
■ One or more plaintext–ciphertext pairs formed with the secret key
Chosen Plaintext ■ Encryption algorithm
■ Ciphertext
■ Plaintext message chosen by cryptanalyst, together with its corresponding
ciphertext generated with the secret key
Chosen Ciphertext ■ Encryption algorithm
■ Ciphertext
■ Ciphertext chosen by cryptanalyst, together with its corresponding decrypted
plaintext generated with the secret key
Chosen Text ■ Encryption algorithm
■ Ciphertext
■ Plaintext message chosen by cryptanalyst, together with its corresponding
ciphertext generated with the secret key
■ Ciphertext chosen by cryptanalyst, together with its corresponding decrypted
plaintext generated with the secret key
Table 3.1 Types of Attacks on Encrypted Messages

In general, if the analyst is able to choose the messages to encrypt, the analyst may
deliberately pick patterns that can be expected to reveal the structure of the key.
Table 3.1 lists two other types of attack: chosen ciphertext and chosen text.
These are less commonly employed as cryptanalytic techniques but are nevertheless
possible avenues of attack.
Only relatively weak algorithms fail to withstand a ciphertext-only attack.
Generally, an encryption algorithm is designed to withstand a known-plaintext
Two more definitions are worthy of note. An encryption scheme is
unconditionally secure if the ciphertext generated by the scheme does not contain
enough information to determine uniquely the corresponding plaintext, no matter
how much ciphertext is available. That is, no matter how much time an opponent
has, it is impossible for him or her to decrypt the ciphertext simply because the
required information is not there. With the exception of a scheme known as the
one-time pad (described later in this chapter), there is no encryption algorithm that
is unconditionally secure. Therefore, all that the users of an encryption algorithm
can strive for is an algorithm that meets one or both of the following criteria:
■ The cost of breaking the cipher exceeds the value of the encrypted information.
■ The time required to break the cipher exceeds the useful lifetime of the
An encryption scheme is said to be computationally secure if either of the
foregoing two criteria are met. Unfortunately, it is very difficult to estimate the
amount of effort required to cryptanalyze ciphertext successfully.
All forms of cryptanalysis for symmetric encryption schemes are designed
to exploit the fact that traces of structure or pattern in the plaintext may survive
encryption and be discernible in the ciphertext. This will become clear as we exam-
ine various symmetric encryption schemes in this chapter. We will see in Part Two
that cryptanalysis for public-key schemes proceeds from a fundamentally different
premise, namely, that the mathematical properties of the pair of keys may make it
possible for one of the two keys to be deduced from the other.
A brute-force attack involves trying every possible key until an intelligible
translation of the ciphertext into plaintext is obtained. On average, half of all pos-
sible keys must be tried to achieve success. That is, if there are X different keys, on
average an attacker would discover the actual key after X/2 tries. It is important to
note that there is more to a brute-force attack than simply running through all pos-
sible keys. Unless known plaintext is provided, the analyst must be able to recognize
plaintext as plaintext. If the message is just plain text in English, then the result pops
out easily, although the task of recognizing English would have to be automated. If
the text message has been compressed before encryption, then recognition is more
difficult. And if the message is some more general type of data, such as a numeri-
cal file, and this has been compressed, the problem becomes even more difficult to
automate. Thus, to supplement the brute-force approach, some degree of knowl-
edge about the expected plaintext is needed, and some means of automatically dis-
tinguishing plaintext from garble is also needed.

In this section and the next, we examine a sampling of what might be called classical
encryption techniques. A study of these techniques enables us to illustrate the basic
approaches to symmetric encryption used today and the types of cryptanalytic at-
tacks that must be anticipated.
The two basic building blocks of all encryption techniques are substitution
and transposition. We examine these in the next two sections. Finally, we discuss a
system that combines both substitution and transposition.
A substitution technique is one in which the letters of plaintext are replaced
by other letters or by numbers or symbols.1 If the plaintext is viewed as a sequence
of bits, then substitution involves replacing plaintext bit patterns with ciphertext bit
Caesar Cipher
The earliest known, and the simplest, use of a substitution cipher was by Julius
Caesar. The Caesar cipher involves replacing each letter of the alphabet with the
letter standing three places further down the alphabet. For example,
plain: meet me after the toga party
Note that the alphabet is wrapped around, so that the letter following Z is A.
We can define the transformation by listing all possibilities, as follows:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
Let us assign a numerical equivalent to each letter:
a b c d e f g h i j k l m
0 1 2 3 4 5 6 7 8 9 10 11 12
n o p q r s t u v w x y z
13 14 15 16 17 18 19 20 21 22 23 24 25
Then the algorithm can be expressed as follows. For each plaintext letter p, substi-
tute the ciphertext letter C:2
C = E(3, p) = (p + 3) mod 26
A shift may be of any amount, so that the general Caesar algorithm is
C = E(k, p) = (p + k) mod 26 (3.1)
1When letters are involved, the following conventions are used in this book. Plaintext is always in
lowercase; ciphertext is in uppercase; key values are in italicized lowercase.
2We define a mod n to be the remainder when a is divided by n. For example, 11 mod 7 = 4. See Chapter  2
for a further discussion of modular arithmetic.

where k takes on a value in the range 1 to 25. The decryption algorithm is simply
p = D(k, C) = (C – k) mod 26 (3.2)
If it is known that a given ciphertext is a Caesar cipher, then a brute-force
cryptanalysis is easily performed: simply try all the 25 possible keys. Figure 3.3
shows the results of applying this strategy to the example ciphertext. In this case, the
plaintext leaps out as occupying the third line.
Three important characteristics of this problem enabled us to use a brute-
force cryptanalysis:
1. The encryption and decryption algorithms are known.
2. There are only 25 keys to try.
3. The language of the plaintext is known and easily recognizable.
In most networking situations, we can assume that the algorithms are known.
What generally makes brute-force cryptanalysis impractical is the use of an algo-
rithm that employs a large number of keys. For example, the triple DES algorithm,
Figure 3.3 Brute-Force Cryptanalysis of Caesar Cipher
1 oggv og chvgt vjg vqic rctva
2 nffu nf bgufs uif uphb qbsuz
3 meet me after the toga party
4 ldds ld zesdq sgd snfz ozqsx
5 kccr kc ydrcp rfc rmey nyprw
6 jbbq jb xcqbo qeb qldx mxoqv
7 iaap ia wbpan pda pkcw lwnpu
8 hzzo hz vaozm ocz ojbv kvmot
9 gyyn gy uznyl nby niau julns
10 fxxm fx tymxk max mhzt itkmr
11 ewwl ew sxlwj lzw lgys hsjlq
12 dvvk dv rwkvi kyv kfxr grikp
13 cuuj cu qvjuh jxu jewq fqhjo
14 btti bt puitg iwt idvp epgin
15 assh as othsf hvs hcuo dofhm
16 zrrg zr nsgre gur gbtn cnegl
17 yqqf yq mrfqd ftq fasm bmdfk
18 xppe xp lqepc esp ezrl alcej
19 wood wo kpdob dro dyqk zkbdi
20 vnnc vn jocna cqn cxpj yjach
21 ummb um inbmz bpm bwoi xizbg
22 tlla tl hmaly aol avnh whyaf
23 skkz sk glzkx znk zumg vgxze
24 rjjy rj fkyjw ymj ytlf ufwyd
25 qiix qi ejxiv xli xske tevxc

examined in Chapter 7, makes use of a 168-bit key, giving a key space of 2168 or
greater than 3.7 * 1050 possible keys.
The third characteristic is also significant. If the language of the plaintext is
unknown, then plaintext output may not be recognizable. Furthermore, the input
may be abbreviated or compressed in some fashion, again making recognition dif-
ficult. For example, Figure 3.4 shows a portion of a text file compressed using an
algorithm called ZIP. If this file is then encrypted with a simple substitution cipher
(expanded to include more than just 26 alphabetic characters), then the plaintext
may not be recognized when it is uncovered in the brute-force cryptanalysis.
Monoalphabetic Ciphers
With only 25 possible keys, the Caesar cipher is far from secure. A dramatic increase
in the key space can be achieved by allowing an arbitrary substitution. Before pro-
ceeding, we define the term permutation. A permutation of a finite set of elements S
is an ordered sequence of all the elements of S, with each element appearing exactly
once. For example, if S = {a, b, c}, there are six permutations of S:
abc, acb, bac, bca, cab, cba
In general, there are n! permutations of a set of n elements, because the first
element can be chosen in one of n ways, the second in n – 1 ways, the third in n – 2
ways, and so on.
Recall the assignment for the Caesar cipher:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
If, instead, the “cipher” line can be any permutation of the 26 alphabetic characters,
then there are 26! or greater than 4 * 1026 possible keys. This is 10 orders of mag-
nitude greater than the key space for DES and would seem to eliminate brute-force
techniques for cryptanalysis. Such an approach is referred to as a monoalphabetic
substitution cipher, because a single cipher alphabet (mapping from plain alphabet
to cipher alphabet) is used per message.
There is, however, another line of attack. If the cryptanalyst knows the nature
of the plaintext (e.g., noncompressed English text), then the analyst can exploit the
regularities of the language. To see how such a cryptanalysis might proceed, we give
a partial example here that is adapted from one in [SINK09]. The ciphertext to be
solved is
Figure 3.4 Sample of Compressed Text

As a first step, the relative frequency of the letters can be determined and
compared to a standard frequency distribution for English, such as is shown in
Figure 3.5 (based on [LEWA00]). If the message were long enough, this technique
alone might be sufficient, but because this is a relatively short message, we cannot
expect an exact match. In any case, the relative frequencies of the letters in the
ciphertext (in percentages) are as follows:
P 13.33 H 5.83 F 3.33 B 1.67 C 0.00
Z 11.67 D 5.00 W 3.33 G 1.67 K 0.00
S 8.33 E 5.00 Q 2.50 Y 1.67 L 0.00
U 8.33 V 4.17 T 2.50 I 0.83 N 0.00
O 7.50 X 4.17 A 1.67 J 0.83 R 0.00
M 6.67
Comparing this breakdown with Figure 3.5, it seems likely that cipher letters
P and Z are the equivalents of plain letters e and t, but it is not certain which is which.
The letters S, U, O, M, and H are all of relatively high frequency and probably
Figure 3.5 Relative Frequency of Letters in English Text
4 6
3 0.
9 7.

correspond to plain letters from the set {a, h, i, n, o, r, s}. The letters with the lowest
frequencies (namely, A, B, G, Y, I, J) are likely included in the set {b, j, k, q, v, x, z}.
There are a number of ways to proceed at this point. We could make some
tentative assignments and start to fill in the plaintext to see if it looks like a rea-
sonable “skeleton” of a message. A more systematic approach is to look for other
regularities. For example, certain words may be known to be in the text. Or we
could look for repeating sequences of cipher letters and try to deduce their plaintext
A powerful tool is to look at the frequency of two-letter combinations, known
as digrams. A table similar to Figure 3.5 could be drawn up showing the relative fre-
quency of digrams. The most common such digram is th. In our ciphertext, the most
common digram is ZW, which appears three times. So we make the correspondence
of Z with t and W with h. Then, by our earlier hypothesis, we can equate P with e.
Now notice that the sequence ZWP appears in the ciphertext, and we can translate
that sequence as “the.” This is the most frequent trigram (three-letter combination)
in English, which seems to indicate that we are on the right track.
Next, notice the sequence ZWSZ in the first line. We do not know that these
four letters form a complete word, but if they do, it is of the form th_t. If so, S
equates with a.
So far, then, we have
t a e e te a that e e a a
e t ta t ha e ee a e th t a
e e e tat e the t
Only four letters have been identified, but already we have quite a bit of the
message. Continued analysis of frequencies plus trial and error should easily yield a
solution from this point. The complete plaintext, with spaces added between words,
it was disclosed yesterday that several informal but
direct contacts have been made with political
representatives of the viet cong in moscow
Monoalphabetic ciphers are easy to break because they reflect the frequency
data of the original alphabet. A countermeasure is to provide multiple substi-
tutes, known as homophones, for a single letter. For example, the letter e could
be assigned a number of different cipher symbols, such as 16, 74, 35, and 21, with
each homophone assigned to a letter in rotation or randomly. If the number of
symbols assigned to each letter is proportional to the relative frequency of that let-
ter, then single-letter frequency information is completely obliterated. The great
mathematician Carl Friedrich Gauss believed that he had devised an unbreak-
able cipher using homophones. However, even with homophones, each element
of plaintext affects only one element of ciphertext, and multiple-letter patterns

(e.g., digram frequencies) still survive in the ciphertext, making cryptanalysis rela-
tively straightforward.
Two principal methods are used in substitution ciphers to lessen the extent to
which the structure of the plaintext survives in the ciphertext: One approach is to
encrypt multiple letters of plaintext, and the other is to use multiple cipher alpha-
bets. We briefly examine each.
Playfair Cipher
The best-known multiple-letter encryption cipher is the Playfair, which treats di-
grams in the plaintext as single units and translates these units into ciphertext
The Playfair algorithm is based on the use of a 5 * 5 matrix of letters con-
structed using a keyword. Here is an example, solved by Lord Peter Wimsey in
Dorothy Sayers’s Have His Carcase:4
In this case, the keyword is monarchy. The matrix is constructed by filling
in the letters of the keyword (minus duplicates) from left to right and from top to
bottom, and then filling in the remainder of the matrix with the remaining letters in
alphabetic order. The letters I and J count as one letter. Plaintext is encrypted two
letters at a time, according to the following rules:
1. Repeating plaintext letters that are in the same pair are separated with a filler
letter, such as x, so that balloon would be treated as ba lx lo on.
2. Two plaintext letters that fall in the same row of the matrix are each replaced
by the letter to the right, with the first element of the row circularly following
the last. For example, ar is encrypted as RM.
3. Two plaintext letters that fall in the same column are each replaced by the let-
ter beneath, with the top element of the column circularly following the last.
For example, mu is encrypted as CM.
4. Otherwise, each plaintext letter in a pair is replaced by the letter that lies in
its own row and the column occupied by the other plaintext letter. Thus, hs
becomes BP and ea becomes IM (or JM, as the encipherer wishes).
The Playfair cipher is a great advance over simple monoalphabetic ciphers.
For one thing, whereas there are only 26 letters, there are 26 * 26 = 676 digrams,
3This cipher was actually invented by British scientist Sir Charles Wheatstone in 1854, but it bears the
name of his friend Baron Playfair of St. Andrews, who championed the cipher at the British foreign office.
4The book provides an absorbing account of a probable-word attack.

so that identification of individual digrams is more difficult. Furthermore, the rela-
tive frequencies of individual letters exhibit a much greater range than that of
digrams, making frequency analysis much more difficult. For these reasons, the
Playfair cipher was for a long time considered unbreakable. It was used as the stan-
dard field system by the British Army in World War I and still enjoyed considerable
use by the U.S. Army and other Allied forces during World War II.
Despite this level of confidence in its security, the Playfair cipher is relatively
easy to break, because it still leaves much of the structure of the plaintext language
intact. A few hundred letters of ciphertext are generally sufficient.
One way of revealing the effectiveness of the Playfair and other ciphers is
shown in Figure 3.6. The line labeled plaintext plots a typical frequency distribution
of the 26 alphabetic characters (no distinction between upper and lower case) in
ordinary text. This is also the frequency distribution of any monoalphabetic substi-
tution cipher, because the frequency values for individual letters are the same, just
with different letters substituted for the original letters. The plot is developed in the
following way: The number of occurrences of each letter in the text is counted and
divided by the number of occurrences of the most frequently used letter. Using the
results of Figure 3.5, we see that e is the most frequently used letter. As a result, e
has a relative frequency of 1, t of 9.056/12.702 ≈ 0.72, and so on. The points on the
horizontal axis correspond to the letters in order of decreasing frequency.
Figure 3.6 also shows the frequency distribution that results when the text is
encrypted using the Playfair cipher. To normalize the plot, the number of occur-
rences of each letter in the ciphertext was again divided by the number of occur-
rences of e in the plaintext. The resulting plot therefore shows the extent to which
the frequency distribution of letters, which makes it trivial to solve substitution
Figure 3.6 Relative Frequency of Occurrence of Letters
1 2 3 4 5 6 1 7 8 9 10 10 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Random polyalphabetic
Frequency ranked letters (decreasing frequency)

ciphers, is masked by encryption. If the frequency distribution information were
totally concealed in the encryption process, the ciphertext plot of frequencies would
be flat, and cryptanalysis using ciphertext only would be effectively impossible. As
the figure shows, the Playfair cipher has a flatter distribution than does plaintext,
but nevertheless, it reveals plenty of structure for a cryptanalyst to work with. The
plot also shows the Vigenère cipher, discussed subsequently. The Hill and Vigenère
curves on the plot are based on results reported in [SIMM93].
Hill Cipher5
Another interesting multiletter cipher is the Hill cipher, developed by the math-
ematician Lester Hill in 1929.
CONCEPTS FROM LINEAR ALGEBRA Before describing the Hill cipher, let us briefly
review some terminology from linear algebra. In this discussion, we are concerned
with matrix arithmetic modulo 26. For the reader who needs a refresher on matrix
multiplication and inversion, see Appendix E.
We define the inverse M-1 of a square matrix M by the equation M(M-1) =
M-1M = I, where I is the identity matrix. I is a square matrix that is all zeros except
for ones along the main diagonal from upper left to lower right. The inverse of a
matrix does not always exist, but when it does, it satisfies the preceding equation.
For example,
A = ¢ 5 8
17 3
≤ A-1 mod 26 = ¢9 2
1 15

AA-1 = ¢ (5 * 9) + (8 * 1) (5 * 2) + (8 * 15)
(17 * 9) + (3 * 1) (17 * 2) + (3 * 15)

= ¢ 53 130
156 79
≤ mod 26 = ¢1 0
0 1

To explain how the inverse of a matrix is computed, we begin with the concept
of determinant. For any square matrix (m * m), the determinant equals the sum of
all the products that can be formed by taking exactly one element from each row
and exactly one element from each column, with certain of the product terms pre-
ceded by a minus sign. For a 2 * 2 matrix,
¢k11 k12
k21 k22

the determinant is k11k22 – k12k21. For a 3 * 3 matrix, the value of the determinant
is k11k22k33 + k21k32k13 + k31k12k23 – k31k22k13 – k21k12k33 – k11k32k23. If a square
5This cipher is somewhat more difficult to understand than the others in this chapter, but it illustrates an
important point about cryptanalysis that will be useful later on. This subsection can be skipped on a first

matrix A has a nonzero determinant, then the inverse of the matrix is computed
as [A-1]ij = (det A)-1(-1)i+ j(Dji), where (Dji) is the subdeterminant formed by
deleting the jth row and the ith column of A, det(A) is the determinant of A, and
(det A)-1 is the multiplicative inverse of (det A) mod 26.
Continuing our example,
det ¢ 5 8
17 3
≤ = (5 * 3) – (8 * 17) = -121 mod 26 = 9
We can show that 9-1 mod 26 = 3, because 9 * 3 = 27 mod 26 = 1 (see
Chapter 2 or Appendix E). Therefore, we compute the inverse of A as
A = ¢ 5 8
17 3

A-1 mod 26 = 3¢ 3 -8
-17 5
≤ = 3¢3 18
9 5
≤ = ¢ 9 54
27 15
≤ = ¢9 2
1 15

THE HILL ALGORITHM This encryption algorithm takes m successive plaintext let-
ters and substitutes for them m ciphertext letters. The substitution is determined
by m linear equations in which each character is assigned a numerical value
(a = 0, b = 1, c , z = 25). For m = 3, the system can be described as
c1 = (k11p1 + k21p2 + k31p3) mod 26
c2 = (k12p1 + k22p2 + k32p3) mod 26
c3 = (k13p1 + k23p2 + k33p3) mod 26
This can be expressed in terms of row vectors and matrices:6
(c1 c2 c3) = (p1 p2 p3)£k11 k12 k13k21 k22 k23
k31 k32 k33
≥ mod 26
C = PK mod 26
where C and P are row vectors of length 3 representing the plaintext and ciphertext,
and K is a 3 * 3 matrix representing the encryption key. Operations are performed
mod 26.
6Some cryptography books express the plaintext and ciphertext as column vectors, so that the column
vector is placed after the matrix rather than the row vector placed before the matrix. Sage uses row vec-
tors, so we adopt that convention.

For example, consider the plaintext “paymoremoney” and use the encryption key
K = £17 17 521 18 21
2 2 19

The first three letters of the plaintext are represented by the vector (15 0 24).
Then (15 0 24)K = (303 303 531) mod 26 = (17 17 11) = RRL. Continuing in this
fashion, the ciphertext for the entire plaintext is RRLMWBKASPDH.
Decryption requires using the inverse of the matrix K. We can compute det
K = 23, and therefore, (det K)-1 mod 26 = 17. We can then compute the inverse as7
K-1 = £ 4 9 1515 17 6
24 0 17

This is demonstrated as
£17 17 521 18 21
2 2 19
≥£ 4 9 1515 17 6
24 0 17
≥ = £443 442 442858 495 780
494 52 365
≥ mod 26 = £1 0 00 1 0
0 0 1

It is easily seen that if the matrix K-1 is applied to the ciphertext, then the
plaintext is recovered.
In general terms, the Hill system can be expressed as
C = E(K, P) = PK mod 26
P = D(K, C) = CK-1 mod 26 = PKK-1 = P
As with Playfair, the strength of the Hill cipher is that it completely hides
single-letter frequencies. Indeed, with Hill, the use of a larger matrix hides more
frequency information. Thus, a 3 * 3 Hill cipher hides not only single-letter but
also two-letter frequency information.
Although the Hill cipher is strong against a ciphertext-only attack, it is easily
broken with a known plaintext attack. For an m * m Hill cipher, suppose we have m
plaintext–ciphertext pairs, each of length m. We label the pairs Pj = (p1jp1j c pmj)
and Cj = (c1jc1j c cmj) such that Cj = PjK for 1 … j … m and for some unknown
key matrix K. Now define two m * m matrices X = (pij) and Y = (cij). Then we
can form the matrix equation Y = XK. If X has an inverse, then we can determine
K = X-1Y. If X is not invertible, then a new version of X can be formed with addi-
tional plaintext–ciphertext pairs until an invertible X is obtained.
Consider this example. Suppose that the plaintext “hillcipher” is encrypted
using a 2 * 2 Hill cipher to yield the ciphertext HCRZSSXNSP. Thus, we know
that (7 8)K mod 26 = (7 2); (11 11)K mod 26 = (17 25); and so on. Using
the first two plaintext-ciphertext pairs, we have
7The calculations for this example are provided in detail in Appendix E.

¢ 7 2
17 25
≤ = ¢ 7 8
11 11
≤K mod 26
The inverse of X can be computed:
¢ 7 8
11 11
≤-1 = ¢25 22
1 23

K = ¢25 22
1 23
≤ ¢ 7 2
17 25
≤ = ¢549 600
398 577
≤ mod 26 = ¢3 2
8 5

This result is verified by testing the remaining plaintext–ciphertext pairs.
Polyalphabetic Ciphers
Another way to improve on the simple monoalphabetic technique is to use differ-
ent monoalphabetic substitutions as one proceeds through the plaintext message.
The general name for this approach is polyalphabetic substitution cipher. All these
techniques have the following features in common:
1. A set of related monoalphabetic substitution rules is used.
2. A key determines which particular rule is chosen for a given transformation.
VIGENÈRE CIPHER The best known, and one of the simplest, polyalphabetic ciphers
is the Vigenère cipher. In this scheme, the set of related monoalphabetic substitu-
tion rules consists of the 26 Caesar ciphers with shifts of 0 through 25. Each cipher is
denoted by a key letter, which is the ciphertext letter that substitutes for the plain-
text letter a. Thus, a Caesar cipher with a shift of 3 is denoted by the key value 3.8
We can express the Vigenère cipher in the following manner. Assume a
sequence of plaintext letters P = p0, p1, p2, c , pn – 1 and a key consisting of the
sequence of letters K = k0, k1, k2, c , km – 1, where typically m 6 n. The sequence
of ciphertext letters C = C0, C1, C2, c , Cn – 1 is calculated as follows:
C = C0, C1, C2, c , Cn – 1 = E(K, P) = E[(k0, k1, k2, c , km – 1), (p0, p1, p2, c , pn – 1)]
= (p0 + k0) mod 26, (p1 + k1) mod 26, c ,(pm – 1 + km – 1) mod 26,
(pm + k0) mod 26, (pm + 1 + k1) mod 26, c , (p2m – 1 + km – 1) mod 26, c
Thus, the first letter of the key is added to the first letter of the plaintext, mod 26,
the second letters are added, and so on through the first m letters of the plaintext.
For the next m letters of the plaintext, the key letters are repeated. This process
8To aid in understanding this scheme and also to aid in it use, a matrix known as the Vigenère tableau is
often used. This tableau is discussed in a document at

continues until all of the plaintext sequence is encrypted. A general equation of the
encryption process is
Ci = (pi + ki mod m) mod 26 (3.3)
Compare this with Equation (3.1) for the Caesar cipher. In essence, each plain-
text character is encrypted with a different Caesar cipher, depending on the corre-
sponding key character. Similarly, decryption is a generalization of Equation (3.2):
pi = (Ci – ki mod m) mod 26 (3.4)
To encrypt a message, a key is needed that is as long as the message. Usually,
the key is a repeating keyword. For example, if the keyword is deceptive, the mes-
sage “we are discovered save yourself” is encrypted as
key: deceptivedeceptivedeceptive
plaintext: wearediscoveredsaveyourself
Expressed numerically, we have the following result.
key 3 4 2 4 15 19 8 21 4 3 4 2 4 15
plaintext 22 4 0 17 4 3 8 18 2 14 21 4 17 4
ciphertext 25 8 2 21 19 22 16 13 6 17 25 6 21 19
key 19 8 21 4 3 4 2 4 15 19 8 21 4
plaintext 3 18 0 21 4 24 14 20 17 18 4 11 5
ciphertext 22 0 21 25 7 2 16 24 6 11 12 6 9
The strength of this cipher is that there are multiple ciphertext letters for
each plaintext letter, one for each unique letter of the keyword. Thus, the letter fre-
quency information is obscured. However, not all knowledge of the plaintext struc-
ture is lost. For example, Figure 3.6 shows the frequency distribution for a Vigenère
cipher with a keyword of length 9. An improvement is achieved over the Playfair
cipher, but considerable frequency information remains.
It is instructive to sketch a method of breaking this cipher, because the method
reveals some of the mathematical principles that apply in cryptanalysis.
First, suppose that the opponent believes that the ciphertext was encrypted
using either monoalphabetic substitution or a Vigenère cipher. A simple test can
be made to make a determination. If a monoalphabetic substitution is used, then
the statistical properties of the ciphertext should be the same as that of the lan-
guage of the plaintext. Thus, referring to Figure 3.5, there should be one cipher let-
ter with a relative frequency of occurrence of about 12.7%, one with about 9.06%,
and so on. If only a single message is available for analysis, we would not expect
an exact match of this small sample with the statistical profile of the plaintext lan-
guage. Nevertheless, if the correspondence is close, we can assume a monoalpha-
betic substitution.

If, on the other hand, a Vigenère cipher is suspected, then progress depends on
determining the length of the keyword, as will be seen in a moment. For now, let us
concentrate on how the keyword length can be determined. The important insight
that leads to a solution is the following: If two identical sequences of plaintext let-
ters occur at a distance that is an integer multiple of the keyword length, they will
generate identical ciphertext sequences. In the foregoing example, two instances
of the sequence “red” are separated by nine character positions. Consequently, in
both cases, r is encrypted using key letter e, e is encrypted using key letter p, and d
is encrypted using key letter t. Thus, in both cases, the ciphertext sequence is VTW.
We indicate this above by underlining the relevant ciphertext letters and shading
the relevant ciphertext numbers.
An analyst looking at only the ciphertext would detect the repeated sequences
VTW at a displacement of 9 and make the assumption that the keyword is either
three or nine letters in length. The appearance of VTW twice could be by chance
and may not reflect identical plaintext letters encrypted with identical key letters.
However, if the message is long enough, there will be a number of such repeated
ciphertext sequences. By looking for common factors in the displacements of the vari-
ous sequences, the analyst should be able to make a good guess of the keyword length.
Solution of the cipher now depends on an important insight. If the keyword
length is m, then the cipher, in effect, consists of m monoalphabetic substitution
ciphers. For example, with the keyword DECEPTIVE, the letters in positions 1, 10,
19, and so on are all encrypted with the same monoalphabetic cipher. Thus, we can
use the known frequency characteristics of the plaintext language to attack each of
the monoalphabetic ciphers separately.
The periodic nature of the keyword can be eliminated by using a nonrepeating
keyword that is as long as the message itself. Vigenère proposed what is referred to
as an autokey system, in which a keyword is concatenated with the plaintext itself to
provide a running key. For our example,
key: deceptivewearediscoveredsav
plaintext: wearediscoveredsaveyourself
Even this scheme is vulnerable to cryptanalysis. Because the key and the
plaintext share the same frequency distribution of letters, a statistical technique can
be applied. For example, e enciphered by e, by Figure 3.5, can be expected to occur
with a frequency of (0.127)2 ≈ 0.016, whereas t enciphered by t would occur only
about half as often. These regularities can be exploited to achieve successful
VERNAM CIPHER The ultimate defense against such a cryptanalysis is to choose a
keyword that is as long as the plaintext and has no statistical relationship to it. Such
a system was introduced by an AT&T engineer named Gilbert Vernam in 1918.
9Although the techniques for breaking a Vigenère cipher are by no means complex, a 1917 issue of
Scientific American characterized this system as “impossible of translation.” This is a point worth remem-
bering when similar claims are made for modern algorithms.

His system works on binary data (bits) rather than letters. The system can be
expressed succinctly as follows (Figure 3.7):
ci = pi⊕ ki
pi = ith binary digit of plaintext
ki = ith binary digit of key
ci = ith binary digit of ciphertext
⊕ = exclusive@or (XOR) operation
Compare this with Equation (3.3) for the Vigenère cipher.
Thus, the ciphertext is generated by performing the bitwise XOR of the plain-
text and the key. Because of the properties of the XOR, decryption simply involves
the same bitwise operation:
pi = ci⊕ ki
which compares with Equation (3.4).
The essence of this technique is the means of construction of the key. Vernam
proposed the use of a running loop of tape that eventually repeated the key, so that
in fact the system worked with a very long but repeating keyword. Although such
a scheme, with a long key, presents formidable cryptanalytic difficulties, it can be
broken with sufficient ciphertext, the use of known or probable plaintext sequences,
or both.
One-Time Pad
An Army Signal Corp officer, Joseph Mauborgne, proposed an improvement to the
Vernam cipher that yields the ultimate in security. Mauborgne suggested using a
random key that is as long as the message, so that the key need not be repeated. In
addition, the key is to be used to encrypt and decrypt a single message, and then is
discarded. Each new message requires a new key of the same length as the new mes-
sage. Such a scheme, known as a one-time pad, is unbreakable. It produces random
output that bears no statistical relationship to the plaintext. Because the ciphertext
Figure 3.7 Vernam Cipher
Key stream
bit stream (ki)
bit stream (ki)
(ci )
Key stream

contains no information whatsoever about the plaintext, there is simply no way to
break the code.
An example should illustrate our point. Suppose that we are using a Vigenère
scheme with 27 characters in which the twenty-seventh character is the space
character, but with a one-time key that is as long as the message. Consider the
We now show two different decryptions using two different keys:
key: pxlmvmsydofuyrvzwc tnlebnecvgdupahfzzlmnyih
plaintext: mr mustard with the candlestick in the hall
key: pftgpmiydgaxgoufhklllmhsqdqogtewbqfgyovuhwt
plaintext: miss scarlet with the knife in the library
Suppose that a cryptanalyst had managed to find these two keys. Two plau-
sible plaintexts are produced. How is the cryptanalyst to decide which is the correct
decryption (i.e., which is the correct key)? If the actual key were produced in a truly
random fashion, then the cryptanalyst cannot say that one of these two keys is more
likely than the other. Thus, there is no way to decide which key is correct and there-
fore which plaintext is correct.
In fact, given any plaintext of equal length to the ciphertext, there is a key that
produces that plaintext. Therefore, if you did an exhaustive search of all possible
keys, you would end up with many legible plaintexts, with no way of knowing which
was the intended plaintext. Therefore, the code is unbreakable.
The security of the one-time pad is entirely due to the randomness of the key.
If the stream of characters that constitute the key is truly random, then the stream
of characters that constitute the ciphertext will be truly random. Thus, there are no
patterns or regularities that a cryptanalyst can use to attack the ciphertext.
In theory, we need look no further for a cipher. The one-time pad offers com-
plete security but, in practice, has two fundamental difficulties:
1. There is the practical problem of making large quantities of random keys. Any
heavily used system might require millions of random characters on a regular
basis. Supplying truly random characters in this volume is a significant task.
2. Even more daunting is the problem of key distribution and protection. For
every message to be sent, a key of equal length is needed by both sender and
receiver. Thus, a mammoth key distribution problem exists.
Because of these difficulties, the one-time pad is of limited utility and is useful
primarily for low-bandwidth channels requiring very high security.
The one-time pad is the only cryptosystem that exhibits what is referred to as
perfect secrecy. This concept is explored in Appendix F.

All the techniques examined so far involve the substitution of a ciphertext symbol
for a plaintext symbol. A very different kind of mapping is achieved by performing
some sort of permutation on the plaintext letters. This technique is referred to as a
transposition cipher.
The simplest such cipher is the rail fence technique, in which the plaintext is
written down as a sequence of diagonals and then read off as a sequence of rows.
For example, to encipher the message “meet me after the toga party” with a rail
fence of depth 2, we write the following:
m e m a t r h t g p r y
e t e f e t e o a a t
The encrypted message is
This sort of thing would be trivial to cryptanalyze. A more complex scheme is
to write the message in a rectangle, row by row, and read the message off, column
by column, but permute the order of the columns. The order of the columns then
becomes the key to the algorithm. For example,
Key: 4 3 1 2 5 6 7
Plaintext: a t t a c k p
o s t p o n e
d u n t i l t
w o a m x y z
Thus, in this example, the key is 4312567. To encrypt, start with the column
that is labeled 1, in this case column 3. Write down all the letters in that column.
Proceed to column 4, which is labeled 2, then column 2, then column 1, then
columns 5, 6, and 7.
A pure transposition cipher is easily recognized because it has the same letter
frequencies as the original plaintext. For the type of columnar transposition just
shown, cryptanalysis is fairly straightforward and involves laying out the cipher-
text in a matrix and playing around with column positions. Digram and trigram fre-
quency tables can be useful.
The transposition cipher can be made significantly more secure by perform-
ing more than one stage of transposition. The result is a more complex permutation
that is not easily reconstructed. Thus, if the foregoing message is reencrypted using
the same algorithm,

Key: 4 3 1 2 5 6 7
Input: t t n a a p t
m t s u o a o
d w c o i x k
n l y p e t z
To visualize the result of this double transposition, designate the letters in the
original plaintext message by the numbers designating their position. Thus, with 28
letters in the message, the original sequence of letters is
01 02 03 04 05 06 07 08 09 10 11 12 13 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28
After the first transposition, we have
03 10 17 24 04 11 18 25 02 09 16 23 01 08
15 22 05 12 19 26 06 13 20 27 07 14 21 28
which has a somewhat regular structure. But after the second transposition, we have
17 09 05 27 24 16 12 07 10 02 22 20 03 25
15 13 04 23 19 14 11 01 26 21 18 08 06 28
This is a much less structured permutation and is much more difficult to cryptanalyze.
The example just given suggests that multiple stages of encryption can produce an
algorithm that is significantly more difficult to cryptanalyze. This is as true of substi-
tution ciphers as it is of transposition ciphers. Before the introduction of DES, the
most important application of the principle of multiple stages of encryption was a
class of systems known as rotor machines.10
The basic principle of the rotor machine is illustrated in Figure 3.8. The
machine consists of a set of independently rotating cylinders through which electri-
cal pulses can flow. Each cylinder has 26 input pins and 26 output pins, with internal
wiring that connects each input pin to a unique output pin. For simplicity, only three
of the internal connections in each cylinder are shown.
If we associate each input and output pin with a letter of the alphabet, then a
single cylinder defines a monoalphabetic substitution. For example, in Figure 3.8,
if an operator depresses the key for the letter A, an electric signal is applied to
10Machines based on the rotor principle were used by both Germany (Enigma) and Japan (Purple) in
World War II. The breaking of both codes by the Allies was a significant factor in the war’s outcome.

the first pin of the first cylinder and flows through the internal connection to the
twenty-fifth output pin.
Consider a machine with a single cylinder. After each input key is depressed,
the cylinder rotates one position, so that the internal connections are shifted accord-
ingly. Thus, a different monoalphabetic substitution cipher is defined. After 26 let-
ters of plaintext, the cylinder would be back to the initial position. Thus, we have a
polyalphabetic substitution algorithm with a period of 26.
A single-cylinder system is trivial and does not present a formidable crypt-
analytic task. The power of the rotor machine is in the use of multiple cylinders, in
which the output pins of one cylinder are connected to the input pins of the next.
Figure 3.8 shows a three-cylinder system. The left half of the figure shows a position
in which the input from the operator to the first pin (plaintext letter a) is routed
through the three cylinders to appear at the output of the second pin (ciphertext
letter B).
With multiple cylinders, the one closest to the operator input rotates one
pin position with each keystroke. The right half of Figure 3.8 shows the system’s
configuration after a single keystroke. For every complete rotation of the inner
cylinder, the middle cylinder rotates one pin position. Finally, for every complete
rotation of the middle cylinder, the outer cylinder rotates one pin position. This
is the same type of operation seen with an odometer. The result is that there are
26 * 26 * 26 = 17,576 different substitution alphabets used before the system
Figure 3.8 Three-Rotor Machine with Wiring Represented by Numbered Contacts
Direction of motion Direction of motion
Fast rotor Medium rotor Slow rotor Fast rotor Medium rotor Slow rotor
(a) Initial setting (b) Setting after one keystroke

repeats. The addition of fourth and fifth rotors results in periods of 456,976 and
11,881,376 letters, respectively. Thus, a given setting of a 5-rotor machine is equiva-
lent to a Vigenère cipher with a key length of 11,881,376.
Such a scheme presents a formidable cryptanalytic challenge. If, for example,
the cryptanalyst attempts to use a letter frequency analysis approach, the analyst
is faced with the equivalent of over 11 million monoalphabetic ciphers. We might
need on the order of 50 letters in each monalphabetic cipher for a solution, which
means that the analyst would need to be in possession of a ciphertext with a length
of over half a billion letters.
The significance of the rotor machine today is that it points the way to a large
class of symmetric ciphers, of which the Data Encryption Standard (DES) is the
most prominent. DES is introduced in Chapter 4.
We conclude with a discussion of a technique that (strictly speaking), is not encryp-
tion, namely, steganography.
A plaintext message may be hidden in one of two ways. The methods of
steganography conceal the existence of the message, whereas the methods of cryp-
tography render the message unintelligible to outsiders by various transformations
of the text.11
A simple form of steganography, but one that is time-consuming to construct,
is one in which an arrangement of words or letters within an apparently innocuous
text spells out the real message. For example, the sequence of first letters of each
word of the overall message spells out the hidden message. Figure 3.9 shows an
example in which a subset of the words of the overall message is used to convey the
hidden message. See if you can decipher this; it’s not too hard.
Various other techniques have been used historically; some examples are the
following [MYER91]:
■ Character marking: Selected letters of printed or typewritten text are over-
written in pencil. The marks are ordinarily not visible unless the paper is held
at an angle to bright light.
■ Invisible ink: A number of substances can be used for writing but leave no vis-
ible trace until heat or some chemical is applied to the paper.
■ Pin punctures: Small pin punctures on selected letters are ordinarily not vis-
ible unless the paper is held up in front of a light.
■ Typewriter correction ribbon: Used between lines typed with a black ribbon,
the results of typing with the correction tape are visible only under a strong
11Steganography was an obsolete word that was revived by David Kahn and given the meaning it has
today [KAHN96].

Although these techniques may seem archaic, they have contemporary equiv-
alents. [WAYN09] proposes hiding a message by using the least significant bits of
frames on a CD. For example, the Kodak Photo CD format’s maximum resolution
is 3096 * 6144 pixels, with each pixel containing 24 bits of RGB color information.
The least significant bit of each 24-bit pixel can be changed without greatly affecting
the quality of the image. The result is that you can hide a 130-kB message in a single
digital snapshot. There are now a number of software packages available that take
this type of approach to steganography.
Steganography has a number of drawbacks when compared to encryption.
It requires a lot of overhead to hide a relatively few bits of information, although
using a scheme like that proposed in the preceding paragraph may make it more
effective. Also, once the system is discovered, it becomes virtually worthless. This
problem, too, can be overcome if the insertion method depends on some sort of key
(e.g., see Problem 3.22). Alternatively, a message can be first encrypted and then
hidden using steganography.
The advantage of steganography is that it can be employed by parties who
have something to lose should the fact of their secret communication (not necessar-
ily the content) be discovered. Encryption flags traffic as important or secret or may
identify the sender or receiver as someone with something to hide.
Figure 3.9 A Puzzle for Inspector Morse
(From The Silent World of Nicholas Quinn, by Colin Dexter)

Key Terms
block cipher
brute-force attack
Caesar cipher
computationally secure
conventional encryption
cryptographic system
Hill cipher
monoalphabetic cipher
one-time pad
Playfair cipher
polyalphabetic cipher
rail fence cipher
single-key encryption
stream cipher
symmetric encryption
transposition cipher
unconditionally secure
Vigenère cipher
Review Questions
3.1 Describe the main requirements for the secure use of symmetric encryption.
3.2 What are the two basic functions used in encryption algorithms?
3.3 Differentiate between secret-key encryption and public-key encryption.
3.4 What is the difference between a block cipher and a stream cipher?
3.5 What are the two general approaches to attacking a cipher?
3.6 List and briefly define types of cryptanalytic attacks based on what is known to the
3.7 What is the difference between an unconditionally secure cipher and a computation-
ally secure cipher?
3.8 Why is the Caesar cipher substitution technique vulnerable to a brute-force cryptanalysis?
3.9 How much key space is available when a monoalphabetic substitution cipher is used
to replace plaintext with ciphertext?
3.10 What is the drawback of a Playfair cipher?
3.11 What is the difference between a monoalphabetic cipher and a polyalphabetic cipher?
3.12 What are two problems with the one-time pad?
3.13 What is a transposition cipher?
3.14 What are the drawbacks of Steganography?
3.1 A generalization of the Caesar cipher, known as the affine Caesar cipher, has the fol-
lowing form: For each plaintext letter p, substitute the ciphertext letter C:
C = E([a, b], p) = (ap + b) mod 26
A basic requirement of any encryption algorithm is that it be one-to-one. That is, if
p ≠ q, then E(k, p) ≠ E(k, q). Otherwise, decryption is impossible, because more
than one plaintext character maps into the same ciphertext character. The affine
Caesar cipher is not one-to-one for all values of a. For example, for a = 2 and b = 3,
then E([a, b], 0) = E([a, b], 13) = 3.
a. Are there any limitations on the value of b? Explain why or why not.
b. Determine which values of a are not allowed.

c. Provide a general statement of which values of a are and are not allowed. Justify
your statement.
3.2 How many one-to-one affine Caesar ciphers are there?
3.3 A ciphertext has been generated with an affine cipher. The most frequent letter of
the ciphertext is “C,” and the second most frequent letter of the ciphertext is “Z.”
Break this code.
3.4 The following ciphertext was generated using a simple substitution algorithm.
hzsrnqc klyy wqc flo mflwf ol zqdn nsoznj wskn lj xzsrbjnf,
wzsxz gqv zqhhnf ol ozn glco zlfnco hnlhrn; nsoznj jnrqosdnc
lj fnqj kjsnfbc, wzsxz sc xnjoqsfrv gljn efeceqr. zn rsdnb
qrlfn sf zsc zlecn sf cqdsrrn jlw, wzsoznj flfn hnfnojqonb.
q csfyrn blgncosx cekksxnb ol cnjdn zsg. zn pjnqmkqconb qfb
bsfnb qo ozn xrep, qo zlejc gqozngqosxqrrv ksanb, sf ozn cqgn
jllg, qo ozn cqgn oqprn, fndnj oqmsfy zsc gnqrc wsoz loznj
gngpnjc, gexz rncc pjsfysfy q yenco wsoz zsg; qfb wnfo zlgn
qo naqxorv gsbfsyzo, lfrv ol jnosjn qo lfxn ol pnb. zn fndnj
ecnb ozn xlcv xzqgpnjc wzsxz ozn jnkljg hjldsbnc klj soc
kqdlejnb gngpnjc. zn hqccnb onf zlejc leo lk ozn ownfov-klej
sf cqdsrrn jlw, nsoznj sf crnnhsfy lj gqmsfy zsc olsrno.
Decrypt this message.
1. As you know, the most frequently occurring letter in English is e. Therefore, the
first or second (or perhaps third?) most common character in the message is likely
to stand for e. Also, e is often seen in pairs (e.g., meet, fleet, speed, seen, been,
agree, etc.). Try to find a character in the ciphertext that decodes to e.
2. The most common word in English is “the.” Use this fact to guess the characters
that stand for t and h.
3. Decipher the rest of the message by deducing additional words.
Warning: The resulting message is in English but may not make much sense on a first
3.5 One way to solve the key distribution problem is to use a line from a book that both
the sender and the receiver possess. Typically, at least in spy novels, the first sentence
of a book serves as the key. The particular scheme discussed in this problem is from
one of the best suspense novels involving secret codes, Talking to Strange Men, by
Ruth Rendell. Work this problem without consulting that book!
Consider the following message:
This ciphertext was produced using the first sentence of The Other Side of Silence
(a book about the spy Kim Philby):
The snow lay thick on the steps and the snowflakes driven by the wind
looked black in the headlights of the cars.
A simple substitution cipher was used.
a. What is the encryption algorithm?
b. How secure is it?
c. To make the key distribution problem simple, both parties can agree to use the first or
last sentence of a book as the key. To change the key, they simply need to agree on a
new book. The use of the first sentence would be preferable to the use of the last. Why?
3.6 In one of his cases, Sherlock Holmes was confronted with the following message.
534 C2 13 127 36 31 4 17 21 41
26 BIRLSTONE 9 127 171

Although Watson was puzzled, Holmes was able immediately to deduce the type of
cipher. Can you?
3.7 This problem uses a real-world example, from an old U.S. Special Forces manual
(public domain). The document, filename SpecialForces , is available at
a. Using the two keys (memory words) cryptographic and network security, encrypt
the following message:
Be at the third pillar from the left outside the lyceum theatre tonight at seven.
If you are distrustful bring two friends.
Make reasonable assumptions about how to treat redundant letters and excess
letters in the memory words and how to treat spaces and punctuation. Indicate
what your assumptions are. Note: The message is from the Sherlock Holmes novel,
The Sign of Four.
b. Decrypt the ciphertext. Show your work.
c. Comment on when it would be appropriate to use this technique and what its
advantages are.
3.8 A disadvantage of the general monoalphabetic cipher is that both sender and receiver
must commit the permuted cipher sequence to memory. A common technique for
avoiding this is to use a keyword from which the cipher sequence can be gener-
ated. For example, using the keyword CRYPTO, write out the keyword followed by
unused letters in normal order and match this against the plaintext letters:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: C R Y P T O A B D E F G H I J K L M N Q S U V W X Z
If it is felt that this process does not produce sufficient mixing, write the remain-
ing letters on successive lines and then generate the sequence by reading down the
This yields the sequence:
Such a system is used in the example in Section 3.2 (the one that begins “it was
disclosed yesterday”). Determine the keyword.
3.9 When the PT-109 American patrol boat, under the command of Lieutenant John F.
Kennedy, was sunk by a Japanese destroyer, a message was received at an Australian
wireless station in Playfair code:
The key used was royal new zealand navy. Decrypt the message. Translate TT into tt.

3.10 a. Construct a Playfair matrix with the key algorithm.
b. Construct a Playfair matrix with the key cryptography. Make a reasonable assump-
tion about how to treat redundant letters in the key.
3.11 a. Using this Playfair matrix:
Encrypt this message:
I only regret that I have but one life to give for my country.
Note: This message is by Nathan Hale, a soldier in the American Revolutionary War.
b. Repeat part (a) using the Playfair matrix from Problem 3.10a.
c. How do you account for the results of this problem? Can you generalize your
3.12 a. How many possible keys does the Playfair cipher have? Ignore the fact that
some keys might produce identical encryption results. Express your answer as an
approximate power of 2.
b. Now take into account the fact that some Playfair keys produce the same encryp-
tion results. How many effectively unique keys does the Playfair cipher have?
3.13 What substitution system results when we use a 1 * 25 Playfair matrix?
3.14 a. Encrypt the message “meet me at the usual place at ten rather than eight o clock”
using the Hill cipher with the key ¢7 3
2 5
≤. Show your calculations and the result.
b. Show the calculations for the corresponding decryption of the ciphertext to
recover the original plaintext.
3.15 We have shown that the Hill cipher succumbs to a known plaintext attack if sufficient
plaintext–ciphertext pairs are provided. It is even easier to solve the Hill cipher if a
chosen plaintext attack can be mounted. Describe such an attack.
3.16 It can be shown that the Hill cipher with the matrix ¢a b
c d
≤ requires that (ad – bc)
is relatively prime to 26; that is, the only common positive integer factor of (ad – bc)
and 26 is 1. Thus, if (ad – bc) = 13 or is even, the matrix is not allowed. Determine
the number of different (good) keys there are for a 2 * 2 Hill cipher without count-
ing them one by one, using the following steps:
a. Find the number of matrices whose determinant is even because one or both rows
are even. (A row is “even” if both entries in the row are even.)
b. Find the number of matrices whose determinant is even because one or both col-
umns are even. (A column is “even” if both entries in the column are even.)
c. Find the number of matrices whose determinant is even because all of the entries
are odd.
d. Taking into account overlaps, find the total number of matrices whose determi-
nant is even.
e. Find the number of matrices whose determinant is a multiple of 13 because the
first column is a multiple of 13.

f. Find the number of matrices whose determinant is a multiple of 13 where
the first column is not a multiple of 13 but the second column is a mul-
tiple of the first modulo 13.
g. Find the total number of matrices whose determinant is a multiple of 13.
h. Find the number of matrices whose determinant is a multiple of 26
because they fit cases parts (a) and (e), (b) and (e), (c) and (e), (a) and
(f), and so on.
i. Find the total number of matrices whose determinant is neither a mul-
tiple of 2 nor a multiple of 13.
3.17 Calculate the determinant mod 26 of
a. ¢2 3 5
1 3 7
≤ b. £2 1 1 3 2 55 7 1 8
3 1 4 1 2

3.18 Determine the inverse mod 26 of
a. ¢2 3
1 22
≤ b. £ 6 24 113 16 10
20 17 15

3.19 Using the Vigenère cipher, encrypt the word “cryptographic” using the word
3.20 This problem explores the use of a one-time pad version of the Vigenère
cipher. In this scheme, the key is a stream of random numbers between 0
and 26. For example, if the key is 3 19 5 . . . , then the first letter of plaintext
is encrypted with a shift of 3 letters, the second with a shift of 19 letters, the
third with a shift of 5 letters, and so on.
a. Encrypt the plaintext sendmoremoney with the key stream
3 11 5 7 17 21 0 11 14 8 7 13 9
b. Using the ciphertext produced in part (a), find a key so that the cipher-
text decrypts to the plaintext cashnotneeded.
3.21 What is the message embedded in Figure 3.9?
3.22 In one of Dorothy Sayers’s mysteries, Lord Peter is confronted with the
message shown in Figure 3.10. He also discovers the key to the message,
which is a sequence of integers:
a. Decrypt the message. Hint: What is the largest integer value?
b. If the algorithm is known but not the key, how secure is the scheme?
c. If the key is known but not the algorithm, how secure is the scheme?
Figure 3.10 A Puzzle for Lord Peter
I thought to see the fairies in the fields, but I saw only the evil elephants with their black
backs. Woe! how that sight awed me! The elves danced all around and about while I heard
voices calling clearly. Ah! how I tried to see—throw off the ugly cloud—but no blind eye
of a mortal was permitted to spy them. So then came minstrels, having gold trumpets, harps
and drums. These played very loudly beside me, breaking that spell. So the dream vanished,
whereat I thanked Heaven. I shed many tears before the thin moon rose up, frail and faint as
a sickle of straw. Now though the Enchanter gnash his teeth vainly, yet shall he return as the
Spring returns. Oh, wretched man! Hell gapes, Erebus now lies open. The mouths of Death
wait on thy end.

Programming Problems
3.23 Write a program that can encrypt and decrypt using the general Caesar
cipher, also known as an additive cipher.
3.24 Write a program that can encrypt and decrypt using the affine cipher
described in Problem 3.1.
3.25 Write a program that can perform a letter frequency attack on an additive
cipher without human intervention. Your software should produce possible
plaintexts in rough order of likelihood. It would be good if your user inter-
face allowed the user to specify “give me the top 10 possible plaintexts.”
3.26 Write a program that can perform a letter frequency attack on any mono-
alphabetic substitution cipher without human intervention. Your software
should produce possible plaintexts in rough order of likelihood. It would
be good if your user interface allowed the user to specify “give me the top
10 possible plaintexts.”
3.27 Create software that can encrypt and decrypt using a 2 * 2 Hill cipher.
3.28 Create software that can perform a fast known plaintext attack on a Hill cipher,
given the dimension m. How fast are your algorithms, as a function of m?

4.1 Traditional Block Cipher Structure
Stream Ciphers and Block Ciphers
Motivation for the Feistel Cipher Structure
The Feistel Cipher
4.2 The Data Encryption Standard
DES Encryption
DES Decryption
4.3 A DES Example
The Avalanche Effect
4.4 The Strength of DES
The Use of 56-Bit Keys
The Nature of the DES Algorithm
Timing Attacks
4.5 Block Cipher Design Principles
Number of Rounds
Design of Function F
Key Schedule Algorithm
4.6 Key Terms, Review Questions, and Problems
Block Ciphers and the Data
Encryption Standard

The objective of this chapter is to illustrate the principles of modern symmetric
ciphers. For this purpose, we focus on the most widely used symmetric cipher: the Data
Encryption Standard (DES). Although numerous symmetric ciphers have been devel-
oped since the introduction of DES, and although it is destined to be replaced by the
Advanced Encryption Standard (AES), DES remains the most important such algo-
rithm. Furthermore, a detailed study of DES provides an understanding of the prin-
ciples used in other symmetric ciphers.
This chapter begins with a discussion of the general principles of symmetric block
ciphers, which are the principal type of symmetric ciphers studied in this book. The
other form of symmetric ciphers, stream ciphers, are discussed in Chapter 8. Next, we
cover full DES. Following this look at a specific algorithm, we return to a more general
discussion of block cipher design.
Compared to public-key ciphers, such as RSA, the structure of DES and most
symmetric ciphers is very complex and cannot be explained as easily as RSA and simi-
lar algorithms. Accordingly, the reader may wish to begin with a simplified version of
DES, which is described in Appendix G. This version allows the reader to perform
encryption and decryption by hand and gain a good understanding of the working of
the algorithm details. Classroom experience indicates that a study of this simplified
version enhances understanding of DES.1
Several important symmetric block encryption algorithms in current use are based
on a structure referred to as a Feistel block cipher [FEIS73]. For that reason, it is
important to examine the design principles of the Feistel cipher. We begin with a
comparison of stream ciphers and block ciphers. Then we discuss the motivation for
the Feistel block cipher structure. Finally, we discuss some of its implications.
1However, you may safely skip Appendix G, at least on a first reading. If you get lost or bogged down in
the details of DES, then you can go back and start with simplified DES.
After studying this chapter, you should be able to
◆ Understand the distinction between stream ciphers and block ciphers.
◆ Present an overview of the Feistel cipher and explain how decryption is
the inverse of encryption.
◆ Present an overview of Data Encryption Standard (DES).
◆ Explain the concept of the avalanche effect.
◆ Discuss the cryptographic strength of DES.
◆ Summarize the principal block cipher design principles.

Stream Ciphers and Block Ciphers
A stream cipher is one that encrypts a digital data stream one bit or one byte at a
time. Examples of classical stream ciphers are the autokeyed Vigenère cipher and
the Vernam cipher. In the ideal case, a one-time pad version of the Vernam cipher
would be used (Figure 3.7), in which the keystream (ki) is as long as the plaintext bit
stream (pi). If the cryptographic keystream is random, then this cipher is unbreakable
by any means other than acquiring the keystream. However, the keystream must be
provided to both users in advance via some independent and secure channel. This
introduces insurmountable logistical problems if the intended data traffic is very large.
Accordingly, for practical reasons, the bit-stream generator must be imple-
mented as an algorithmic procedure, so that the cryptographic bit stream can be
produced by both users. In this approach (Figure 4.1a), the bit-stream generator is
a key-controlled algorithm and must produce a bit stream that is cryptographically
strong. That is, it must be computationally impractical to predict future portions of
the bit stream based on previous portions of the bit stream. The two users need only
share the generating key, and each can produce the keystream.
A block cipher is one in which a block of plaintext is treated as a whole and
used to produce a ciphertext block of equal length. Typically, a block size of 64 or
Figure 4.1 Stream Cipher and Block Cipher
(a) Stream cipher using algorithmic bit-stream generator
(b) Block cipher
( K )
b bits
b bits
( K )
( K )
b bits
b bits
( K )

128 bits is used. As with a stream cipher, the two users share a symmetric encryption
key (Figure 4.1b). Using some of the modes of operation explained in Chapter 7, a
block cipher can be used to achieve the same effect as a stream cipher.
Far more effort has gone into analyzing block ciphers. In general, they seem
applicable to a broader range of applications than stream ciphers. The vast majority
of network-based symmetric cryptographic applications make use of block ciphers.
Accordingly, the concern in this chapter, and in our discussions throughout the
book of symmetric encryption, will primarily focus on block ciphers.
Motivation for the Feistel Cipher Structure
A block cipher operates on a plaintext block of n bits to produce a ciphertext block
of n bits. There are 2n possible different plaintext blocks and, for the encryption
to be reversible (i.e., for decryption to be possible), each must produce a unique
ciphertext block. Such a transformation is called reversible, or nonsingular. The fol-
lowing examples illustrate nonsingular and singular transformations for n = 2.
Reversible Mapping Irreversible Mapping
Plaintext Ciphertext Plaintext Ciphertext
00 11 00 11
01 10 01 10
10 00 10 01
11 01 11 01
In the latter case, a ciphertext of 01 could have been produced by one of two plain-
text blocks. So if we limit ourselves to reversible mappings, the number of different
transformations is 2n!.2
Figure 4.2 illustrates the logic of a general substitution cipher for n = 4.
A 4-bit input produces one of 16 possible input states, which is mapped by the sub-
stitution cipher into a unique one of 16 possible output states, each of which is repre-
sented by 4 ciphertext bits. The encryption and decryption mappings can be defined
by a tabulation, as shown in Table 4.1. This is the most general form of block cipher
and can be used to define any reversible mapping between plaintext and ciphertext.
Feistel refers to this as the ideal block cipher, because it allows for the maximum
number of possible encryption mappings from the plaintext block [FEIS75].
But there is a practical problem with the ideal block cipher. If a small block
size, such as n = 4, is used, then the system is equivalent to a classical substitution
cipher. Such systems, as we have seen, are vulnerable to a statistical analysis of the
plaintext. This weakness is not inherent in the use of a substitution cipher but rather
results from the use of a small block size. If n is sufficiently large and an arbitrary
reversible substitution between plaintext and ciphertext is allowed, then the statisti-
cal characteristics of the source plaintext are masked to such an extent that this type
of cryptanalysis is infeasible.
2The reasoning is as follows: For the first plaintext, we can choose any of 2n ciphertext blocks. For the
second plaintext, we choose from among 2n – 1 remaining ciphertext blocks, and so on.

An arbitrary reversible substitution cipher (the ideal block cipher) for a large
block size is not practical, however, from an implementation and performance
point of view. For such a transformation, the mapping itself constitutes the key.
Consider again Table 4.1, which defines one particular reversible mapping from
Figure 4.2 General n-bit-n-bit Block Substitution (shown with n = 4)
4-bit input
4 to 16 decoder
16 to 4 encoder
4-bit output
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Table 4.1 Encryption and Decryption Tables for Substitution Cipher of Figure 4.2
Plaintext Ciphertext
0000 1110
0001 0100
0010 1101
0011 0001
0100 0010
0101 1111
0110 1011
0111 1000
1000 0011
1001 1010
1010 0110
1011 1100
1100 0101
1101 1001
1110 0000
1111 0111
Ciphertext Plaintext
0000 1110
0001 0011
0010 0100
0011 1000
0100 0001
0101 1100
0110 1010
0111 1111
1000 0111
1001 1101
1010 1001
1011 0110
1100 1011
1101 0010
1110 0000
1111 0101

plaintext to ciphertext for n = 4. The mapping can be defined by the entries in the
second column, which show the value of the ciphertext for each plaintext block.
This, in essence, is the key that determines the specific mapping from among all
possible mappings. In this case, using this straightforward method of defining the
key, the required key length is (4 bits) * (16 rows) = 64 bits. In general, for an
n-bit ideal block cipher, the length of the key defined in this fashion is n * 2n bits.
For a 64-bit block, which is a desirable length to thwart statistical attacks, the
required key length is 64 * 264 = 270 ≈ 1021 bits.
In considering these difficulties, Feistel points out that what is needed is an
approximation to the ideal block cipher system for large n, built up out of compo-
nents that are easily realizable [FEIS75]. But before turning to Feistel’s approach,
let us make one other observation. We could use the general block substitution
cipher but, to make its implementation tractable, confine ourselves to a subset of
the 2n! possible reversible mappings. For example, suppose we define the mapping
in terms of a set of linear equations. In the case of n = 4, we have
y1 = k11x1 + k12x2 + k13x3 + k14x4
y2 = k21x1 + k22x2 + k23x3 + k24x4
y3 = k31x1 + k32x2 + k33x3 + k34x4
y4 = k41x1 + k42x2 + k43x3 + k44x4
where the xi are the four binary digits of the plaintext block, the yi are the four bi-
nary digits of the ciphertext block, the kij are the binary coefficients, and arithmetic
is mod 2. The key size is just n2, in this case 16 bits. The danger with this kind of for-
mulation is that it may be vulnerable to cryptanalysis by an attacker that is aware of
the structure of the algorithm. In this example, what we have is essentially the Hill
cipher discussed in Chapter 3, applied to binary data rather than characters. As we
saw in Chapter 3, a simple linear system such as this is quite vulnerable.
The Feistel Cipher
Feistel proposed [FEIS73] that we can approximate the ideal block cipher by utiliz-
ing the concept of a product cipher, which is the execution of two or more simple
ciphers in sequence in such a way that the final result or product is cryptographi-
cally stronger than any of the component ciphers. The essence of the approach is
to develop a block cipher with a key length of k bits and a block length of n bits,
allowing a total of 2k possible transformations, rather than the 2n! transformations
available with the ideal block cipher.
In particular, Feistel proposed the use of a cipher that alternates substitutions
and permutations, where these terms are defined as follows:
■ Substitution: Each plaintext element or group of elements is uniquely replaced
by a corresponding ciphertext element or group of elements.
■ Permutation: A sequence of plaintext elements is replaced by a permutation
of that sequence. That is, no elements are added or deleted or replaced in the
sequence, rather the order in which the elements appear in the sequence is

In fact, Feistel’s is a practical application of a proposal by Claude Shannon
to develop a product cipher that alternates confusion and diffusion functions
[SHAN49].3 We look next at these concepts of diffusion and confusion and then
present the Feistel cipher. But first, it is worth commenting on this remarkable fact:
The Feistel cipher structure, which dates back over a quarter century and which, in
turn, is based on Shannon’s proposal of 1945, is the structure used by a number of
significant symmetric block ciphers currently in use. In particular, the Feistel struc-
ture is used for Triple Data Encryption Algorithm (TDEA), which is one of the two
encryption algorithms (along with AES), approved for general use by the National
Institute of Standards and Technology (NIST). The Feistel structure is also used for
several schemes for format-preserving encryption, which have recently come into
prominence. In addition, the Camellia block cipher is a Feistel structure; it is one
of the possible symmetric ciphers in TLS and a number of other Internet security
protocols. Both TDEA and format-preserving encryption are covered in Chapter 7.
DIFFUSION AND CONFUSION The terms diffusion and confusion were introduced by
Claude Shannon to capture the two basic building blocks for any cryptographic sys-
tem [SHAN49]. Shannon’s concern was to thwart cryptanalysis based on statisti-
cal analysis. The reasoning is as follows. Assume the attacker has some knowledge
of the statistical characteristics of the plaintext. For example, in a human-readable
message in some language, the frequency distribution of the various letters may be
known. Or there may be words or phrases likely to appear in the message (probable
words). If these statistics are in any way reflected in the ciphertext, the cryptanalyst
may be able to deduce the encryption key, part of the key, or at least a set of keys
likely to contain the exact key. In what Shannon refers to as a strongly ideal cipher,
all statistics of the ciphertext are independent of the particular key used. The arbi-
trary substitution cipher that we discussed previously (Figure 4.2) is such a cipher,
but as we have seen, it is impractical.4
Other than recourse to ideal systems, Shannon suggests two methods for
frustrating statistical cryptanalysis: diffusion and confusion. In diffusion, the sta-
tistical structure of the plaintext is dissipated into long-range statistics of the
ciphertext. This is achieved by having each plaintext digit affect the value of many
ciphertext digits; generally, this is equivalent to having each ciphertext digit be
affected by many plaintext digits. An example of diffusion is to encrypt a message
M = m1, m2, m3, c of characters with an averaging operation:
yn = ¢ ak
mn + i≤ mod 26
3The paper is available at Shannon’s 1949 paper appeared originally as a classified
report in 1945. Shannon enjoys an amazing and unique position in the history of computer and informa-
tion science. He not only developed the seminal ideas of modern cryptography but is also responsible for
inventing the discipline of information theory. Based on his work in information theory, he developed
a formula for the capacity of a data communications channel, which is still used today. In addition, he
founded another discipline, the application of Boolean algebra to the study of digital circuits; this last he
managed to toss off as a master’s thesis.
4Appendix F expands on Shannon’s concepts concerning measures of secrecy and the security of crypto-
graphic algorithms.

adding k successive letters to get a ciphertext letter yn. One can show that the sta-
tistical structure of the plaintext has been dissipated. Thus, the letter frequencies in
the ciphertext will be more nearly equal than in the plaintext; the digram frequen-
cies will also be more nearly equal, and so on. In a binary block cipher, diffusion can
be achieved by repeatedly performing some permutation on the data followed by
applying a function to that permutation; the effect is that bits from different posi-
tions in the original plaintext contribute to a single bit of ciphertext.5
Every block cipher involves a transformation of a block of plaintext into a
block of ciphertext, where the transformation depends on the key. The mechanism
of diffusion seeks to make the statistical relationship between the plaintext and
ciphertext as complex as possible in order to thwart attempts to deduce the key. On
the other hand, confusion seeks to make the relationship between the statistics of
the ciphertext and the value of the encryption key as complex as possible, again to
thwart attempts to discover the key. Thus, even if the attacker can get some handle
on the statistics of the ciphertext, the way in which the key was used to produce that
ciphertext is so complex as to make it difficult to deduce the key. This is achieved by
the use of a complex substitution algorithm. In contrast, a simple linear substitution
function would add little confusion.
As [ROBS95b] points out, so successful are diffusion and confusion in captur-
ing the essence of the desired attributes of a block cipher that they have become the
cornerstone of modern block cipher design.
FEISTEL CIPHER STRUCTURE The left-hand side of Figure 4.3 depicts the encryption
structure proposed by Feistel. The inputs to the encryption algorithm are a plaintext
block of length 2w bits and a key K. The plaintext block is divided into two halves,
LE0 and RE0. The two halves of the data pass through n rounds of processing and
then combine to produce the ciphertext block. Each round i has as inputs LEi- 1 and
REi- 1 derived from the previous round, as well as a subkey Ki derived from the over-
all K. In general, the subkeys Ki are different from K and from each other. In Figure
4.3, 16 rounds are used, although any number of rounds could be implemented.
All rounds have the same structure. A substitution is performed on the left
half of the data. This is done by applying a round function F to the right half of the
data and then taking the exclusive-OR of the output of that function and the left
half of the data. The round function has the same general structure for each round
but is parameterized by the round subkey Ki. Another way to express this is to say
that F is a function of right-half block of w bits and a subkey of y bits, which pro-
duces an output value of length w bits: F(REi, Ki+ 1). Following this substitution, a
permutation is performed that consists of the interchange of the two halves of the
data.6 This structure is a particular form of the substitution-permutation network
(SPN) proposed by Shannon.
5Some books on cryptography equate permutation with diffusion. This is incorrect. Permutation, by itself,
does not change the statistics of the plaintext at the level of individual letters or permuted blocks. For exam-
ple, in DES, the permutation swaps two 32-bit blocks, so statistics of strings of 32 bits or less are preserved.
6The final round is followed by an interchange that undoes the interchange that is part of the final round.
One could simply leave both interchanges out of the diagram, at the sacrifice of some consistency of pre-
sentation. In any case, the effective lack of a swap in the final round is done to simplify the implementa-
tion of the decryption process, as we shall see.

The exact realization of a Feistel network depends on the choice of the follow-
ing parameters and design features:
■ Block size: Larger block sizes mean greater security (all other things being
equal) but reduced encryption/decryption speed for a given algorithm. The
greater security is achieved by greater diffusion. Traditionally, a block size of
64 bits has been considered a reasonable tradeoff and was nearly universal in
block cipher design. However, the new AES uses a 128-bit block size.
Figure 4.3 Feistel Encryption and Decryption (16 rounds)
Output (ciphertext)
LD0 = RE16 RD0 = LE16
LD2 = RE14 RD2 = LE14
LD14 = RE2 RD14 = LE2
LD16 = RE0
LD17 = RE0
RD16 = LE0
RD17 = LE0
RD1 = LE15LD1 = RE15
RD15 = LE1LD15 = RE1
Input (ciphertext)
Output (plaintext)
Input (plaintext)
LE14 RE14
LE15 RE15
LE16 RE16
LE17 RE17

■ Key size: Larger key size means greater security but may decrease encryption/
decryption speed. The greater security is achieved by greater resistance to
brute-force attacks and greater confusion. Key sizes of 64 bits or less are now
widely considered to be inadequate, and 128 bits has become a common size.
■ Number of rounds: The essence of the Feistel cipher is that a single round
offers inadequate security but that multiple rounds offer increasing security.
A typical size is 16 rounds.
■ Subkey generation algorithm: Greater complexity in this algorithm should
lead to greater difficulty of cryptanalysis.
■ Round function F: Again, greater complexity generally means greater resis-
tance to cryptanalysis.
There are two other considerations in the design of a Feistel cipher:
■ Fast software encryption/decryption: In many cases, encryption is embedded
in applications or utility functions in such a way as to preclude a hardware im-
plementation. Accordingly, the speed of execution of the algorithm becomes a
■ Ease of analysis: Although we would like to make our algorithm as difficult as
possible to cryptanalyze, there is great benefit in making the algorithm easy
to analyze. That is, if the algorithm can be concisely and clearly explained, it is
easier to analyze that algorithm for cryptanalytic vulnerabilities and therefore
develop a higher level of assurance as to its strength. DES, for example, does
not have an easily analyzed functionality.
FEISTEL DECRYPTION ALGORITHM The process of decryption with a Feistel cipher
is essentially the same as the encryption process. The rule is as follows: Use the
ciphertext as input to the algorithm, but use the subkeys Ki in reverse order. That
is, use Kn in the first round, Kn – 1 in the second round, and so on, until K1 is used in
the last round. This is a nice feature, because it means we need not implement two
different algorithms; one for encryption and one for decryption.
To see that the same algorithm with a reversed key order produces the cor-
rect result, Figure 4.3 shows the encryption process going down the left-hand side
and the decryption process going up the right-hand side for a 16-round algorithm.
For clarity, we use the notation LEi and REi for data traveling through the encryp-
tion algorithm and LDi and RDi for data traveling through the decryption algo-
rithm. The diagram indicates that, at every round, the intermediate value of the
decryption process is equal to the corresponding value of the encryption process
with the two halves of the value swapped. To put this another way, let the output
of the ith encryption round be LEi ‘REi (LEi concatenated with REi). Then the cor-
responding output of the (16 – i)th decryption round is REi ‘LEi or, equivalently,
LD16 – i ‘RD16 – i.
Let us walk through Figure 4.3 to demonstrate the validity of the preceding
assertions. After the last iteration of the encryption process, the two halves of the
output are swapped, so that the ciphertext is RE16 ‘LE16. The output of that round
is the ciphertext. Now take that ciphertext and use it as input to the same algorithm.
The input to the first round is RE16 ‘LE16, which is equal to the 32-bit swap of the
output of the sixteenth round of the encryption process.

Now we would like to show that the output of the first round of the decryption
process is equal to a 32-bit swap of the input to the sixteenth round of the encryp-
tion process. First, consider the encryption process. We see that
LE16 = RE15
RE16 = LE15⊕ F(RE15, K16)
On the decryption side,
LD1 = RD0 = LE16 = RE15
RD1 = LD0⊕ F(RD0, K16)
= RE16⊕ F(RE15, K16)
= [LE15⊕ F(RE15, K16)]⊕ F(RE15, K16)
The XOR has the following properties:
[A⊕ B]⊕ C = A⊕ [B⊕ C]
D⊕D = 0
E⊕ 0 = E
Thus, we have LD1 = RE15 and RD1 = LE15. Therefore, the output of the first
round of the decryption process is RE15 ‘LE15, which is the 32-bit swap of the input
to the sixteenth round of the encryption. This correspondence holds all the way
through the 16 iterations, as is easily shown. We can cast this process in general
terms. For the ith iteration of the encryption algorithm,
LEi = REi- 1
REi = LEi- 1⊕ F(REi- 1, Ki)
Rearranging terms:
REi- 1 = LEi
LEi- 1 = REi⊕ F(REi- 1, Ki) = REi⊕ F(LEi, Ki)
Thus, we have described the inputs to the ith iteration as a function of the outputs, and
these equations confirm the assignments shown in the right-hand side of Figure 4.3.
Finally, we see that the output of the last round of the decryption process is
RE0 ‘LE0. A 32-bit swap recovers the original plaintext, demonstrating the validity
of the Feistel decryption process.
Note that the derivation does not require that F be a reversible function. To
see this, take a limiting case in which F produces a constant output (e.g., all ones)
regardless of the values of its two arguments. The equations still hold.
To help clarify the preceding concepts, let us look at a specific example
(Figure 4.4 and focus on the fifteenth round of encryption, corresponding to the sec-
ond round of decryption. Suppose that the blocks at each stage are 32 bits (two 16-bit
halves) and that the key size is 24 bits. Suppose that at the end of encryption round
fourteen, the value of the intermediate block (in hexadecimal) is DE7F03A6. Then
LE14 = DE7F and RE14 = 03A6. Also assume that the value of K15 is 12DE52.
After round 15, we have LE15 = 03A6 and RE15 = F(03A6, 12DE52)⊕DE7F.

Now let’s look at the decryption. We assume that LD1 = RE15 and
RD1 = LE15, as shown in Figure 4.3, and we want to demonstrate that LD2 = RE14
and RD2 = LE14. So, we start with LD1 = F(03A6, 12DE52)⊕DE7F and
RD1 = 03A6. Then, from Figure 4.3, LD2 = 03A6 = RE14 and RD2 =
F(03A6, 12DE52)⊕ [F(03A6, 12DE52)⊕DE7F] = DE7F = LE14.
Until the introduction of the Advanced Encryption Standard (AES) in 2001, the
Data Encryption Standard (DES) was the most widely used encryption scheme.
DES was issued in 1977 by the National Bureau of Standards, now the National
Institute of Standards and Technology (NIST), as Federal Information Processing
Standard 46 (FIPS PUB 46). The algorithm itself is referred to as the Data
Encryption Algorithm (DEA).7 For DEA, data are encrypted in 64-bit blocks using
a 56-bit key. The algorithm transforms 64-bit input in a series of steps into a 64-bit
output. The same steps, with the same key, are used to reverse the encryption.
Over the years, DES became the dominant symmetric encryption algorithm,
especially in financial applications. In 1994, NIST reaffirmed DES for federal use
for another five years; NIST recommended the use of DES for applications other
than the protection of classified information. In 1999, NIST issued a new version
of its standard (FIPS PUB 46-3) that indicated that DES should be used only
for legacy systems and that triple DES (which in essence involves repeating the
DES algorithm three times on the plaintext using two or three different keys to
produce the ciphertext) be used. We study triple DES in Chapter 7. Because the
underlying encryption and decryption algorithms are the same for DES and triple
DES, it remains important to understand the DES cipher. This section provides an
overview.For the interested reader, Appendix S provides further detail.
7The terminology is a bit confusing. Until recently, the terms DES and DEA could be used interchange-
ably. However, the most recent edition of the DES document includes a specification of the DEA
described here plus the triple DEA (TDEA) described in Chapter 7. Both DEA and TDEA are part of
the Data Encryption Standard. Further, until the recent adoption of the official term TDEA, the triple
DEA algorithm was typically referred to as triple DES and written as 3DES. For the sake of convenience,
we will use the term 3DES.
Figure 4.4 Feistel Example
DE7F 03A6
Decryption roundEncryption round
6A306A30 F(03A6, 12DE52) DE7F F(03A6, 12DE52) DE7F
F(03A6, 12DE52)
[F(03A6, 12DE52) DE7F]
= DE7F

DES Encryption
The overall scheme for DES encryption is illustrated in Figure 4.5. As with any
encryption scheme, there are two inputs to the encryption function: the plaintext to
be encrypted and the key. In this case, the plaintext must be 64 bits in length and the
key is 56 bits in length.8
Looking at the left-hand side of the figure, we can see that the processing
of the plaintext proceeds in three phases. First, the 64-bit plaintext passes through
an initial permutation (IP) that rearranges the bits to produce the permuted input.
8Actually, the function expects a 64-bit key as input. However, only 56 of these bits are ever used; the
other 8 bits can be used as parity bits or simply set arbitrarily.
Figure 4.5 General Depiction of DES Encryption Algorithm
Initial permutation
Permuted choice 2Round 1
32-bit swap
Inverse initial
Permuted choice 1
Round 2
Round 16
64-bit plaintext 64-bit key
64-bit ciphertext
Left circular shift
Permuted choice 2 Left circular shift
Permuted choice 2 Left circular shift
64 56
56 64
64 bits

4.3 / A DES EXAMPLE 131
This is followed by a phase consisting of sixteen rounds of the same function, which
involves both permutation and substitution functions. The output of the last (six-
teenth) round consists of 64 bits that are a function of the input plaintext and the
key. The left and right halves of the output are swapped to produce the preoutput.
Finally, the preoutput is passed through a permutation [IP-1] that is the inverse of
the initial permutation function, to produce the 64-bit ciphertext. With the excep-
tion of the initial and final permutations, DES has the exact structure of a Feistel
cipher, as shown in Figure 4.3.
The right-hand portion of Figure 4.5 shows the way in which the 56-bit key is
used. Initially, the key is passed through a permutation function. Then, for each of
the sixteen rounds, a subkey (Ki) is produced by the combination of a left circular
shift and a permutation. The permutation function is the same for each round, but a
different subkey is produced because of the repeated shifts of the key bits.
DES Decryption
As with any Feistel cipher, decryption uses the same algorithm as encryption, except
that the application of the subkeys is reversed. Additionally, the initial and final
permutations are reversed.
We now work through an example and consider some of its implications. Although
you are not expected to duplicate the example by hand, you will find it informative
to study the hex patterns that occur from one step to the next.
For this example, the plaintext is a hexadecimal palindrome. The plaintext,
key, and resulting ciphertext are as follows:
Plaintext: 02468aceeca86420
Key: 0f1571c947d9e859
Ciphertext: da02ce3a89ecac3b
Table 4.2 shows the progression of the algorithm. The first row shows the 32-bit
values of the left and right halves of data after the initial permutation. The next 16
rows show the results after each round. Also shown is the value of the 48-bit subkey
generated for each round. Note that Li = Ri- 1. The final row shows the left- and
right-hand values after the inverse initial permutation. These two values combined
form the ciphertext.
The Avalanche Effect
A desirable property of any encryption algorithm is that a small change in either
the plaintext or the key should produce a significant change in the ciphertext. In
particular, a change in one bit of the plaintext or one bit of the key should produce

a change in many bits of the ciphertext. This is referred to as the avalanche effect.
If the change were small, this might provide a way to reduce the size of the plaintext
or key space to be searched.
Using the example from Table 4.2, Table 4.3 shows the result when the fourth
bit of the plaintext is changed, so that the plaintext is 12468aceeca86420. The
second column of the table shows the intermediate 64-bit values at the end of each
round for the two plaintexts. The third column shows the number of bits that differ
between the two intermediate values. The table shows that, after just three rounds,
18 bits differ between the two blocks. On completion, the two ciphertexts differ in
32 bit positions.
Table 4.4 shows a similar test using the original plaintext of with two keys that
differ in only the fourth bit position: the original key, 0f1571c947d9e859, and
the altered key, 1f1571c947d9e859. Again, the results show that about half of
the bits in the ciphertext differ and that the avalanche effect is pronounced after just
a few rounds.
Round Ki Li Ri
IP 5a005a00 3cf03c0f
1 1e030f03080d2930 3cf03c0f bad22845
2 0a31293432242318 bad22845 99e9b723
3 23072318201d0c1d 99e9b723 0bae3b9e
4 05261d3824311a20 0bae3b9e 42415649
5 3325340136002c25 42415649 18b3fa41
6 123a2d0d04262a1c 18b3fa41 9616fe23
7 021f120b1c130611 9616fe23 67117cf2
8 1c10372a2832002b 67117cf2 c11bfc09
9 04292a380c341f03 c11bfc09 887fbc6c
10 2703212607280403 887fbc6c 600f7e8b
11 2826390c31261504 600f7e8b f596506e
12 12071c241a0a0f08 f596506e 738538b8
13 300935393c0d100b 738538b8 c6a62c4e
14 311e09231321182a c6a62c4e 56b0bd75
15 283d3e0227072528 56b0bd75 75e8fd8f
16 2921080b13143025 75e8fd8f 25896490
IP−1 da02ce3a 89ecac3b
Note: DES subkeys are shown as eight 6-bit values in hex format
Table 4.2 DES Example

4.3 / A DES EXAMPLE 133
Table 4.3 Avalanche Effect in DES: Change in Plaintext
Round D
9 c11bfc09887fbc6c
10 887fbc6c600f7e8b
11 600f7e8bf596506e
12 f596506e738538b8
13 738538b8c6a62c4e
14 c6a62c4e56b0bd75
15 56b0bd7575e8fd8f
16 75e8fd8f25896490
IP−1 da02ce3a89ecac3b
Round D
1 3cf03c0fbad22845
2 bad2284599e9b723
3 99e9b7230bae3b9e
4 0bae3b9e42415649
5 4241564918b3fa41
6 18b3fa419616fe23
7 9616fe2367117cf2
8 67117cf2c11bfc09
Table 4.4 Avalanche Effect in DES: Change in Key
Round D
1 3cf03c0fbad22845
2 bad2284599e9b723
3 99e9b7230bae3b9e
4 0bae3b9e42415649
5 4241564918b3fa41
6 18b3fa419616fe23
7 9616fe2367117cf2
8 67117cf2c11bfc09
Round D
9 c11bfc09887fbc6c
10 887fbc6c600f7e8b
11 600f7e8bf596506e
12 f596506e738538b8
13 738538b8c6a62c4e
14 c6a62c4e56b0bd75
15 56b0bd7575e8fd8f
16 75e8fd8f25896490
IP−1 da02ce3a89ecac3b

Since its adoption as a federal standard, there have been lingering concerns about
the level of security provided by DES. These concerns, by and large, fall into two
areas: key size and the nature of the algorithm.
The Use of 56-Bit Keys
With a key length of 56 bits, there are 256 possible keys, which is approximately
7.2 * 1016 keys. Thus, on the face of it, a brute-force attack appears impractical.
Assuming that, on average, half the key space has to be searched, a single machine
performing one DES encryption per microsecond would take more than a thousand
years to break the cipher.
However, the assumption of one encryption per microsecond is overly con-
servative. As far back as 1977, Diffie and Hellman postulated that the technology
existed to build a parallel machine with 1 million encryption devices, each of which
could perform one encryption per microsecond [DIFF77]. This would bring the
average search time down to about 10 hours. The authors estimated that the cost
would be about $20 million in 1977 dollars.
With current technology, it is not even necessary to use special, purpose-built
hardware. Rather, the speed of commercial, off-the-shelf processors threaten the
security of DES. A recent paper from Seagate Technology [SEAG08] suggests that
a rate of 1 billion (109) key combinations per second is reasonable for today’s mul-
ticore computers. Recent offerings confirm this. Both Intel and AMD now offer
hardware-based instructions to accelerate the use of AES. Tests run on a contem-
porary multicore Intel machine resulted in an encryption rate of about half a bil-
lion encryptions per second [BASU12]. Another recent analysis suggests that with
contemporary supercomputer technology, a rate of 1013 encryptions per second is
reasonable [AROR12].
With these results in mind, Table 4.5 shows how much time is required for a
brute-force attack for various key sizes. As can be seen, a single PC can break DES in
about a year; if multiple PCs work in parallel, the time is drastically shortened. And
today’s supercomputers should be able to find a key in about an hour. Key sizes of
128 bits or greater are effectively unbreakable using simply a brute-force approach.
Even if we managed to speed up the attacking system by a factor of 1  trillion (1012),
it would still take over 100,000 years to break a code using a 128-bit key.
Fortunately, there are a number of alternatives to DES, the most important of
which are AES and triple DES, discussed in Chapters 6 and 7, respectively.
The Nature of the DES Algorithm
Another concern is the possibility that cryptanalysis is possible by exploiting
the characteristics of the DES algorithm. The focus of concern has been on the
eight substitution tables, or S-boxes, that are used in each iteration (described in
Appendix S). Because the design criteria for these boxes, and indeed for the entire
algorithm, were not made public, there is a suspicion that the boxes were con-
structed in such a way that cryptanalysis is possible for an opponent who knows

Key Size (bits) Cipher
Number of
Time Required at 109
Time Required
at 1013
56 DES 256 ≈ 7.2 * 1016 255 ns = 1.125 years 1 hour
128 AES 2128 ≈ 3.4 * 1038 2127 ns = 5.3 * 1021 years 5.3 * 1017 years
168 Triple DES 2168 ≈ 3.7 * 1050 2167 ns = 5.8 * 1033 years 5.8 * 1029 years
192 AES 2192 ≈ 6.3 * 1057 2191 ns = 9.8 * 1040 years 9.8 * 1036 years
256 AES 2256 ≈ 1.2 * 1077 2255 ns = 1.8 * 1060 years 1.8 * 1056 years
26 characters
Monoalphabetic 2! = 4 * 1026 2 * 1026 ns = 6.3 * 109 years 6.3 * 106 years
Table 4.5 Average Time Required for Exhaustive Key Search
the weaknesses in the S-boxes. This assertion is tantalizing, and over the years a
number of regularities and unexpected behaviors of the S-boxes have been discov-
ered. Despite this, no one has so far succeeded in discovering the supposed fatal
weaknesses in the S-boxes.9
Timing Attacks
We discuss timing attacks in more detail in Part Two, as they relate to public-key
algorithms. However, the issue may also be relevant for symmetric ciphers. In
essence, a timing attack is one in which information about the key or the plaintext is
obtained by observing how long it takes a given implementation to perform decryp-
tions on various ciphertexts. A timing attack exploits the fact that an encryption
or decryption algorithm often takes slightly different amounts of time on different
inputs. [HEVI99] reports on an approach that yields the Hamming weight (number
of bits equal to one) of the secret key. This is a long way from knowing the actual
key, but it is an intriguing first step. The authors conclude that DES appears to be
fairly resistant to a successful timing attack but suggest some avenues to explore.
Although this is an interesting line of attack, it so far appears unlikely that this tech-
nique will ever be successful against DES or more powerful symmetric ciphers such
as triple DES and AES.
Although much progress has been made in designing block ciphers that are cryp-
tographically strong, the basic principles have not changed all that much since the
work of Feistel and the DES design team in the early 1970s. In this section we look
at three critical aspects of block cipher design: the number of rounds, design of the
function F, and key scheduling.
9At least, no one has publicly acknowledged such a discovery.

Number of Rounds
The cryptographic strength of a Feistel cipher derives from three aspects of the
design: the number of rounds, the function F, and the key schedule algorithm. Let
us look first at the choice of the number of rounds.
The greater the number of rounds, the more difficult it is to perform crypt-
analysis, even for a relatively weak F. In general, the criterion should be that the
number of rounds is chosen so that known cryptanalytic efforts require greater
effort than a simple brute-force key search attack. This criterion was certainly used
in the design of DES. Schneier [SCHN96] observes that for 16-round DES, a dif-
ferential cryptanalysis attack is slightly less efficient than brute force: The differen-
tial cryptanalysis attack requires 255.1 operations,10 whereas brute force requires 255.
If DES had 15 or fewer rounds, differential cryptanalysis would require less effort
than a brute-force key search.
This criterion is attractive, because it makes it easy to judge the strength of
an algorithm and to compare different algorithms. In the absence of a cryptana-
lytic breakthrough, the strength of any algorithm that satisfies the criterion can be
judged solely on key length.
Design of Function F
The heart of a Feistel block cipher is the function F, which provides the element of
confusion in a Feistel cipher. Thus, it must be difficult to “unscramble” the substitu-
tion performed by F. One obvious criterion is that F be nonlinear, as we discussed
previously. The more nonlinear F, the more difficult any type of cryptanalysis will be.
There are several measures of nonlinearity, which are beyond the scope of this
book. In rough terms, the more difficult it is to approximate F by a set of linear
equations, the more nonlinear F is.
Several other criteria should be considered in designing F. We would like the
algorithm to have good avalanche properties. Recall that, in general, this means that
a change in one bit of the input should produce a change in many bits of the output.
A more stringent version of this is the strict avalanche criterion (SAC) [WEBS86],
which states that any output bit j of an S-box (see Appendix S for a discussion of
S-boxes) should change with probability 1/2 when any single input bit i is inverted
for all i, j. Although SAC is expressed in terms of S-boxes, a similar criterion could
be applied to F as a whole. This is important when considering designs that do not
include S-boxes.
Another criterion proposed in [WEBS86] is the bit independence criterion
(BIC), which states that output bits j and k should change independently when any
single input bit i is inverted for all i, j, and k. The SAC and BIC criteria appear to
strengthen the effectiveness of the confusion function.
10Differential cryptanalysis of DES requires 247 chosen plaintext. If all you have to work with is known
plaintext, then you must sort through a large quantity of known plaintext–ciphertext pairs looking for the
useful ones. This brings the level of effort up to 255.1.

Key Schedule Algorithm
With any Feistel block cipher, the key is used to generate one subkey for each round.
In general, we would like to select subkeys to maximize the difficulty of deducing
individual subkeys and the difficulty of working back to the main key. No general
principles for this have yet been promulgated.
Adams suggests [ADAM94] that, at minimum, the key schedule should guar-
antee key/ciphertext Strict Avalanche Criterion and Bit Independence Criterion.
Key Terms
avalanche effect
block cipher
Data Encryption Standard
Feistel cipher
irreversible mapping
product cipher
reversible mapping
round function
Review Questions
4.1 Briefly define a nonsingular transformation.
4.2 What is the difference between a block cipher and a stream cipher?
4.3 Why is it not practical to use an arbitrary reversible substitution cipher of the kind
shown in Table 4.1?
4.4 Briefly define the terms substitution and permutation.
4.5 What is the difference between diffusion and confusion?
4.6 Which parameters and design choices determine the actual algorithm of a Feistel
4.7 What are the critical aspects of Feistel cipher design?
4.1 a. In Section 4.1, under the subsection on the motivation for the Feistel cipher struc-
ture, it was stated that, for a block of n bits, the number of different reversible
mappings for the ideal block cipher is 2n!. Justify.
b. In that same discussion, it was stated that for the ideal block cipher, which allows all
possible reversible mappings, the size of the key is n * 2n bits. But, if there are 2n!
possible mappings, it should take log2 2
n! bits to discriminate among the different
mappings, and so the key length should be log2 2
n!. However, log2 2
n! 6 n * 2n.
Explain the discrepancy.

4.2 Consider a Feistel cipher composed of sixteen rounds with a block length of 128 bits
and a key length of 128 bits. Suppose that, for a given k, the key scheduling algorithm
determines values for the first eight round keys, k1, k2, c k8, and then sets
k9 = k8, k10 = k7, k11 = k6, c , k16 = k1
Suppose you have a ciphertext c. Explain how, with access to an encryption oracle,
you can decrypt c and determine m using just a single oracle query. This shows that
such a cipher is vulnerable to a chosen plaintext attack. (An encryption oracle can be
thought of as a device that, when given a plaintext, returns the corresponding cipher-
text. The internal details of the device are not known to you and you cannot break
open the device. You can only gain information from the oracle by making queries to
it and observing its responses.)
4.3 Let p be a permutation of the integers 0, 1, 2, c , (2n – 1), such that p(m) gives the
permuted value of m, 0 … m 6 2n. Put another way, p maps the set of n-bit integers
into itself and no two integers map into the same integer. DES is such a permutation
for 64-bit integers. We say that p has a fixed point at m if p(m) = m. That is, if p is
an encryption mapping, then a fixed point corresponds to a message that encrypts to
itself. We are interested in the number of fixed points in a randomly chosen permuta-
tion p. Show the somewhat unexpected result that the number of fixed points for p is
1 on an average, and this number is independent of the size of the permutation.
4.4 Consider a block encryption algorithm that encrypts blocks of length n, and let
N = 2n. Say we have t plaintext–ciphertext pairs Pi, Ci = E(K, Pi), where we assume
that the key K selects one of the N! possible mappings. Imagine that we wish to find K
by exhaustive search. We could generate key K′ and test whether Ci = E(K′, Pi) for
1 … i … t. If K′ encrypts each Pi to its proper Ci, then we have evidence that K = K′.
However, it may be the case that the mappings E(K, # ) and E(K′, # ) exactly agree
on the t plaintext–cipher text pairs Pi, Ci and agree on no other pairs.
a. What is the probability that E(K, # ) and E(K′, # ) are in fact distinct mappings?
b. What is the probability that E(K, # ) and E(K′, # ) agree on another t′ plaintext–
ciphertext pairs where 0 … t′ … N – t?
4.5 For any block cipher, the fact that it is a nonlinear function is crucial to its security. To
see this, suppose that we have a linear block cipher EL that encrypts 256-bit blocks
of plaintext into 256-bit blocks of ciphertext. Let EL(k, m) denote the encryption of a
256-bit message m under a key k (the actual bit length of k is irrelevant). Thus,
EL(k, [m1⊕ m2]) = EL(k, m1)⊕ EL(k, m2) for all 128@bit patterns m1, m2.
Describe how, with 256 chosen ciphertexts, an adversary can decrypt any ciphertext
without knowledge of the secret key k. (A “chosen ciphertext” means that an adver-
sary has the ability to choose a ciphertext and then obtain its decryption. Here, you
have 256 plaintext/ciphertext pairs to work with and you have the ability to choose
the value of the ciphertexts.)
4.6 Suppose the DES F function mapped every 32-bit input R, regardless of the value of
the input K, to;
a. 32-bit string of zero
b. R
1. What function would DES then compute?
2. What would the decryption look like?
Hint: Use the following properties of the XOR operation:
(A⊕ B)⊕ C = A⊕ (B⊕ C)
(A⊕ A) = 0
(A⊕ 0 ) = A
A⊕ 1 = bitwise complement of A

A,B,C are n-bit strings of bits
0 is an n-bit string of zeros
1 is an n-bit string of one
4.7 Show that DES decryption is, in fact, the inverse of DES encryption.
4.8 The 32-bit swap after the sixteenth iteration of the DES algorithm is needed to make
the encryption process invertible by simply running the ciphertext back through the
algorithm with the key order reversed. This was demonstrated in the preceding prob-
lem. However, it still may not be entirely clear why the 32-bit swap is needed. To
demonstrate why, solve the following exercises. First, some notation:
A ‘B = the concatenation of the bit strings A and B
Ti(R ‘L) = the transformation defined by the ith iteration of the encryption
algorithm for 1 … I … 16
TDi(R ‘L) = the transformation defined by the ith iteration of the decryption
algorithm for 1 … I … 16
T17(R ‘L) = L ‘R, where this transformation occurs after the sixteenth iteration
of the encryption algorithm
a. Show that the composition TD1(IP(IP
-1(T17(T16(L15 ‘R15))))) is equivalent to the
transformation that interchanges the 32-bit halves, L15 and R15. That is, show that
-1(T17(T16(L15 ‘R15))))) = R15 ‘L15
b. Now suppose that we did away with the final 32-bit swap in the encryption algo-
rithm. Then we would want the following equality to hold:
-1(T16(L15 ‘R15)))) = L15 ‘R15
Does it?
Note: The following problems refer to details of DES that are described in Appendix S.
4.9 Consider the substitution defined by row 1 of S-box S1 in Table S.2. Show a block
diagram similar to Figure 4.2 that corresponds to this substitution.
4.10 Compute the bits number 4, 17, 41, and 45 at the output of the first round of the DES
decryption, assuming that the ciphertext block is composed of all ones and the exter-
nal key is composed of all ones.
4.11 This problem provides a numerical example of encryption using a one-round version
of DES. We start with the same bit pattern for the key K and the plaintext, namely:
Hexadecimal notation: 0 1 2 3 4 5 6 7 8 9 A B C D E F
Binary notation: 0000 0001 0010 0011 0100 0101 0110 0111
1000 1001 1010 1011 1100 1101 1110 1111
a. Derive K1, the first-round subkey.
b. Derive L0, R0.
c. Expand R0 to get E[R0], where E[ # ] is the expansion function of Table S.1.
d. Calculate A = E[R0]⊕ K1.
e. Group the 48-bit result of (d) into sets of 6 bits and evaluate the corresponding
S-box substitutions.
f. Concatenate the results of (e) to get a 32-bit result, B.

g. Apply the permutation to get P(B).
h. Calculate R1 = P(B)⊕ L0.
i. Write down the ciphertext.
4.12 Analyze the amount of left shifts in the DES key schedule by studying Table S.3 (d).
Is there a pattern? What could be the reason for the choice of these constants?
4.13 When using the DES algorithm for decryption, the 16 keys (K1, K2, c , K16) are
used in reverse order. Therefore, the right-hand side of Figure S.1 is not valid for
decryption. Design a key-generation scheme with the appropriate shift schedule
(analogous to Table S.3d) for the decryption process.
4.14 a. Let X′ be the bitwise complement of X. Prove that if the complement of the
plaintext block is taken and the complement of an encryption key is taken, then
the result of DES encryption with these values is the complement of the original
ciphertext. That is,

If Y = E(K, X)
Then Y′ = E(K′, X′)

Hint: Begin by showing that for any two bit strings of equal length, A and B,
(A⊕ B)′ = A′ ⊕ B.
b. It has been said that a brute-force attack on DES requires searching a key space of
256 keys. Does the result of part (a) change that?
4.15 a. We say that a DES key K is weak if DESK is an involution. Exhibit four weak
keys for DES.
b. We say that a DES key K is semi-weak if it is not weak and if there exists a key K′
such that DESK
– 1 = DESK′. Exhibit four semi-weak keys for DES.
Note: The following problems refer to simplified DES, described in Appendix G.
4.16 Refer to Figure G.3, which explains encryption function for S-DES.
a. How important is the initial permutation IP?
b. How important is the SW function in the middle?
4.17 The equations for the variables q and r for S-DES are defined in the section on
S-DES analysis. Provide the equations for s and t.
4.18 Using S-DES, decrypt the string 01000110 using the key 1010000010 by hand.
Show intermediate results after each function (IP, FK, SW, FK, IP
-1). Then decode
the first 4 bits of the plaintext string to a letter and the second 4 bits to another letter
where we encode A through P in base 2 (i.e., A = 0000, B = 0001, c , P = 1111).
Hint: As a midway check, after the xoring with K2, the string should be 11000001.
Programming Problems
4.19 Create software that can encrypt and decrypt using a general substitution block
4.20 Create software that can encrypt and decrypt using S-DES. Test data: use plaintext,
ciphertext, and key of Problem 4.18.

5.1 Groups
Abelian Group
Cyclic Group
5.2 Rings
5.3 Fields
5.4 Finite Fields of the Form GF(p)
Finite Fields of Order p
Finding the Multiplicative Inverse in GF(p)
5.5 Polynomial Arithmetic
Ordinary Polynomial Arithmetic
Polynomial Arithmetic with Coefficients in Zp
Finding the Greatest Common Divisor
5.6 Finite Fields of the form GF(2n)
Modular Polynomial Arithmetic
Finding the Multiplicative Inverse
Computational Considerations
Using a Generator
5.7 Key Terms, Review Questions, and Problems
Finite Fields

Finite fields have become increasingly important in cryptography. A number of
cryptographic algorithms rely heavily on properties of finite fields, notably the
Advanced Encryption Standard (AES) and elliptic curve cryptography. Other exam-
ples include the message authentication code CMAC and the authenticated encryption
scheme GCM.
This chapter provides the reader with sufficient background on the concepts of
finite fields to be able to understand the design of AES and other cryptographic algo-
rithms that use finite fields. Because students unfamiliar with abstract algebra may find
the concepts behind finite fields somewhat difficult to grasp, we approach the topic in a
way designed to enhance understanding. Our plan of attack is as follows:
1. Fields are a subset of a larger class of algebraic structures called rings, which
are in turn a subset of the larger class of groups. In fact, as shown in Figure 5.1,
both groups and rings can be further differentiated. Groups are defined by
a simple set of properties and are easily understood. Each successive subset
(abelian group, ring, commutative ring, and so on) adds additional properties
and is thus more complex. Sections 5.1 through 5.3 will examine groups, rings,
and fields, successively.
2. Finite fields are a subset of fields, consisting of those fields with a finite num-
ber of elements. These are the class of fields that are found in cryptographic
algorithms. With the concepts of fields in hand, we turn in Section 5.4 to a
specific class of finite fields, namely those with p elements, where p is prime.
Certain asymmetric cryptographic algorithms make use of such fields.
3. A more important class of finite fields, for cryptography, comprises those with
2n elements depicted as fields of the form GF(2n). These are used in a wide
variety of cryptographic algorithms. However, before discussing these fields, we
need to analyze the topic of polynomial arithmetic, which is done in Section 5.5.
4. With all of this preliminary work done, we are able at last, in Section 5.6, to
discuss finite fields of the form GF(2n).
Before proceeding, the reader may wish to review Sections 2.1 through 2.3, which
cover relevant topics in number theory.
After studying this chapter, you should be able to:
◆ Distinguish among groups, rings, and fields.
◆ Define finite fields of the form GF(p).
◆ Explain the differences among ordinary polynomial arithmetic, polynomial
arithmetic with coefficients in Zp, and modular polynomial arithmetic in
◆ Define finite fields of the form GF(2n).
◆ Explain the two different uses of the mod operator.

5.1 / GROUPS 143
Groups, rings, and fields are the fundamental elements of a branch of mathematics
known as abstract algebra, or modern algebra. In abstract algebra, we are concerned
with sets on whose elements we can operate algebraically; that is, we can combine
two elements of the set, perhaps in several ways, to obtain a third element of the set.
These operations are subject to specific rules, which define the nature of the set. By
convention, the notation for the two principal classes of operations on set elements is
usually the same as the notation for addition and multiplication on ordinary numbers.
However, it is important to note that, in abstract algebra, we are not limited to ordi-
nary arithmetical operations. All this should become clear as we proceed.
A group G, sometimes denoted by {G, # }, is a set of elements with a binary opera-
tion denoted by # that associates to each ordered pair (a, b) of elements in G an
element (a # b) in G, such that the following axioms are obeyed:1
(A1) Closure: If a and b belong to G, then a # b is also in G.
(A2) Associative: a # (b # c) = (a # b) # c for all a, b, c in G.
1 The operator # is generic and can refer to addition, multiplication, or some other mathematical operation.
Figure 5.1 Groups, Rings, and Fields
Abelian groups
Commutative rings
Integral domains

(A3) Identity element: There is an element e in G such that
a # e = e # a = a for all a in G.
(A4) Inverse element: For each a in G, there is an element a′ in G
such that a # a′ = a′ # a = e.
Let Nn denote a set of n distinct symbols that, for convenience, we represent as
{1, 2, c , n}. A permutation of n distinct symbols is a one-to-one mapping from
Nn to Nn.
2 Define Sn to be the set of all permutations of n distinct symbols. Each
element of Sn is represented by a permutation p of the integers in 1, 2, . . . , n.
It is easy to demonstrate that Sn is a group:
A1: If (p, r∈ Sn), then the composite mapping p # r is formed by per-
muting the elements of r according to the permutation p. For
example, {3, 2, 1} # {1, 3, 2} = {2, 3, 1}. The notation for this map-
ping is explained as follows: The value of the first element of p
indicates which element of r is to be in the first position in p # r; the
value of the second element of p indicates which element of r is
to be in the second position in p # r; and so on. Clearly, p # r∈ Sn.
A2: The composition of mappings is also easily seen to be associative.
A3: The identity mapping is the permutation that does not alter the
order of the n elements. For Sn, the identity element is {1, 2, c , n}.
A4: For any p∈ Sn, the mapping that undoes the permutation defined
by p is the inverse element for p. There will always be such an
inverse. For example {2, 3, 1} # {3, 1, 2} = {1, 2, 3}.
2This is equivalent to the definition of permutation in Chapter 2, which stated that a permutation of a
finite set of elements S is an ordered sequence of all the elements of S, with each element appearing
exactly once.
The set of integers (positive, negative, and 0) under addition is an abelian group.
The set of nonzero real numbers under multiplication is an abelian group. The
set Sn from the preceding example is a group but not an abelian group for n 7 2.
If a group has a finite number of elements, it is referred to as a finite group, and
the order of the group is equal to the number of elements in the group. Otherwise,
the group is an infinite group.
Abelian Group
A group is said to be abelian if it satisfies the following additional condition:
(A5) Commutative: a # b = b # a for all a, b in G.

5.2 / RINGS 145
When the group operation is addition, the identity element is 0; the in-
verse element of a is -a; and subtraction is defined with the following rule:
a – b = a + (-b).
Cyclic Group
We define exponentiation within a group as a repeated application of the group
operator, so that a3 = a # a # a. Furthermore, we define a0 = e as the identity ele-
ment, and a-n = (a′)n, where a′ is the inverse element of a within the group.
A group G is cyclic if every element of G is a power ak (k is an integer) of a fixed
element a∈G. The element a is said to generate the group G or to be a generator
of G. A cyclic group is always abelian and may be finite or infinite.
The additive group of integers is an infinite cyclic group generated by the element
1. In this case, powers are interpreted additively, so that n is the nth power of 1.
A ring R, sometimes denoted by {R, + , * }, is a set of elements with two binary
operations, called addition and multiplication,3 such that for all a, b, c in R the fol-
lowing axioms are obeyed.
(A1–A5) R is an abelian group with respect to addition; that is, R satisfies axioms
A1 through A5. For the case of an additive group, we denote the identity element
as 0 and the inverse of a as -a.
(M1) Closure under multiplication: If a and b belong to R, then ab is also in R.
(M2) Associativity of multiplication: a(bc) = (ab)c for all a, b, c in R.
(M3) Distributive laws: a(b + c) = ab + ac for all a, b, c in R.
(a + b)c = ac + bc for all a, b, c in R.
In essence, a ring is a set of elements in which we can do addition, subtraction
[a – b = a + (-b)], and multiplication without leaving the set.
3Generally, we do not use the multiplication symbol, * , but denote multiplication by the concatenation
of two elements.
With respect to addition and multiplication, the set of all n-square matrices over
the real numbers is a ring.
A ring is said to be commutative if it satisfies the following additional condition:
(M4) Commutativity of multiplication: ab = ba for all a, b in R.

Next, we define an integral domain, which is a commutative ring that obeys
the following axioms.
(M5) Multiplicative identity: There is an element 1 in R such that
a1 = 1a = a for all a in R.
(M6) No zero divisors: If a, b in R and ab = 0, then either a = 0
or b = 0.
Let S be the set of even integers (positive, negative, and 0) under the usual
operations of addition and multiplication. S is a commutative ring. The set of all
n-square matrices defined in the preceding example is not a commutative ring.
The set Zn of integers {0, 1, c , n – 1}, together with the arithmetic oper-
ations modulo n, is a commutative ring (Table 4.3).
Let S be the set of integers (positive, negative, and 0) under the usual operations
of addition and multiplication. S is an integral domain.
Familiar examples of fields are the rational numbers, the real numbers, and the
complex numbers. Note that the set of all integers is not a field, because not every
element of the set has a multiplicative inverse; in fact, only the elements 1 and -1
have multiplicative inverses in the integers.
A field F, sometimes denoted by {F, + , * }, is a set of elements with two binary
operations, called addition and multiplication, such that for all a, b, c in F the follow-
ing axioms are obeyed.
(A1–M6) F is an integral domain; that is, F satisfies axioms A1 through A5 and
M1 through M6.
(M7) Multiplicative inverse: For each a in F, except 0, there is an element
a-1 in F such that aa-1 = (a-1)a = 1.
In essence, a field is a set of elements in which we can do addition, subtraction,
multiplication, and division without leaving the set. Division is defined with the fol-
lowing rule: a/b = a(b-1).
In gaining insight into fields, the following alternate characterization may be
useful. A field F, denoted by {F, +}, is a set of elements with two binary operations,
called addition and multiplication, such that the following conditions hold:
1. F forms an abelian group with respect to addition.
2. The nonzero elements of F form an abelian group with respect to multiplication.

3. The distributive law holds. That is, for all a, b, c in F,
a(b + c) = ab + ac.
(a + b)c = ac + bc
4. Figure 5.2 summarizes the axioms that define groups, rings, and fields.
In Section 5.3, we defined a field as a set that obeys all of the axioms of Figure 5.2
and gave some examples of infinite fields. Infinite fields are not of particular inter-
est in the context of cryptography. However, in addition to infinite fields, there are
two types of finite fields, as illustrated in Figure 5.3. Finite fields play a crucial role
in many cryptographic algorithms.
It can be shown that the order of a finite field (number of elements in the
field) must be a power of a prime pn, where n is a positive integer. The finite field
of order pn is generally written GF(pn); GF stands for Galois field, in honor of the
mathematician who first studied finite fields. Two special cases are of interest for
our purposes. For n = 1, we have the finite field GF(p); this finite field has a differ-
ent structure than that for finite fields with n 7 1 and is studied in this section. For
finite fields of the form GF(pn), GF(2n) fields are of particular cryptographic inter-
est, and these are covered in Section 5.6.
Finite Fields of Order p
For a given prime, p, we define the finite field of order p, GF(p), as the set Zp of integers
{0, 1, c , p – 1} together with the arithmetic operations modulo p. Note therefore
that we are using ordinary modular arithmetic to define the operations over these fields.
Figure 5.2 Properties of Groups, Rings, and Fields
(A1) Closure under addition: If a and b belong to S, then a + b is also in S
(A2) Associativity of addition: a + (b + c) = (a + b) + c for all a, b, c in S
(A3) Additive identity: There is an element 0 in R such that
a + 0 = 0 + a = a for all a in S
(A4) Additive inverse: For each a in S there is an element –a in S
such that a + (–a) = (–a) + a = 0
(A5) Commutativity of addition: a + b = b + a for all a, b in S
(M1) Closure under multiplication: If a and b belong to S, then ab is also in S
(M2) Associativity of multiplication: a(bc) = (ab)c for all a, b, c in S
(M3) Distributive laws: a(b + c) = ab + ac for all a, b, c in S
(a + b)c = ac + bc for all a, b, c in S
(M4) Commutativity of multiplication: ab = ba for all a, b in S
(M5) Multiplicative identity: There is an element 1 in S such that
a1 = 1a = a for all a in S
(M6) No zero divisors: If a, b in S and ab = 0, then either
a = 0 or b = 0
(M7) Multiplicative inverse: If a belongs to S and a ≠ 0, there is an
element a –1 in S such that aa –1 = a –1a = 1

Recall that we showed in Section 5.2 that the set Zn of integers {0, 1, c , n – 1},
together with the arithmetic operations modulo n, is a commutative ring (Table 2.5).
We further observed that any integer in Zn has a multiplicative inverse if and only if
that integer is relatively prime to n [see discussion of Equation (2.5)].4 If n is prime,
then all of the nonzero integers in Zn are relatively prime to n, and therefore there
exists a multiplicative inverse for all of the nonzero integers in Zn. Thus, for Zp we
can add the following properties to those listed in Table 5.2:
inverse (w-1)
For each w ∈ Zp, w ≠ 0, there exists a z∈ Zp
such that w * z K 1 (mod p)
Because w is relatively prime to p, if we multiply all the elements of Zp by
w, the resulting residues are all of the elements of Zp permuted. Thus, exactly one
of the residues has the value 1. Therefore, there is some integer in Zp that, when
multiplied by w, yields the residue 1. That integer is the multiplicative inverse of w,
designated w-1. Therefore, Zp is in fact a finite field. Furthermore, Equation (2.5) is
consistent with the existence of a multiplicative inverse and can be rewritten with-
out the condition:
if (a * b) K (a * c)(mod p) then b K c(mod p) (5.1)
Multiplying both sides of Equation (5.1) by the multiplicative inverse of a, we have
((a-1) * a * b) K ((a-1) * a * c)(mod p)
b K c (mod p)
4As stated in the discussion of Equation (2.5), two integers are relatively prime if their only common
positive integer factor is 1.
Figure 5.3 Types of Fields
Fields with an
infinite number
of elements
Finite fields
Finite fields
with p elements
Finite fields
with pn elements
The simplest finite field is GF(2). Its arithmetic operations are easily summarized:
+ 0 1
0 0 1
1 1 0
* 0 1
0 0 0
1 0 1
w -w w-1
0 0 –
1 1 1
In this case, addition is equivalent to the exclusive-OR (XOR) operation, and
multiplication is equivalent to the logical AND operation.

The right-hand side of Table 5.1 shows arithmetic operations in GF(7). This is a
field of order 7 using modular arithmetic modulo 7. As can be seen, it satisfies all
of the properties required of a field (Figure 5.2). Compare with the left-hand side
of Table 5.1, which reproduces Table 2.2. In the latter case, we see that the set Z8,
using modular arithmetic modulo 8, is not a field. Later in this chapter, we show
how to define addition and multiplication operations on Z8 in such a way as to
form a finite field.
Finding the Multiplicative Inverse in GF(p)
It is easy to find the multiplicative inverse of an element in GF(p) for small values
of p. You simply construct a multiplication table, such as shown in Table 5.1e, and
the desired result can be read directly. However, for large values of p, this approach
is not practical.
If a and b are relatively prime, then b has a multiplicative inverse modulo a.
That is, if gcd(a, b) = 1, then b has a multiplicative inverse modulo a. That is, for
positive integer b 6 a, there exists a b-1 6 a such that bb-1 = 1 mod a. If a is a
prime number and b 6 a, then clearly a and b are relatively prime and have a great-
est common divisor of 1. We now show that we can easily compute b-1 using the
extended Euclidean algorithm.
We repeat here Equation (2.7), which we showed can be solved with the ex-
tended Euclidean algorithm:
ax + by = d = gcd(a, b)
Now, if gcd(a, b) = 1, then we have ax + by = 1. Using the basic equalities of
modular arithmetic, defined in Section 2.3, we can say
[(ax mod a) + (by mod a)] mod a = 1 mod a
0 + (by mod a) = 1
But if by mod a = 1, then y = b-1. Thus, applying the extended Euclidean
algorithm to Equation (2.7) yields the value of the multiplicative inverse of b if
gcd(a, b) = 1.
Consider the example that was shown in Table 2.4. Here we have a = 1759,
which is a prime number, and b = 550. The solution of the equation
1759x + 550y = d yields a value of y = 355. Thus, b-1 = 355. To verify, we cal-
culate 550 * 355 mod 1759 = 195250 mod 1759 = 1.
More generally, the extended Euclidean algorithm can be used to find a
multiplicative inverse in Zn for any n. If we apply the extended Euclidean algorithm
to the equation nx + by = d, and the algorithm yields d = 1, then y = b-1 in Zn.

+ 0 1 2 3 4 5 6 7
0 0 1 2 3 4 5 6 7
1 1 2 3 4 5 6 7 0
2 2 3 4 5 6 7 0 1
3 3 4 5 6 7 0 1 2
4 4 5 6 7 0 1 2 3
5 5 6 7 0 1 2 3 4
6 6 7 0 1 2 3 4 5
7 7 0 1 2 3 4 5 6
(a) Addition modulo 8
* 0 1 2 3 4 5 6 7
0 0 0 0 0 0 0 0 0
1 0 1 2 3 4 5 6 7
2 0 2 4 6 0 2 4 6
3 0 3 6 1 4 7 2 5
4 0 4 0 4 0 4 0 4
5 0 5 2 7 4 1 6 3
6 0 6 4 2 0 6 4 2
7 0 7 6 5 4 3 2 1
(b) Multiplication modulo 8
w 0 1 2 3 4 5 6 7
-w 0 7 6 5 4 3 2 1
w-1 — 1 — 3 — 5 — 7
(c) Additive and multiplicative
inverses modulo 8
+ 0 1 2 3 4 5 6
0 0 1 2 3 4 5 6
1 1 2 3 4 5 6 0
2 2 3 4 5 6 0 1
3 3 4 5 6 0 1 2
4 4 5 6 0 1 2 3
5 5 6 0 1 2 3 4
6 6 0 1 2 3 4 5
(d) Addition modulo 7
* 0 1 2 3 4 5 6
0 0 0 0 0 0 0 0
1 0 1 2 3 4 5 6
2 0 2 4 6 1 3 5
3 0 3 6 2 5 1 4
4 0 4 1 5 2 6 3
5 0 5 3 1 6 4 2
6 0 6 5 4 3 2 1
(e) Multiplication modulo 7
w 0 1 2 3 4 5 6
-w 0 6 5 4 3 2 1
w-1 — 1 4 5 2 3 6
(f) Additive and multiplicative
inverses modulo 7
Table 5.1 Arithmetic Modulo 8 and Modulo 7
In this section, we have shown how to construct a finite field of order p, where p is
prime. Specifically, we defined GF(p) with the following properties.
1. GF(p) consists of p elements.
2. The binary operations + and * are defined over the set. The operations of
addition, subtraction, multiplication, and division can be performed without
leaving the set. Each element of the set other than 0 has a multiplicative in-
verse, and division is performed by multiplication by the multiplicative inverse.
We have shown that the elements of GF(p) are the integers {0, 1, c , p – 1}
and that the arithmetic operations are addition and multiplication mod p.

Before continuing our discussion of finite fields, we need to introduce the interest-
ing subject of polynomial arithmetic. We are concerned with polynomials in a single
variable x, and we can distinguish three classes of polynomial arithmetic (Figure 5.4).
■ Ordinary polynomial arithmetic, using the basic rules of algebra.
■ Polynomial arithmetic in which the arithmetic on the coefficients is performed
modulo p; that is, the coefficients are in GF(p).
■ Polynomial arithmetic in which the coefficients are in GF(p), and the poly-
nomials are defined modulo a polynomial m(x) whose highest power is some
integer n.
This section examines the first two classes, and the next section covers the
last class.
Ordinary Polynomial Arithmetic
A polynomial of degree n (integer n Ú 0) is an expression of the form
f(x) = anxn + an – 1xn – 1 + g + a1x + a0 = a
where the ai are elements of some designated set of numbers S, called the coefficient
set, and an ≠ 0. We say that such polynomials are defined over the coefficient set S.
A zero-degree polynomial is called a constant polynomial and is simply an
element of the set of coefficients. An nth-degree polynomial is said to be a monic
polynomial if an = 1.
In the context of abstract algebra, we are usually not interested in evaluating a
polynomial for a particular value of x [e.g., f(7)]. To emphasize this point, the vari-
able x is sometimes referred to as the indeterminate.
Polynomial arithmetic includes the operations of addition, subtraction, and
multiplication. These operations are defined in a natural way as though the variable
Figure 5.4 Treatment of Polynomials
Polynomial f(x)
x treated as a variable,
and evaluated for
a particular value of x
x treated as an
Arithmetic on
coefficients is
modulo p
Arithmetic on coefficients is
performed modulo p
and polynomials are defined
modulo a polynomial m(x)

x was an element of S. Division is similarly defined, but requires that S be a field.
Examples of fields include the real numbers, rational numbers, and Zp for p prime.
Note that the set of all integers is not a field and does not support polynomial
Addition and subtraction are performed by adding or subtracting correspond-
ing coefficients. Thus, if
f(x) = a
i; g(x) = a
i; n Ú m
then addition is defined as
f(x) + g(x) = a
(ai + bi)xi + a
i=m + 1
and multiplication is defined as
f(x) * g(x) = a
n + m
ck = a0bk + a1bk – 1 + g + ak – 1b1 + akb0
In the last formula, we treat ai as zero for i 7 n and bi as zero for i 7 m. Note that
the degree of the product is equal to the sum of the degrees of the two polynomials.
As an example, let f(x) = x3 + x2 + 2 and g(x) = x2 – x + 1, where S is the set
of integers. Then
f(x) + g(x) = x3 + 2×2 – x + 3
f(x) – g(x) = x3 + x + 1
f(x) * g(x) = x5 + 3×2 – 2x + 2
Figures 5.5a through 5.5c show the manual calculations. We comment on division
Polynomial Arithmetic with Coefficients in Zp
Let us now consider polynomials in which the coefficients are elements of some
field F; we refer to this as a polynomial over the field F. In this case, it is easy to
show that the set of such polynomials is a ring, referred to as a polynomial ring. That
is, if we consider each distinct polynomial to be an element of the set, then that set
is a ring.5
When polynomial arithmetic is performed on polynomials over a field, then
division is possible. Note that this does not mean that exact division is possible. Let
5In fact, the set of polynomials whose coefficients are elements of a commutative ring forms a polynomial
ring, but that is of no interest in the present context.

us clarify this distinction. Within a field, given two elements a and b, the quotient
a/b is also an element of the field. However, given a ring R that is not a field, in gen-
eral, division will result in both a quotient and a remainder; this is not exact division.
Figure 5.5 Examples of Polynomial Arithmetic
+ +x2
x2 x
+–+ ( )
× ( )
– ( )
+ 3
(a) Addition
(d) Division(c) Multiplication
+ +x2
+ x2
x2 x
x 2
x3 – x2
+ x
– x
+ 2
2×2– 2x + 2
x4 –– –x3 2x
– 2x
x5 + +x4 2×2
x5 +3×2
+– 1 x2 x +– 1
+ 2
+ 2
+ +x2
x2 x
+ 1
(b) Subtraction
Consider the division 5/3 within a set S. If S is the set of rational numbers, which
is a field, then the result is simply expressed as 5/3 and is an element of S. Now
suppose that S is the field Z7. In this case, we calculate (using Table 5.1f)
5/3 = (5 * 3-1) mod 7 = (5 * 5) mod 7 = 4
which is an exact solution. Finally, suppose that S is the set of integers, which is a
ring but not a field. Then 5/3 produces a quotient of 1 and a remainder of 2:
5/3 = 1 + 2/3
5 = 1 * 3 + 2
Thus, division is not exact over the set of integers.
Now, if we attempt to perform polynomial division over a coefficient set that
is not a field, we find that division is not always defined.
If the coefficient set is the integers, then (5×2)/(3x) does not have a solution,
because it would require a coefficient with a value of 5/3, which is not in the coef-
ficient set. Suppose that we perform the same polynomial division over Z7. Then
we have (5×2)/(3x) = 4x, which is a valid polynomial over Z7.
However, as we demonstrate presently, even if the coefficient set is a field,
polynomial division is not necessarily exact. In general, division will produce a quo-
tient and a remainder. We can restate the division algorithm of Equation (2.1) for
polynomials over a field as follows. Given polynomials f(x) of degree n and g(x)

of degree (m), (n Ú m), if we divide f(x) by g(x), we get a quotient q(x) and a
remainder r(x) that obey the relationship
f(x) = q(x)g(x) + r(x) (5.2)
with polynomial degrees:
Degree f(x) = n
Degree g(x) = m
Degree q(x) = n – m
Degree r(x) … m – 1
With the understanding that remainders are allowed, we can say that poly-
nomial division is possible if the coefficient set is a field. One common technique
used for polynomial division is polynomial long division, similar to long division for
integers. Examples of this are shown subsequently.
In an analogy to integer arithmetic, we can write f(x) mod g(x) for the remain-
der r(x) in Equation (5.2). That is, r(x) = f(x) mod g(x). If there is no remainder
[i.e., r(x) = 0], then we can say g(x) divides f(x), written as g(x) � f(x). Equivalently,
we can say that g(x) is a factor of f(x) or g(x) is a divisor of f(x).
For the preceding example [f(x) = x3 + x2 + 2 and g(x) = x2 – x + 1], f(x)/g(x)
produces a quotient of q(x) = x + 2 and a remainder r(x) = x, as shown in
Figure 5.5d. This is easily verified by noting that
q(x)g(x) + r(x) = (x + 2)(x2 – x + 1) + x = (x3 + x2 – x + 2) + x
= x3 + x2 + 2 = f(x)
For our purposes, polynomials over GF(2) are of most interest. Recall from
Section 5.4 that in GF(2), addition is equivalent to the XOR operation, and multi-
plication is equivalent to the logical AND operation. Further, addition and subtrac-
tion are equivalent mod 2:
1 + 1 = 1 – 1 = 0
1 + 0 = 1 – 0 = 1
0 + 1 = 0 – 1 = 1
Figure 5.6 shows an example of polynomial arithmetic over GF(2). For
f(x) = (x7 + x5 + x4 + x3 + x + 1) and g(x) = (x3 + x + 1), the figure shows
f(x) + g(x); f(x) – g(x); f(x) * g(x); and f(x)/g(x). Note that g(x) � f(x).
A polynomial f(x) over a field F is called irreducible if and only if f(x) can-
not be expressed as a product of two polynomials, both over F, and both of degree
lower than that of f(x). By analogy to integers, an irreducible polynomial is also
called a prime polynomial.
The polynomial6 f(x) = x4 + 1 over GF(2) is reducible, because
x4 + 1 = (x + 1)(x3 + x2 + x + 1).
6In the reminder of this chapter, unless otherwise noted, all examples are of polynomials over GF(2).

Consider the polynomial f(x) = x3 + x + 1. It is clear by inspection that x is not
a factor of f(x). We easily show that x + 1 is not a factor of f(x):
x2 + x
x + 1�x3 + x + 1
x3 + x2
x2 + x
x2 + x
Thus, f(x) has no factors of degree 1. But it is clear by inspection that if f(x) is
reducible, it must have one factor of degree 2 and one factor of degree 1. There-
fore, f(x) is irreducible.
Figure 5.6 Examples of Polynomial Arithmetic over GF(2)
(a) Addition
(c) Multiplication
(d) Division
x4x5 ++x7
x3x4 ++x5 ++x7 +x 1
+++ ( )1
x3x4 ++x5 ++x7 +x 1
x4x5 ++x7
x3 x
x3 ++ +x 1
+ 1
x5x6 ++x8 x4 ++ +x2
+ x2
x7x8 ++x10 x6 ++ +x4
x10 + x4
++× ( )1
x3x4 ++x5 ++x7
x4x5 ++x7
x3 x
++– ( )1
(b) Subtraction
x3x4 ++x5 ++
+x 1
x3 + +x 1
x3 + +x 1
x4 1+
x3 x ++ 1

Finding the Greatest Common Divisor
We can extend the analogy between polynomial arithmetic over a field and integer
arithmetic by defining the greatest common divisor as follows. The polynomial c(x)
is said to be the greatest common divisor of a(x) and b(x) if the following are true.
1. c(x) divides both a(x) and b(x).
2. Any divisor of a(x) and b(x) is a divisor of c(x).
An equivalent definition is the following: gcd[a(x), b(x)] is the polynomial of
maximum degree that divides both a(x) and b(x).
We can adapt the Euclidean algorithm to compute the greatest common divisor
of two polynomials. Recall Equation (2.6), from Chapter 2, which is the basis of the
Euclidean algorithm: gcd(a, b) = gcd(b, a mod b). This equality can be rewritten as the
following equation:
gcd[a(x), b(x)] = gcd[b(x), a(x) mod b(x)] (5.3)
Equation (5.3) can be used repetitively to determine the greatest common divisor.
Compare the following scheme to the definition of the Euclidean algorithm for integers.
Euclidean Algorithm for Polynomials
Calculate Which satisfies
r1(x) = a(x) mod b(x) a(x) = q1(x)b(x) + r1(x)
r2(x) = b(x) mod r1(x) b(x) = q2(x)r1(x) + r2(x)
r3(x) = r1(x) mod r2(x) r1(x) = q3(x)r2(x) + r3(x)
rn(x) = rn – 2(x) mod rn – 1(x) rn – 2(x) = qn(x)rn – 1(x) + rn(x)
rn + 1(x) = rn – 1(x) mod rn(x) = 0
rn – 1(x) = qn + 1(x)rn(x) + 0
d(x) = gcd(a(x), b(x)) = rn(x)
At each iteration, we have d(x) = gcd(ri+ 1(x), ri(x)) until finally
d(x) = gcd(rn(x), 0) = rn(x). Thus, we can find the greatest common divisor of two
integers by repetitive application of the division algorithm. This is the Euclidean
algorithm for polynomials. The algorithm assumes that the degree of a(x) is greater
than the degree of b(x).
Find gcd[a(x), b(x)] for a(x) = x6 + x5 + x4 + x3 + x2 + x + 1 and b(x) =
x4 + x2 + x + 1. First, we divide a(x) by b(x):
x2 + x
x4 + x2 + x + 1�x6 + x5 + x4 + x3 + x2 + x + 1
x6 + x4 + x3 + x2
x5 + x + 1
x5 + x3 + x2 + x
x3 + x2 + 1

We began this section with a discussion of arithmetic with ordinary polynomials. In
ordinary polynomial arithmetic, the variable is not evaluated; that is, we do not plug
a value in for the variable of the polynomials. Instead, arithmetic operations are
performed on polynomials (addition, subtraction, multiplication, division) using the
ordinary rules of algebra. Polynomial division is not allowed unless the coefficients
are elements of a field.
Next, we discussed polynomial arithmetic in which the coefficients are ele-
ments of GF(p). In this case, polynomial addition, subtraction, multiplication, and
division are allowed. However, division is not exact; that is, in general division re-
sults in a quotient and a remainder.
Finally, we showed that the Euclidean algorithm can be extended to find the
greatest common divisor of two polynomials whose coefficients are elements of a
All of the material in this section provides a foundation for the following sec-
tion, in which polynomials are used to define finite fields of order pn.
Earlier in this chapter, we mentioned that the order of a finite field must be of the
form pn, where p is a prime and n is a positive integer. In Section 5.4, we looked at
the special case of finite fields with order p. We found that, using modular arith-
metic in Zp, all of the axioms for a field (Figure 5.2) are satisfied. For polynomials
over pn, with n 7 1, operations modulo pn do not produce a field. In this section,
we show what structure satisfies the axioms for a field in a set with pn elements and
concentrate on GF(2n).
Virtually all encryption algorithms, both symmetric and asymmetric, involve arith-
metic operations on integers. If one of the operations that is used in the algorithm is
division, then we need to work in arithmetic defined over a field. For convenience
This yields r1(x) = x3 + x2 + 1 and q1 (x) = x2 + x.
Then, we divide b(x) by r1(x).
x + 1
x3 + x2 + 1�x4 + x2 + x + 1
x4 + x3 + x
x3 + x2 + 1
x3 + x2 + 1
This yields r2(x) = 0 and q2(x) = x + 1.
Therefore, gcd[a(x), b(x)] = r1(x) = x3 + x2 + 1.

and for implementation efficiency, we would also like to work with integers that fit
exactly into a given number of bits with no wasted bit patterns. That is, we wish to
work with integers in the range 0 through 2n – 1, which fit into an n-bit word.
Suppose we wish to define a conventional encryption algorithm that operates on
data 8 bits at a time, and we wish to perform division. With 8 bits, we can repre-
sent integers in the range 0 through 255. However, 256 is not a prime number, so
that if arithmetic is performed in Z256 (arithmetic modulo 256), this set of inte-
gers will not be a field. The closest prime number less than 256 is 251. Thus, the
set Z251, using arithmetic modulo 251, is a field. However, in this case the 8-bit
patterns representing the integers 251 through 255 would not be used, resulting
in inefficient use of storage.
As the preceding example points out, if all arithmetic operations are to be
used and we wish to represent a full range of integers in n bits, then arithmetic
modulo 2n will not work. Equivalently, the set of integers modulo 2n for n 7 1, is
not a field. Furthermore, even if the encryption algorithm uses only addition and
multiplication, but not division, the use of the set Z2n is questionable, as the follow-
ing example illustrates.
Suppose we wish to use 3-bit blocks in our encryption algorithm and use only the
operations of addition and multiplication. Then arithmetic modulo 8 is well defined,
as shown in Table 5.1. However, note that in the multiplication table, the nonzero
integers do not appear an equal number of times. For example, there are only four
occurrences of 3, but twelve occurrences of 4. On the other hand, as was mentioned,
there are finite fields of the form GF(2n), so there is in particular a finite field of
order 23 = 8. Arithmetic for this field is shown in Table 5.2. In this case, the number
of occurrences of the nonzero integers is uniform for multiplication. To summarize,
Integer 1 2 3 4 5 6 7
Occurrences in Z8 4 8 4 12 4 8 4
Occurrences in GF(23) 7 7 7 7 7 7 7
For the moment, let us set aside the question of how the matrices of Table 5.2
were constructed and instead make some observations.
1. The addition and multiplication tables are symmetric about the main diago-
nal, in conformance to the commutative property of addition and multiplica-
tion. This property is also exhibited in Table 5.1, which uses mod 8 arithmetic.
2. All the nonzero elements defined by Table 5.2 have a multiplicative inverse,
unlike the case with Table 5.1.
3. The scheme defined by Table 5.2 satisfies all the requirements for a finite
field. Thus, we can refer to this scheme as GF(23).
4. For convenience, we show the 3-bit assignment used for each of the elements
of GF(23).

Intuitively, it would seem that an algorithm that maps the integers unevenly
onto themselves might be cryptographically weaker than one that provides a uni-
form mapping. That is, a cryptanalytic technique might be able to exploit the fact
that some integers occur more frequently and some less frequently in the ciphertext.
Thus, the finite fields of the form GF(2n) are attractive for cryptographic algorithms.
To summarize, we are looking for a set consisting of 2n elements, together
with a definition of addition and multiplication over the set that define a field. We
can assign a unique integer in the range 0 through 2n – 1 to each element of the
set. Keep in mind that we will not use modular arithmetic, as we have seen that this
does not result in a field. Instead, we will show how polynomial arithmetic provides
a means for constructing the desired field.
Modular Polynomial Arithmetic
Consider the set S of all polynomials of degree n – 1 or less over the field Zp. Thus,
each polynomial has the form
f(x) = an – 1xn – 1 + an – 2xn – 2 + g + a1x + a0 = a
n – 1
000 001 010 011 100 101 110 111
+ 0 1 2 3 4 5 6 7
000 0 0 1 2 3 4 5 6 7
001 1 1 0 3 2 5 4 7 6
010 2 2 3 0 1 6 7 4 5
011 3 3 2 1 0 7 6 5 4
100 4 4 5 6 7 0 1 2 3
101 5 5 4 7 6 1 0 3 2
110 6 6 7 4 5 2 3 0 1
111 7 7 6 5 4 3 2 1 0
(a) Addition
000 001 010 011 100 101 110 111
* 0 1 2 3 4 5 6 7
000 0 0 0 0 0 0 0 0 0
001 1 0 1 2 3 4 5 6 7
010 2 0 2 4 6 3 1 7 5
011 3 0 3 6 5 7 4 1 2
100 4 0 4 3 7 6 2 5 1
101 5 0 5 1 4 2 7 3 6
110 6 0 6 7 1 5 3 2 4
111 7 0 7 5 2 1 6 4 3
(b) Multiplication
w -w w-1
0 0 –
1 1 1
2 2 5
3 3 6
4 4 7
5 5 2
6 6 3
7 7 4
(c) Additive and multiplicative
Table 5.2 Arithmetic in GF(23)

where each ai takes on a value in the set {0, 1, c , p – 1}. There are a total of pn
different polynomials in S.
For p = 3 and n = 2, the 32 = 9 polynomials in the set are
0, 1, 2, x, x + 1, x + 2, 2x, 2x + 1, 2x + 2
For p = 2 and n = 3, the 23 = 8 polynomials in the set are
0, 1, x, x + 1, x2, x2 + 1, x2 + x, x2 + x + 1
With the appropriate definition of arithmetic operations, each such set S is a
finite field. The definition consists of the following elements.
1. Arithmetic follows the ordinary rules of polynomial arithmetic using the basic
rules of algebra, with the following two refinements.
2. Arithmetic on the coefficients is performed modulo p. That is, we use the rules
of arithmetic for the finite field Zp.
3. If multiplication results in a polynomial of degree greater than n – 1, then the
polynomial is reduced modulo some irreducible polynomial m(x) of degree n.
That is, we divide by m(x) and keep the remainder. For a polynomial f(x), the
remainder is expressed as r(x) = f(x) mod m(x).
The Advanced Encryption Standard (AES) uses arithmetic in the finite field
GF(28), with the irreducible polynomial m(x) = x8 + x4 + x3 + x + 1. Consider
the two polynomials f(x) = x6 + x4 + x2 + x + 1 and g(x) = x7 + x + 1. Then
f(x) + g(x) = x6 + x4 + x2 + x + 1 + x7 + x + 1
= x7 + x6 + x4 + x2
f(x) * g(x) = x13 + x11 + x9 + x8 + x7
+ x7 + x5 + x3 + x2 + x
+ x6 + x4 + x2 + x + 1
= x13 + x11 + x9 + x8 + x6 + x5 + x4 + x3 + 1
x5 + x3
x8 + x4 + x3 + x + 1 >x13 + x11 + x9 + x8 + x6 + x5 + x4 + x3 + 1
x13 + x9 + x8 + x6 + x5
x11 + x4 + x3
x11 + x7 + x6 + x4 + x3
x7 + x6 + 1
Therefore, f(x) * g(x) mod m(x) = x7 + x6 + 1.

As with ordinary modular arithmetic, we have the notion of a set of residues
in modular polynomial arithmetic. The set of residues modulo m(x), an nth-degree
polynomial, consists of pn elements. Each of these elements is represented by one of
the pn polynomials of degree m 6 n.
The residue class [x + 1], (mod m(x)), consists of all polynomials a(x) such that
a(x) K (x + 1)(mod m(x)). Equivalently, the residue class [x + 1] consists of all
polynomials a(x) that satisfy the equality a(x) mod m(x) = x + 1.
It can be shown that the set of all polynomials modulo an irreducible nth-
degree polynomial m(x) satisfies the axioms in Figure 5.2, and thus forms a finite
field. Furthermore, all finite fields of a given order are isomorphic; that is, any two
finite-field structures of a given order have the same structure, but the representa-
tion or labels of the elements may be different.
To construct the finite field GF(23), we need to choose an irreducible poly-
nomial of degree 3. There are only two such polynomials: (x3 + x2 + 1) and
(x3 + x + 1). Using the latter, Table 5.3 shows the addition and multiplication
tables for GF(23). Note that this set of tables has the identical structure to those
of Table 5.2. Thus, we have succeeded in finding a way to define a field of order 23.
We can now read additions and multiplications from the table easily. For exam-
ple, consider binary 100 + 010 = 110. This is equivalent to x2 + x. Also consider
100 * 010 = 011, which is equivalent to x2 * x = x3 and reduces to x + 1. That
is, x3 mod (x3 + x + 1) = x + 1, which is equivalent to 011.
Finding the Multiplicative Inverse
Just as the Euclidean algorithm can be adapted to find the greatest common divisor
of two polynomials, the extended Euclidean algorithm can be adapted to find the
multiplicative inverse of a polynomial. Specifically, the algorithm will find the mul-
tiplicative inverse of b(x) modulo a(x) if the degree of b(x) is less than the degree of
a(x) and gcd[a(x), b(x)] = 1. If a(x) is an irreducible polynomial, then it has no fac-
tor other than itself or 1, so that gcd[a(x), b(x)] = 1. The algorithm can be charac-
terized in the same way as we did for the extended Euclidean algorithm for integers.
Given polynomials a(x) and b(x) with the degree of a(x) greater than the degree
of b(x), we wish to solve the following equation for the values v(x), w(x), and d(x),
where d(x) = gcd[a(x), b(x)]:
a(x)v(x) + b(x)w(x) = d(x)
If d(x) = 1, then w(x) is the multiplicative inverse of b(x) modulo a(x). The calcula-
tions are as follows.


l A

Extended Euclidean Algorithm for Polynomials
Calculate Which satisfies Calculate Which satisfies
r-1(x) = a(x) v-1(x) = 1; w-1(x) = 0 a(x) = a(x)v-1(x) +
r0(x) = b(x) v0(x) = 0; w0(x) = 1 b(x) = a(x)v0(x) +
r1(x) = a(x) mod b(x)
q1(x) = quotient of
a(x) = q1(x)b(x) +
v1(x) = v-1(x) –
q1(x)v0(x) = 1
w1(x) = w-1(x) –
q1(x)w0(x) = -q1(x)
r1(x) = a(x)v1(x) +
r2(x) = b(x) mod r1(x)
q2(x) = quotient of
b(x) = q2(x)r1(x) +
v2(x) = v0(x) –
w2(x) = w0(x) –
r2(x) = a(x)v2(x) +
r3(x) = r1(x) mod r2(x)
q3(x) = quotient of
r1(x) = q3(x)r2(x) +
v3(x) = v1(x) –
w3(x) = w1(x) –
r3(x) = a(x)v3(x) +
rn(x) = rn – 2(x)
mod rn – 1(x)
qn(x) = quotient of
rn – 2(x)/rn – 2(x)
rn – 2(x) = qn(x)rn – 1(x)
+ rn(x)
vn(x) = vn – 2(x) –
qn(x)vn – 1(x)
wn(x) = wn – 2(x) –
qn(x)wn – 1(x)
rn(x) = a(x)vn(x) +
rn + 1(x) = rn – 1(x)
mod rn(x) = 0
qn + 1(x) = quotient of
rn – 1(x)/rn(x)
rn – 1(x) = qn + 1(x)rn(x)
+ 0
d(x) = gcd(a(x),
b(x)) = rn(x)
v(x) = vn(x); w(x) =
Table 5.4 shows the calculation of the multiplicative inverse of (x7 + x + 1)
mod (x8 + x4 + x3 + x + 1). The result is that (x7 + x + 1)-1 = (x7). That is,
(x7 + x + 1)(x7) K 1(mod (x8 + x4 + x3 + x + 1)).
Computational Considerations
A polynomial f(x) in GF(2n)
f(x) = an – 1xn – 1 + an – 2xn – 2 + g + a1x + a0 = a
n – 1
can be uniquely represented by the sequence of its n binary coefficients
(an – 1, an – 2, c , a0). Thus, every polynomial in GF(2n) can be represented by an
n-bit number.

ADDITION We have seen that addition of polynomials is performed by adding cor-
responding coefficients, and, in the case of polynomials over Z2, addition is just the
XOR operation. So, addition of two polynomials in GF(2n) corresponds to a bitwise
XOR operation.
Initialization a(x) = x8 + x4 + x3 + x + 1; v-1(x) = 1; w-1(x) = 0
b(x) = x7 + x + 1; v0(x) = 0; w0(x) = 1
Iteration 1 q1(x) = x; r1(x) = x4 + x3 + x2 + 1
v1(x) = 1; w1(x) = x
Iteration 2 q2(x) = x3 + x2 + 1; r2(x) = x
v2(x) = x3 + x2 + 1; w2(x) = x4 + x3 + x + 1
Iteration 3 q3(x) = x3 + x2 + x; r3(x) = 1
v3(x) = x6 + x2 + x + 1; w3(x) = x7
Iteration 4 q4(x) = x; r4(x) = 0
v4(x) = x7 + x + 1; w4(x) = x8 + x4 + x3 + x + 1
Result d(x) = r3(x) = gcd(a(x), b(x)) = 1
w(x) = w3(x) = (x7 + x + 1)-1 mod (x8 + x4 + x3 + x + 1) = x7
Table 5.4 Extended Euclid [(x8 + x4 + x3 + x + 1), (x7 + x + 1)]
Tables 5.2 and 5.3 show the addition and multiplication tables for GF(23) modulo
m(x) = (x3 + x + 1). Table 5.2 uses the binary representation, and Table 5.3
uses the polynomial representation.
Consider the two polynomials in GF(28) from our earlier example:
f(x) = x6 + x4 + x2 + x + 1 and g(x) = x7 + x + 1.

(x6 + x4 + x2 + x + 1) + (x7 + x + 1) = x7 + x6 + x4 + x2 (polynomial notation)
(01010111)⊕ (10000011) = (11010100) (binary notation)
{57}⊕ {83} = {D4} (hexadecimal notation)7

7A basic refresher on number systems (decimal, binary, hexadecimal) can be found at the Computer
Science Student Resource Site at Here each of two groups
of 4 bits in a byte is denoted by a single hexadecimal character, and the two characters are enclosed in
MULTIPLICATION There is no simple XOR operation that will accomplish multi-
plication in GF(2n). However, a reasonably straightforward, easily implemented
technique is available. We will discuss the technique with reference to GF(28) using
m(x) = x8 + x4 + x3 + x + 1, which is the finite field used in AES. The technique
readily generalizes to GF(2n).
The technique is based on the observation that
x8 mod m(x) = [m(x) – x8] = (x4 + x3 + x + 1) (5.4)

A moment’s thought should convince you that Equation (5.4) is true; if you
are not sure, divide it out. In general, in GF(2n) with an nth-degree polynomial p(x),
we have xn mod p(x) = [p(x) – xn].
Now, consider a polynomial in GF(28), which has the form
f(x) = b7x7 + b6x6 + b5x5 + b4x4 + b3x3 + b2x2 + b1x + b0. If we multiply by x,
we have
x * f(x) = (b7x8 + b6x7 + b5x6 + b4x5 + b3x4
+ b2x3 + b1x2 + b0x) mod m(x) (5.5)
If b7 = 0, then the result is a polynomial of degree less than 8, which is already
in reduced form, and no further computation is necessary. If b7 = 1, then reduction
modulo m(x) is achieved using Equation (5.4):
x * f(x) = (b6x7 + b5x6 + b4x5 + b3x4 + b2x3 + b1x2 + b0x)
+ (x4 + x3 + x + 1)
It follows that multiplication by x (i.e., 00000010) can be implemented as a 1-bit
left shift followed by a conditional bitwise XOR with (00011011), which represents
(x4 + x3 + x + 1). To summarize,
x * f(x) = b (b6b5b4b3b2b1b00) if b7 = 0
(b6b5b4b3b2b1b00)⊕ (00011011) if b7 = 1
Multiplication by a higher power of x can be achieved by repeated application
of Equation (5.6). By adding intermediate results, multiplication by any constant in
GF(28) can be achieved.
In an earlier example, we showed that for f(x) = x6 + x4 + x2 + x + 1, g(x) = x7 +
x + 1, and m(x) = x8 + x4 + x3 + x + 1, we have f(x) * g(x) mod m(x) = x7 + x6 + 1.
Redoing this in binary arithmetic, we need to compute (01010111) * (10000011). First,
we determine the results of multiplication by powers of x:
(01010111) * (00000010) = (10101110)
(01010111) * (00000100) = (01011100)⊕ (00011011) = (01000111)
(01010111) * (00001000) = (10001110)
(01010111) * (00010000) = (00011100)⊕ (00011011) = (00000111)
(01010111) * (00100000) = (00001110)
(01010111) * (01000000) = (00011100)
(01010111) * (10000000) = (00111000)

(01010111) * (10000011) = (01010111) * [(00000001)⊕ (00000010)⊕ (10000000)]
= (01010111)⊕ (10101110)⊕ (00111000) = (11000001)
which is equivalent to x7 + x6 + 1.

Using a Generator
An equivalent technique for defining a finite field of the form GF(2n), using the
same irreducible polynomial, is sometimes more convenient. To begin, we need two
definitions: A generator g of a finite field F of order q (contains q elements) is an
element whose first q – 1 powers generate all the nonzero elements of F. That is,
the elements of F consist of 0, g0, g1, c , gq – 2. Consider a field F defined by a
polynomial f(x). An element b contained in F is called a root of the polynomial if
f(b) = 0. Finally, it can be shown that a root g of an irreducible polynomial is a gen-
erator of the finite field defined on that polynomial.
Decimal (Hex)
0 0 000 0
g0(= g7) 1 001 1
g1 g 010 2
g2 g2 100 4
g3 g + 1 011 3
g4 g2 + g 110 6
g5 g2 + g + 1 111 7
g6 g2 + 1 101 5
Table 5.5 Generator for GF(23) using x3 + x + 1
Let us consider the finite field GF(23), defined over the irreducible poly-
nomial x3 + x + 1, discussed previously. Thus, the generator g must satisfy
f(g) = g3 + g + 1 = 0. Keep in mind, as discussed previously, that we need not
find a numerical solution to this equality. Rather, we deal with polynomial arith-
metic in which arithmetic on the coefficients is performed modulo 2. Therefore,
the solution to the preceding equality is g3 = -g – 1 = g + 1. We now show
that g in fact generates all of the polynomials of degree less than 3. We have the
g4 = g(g3) = g(g + 1) = g2 + g
g5 = g(g4) = g(g2 + g) = g3 + g2 = g2 + g + 1
g6 = g(g5) = g(g2 + g + 1) = g3 + g2 + g = g2 + g + g + 1 = g2 + 1
g7 = g(g6) = g(g2 + 1) = g3 + g = g + g + 1 = 1 = g0
We see that the powers of g generate all the nonzero polynomials in GF(23).
Also, it should be clear that gk = gk mod7 for any integer k. Table 5.5 shows the
power representation, as well as the polynomial and binary representations.

In general, for GF(2n) with irreducible polynomial f(x), determine
gn = f(g) – gn. Then calculate all of the powers of g from gn + 1 through g2
n – 2.
The elements of the field correspond to the powers of g from g0 through g2
n – 2
plus the value 0. For multiplication of two elements in the field, use the equality
gk = gk mod(2
n – 1) for any integer k.
In this section, we have shown how to construct a finite field of order 2n. Specifically,
we defined GF(2n) with the following properties.
1. GF(2n) consists of 2n elements.
2. The binary operations + and * are defined over the set. The operations
of addition, subtraction, multiplication, and division can be performed with-
out leaving the set. Each element of the set other than 0 has a multiplicative
We have shown that the elements of GF(2n) can be defined as the set of all
polynomials of degree n – 1 or less with binary coefficients. Each such polynomial
can be represented by a unique n-bit value. Arithmetic is defined as polynomial
arithmetic modulo some irreducible polynomial of degree n. We have also seen that
an equivalent definition of a finite field GF(2n) makes use of a generator and that
arithmetic is defined using powers of the generator.
This power representation makes multiplication easy. To multiply in the
power notation, add exponents modulo 7. For example, g4 * g6 = g(10 mod 7) =
g3 = g + 1. The same result is achieved using polynomial arithmetic: We have
g4 = g2 + g and g6 = g2 + 1. Then, (g2 + g) * (g2 + 1) = g4 + g3 + g2 + g.
Next, we need to determine (g4 + g3 + g2 + 1) mod (g3 + g + 1) by division:
g + 1
g3 + g + 1�g4 + g3 + g2 + g
g4 + g2 + g
g3 + g + 1
g + 1
We get a result of g + 1, which agrees with the result obtained using the power
Table 5.6 shows the addition and multiplication tables for GF(23) using
the power representation. Note that this yields the identical results to the
polynomial representation (Table 5.3) with some of the rows and columns
i nterchanged.


3 )
l (

Key Terms
abelian group
coefficient set
commutative ring
cyclic group
Euclidean algorithm
finite field
finite group
greatest common divisor
identity element
infinite field
infinite group
integral domain
inverse element
irreducible polynomial
modular arithmetic
modular polynomial
monic polynomial
polynomial arithmetic
polynomial ring
prime number
prime polynomial
relatively prime
Review Questions
5.1 Briefly define a group.
5.2 Briefly define a ring.
5.3 Briefly define a field.
5.4 List three classes of polynomial arithmetic.
5.1 For the group Sn of all permutations of n distinct symbols,
a. what is the number of elements in Sn?
b. show that Sn is not abelian for n 7 2.
5.2 Does the set of residue classes (mod3) form a group
a. with respect to modular addition?
b. with respect to modular multiplication?
5.3 Let S = {0, a, b, c}. The addition and multiplication on the set S is defined in the
following tables:
+ 0 a B C
0 0 a B C
A a 0 c B
B b c 0 A
C c b a 0
* 0 a b c
0 0 0 0 0
a 0 a b c
b 0 a b c
c 0 0 0 0
Is S a noncommutative ring? Justify your answer.
5.4 Develop a set of tables similar to Table 5.1 for GF(5).
5.5 Demonstrate that the set of polynomials whose coefficients form a field is a ring.
5.6 Demonstrate whether each of these statements is true or false for polynomials over a

a. The product of monic polynomials is monic.
b. The product of polynomials of degrees m and n has degree m + n.
c. The sum of polynomials of degrees m and n has degree max [m, n].
5.7 For polynomial arithmetic with coefficients in Z1 1, perform the following calculations.
a. (x 2 + 2 x + 9 )(x 3 + 1 1 x 2 + x + 7 )
b. (8 x 2 + 3 x + 2 )(5 x 2 + 6 )
5.8 Determine which of the following polynomials are reducible over GF(2).
a. x 2 + 1
b. x 2 + x + 1
c. x 4 + x + 1
5.9 Determine the gcd of the following pairs of polynomials.
a. (x3 + 1) and (x2 + x + 1) over GF(2)
b. (x3 + x + 1) and (x2 + 1) over GF(3)
c. (x3 – 2x + 1) and (x2 – x – 2) over GF(5)
d. (x4 + 8×3 + 7x + 8) and (2×3 + 9×2 + 10x + 1) over GF(11)
5.10 Develop a set of tables similar to Table 5.3 for GF(3) with m(x) = x2 + x + 1.
5.11 Determine the multiplicative inverse of x 2 + 1 in GF(23) with m(x) = x 3 + x – 1 .
5.12 Develop a table similar to Table 5.5 for GF(25) with m(x) = x 5 + x 4 + x 3 + x + 1 .
Programming Problems
5.13 Write a simple four-function calculator in GF(24). You may use table lookups for the
multiplicative inverses.
5.14 Write a simple four-function calculator in GF(28). You should compute the multiplica-
tive inverses on the fly.

6.1 Finite Field Arithmetic
6.2 AES Structure
General Structure
Detailed Structure
6.3 AES Transformation Functions
Substitute Bytes Transformation
ShiftRows Transformation
MixColumns Transformation
AddRoundKey Transformation
6.4 AES Key Expansion
Key Expansion Algorithm
6.5 An AES Example
Avalanche Effect
6.6 AES Implementation
Equivalent Inverse Cipher
Implementation Aspects
6.7 Key Terms, Review Questions, and Problems
Appendix 6A Polynomials with Coefficients in GF(28)
Advanced Encryption Standard

The Advanced Encryption Standard (AES) was published by the National Institute of
Standards and Technology (NIST) in 2001. AES is a symmetric block cipher that is
intended to replace DES as the approved standard for a wide range of applications.
Compared to public-key ciphers such as RSA, the structure of AES and most symmet-
ric ciphers is quite complex and cannot be explained as easily as many other
cryptographic algorithms. Accordingly, the reader may wish to begin with a simplified
version of AES, which is described in Appendix I. This version allows the reader to
perform encryption and decryption by hand and gain a good understanding of the
working of the algorithm details. Classroom experience indicates that a study of this
simplified version enhances understanding of AES.1 One possible approach is to read
the chapter first, then carefully read Appendix I, and then re-read the main body
of the chapter.
Appendix H looks at the evaluation criteria used by NIST to select from among
the candidates for AES, plus the rationale for picking Rijndael, which was the winning
candidate. This material is useful in understanding not just the AES design but also the
criteria by which to judge any symmetric encryption algorithm.
In AES, all operations are performed on 8-bit bytes. In particular, the arithmetic
operations of addition, multiplication, and division are performed over the finite
field GF(28). Section 5.6 discusses such operations in some detail. For the reader
who has not studied Chapter 5, and as a quick review for those who have, this sec-
tion summarizes the important concepts.
In essence, a field is a set in which we can do addition, subtraction, multiplica-
tion, and division without leaving the set. Division is defined with the following rule:
a/b = a(b-1). An example of a finite field (one with a finite number of elements) is
the set Zp consisting of all the integers {0, 1, c , p – 1}, where p is a prime num-
ber and in which arithmetic is carried out modulo p.
1However, you may safely skip Appendix I, at least on a first reading. If you get lost or bogged down in
the details of AES, then you can go back and start with simplified AES.
After studying this chapter, you should be able to:
◆ Present an overview of the general structure of Advanced Encryption
Standard (AES).
◆ Understand the four transformations used in AES.
◆ Explain the AES key expansion algorithm.
◆ Understand the use of polynomials with coefficients in GF(28).

Virtually all encryption algorithms, both conventional and public-key, involve
arithmetic operations on integers. If one of the operations used in the algorithm
is division, then we need to work in arithmetic defined over a field; this is because
division requires that each nonzero element have a multiplicative inverse. For con-
venience and for implementation efficiency, we would also like to work with inte-
gers that fit exactly into a given number of bits, with no wasted bit patterns. That is,
we wish to work with integers in the range 0 through 2n – 1, which fit into an n-bit
word. Unfortunately, the set of such integers, Z2n, using modular arithmetic, is not a
field. For example, the integer 2 has no multiplicative inverse in Z2n, that is, there is
no integer b, such that 2b mod 2n = 1.
There is a way of defining a finite field containing 2n elements; such a field is
referred to as GF(2n). Consider the set, S, of all polynomials of degree n – 1 or less
with binary coefficients. Thus, each polynomial has the form
f(x) = an – 1xn – 1 + an – 2xn – 2 + g + a1x + a0 = a
n – 1
where each ai takes on the value 0 or 1. There are a total of 2
n different polynomials
in S. For n = 3, the 23 = 8 polynomials in the set are

0 x x2 x2 + x
1 x + 1 x2 + 1 x2 + x + 1

With the appropriate definition of arithmetic operations, each such set S is a
finite field. The definition consists of the following elements.
1. Arithmetic follows the ordinary rules of polynomial arithmetic using the basic
rules of algebra with the following two refinements.
2. Arithmetic on the coefficients is performed modulo 2. This is the same as the
XOR operation.
3. If multiplication results in a polynomial of degree greater than n – 1, then the
polynomial is reduced modulo some irreducible polynomial m(x) of degree n.
That is, we divide by m(x) and keep the remainder. For a polynomial f(x),
the remainder is expressed as r(x) = f(x) mod m(x). A polynomial m(x) is
called irreducible if and only if m(x) cannot be expressed as a product of two
polynomials, both of degree lower than that of m(x).
For example, to construct the finite field GF(23), we need to choose an irre-
ducible polynomial of degree 3. There are only two such polynomials: (x3 + x2 + 1)
and (x3 + x + 1). Addition is equivalent to taking the XOR of like terms. Thus,
(x + 1) + x = 1.
A polynomial in GF(2n) can be uniquely represented by its n binary coeffi cients
(an – 1an – 2 c a0). Therefore, every polynomial in GF(2n) can be represented by
an n-bit number. Addition is performed by taking the bitwise XOR of the two n-bit
elements. There is no simple XOR operation that will accomplish multiplication in
GF(2n). However, a reasonably straightforward, easily implemented, technique is
available. In essence, it can be shown that multiplication of a number in GF(2n) by

2 consists of a left shift followed by a conditional XOR with a constant. Multiplication
by larger numbers can be achieved by repeated application of this rule.
For example, AES uses arithmetic in the finite field GF(28) with the irreducible
polynomial m(x) = x8 + x4 + x3 + x + 1. Consider two elements A =
(a7a6 c a1a0) and B = (b7b6 c b1b0). The sum A + B = (c7c6 c c1c0), where
ci = ai⊕ bi. The multiplication {02} # A equals (a6 c a1a00) if a7 = 0 and equals
(a6 c a1a00)⊕ (00011011) if a7 = 1.2
To summarize, AES operates on 8-bit bytes. Addition of two bytes is defined
as the bitwise XOR operation. Multiplication of two bytes is defined as multiplica-
tion in the finite field GF(28), with the irreducible polynomial3 m(x) = x8 + x4 + x3 +
x + 1. The developers of Rijndael give as their motivation for selecting this one of
the 30 possible irreducible polynomials of degree 8 that it is the first one on the list
given in [LIDL94].
General Structure
Figure 6.1 shows the overall structure of the AES encryption process. The cipher
takes a plaintext block size of 128 bits, or 16 bytes. The key length can be 16, 24, or
32 bytes (128, 192, or 256 bits). The algorithm is referred to as AES-128, AES-192,
or AES-256, depending on the key length.
The input to the encryption and decryption algorithms is a single 128-bit block.
In FIPS PUB 197, this block is depicted as a 4 * 4 square matrix of bytes. This
block is copied into the State array, which is modified at each stage of encryption or
decryption. After the final stage, State is copied to an output matrix. These opera-
tions are depicted in Figure 6.2a. Similarly, the key is depicted as a square matrix of
bytes. This key is then expanded into an array of key schedule words. Figure 6.2b
shows the expansion for the 128-bit key. Each word is four bytes, and the total key
schedule is 44 words for the 128-bit key. Note that the ordering of bytes within a ma-
trix is by column. So, for example, the first four bytes of a 128-bit plaintext input to
the encryption cipher occupy the first column of the in matrix, the second four bytes
occupy the second column, and so on. Similarly, the first four bytes of the expanded
key, which form a word, occupy the first column of the w matrix.
The cipher consists of N rounds, where the number of rounds depends on the
key length: 10 rounds for a 16-byte key, 12 rounds for a 24-byte key, and 14 rounds
for a 32-byte key (Table 6.1). The first N – 1 rounds consist of four distinct trans-
formation functions: SubBytes, ShiftRows, MixColumns, and AddRoundKey,
which are described subsequently. The final round contains only three transforma-
tions, and there is a initial single transformation (AddRoundKey) before the first
round, which can be considered Round 0. Each transformation takes one or more
2In FIPS PUB 197, a hexadecimal number is indicated by enclosing it in curly brackets. We use that convention
in this chapter.
3In the remainder of this discussion, references to GF(28) refer to the finite field defined with this

Figure 6.1 AES Encryption Process
Initial transformation
Plaintext—16 bytes (128 bits) Key—M bytes
(M bytes)Round 0 key
(16 bytes)
Round 1 key
(16 bytes)
Round N – 1 key
(16 bytes)
Round N key
(16 bytes)
Cipehertext—16 bytes (128 bits)
No. of
10 16
Input state
(16 bytes)
State after
(16 bytes)
Final state
(16 bytes)
Round N – 1
output state
(16 bytes)
Round 1
output state
(16 bytes)
Round 1
(4 transformations)
Round N – 1
(4 transformations)
Round N
(3 transformations)
12 24
14 32
4 * 4 matrices as input and produces a 4 * 4 matrix as output. Figure 6.1 shows
that the output of each round is a 4 * 4 matrix, with the output of the final round
being the ciphertext. Also, the key expansion function generates N + 1 round keys,
each of which is a distinct 4 * 4 matrix. Each round key serves as one of the inputs
to the AddRoundKey transformation in each round.

k 0
k 4
k 8
k 1
k 1
k 5
k 9
k 1
k 2
k 6
k 1
k 1
k 3
k 7
k 1
k 1
t 0
t 4
t 8
t 1
t 1
t 5
t 9
t 1
t 2
t 6
t 1
t 1
t 3
t 7
t 1
t 1
s 0
s 1
s 2
s 3
s 0
s 1
s 2
s 3
s 0
s 1
s 2
s 3
s 0
s 1
s 2
s 3
s 0
s 1
s 2
s 3
s 0
s 1
s 2
s 3
s 0
s 1
s 2
s 3
s 0
s 1
s 2
s 3
) I
, s
, a
) K

Key Size (words/bytes/bits) 4/16/128 6/24/192 8/32/256
Plaintext Block Size (words/bytes/bits) 4/16/128 4/16/128 4/16/128
Number of Rounds 10 12 14
Round Key Size (words/bytes/bits) 4/16/128 4/16/128 4/16/128
Expanded Key Size (words/bytes) 44/176 52/208 60/240
Table 6.1 AES Parameters
Detailed Structure
Figure 6.3 shows the AES cipher in more detail, indicating the sequence of transfor-
mations in each round and showing the corresponding decryption function. As was
done in Chapter 4, we show encryption proceeding down the page and decryption
proceeding up the page.
Before delving into details, we can make several comments about the overall
AES structure.
1. One noteworthy feature of this structure is that it is not a Feistel structure.
Recall that, in the classic Feistel structure, half of the data block is used to
modify the other half of the data block and then the halves are swapped. AES
instead processes the entire data block as a single matrix during each round
using substitutions and permutation.
2. The key that is provided as input is expanded into an array of forty-four 32-bit
words, w[i]. Four distinct words (128 bits) serve as a round key for each round;
these are indicated in Figure 6.3.
3. Four different stages are used, one of permutation and three of substitution:
■ Substitute bytes: Uses an S-box to perform a byte-by-byte substitution of
the block.
■ ShiftRows: A simple permutation.
■ MixColumns: A substitution that makes use of arithmetic over GF(28).
■ AddRoundKey: A simple bitwise XOR of the current block with a portion
of the expanded key.
4. The structure is quite simple. For both encryption and decryption, the cipher
begins with an AddRoundKey stage, followed by nine rounds that each in-
cludes all four stages, followed by a tenth round of three stages. Figure 6.4
depicts the structure of a full encryption round.
5. Only the AddRoundKey stage makes use of the key. For this reason, the cipher
begins and ends with an AddRoundKey stage. Any other stage, applied at the
beginning or end, is reversible without knowledge of the key and so would add
no security.
6. The AddRoundKey stage is, in effect, a form of Vernam cipher and by itself
would not be formidable. The other three stages together provide confusion,
diffusion, and nonlinearity, but by themselves would provide no security be-
cause they do not use the key. We can view the cipher as alternating operations
of XOR encryption (AddRoundKey) of a block, followed by scrambling of the

Figure 6.3 AES Encryption and Decryption
Add round key
w[4, 7]
(16 bytes)
(16 bytes)
Substitute bytes
Expand key
Shift rows
Mix columnsR
Add round key
Substitute bytes
Shift rows
Mix columns
Add round key
Substitute bytes
Shift rows
Add round key
(16 bytes)
(a) Encryption
(16 bytes)
Add round key
Inverse sub bytes
Inverse shift rows
Inverse mix cols
Add round key
Inverse sub bytes
Inverse shift rows
Inverse mix cols
Add round key
Inverse sub bytes
Inverse shift rows
Add round key
(16 bytes)
(b) Decryption
w[36, 39]
w[40, 43]
w[0, 3]
block (the other three stages), followed by XOR encryption, and so on. This
scheme is both efficient and highly secure.
7. Each stage is easily reversible. For the Substitute Byte, ShiftRows, and
MixColumns stages, an inverse function is used in the decryption algorithm.
For the AddRoundKey stage, the inverse is achieved by XORing the same
round key to the block, using the result that A⊕ B⊕ B = A.
8. As with most block ciphers, the decryption algorithm makes use of the
expanded key in reverse order. However, the decryption algorithm is not

Figure 6.4 AES Encryption Round
r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15
identical to the encryption algorithm. This is a consequence of the particular
structure of AES.
9. Once it is established that all four stages are reversible, it is easy to verify
that decryption does recover the plaintext. Figure 6.3 lays out encryption
and decryption going in opposite vertical directions. At each horizontal point
(e.g., the dashed line in the figure), State is the same for both encryption and
10. The final round of both encryption and decryption consists of only three stages.
Again, this is a consequence of the particular structure of AES and is required
to make the cipher reversible.
We now turn to a discussion of each of the four transformations used in AES. For
each stage, we describe the forward (encryption) algorithm, the inverse ( decryption)
algorithm, and the rationale for the stage.

Substitute Bytes Transformation
transformation, called SubBytes, is a simple table lookup (Figure 6.5a). AES
defines a 16 * 16 matrix of byte values, called an S-box (Table 6.2a), that con-
tains a permutation of all possible 256 8-bit values. Each individual byte of State
is mapped into a new byte in the following way: The leftmost 4 bits of the byte are
used as a row value and the rightmost 4 bits are used as a column value. These row
and column values serve as indexes into the S-box to select a unique 8-bit output
value. For example, the hexadecimal value {95} references row 9, column 5 of the
S-box, which contains the value {2A}. Accordingly, the value {95} is mapped into
the value {2A}.
Figure 6.5 AES Byte-Level Operations
s0,0 s0,1 s0,2 s0,3
s1,0 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
s0,0 s0,1 s0,2 s0,3
s1,0 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
(b) Add round key transformation
(a) Substitute byte transformation
¿ ¿ ¿ ¿
¿ ¿¿¿
wi wi+2 wi+3
s0,2 s0,3
s1,0 s1,2 s1,3
s2,0 s2,2 s2,3
s3,0 s3,2 s3,3
s0,0 s0,2 s0,3
s1,0 s1,2 s1,3
s2,0 s2,2 s2,3
s3,0 s3,2 s3,3
¿ ¿ ¿ ¿
¿ ¿
¿ ¿ ¿ ¿
¿ ¿ ¿
¿ ¿

0 1 2 3 4 5 6 7 8 9 A B C D E F
0 63 7C 77 7B F2 6B 6F C5 30 01 67 2B FE D7 AB 76
1 CA 82 C9 7D FA 59 47 F0 AD D4 A2 AF 9C A4 72 C0
2 B7 FD 93 26 36 3F F7 CC 34 A5 E5 F1 71 D8 31 15
3 04 C7 23 C3 18 96 05 9A 07 12 80 E2 EB 27 B2 75
4 09 83 2C 1A 1B 6E 5A A0 52 3B D6 B3 29 E3 2F 84
5 53 D1 00 ED 20 FC B1 5B 6A CB BE 39 4A 4C 58 CF
6 D0 EF AA FB 43 4D 33 85 45 F9 02 7F 50 3C 9F A8
7 51 A3 40 8F 92 9D 38 F5 BC B6 DA 21 10 FF F3 D2
8 CD 0C 13 EC 5F 97 44 17 C4 A7 7E 3D 64 5D 19 73
9 60 81 4F DC 22 2A 90 88 46 EE B8 14 DE 5E 0B DB
A E0 32 3A 0A 49 06 24 5C C2 D3 AC 62 91 95 E4 79
B E7 C8 37 6D 8D D5 4E A9 6C 56 F4 EA 65 7A AE 08
C BA 78 25 2E 1C A6 B4 C6 E8 DD 74 1F 4B BD 8B 8A
D 70 3E B5 66 48 03 F6 0E 61 35 57 B9 86 C1 1D 9E
E E1 F8 98 11 69 D9 8E 94 9B 1E 87 E9 CE 55 28 DF
F 8C A1 89 0D BF E6 42 68 41 99 2D 0F B0 54 BB 16
(a) S-box
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 52 09 6A D5 30 36 A5 38 BF 40 A3 9E 81 F3 D7 FB
1 7C E3 39 82 9B 2F FF 87 34 8E 43 44 C4 DE E9 CB
2 54 7B 94 32 A6 C2 23 3D EE 4C 95 0B 42 FA C3 4E
3 08 2E A1 66 28 D9 24 B2 76 5B A2 49 6D 8B D1 25
4 72 F8 F6 64 86 68 98 16 D4 A4 5C CC 5D 65 B6 92
5 6C 70 48 50 FD ED B9 DA 5E 15 46 57 A7 8D 9D 84
6 90 D8 AB 00 8C BC D3 0A F7 E4 58 05 B8 B3 45 06
7 D0 2C 1E 8F CA 3F 0F 02 C1 AF BD 03 01 13 8A 6B
8 3A 91 11 41 4F 67 DC EA 97 F2 CF CE F0 B4 E6 73
9 96 AC 74 22 E7 AD 35 85 E2 F9 37 E8 1C 75 DF 6E
A 47 F1 1A 71 1D 29 C5 89 6F B7 62 0E AA 18 BE 1B
B FC 56 3E 4B C6 D2 79 20 9A DB C0 FE 78 CD 5A F4
C 1F DD A8 33 88 07 C7 31 B1 12 10 59 27 80 EC 5F
D 60 51 7F A9 19 B5 4A 0D 2D E5 7A 9F 93 C9 9C EF
E A0 E0 3B 4D AE 2A F5 B0 C8 EB BB 3C 83 53 99 61
F 17 2B 04 7E BA 77 D6 26 E1 69 14 63 55 21 0C 7D
(b) Inverse S-box
Table 6.2 AES S-Boxes

Here is an example of the SubBytes transformation:
EA 04 65 85 87 F2 4D 97
83 45 5D 96 EC 6E 4C 90
5C 33 98 B0 S 4A C3 46 E7
F0 2D AD C5 8C D8 95 A6
The S-box is constructed in the following fashion (Figure 6.6a).
Figure 6.6 Constuction of S-Box and IS-Box

1 0 0 0 1 1 1 1
1 1 0 0 0 1 1 1
1 1 1 0 0 0 1 1
1 1 1 1 0 0 0 1
1 1 1 1 1 0 0 0
0 1 1 1 1 1 0 0
0 0 1 1 1 1 1 0
0 0 0 1 1 1 1 1



in GF(28)
Byte to bit
column vector
Bit column
vector to byte
Byte at row y,
column x
initialized to yx
(a) Calculation of byte at
row y, column x of S-box
(a) Calculation of byte at
row y, column x of IS-box
in GF(28)
Byte to bit
column vector
Bit column
vector to byte
Byte at row y,
column x
initialized to yx
b ¿

0 0 1 0 0 1 0 1
1 0 0 1 0 0 1 0
0 1 0 0 1 0 0 1
1 0 1 0 0 1 0 0
0 1 0 1 0 0 1 0
0 0 1 0 1 0 0 1
1 0 0 1 0 1 0 0
0 1 0 0 1 0 1 0




1. Initialize the S-box with the byte values in ascending sequence row by row.
The  first row contains {00}, {01}, {02}, c , {0F}; the second row contains
{10}, {11}, etc.; and so on. Thus, the value of the byte at row y, column x is {yx}.
2. Map each byte in the S-box to its multiplicative inverse in the finite field
GF(28); the value {00} is mapped to itself.
3. Consider that each byte in the S-box consists of 8 bits labeled
(b7, b6, b5, b4, b3, b2, b1, b0). Apply the following transformation to each bit of
each byte in the S-box:
= = bi⊕ b(i+ 4) mod 8 ⊕ b(i+ 5) mod 8 ⊕ b(i+ 6) mod 8 ⊕ b(i+ 7) mod 8 ⊕ ci (6.1)
where ci is the ith bit of byte c with the value {63}; that is,
(c7c6c5c4c3c2c1c0) = (01100011). The prime (′) indicates that the variable is to
be updated by the value on the right. The AES standard depicts this transfor-
mation in matrix form as follows.
X = H
1 0 0 0 1 1 1 1
1 1 0 0 0 1 1 1
1 1 1 0 0 0 1 1
1 1 1 1 0 0 0 1
1 1 1 1 1 0 0 0
0 1 1 1 1 1 0 0
0 0 1 1 1 1 1 0
0 0 0 1 1 1 1 1
X + H
X (6.2)
Equation (6.2) has to be interpreted carefully. In ordinary matrix multiplica-
tion,4 each element in the product matrix is the sum of products of the elements of
one row and one column. In this case, each element in the product matrix is the
bitwise XOR of products of elements of one row and one column. Furthermore, the
final addition shown in Equation (6.2) is a bitwise XOR. Recall from Section 5.6
that the bitwise XOR is addition in GF(28).
As an example, consider the input value {95}. The multiplicative inverse in
GF(28) is {95}-1 = {8A}, which is 10001010 in binary. Using Equation (6.2),
1 0 0 0 1 1 1 1
1 1 0 0 0 1 1 1
1 1 1 0 0 0 1 1
1 1 1 1 0 0 0 1
1 1 1 1 1 0 0 0
0 1 1 1 1 1 0 0
0 0 1 1 1 1 1 0
0 0 0 1 1 1 1 1
X ⊕ H
X = H
X ⊕ H
X = H
4For a brief review of the rules of matrix and vector multiplication, refer to Appendix E.

The result is {2A}, which should appear in row {09} column {05} of the S-box.
This is verified by checking Table 6.2a.
The inverse substitute byte transformation, called InvSubBytes, makes use
of the inverse S-box shown in Table 6.2b. Note, for example, that the input {2A}
produces the output {95}, and the input {95} to the S-box produces {2A}. The inverse
S-box is constructed (Figure 6.6b) by applying the inverse of the transformation in
Equation (6.1) followed by taking the multiplicative inverse in GF(28). The inverse
transformation is
= = b(i+ 2) mod 8 ⊕ b(i+ 5) mod 8 ⊕ b(i+ 7) mod 8 ⊕ di
where byte d = {05}, or 00000101. We can depict this transformation as follows.
X = H
0 0 1 0 0 1 0 1
1 0 0 1 0 0 1 0
0 1 0 0 1 0 0 1
1 0 1 0 0 1 0 0
0 1 0 1 0 0 1 0
0 0 1 0 1 0 0 1
1 0 0 1 0 1 0 0
0 1 0 0 1 0 1 0
X + H
To see that InvSubBytes is the inverse of SubBytes, label the matrices in
SubBytes and InvSubBytes as X and Y, respectively, and the vector versions of con-
stants c and d as C and D, respectively. For some 8-bit vector B, Equation (6.2)
becomes B= = XB⊕ C. We need to show that Y(XB⊕ C)⊕D = B. To multiply
out, we must show YXB⊕ YC⊕D = B. This becomes
0 0 1 0 0 1 0 1
1 0 0 1 0 0 1 0
0 1 0 0 1 0 0 1
1 0 1 0 0 1 0 0
0 1 0 1 0 0 1 0
0 0 1 0 1 0 0 1
1 0 0 1 0 1 0 0
0 1 0 0 1 0 1 0
1 0 0 0 1 1 1 1
1 1 0 0 0 1 1 1
1 1 1 0 0 0 1 1
1 1 1 1 0 0 0 1
1 1 1 1 1 0 0 0
0 1 1 1 1 1 0 0
0 0 1 1 1 1 1 0
0 0 0 1 1 1 1 1
X ⊕
0 0 1 0 0 1 0 1
1 0 0 1 0 0 1 0
0 1 0 0 1 0 0 1
1 0 1 0 0 1 0 0
0 1 0 1 0 0 1 0
0 0 1 0 1 0 0 1
1 0 0 1 0 1 0 0
0 1 0 0 1 0 1 0
X ⊕ H
X =

1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1
X ⊕ H
X ⊕ H
X = H
We have demonstrated that YX equals the identity matrix, and the YC = D,
so that YC⊕D equals the null vector.
RATIONALE The S-box is designed to be resistant to known cryptanalytic attacks.
Specifically, the Rijndael developers sought a design that has a low correlation
between input bits and output bits and the property that the output is not a linear
mathematical function of the input [DAEM01]. The nonlinearity is due to the use
of the multiplicative inverse. In addition, the constant in Equation (6.1) was chosen
so that the S-box has no fixed points [S@box(a) = a] and no “opposite fixed points”
[S@box(a) = a], where a is the bitwise complement of a.
Of course, the S-box must be invertible, that is, IS@box[S@box(a)] = a.
However, the S-box does not self-inverse in the sense that it is not true that
S@box(a) = IS@box(a). For example, S@box({95}) = {2A}, but IS@box({95}) = {AD}.
ShiftRows Transformation
FORWARD AND INVERSE TRANSFORMATIONS The forward shift row transformation,
called ShiftRows, is depicted in Figure 6.7a. The first row of State is not altered. For
the second row, a 1-byte circular left shift is performed. For the third row, a 2-byte
circular left shift is performed. For the fourth row, a 3-byte circular left shift is per-
formed. The following is an example of ShiftRows.
87 F2 4D 97 87 F2 4D 97
EC 6E 4C 90 6E 4C 90 EC
4A C3 46 E7 S 46 E7 4A C3
8C D8 95 A6 A6 8C D8 95
The inverse shift row transformation, called InvShiftRows, performs the cir-
cular shifts in the opposite direction for each of the last three rows, with a 1-byte
circular right shift for the second row, and so on.
RATIONALE The shift row transformation is more substantial than it may first
appear. This is because the State, as well as the cipher input and output, is
treated as an array of four 4-byte columns. Thus, on encryption, the first 4 bytes
of the plaintext are copied to the first column of State, and so on. Furthermore,
as will be seen, the round key is applied to State column by column. Thus, a row
shift moves an individual byte from one column to another, which is a linear

5We follow the convention of FIPS PUB 197 and use the symbol # to indicate multiplication over the
finite field GF(28) and ⊕ to indicate bitwise XOR, which corresponds to addition in GF(28).
distance of a multiple of 4 bytes. Also note that the transformation ensures that
the 4 bytes of one column are spread out to four different columns. Figure 6.4
illustrates the effect.
MixColumns Transformation
FORWARD AND INVERSE TRANSFORMATIONS The forward mix column transformation,
called MixColumns, operates on each column individually. Each byte of a column
is mapped into a new value that is a function of all four bytes in that column. The
transformation can be defined by the following matrix multiplication on State
(Figure 6.7b):
D02 03 01 0101 02 03 01
01 01 02 03
03 01 01 02
T D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
T = D s0,0= s0,1= s0,2= s0,3=s1,0= s1,1= s1,2= s1,3=
= s2,1
= s2,2
= s2,3
= s3,1
= s3,2
= s3,3
T (6.3)
Each element in the product matrix is the sum of products of elements of one row
and one column. In this case, the individual additions and multiplications5 are
Figure 6.7 AES Row and Column Operations
s0,0 s0,1 s0,2 s0,3
s1,0 s1,1 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
s0,0 s0,1 s0,2 s0,3
s1,0 s1,1 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
s0,0 s0,1 s0,2 s0,3
s1,0 s1,1 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
s0,0 s0,1 s0,2 s0,3
s1,1 s1,2 s1,3 s1,0
s2,2 s2,3 s2,0 s2,1
s3,3 s3,0 s3,1 s3,2
(a) Shift row transformation
(b) Mix column transformation
2 3 1 1
1 2 3 1
1 1 2 3
3 1 1 2
¿ ¿ ¿ ¿
¿ ¿ ¿ ¿

performed in GF(28). The MixColumns transformation on a single column of State
can be expressed as
s0, j
= = (2 # s0, j)⊕ (3 # s1, j)⊕ s2, j⊕ s3, j
s1, j
= = s0, j⊕ (2 # s1, j)⊕ (3 # s2, j)⊕ s3, j
s2, j
= = s0, j⊕ s1, j⊕ (2 # s2, j)⊕ (3 # s3, j)
s3, j
= = (3 # s0, j)⊕ s1, j⊕ s2, j⊕ (2 # s3, j)
The following is an example of MixColumns:
87 F2 4D 97 47 40 A3 4C
6E 4C 90 EC 37 D4 70 9F
46 E7 4A C3 S 94 E4 3A 42
A6 8C D8 95 ED A5 A6 BC
Let us verify the first column of this example. Recall from Section 5.6 that, in
GF(28), addition is the bitwise XOR operation and that multiplication can be per-
formed according to the rule established in Equation (4.14). In particular, multipli-
cation of a value by x (i.e., by {02}) can be implemented as a 1-bit left shift followed
by a conditional bitwise XOR with (0001 1011) if the leftmost bit of the original
value (prior to the shift) is 1. Thus, to verify the MixColumns transformation on the
first column, we need to show that
({02} # {87}) ⊕ ({03} # {6E}) ⊕ {46} ⊕ {A6} = {47}
{87} ⊕ ({02} # {6E}) ⊕ ({03} # {46}) ⊕ {A6} = {37}
{87} ⊕ {6E} ⊕ ({02} # {46}) ⊕ ({03} # {A6}) = {94}
({03} # {87}) ⊕ {6E} ⊕ {46} ⊕ ({02} # {A6}) = {ED}
For the first equation, we have {02} # {87} = (0000 1110)⊕ (0001 1011) =
(0001 0101) and {03} # {6E} = {6E}⊕ ({02} # {6E}) = (0110 1110)⊕ (1101 1100) =
(1011 0010). Then,

{02} # {87} = 0001 0101
{03} # {6E} = 1011 0010
{46} = 0100 0110
{A6} = 1010 0110
0100 0111 = {47}

The other equations can be similarly verified.
The inverse mix column transformation, called InvMixColumns, is defined by
the following matrix multiplication:
D 0E 0B 0D 0909 0E 0B 0D
0D 09 0E 0B
0B 0D 09 0E
T D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
T = D s0,0= s0,1= s0,2= s0,3=s1,0= s1,1= s1,2= s1,3=
= s2,1
= s2,2
= s2,3
= s3,1
= s3,2
= s3,3
T (6.5)

It is not immediately clear that Equation (6.5) is the inverse of Equation (6.3).
We need to show
D 0E 0B 0D 0909 0E 0B 0D
0D 09 0E 0B
0B 0D 09 0E
T D02 03 01 0101 02 03 01
01 01 02 03
03 01 01 02
T D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s3,0 s3,1 s3,2 s3,3
T = D s0,0 s0,1 s0,2 s0,3s1,0 s1,1 s1,2 s1,3
s2,0 s2,1 s2,2 s2,3
s0,3 s3,1 s3,2 s3,3
which is equivalent to showing
D 0E 0B 0D 0909 0E 0B 0D
0D 09 0E 0B
0B 0D 09 0E
T D02 03 01 0101 02 03 01
01 01 02 03
03 01 01 02
T = D1 0 0 00 1 0 0
0 0 1 0
0 0 0 1
T (6.6)
That is, the inverse transformation matrix times the forward transformation matrix
equals the identity matrix. To verify the first column of Equation (6.6), we need
to show
({0E} # {02})⊕ {0B}⊕ {0D}⊕ ({09} # {03}) = {01}
({09} # {02})⊕ {0E}⊕ {0B}⊕ ({0D} # {03}) = {00}
({0D} # {02})⊕ {09}⊕ {0E}⊕ ({0B} # {03}) = {00}
({0B} # {02})⊕ {0D}⊕ {09}⊕ ({0E} # {03}) = {00}
For the first equation, we have {0E} # {02} = 00011100 and {09} # {03} =
{09}⊕ ({09} # {02}) = 00001001⊕ 00010010 = 00011011. Then

{0E} # {02} = 00011100
{0B} = 00001011
{0D} = 00001101
{09} # {03} = 00011011

The other equations can be similarly verified.
The AES document describes another way of characterizing the MixColumns
transformation, which is in terms of polynomial arithmetic. In the standard,
MixColumns is defined by considering each column of State to be a four-term poly-
nomial with coefficients in GF(28). Each column is multiplied modulo (x4 + 1) by
the fixed polynomial a(x), given by
a(x) = {03}x3 + {01}x2 + {01}x + {02} (6.7)
Appendix 5A demonstrates that multiplication of each column of State by
a(x) can be written as the matrix multiplication of Equation (6.3). Similarly, it
can be seen that the transformation in Equation (6.5) corresponds to treating

each column as a four-term polynomial and multiplying each column by b(x),
given by
b(x) = {0B}x3 + {0D}x2 + {09}x + {0E} (6.8)
It readily can be shown that b(x) = a-1(x) mod (x4 + 1).
RATIONALE The coefficients of the matrix in Equation (6.3) are based on a linear
code with maximal distance between code words, which ensures a good mixing
among the bytes of each column. The mix column transformation combined with
the shift row transformation ensures that after a few rounds all output bits depend
on all input bits. See [DAEM99] for a discussion.
In addition, the choice of coefficients in MixColumns, which are all {01}, {02},
or {03}, was influenced by implementation considerations. As was discussed, multi-
plication by these coefficients involves at most a shift and an XOR. The coefficients
in InvMixColumns are more formidable to implement. However, encryption was
deemed more important than decryption for two reasons:
1. For the CFB and OFB cipher modes (Figures 7.5 and 7.6; described in
Chapter 7), only encryption is used.
2. As with any block cipher, AES can be used to construct a message authentica-
tion code (Chapter 13), and for this, only encryption is used.
AddRoundKey Transformation
FORWARD AND INVERSE TRANSFORMATIONS In the forward add round key transfor-
mation, called AddRoundKey, the 128 bits of State are bitwise XORed with the
128 bits of the round key. As shown in Figure 6.5b, the operation is viewed as a
columnwise operation between the 4 bytes of a State column and one word of
the round key; it can also be viewed as a byte-level operation. The following is an
example of AddRoundKey:
47 40 A3 4C AC 19 28 57 EB 59 8B 1B
37 D4 70 9F 77 FA D1 5C 40 2E A1 C3
94 E4 3A 42 ⊕ 66 DC 29 00 = F2 38 13 42
ED A5 A6 BC F3 21 41 6A 1E 84 E7 D6
The first matrix is State, and the second matrix is the round key.
The inverse add round key transformation is identical to the forward add
round key transformation, because the XOR operation is its own inverse.
RATIONALE The add round key transformation is as simple as possible and affects
every bit of State. The complexity of the round key expansion, plus the complexity
of the other stages of AES, ensure security.
Figure 6.8 is another view of a single round of AES, emphasizing the mecha-
nisms and inputs of each transformation.

Key Expansion Algorithm
The AES key expansion algorithm takes as input a four-word (16-byte) key and
produces a linear array of 44 words (176 bytes). This is sufficient to provide a four-
word round key for the initial AddRoundKey stage and each of the 10 rounds of the
cipher. The pseudocode on the next page describes the expansion.
The key is copied into the first four words of the expanded key. The remain-
der of the expanded key is filled in four words at a time. Each added word w[i]
depends on the immediately preceding word, w[i – 1], and the word four positions
back, w[i – 4]. In three out of four cases, a simple XOR is used. For a word whose
position in the w array is a multiple of 4, a more complex function is used. Figure 6.9
illustrates the generation of the expanded key, using the symbol g to represent that
complex function. The function g consists of the following subfunctions.
Figure 6.8 Inputs for Single AES Round
State matrix
at beginning
of round
State matrix
at end
of round
MixColumns matrix
Variable inputConstant inputs
02 03 01 01
01 02 03 01
01 01 02 03
03 01 01 02


KeyExpansion (byte key[16], word w[44])
word temp
for (i = 0; i < 4; i++) w[i] = (key[4*i], key[4*i+1], key[4*i+2], key[4*i+3]); for (i = 4; i < 44; i++) { temp = w[i − 1]; if (i mod 4 = 0) temp = SubWord (RotWord (temp)) ⊕ Rcon[i/4]; w[i] = w[i−4] ⊕ temp } } Figure 6.9 AES Key Expansion k3 (a) Overall algorithm (b) Function g k7 k11 k15 k2 k6 k10 k14 k1 k5 k9 k13 k0 k4 k8 k12 w0 w1 w2 w3 g w4 w5 w6 w7 w40 w41 w42 w43 g B0 B1 B2 B3 w w B1 B2 B3 B0 0 0 0 B1 S S B2' ' B3 S S B0' ' RCj œ 192 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD 1. RotWord performs a one-byte circular left shift on a word. This means that an input word [B0, B1, B2, B3] is transformed into [B1, B2, B3, B0]. 2. SubWord performs a byte substitution on each byte of its input word, using the S-box (Table 6.2a). 3. The result of steps 1 and 2 is XORed with a round constant, Rcon[j]. The round constant is a word in which the three rightmost bytes are always 0. Thus, the effect of an XOR of a word with Rcon is to only perform an XOR on the leftmost byte of the word. The round constant is different for each round and is de- fined as Rcon[j] = (RC[j], 0, 0, 0), with RC[1] = 1, RC[j] = 2 # RC[j - 1] and with multiplication defined over the field GF(28). The values of RC[j] in hexadecimal are j 1 2 3 4 5 6 7 8 9 10 RC[j] 01 02 04 08 10 20 40 80 1B 36 For example, suppose that the round key for round 8 is EA D2 73 21 B5 8D BA D2 31 2B F5 60 7F 8D 29 2F Then the first 4 bytes (first column) of the round key for round 9 are calculated as follows: i (decimal) temp After RotWord After SubWord Rcon (9) After XOR with Rcon w[i - 4] w[i] = temp⊕ w[i - 4] 36 7F8D292F 8D292F7F 5DA515D2 1B000000 46A515D2 EAD27321 AC7766F3 Rationale The Rijndael developers designed the expansion key algorithm to be resistant to known cryptanalytic attacks. The inclusion of a round-dependent round constant eliminates the symmetry, or similarity, between the ways in which round keys are generated in different rounds. The specific criteria that were used are [DAEM99] ■ Knowledge of a part of the cipher key or round key does not enable calcula- tion of many other round-key bits. ■ An invertible transformation [i.e., knowledge of any Nk consecutive words of the expanded key enables regeneration of the entire expanded key (Nk = key size in words)]. ■ Speed on a wide range of processors. ■ Usage of round constants to eliminate symmetries. ■ Diffusion of cipher key differences into the round keys; that is, each key bit affects many round key bits. ■ Enough nonlinearity to prohibit the full determination of round key differ- ences from cipher key differences only. ■ Simplicity of description. 6.5 / AN AES EXAMPLE 193 The authors do not quantify the first point on the preceding list, but the idea is that if you know less than Nk consecutive words of either the cipher key or one of the round keys, then it is difficult to reconstruct the remaining unknown bits. The fewer bits one knows, the more difficult it is to do the reconstruction or to deter- mine other bits in the key expansion. 6.5 AN AES EXAMPLE We now work through an example and consider some of its implications. Although you are not expected to duplicate the example by hand, you will find it informative to study the hex patterns that occur from one step to the next. For this example, the plaintext is a hexadecimal palindrome. The plaintext, key, and resulting ciphertext are Plaintext: 0123456789abcdeffedcba9876543210 Key: 0f1571c947d9e8590cb7add6af7f6798 Ciphertext: ff0b844a0853bf7c6934ab4364148fb9 Results Table 6.3 shows the expansion of the 16-byte key into 10 round keys. As previ- ously explained, this process is performed word by word, with each four-byte word occupying one column of the word round-key matrix. The left-hand column shows Key Words Auxiliary Function w0 = 0f 15 71 c9 w1 = 47 d9 e8 59 w2 = 0c b7 ad d6 w3 = af 7f 67 98 RotWord (w3) = 7f 67 98 af = x1 SubWord (x1) = d2 85 46 79 = y1 Rcon (1) = 01 00 00 00 y1 ⊕ Rcon (1) = d3 85 46 79 = z1 w4 = w0 ⊕ z1 = dc 90 37 b0 w5 = w4 ⊕ w1 = 9b 49 df e9 w6 = w5 ⊕ w2 = 97 fe 72 3f w7 = w6 ⊕ w3 = 38 81 15 a7 RotWord (w7) = 81 15 a7 38 = x2 SubWord (x2) = 0c 59 5c 07 = y2 Rcon (2) = 02 00 00 00 y2 ⊕ Rcon (2) = 0e 59 5c 07 = z2 w8 = w4 ⊕ z2 = d2 c9 6b b7 w9 = w8 ⊕ w5 = 49 80 b4 5e w10 = w9 ⊕ w6 = de 7e c6 61 w11 = w10 ⊕ w7 = e6 ff d3 c6 RotWord (w11) = ff d3 c6 e6 = x3 SubWord (x3) = 16 66 b4 83 = y3 Rcon (3) = 04 00 00 00 y3 ⊕ Rcon (3) = 12 66 b4 8e = z3 w12 = w8 ⊕ z3 = c0 af df 39 w13 = w12 ⊕ w9 = 89 2f 6b 67 w14 = w13 ⊕ w10 = 57 51 ad 06 w15 = w14 ⊕ w11 = b1 ae 7e c0 RotWord (w15) = ae 7e c0 b1 = x4 SubWord (x4) = e4 f3 ba c8 = y4 Rcon (4) = 08 00 00 00 y4 ⊕ Rcon (4) = ec f3 ba c8 = 4 Table 6.3 Key Expansion for AES Example (Continued) 194 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD Key Words Auxiliary Function w16 = w12 ⊕ z4 = 2c 5c 65 f1 w17 = w16 ⊕ w13 = a5 73 0e 96 w18 = w17 ⊕ w14 = f2 22 a3 90 w19 = w18 ⊕ w15 = 43 8c dd 50 RotWord (w19) = 8c dd 50 43 = x5 SubWord (x5) = 64 c1 53 1a = y5 Rcon(5) = 10 00 00 00 y5 ⊕ Rcon (5) = 74 c1 53 1a = z5 w20 = w16 ⊕ z5 = 58 9d 36 eb w21 = w20 ⊕ w17 = fd ee 38 7d w22 = w21 ⊕ w18 = 0f cc 9b ed w23 = w22 ⊕ w19 = 4c 40 46 bd RotWord (w23) = 40 46 bd 4c = x6 SubWord (x6) = 09 5a 7a 29 = y6 Rcon(6) = 20 00 00 00 y6 ⊕ Rcon(6) = 29 5a 7a 29 = z6 w24 = w20 ⊕ z6 = 71 c7 4c c2 w25 = w24 ⊕ w21 = 8c 29 74 bf w26 = w25 ⊕ w22 = 83 e5 ef 52 w27 = w26 ⊕ w23 = cf a5 a9 ef RotWord (w27) = a5 a9 ef cf = x7 SubWord (x7) = 06 d3 bf 8a = y7 Rcon (7) = 40 00 00 00 y7 ⊕ Rcon(7) = 46 d3 df 8a = z7 w28 = w24 ⊕ z7 = 37 14 93 48 w29 = w28 ⊕ w25 = bb 3d e7 f7 w30 = w29 ⊕ w26 = 38 d8 08 a5 w31 = w30 ⊕ w27 = f7 7d a1 4a RotWord (w31) = 7d a1 4a f7 = x8 SubWord (x8) = ff 32 d6 68 = y8 Rcon (8) = 80 00 00 00 y8 ⊕ Rcon(8) = 7f 32 d6 68 = z8 w32 = w28 ⊕ z8 = 48 26 45 20 w33 = w32 ⊕ w29 = f3 1b a2 d7 w34 = w33 ⊕ w30 = cb c3 aa 72 w35 = w34 ⊕ w32 = 3c be 0b 3 RotWord (w35) = be 0b 38 3c = x9 SubWord (x9) = ae 2b 07 eb = y9 Rcon (9) = 1B 00 00 00 y9 ⊕ Rcon (9) = b5 2b 07 eb = z9 w36 = w32 ⊕ z9 = fd 0d 42 cb w37 = w36 ⊕ w33 = 0e 16 e0 1c w38 = w37 ⊕ w34 = c5 d5 4a 6e w39 = w38 ⊕ w35 = f9 6b 41 56 RotWord (w39) = 6b 41 56 f9 = x10 SubWord (x10) = 7f 83 b1 99 = y10 Rcon (10) = 36 00 00 00 y10 ⊕ Rcon (10) = 49 83 b1 99 = z10 w40 = w36 ⊕ z10 = b4 8e f3 52 w41 = w40 ⊕ w37 = ba 98 13 4e w42 = w41 ⊕ w38 = 7f 4d 59 20 w43 = w42 ⊕ w39 = 86 26 18 76 Table 6.3 Continued the four round-key words generated for each round. The right-hand column shows the steps used to generate the auxiliary word used in key expansion. We begin, of course, with the key itself serving as the round key for round 0. Next, Table 6.4 shows the progression of State through the AES encryption process. The first column shows the value of State at the start of a round. For the first row, State is just the matrix arrangement of the plaintext. The second, third, and fourth columns show the value of State for that round after the SubBytes, ShiftRows, and MixColumns transformations, respectively. The fifth column shows the round key. You can verify that these round keys equate with those shown in Table 6.3. The first column shows the value of State resulting from the bitwise XOR of State after the preceding MixColumns with the round key for the preceding round. Avalanche Effect If a small change in the key or plaintext were to produce a corresponding small change in the ciphertext, this might be used to effectively reduce the size of the 6.5 / AN AES EXAMPLE 195 Start of Round After SubBytes After ShiftRows After MixColumns Round Key 01 89 fe 76 23 ab dc 54 45 cd ba 32 67 ef 98 10 0f 47 0c af 15 d9 b7 7f 71 e8 ad 67 c9 59 d6 98 0e ce f2 d9 36 72 6b 2b 34 25 17 55 ae b6 4e 88 ab 8b 89 35 05 40 7f f1 18 3f f0 fc e4 4e 2f c4 ab 8b 89 35 40 7f f1 05 f0 fc 18 3f c4 e4 4e 2f b9 94 57 75 e4 8e 16 51 47 20 9a 3f c5 d6 f5 3b dc 9b 97 38 90 49 fe 81 37 df 72 15 b0 e9 3f a7 65 0f c0 4d 74 c7 e8 d0 70 ff e8 2a 75 3f ca 9c 4d 76 ba e3 92 c6 9b 70 51 16 9b e5 9d 75 74 de 4d 76 ba e3 c6 9b 70 92 9b e5 51 16 de 9d 75 74 8e 22 db 12 b2 f2 dc 92 df 80 f7 c1 2d c5 1e 52 d2 49 de e6 c9 80 7e ff 6b b4 c6 d3 b7 5e 61 c6 5c 6b 05 f4 7b 72 a2 6d b4 34 31 12 9a 9b 7f 94 4a 7f 6b bf 21 40 3a 3c 8d 18 c7 c9 b8 14 d2 22 4a 7f 6b bf 40 3a 3c 21 c7 c9 8d 18 22 b8 14 d2 b1 c1 0b cc ba f3 8b 07 f9 1f 6a c3 1d 19 24 5c c0 89 57 b1 af 2f 51 ae df 6b ad 7e 39 67 06 c0 71 48 5c 7d 15 dc da a9 26 74 c7 bd 24 7e 22 9c a3 52 4a ff 59 86 57 d3 f7 92 c6 7a 36 f3 93 de a3 52 4a ff 86 57 d3 59 c6 7a f7 92 de 36 f3 93 d4 11 fe 0f 3b 44 06 73 cb ab 62 37 19 b7 07 ec 2c a5 f2 43 5c 73 22 8c 65 0e a3 dd f1 96 90 50 f8 b4 0c 4c 67 37 24 ff ae a5 c1 ea e8 21 97 bc 41 8d fe 29 85 9a 36 16 e4 06 78 87 9b fd 88 65 41 8d fe 29 9a 36 16 85 78 87 e4 06 65 9b fd 88 2a 47 c4 48 83 e8 18 ba 84 18 27 23 eb 10 0a f3 58 fd 0f 4c 9d ee cc 40 36 38 9b 46 eb 7d ed bd 72 ba cb 04 1e 06 d4 fa b2 20 bc 65 00 6d e7 4e 40 f4 1f f2 72 6f 48 2d 37 b7 65 4d 63 3c 94 2f 40 f4 1f f2 6f 48 2d 72 65 4d 37 b7 2f 63 3c 94 7b 05 42 4a 1e d0 20 40 94 83 18 52 94 c4 43 fb 71 8c 83 cf c7 29 e5 a5 4c 74 ef a9 c2 bf 52 ef 0a 89 c1 85 d9 f9 c5 e5 d8 f7 f7 fb 56 7b 11 14 67 a7 78 97 35 99 a6 d9 61 68 68 0f b1 21 82 fa 67 a7 78 97 99 a6 d9 35 68 0f 61 68 fa b1 21 82 ec 1a c0 80 0c 50 53 c7 3b d7 00 ef b7 22 72 e0 37 bb 38 f7 14 3d d8 7d 93 e7 08 a1 48 f7 a5 4a db a1 f8 77 18 6d 8b ba a8 30 08 4e ff d5 d7 aa b9 32 41 f5 ad 3c 3d f4 c2 04 30 2f 16 03 0e ac b9 32 41 f5 3c 3d f4 ad 30 2f c2 04 ac 16 03 0e b1 1a 44 17 3d 2f ec b6 0a 6b 2f 42 9f 68 f3 b1 48 f3 cb 3c 26 1b c3 be 45 a2 aa 0b 20 d7 72 38 f9 e9 8f 2b 1b 34 2f 08 4f c9 85 49 bf bf 81 89 99 1e 73 f1 af 18 15 30 84 dd 97 3b 08 08 0c a7 99 1e 73 f1 18 15 30 af 97 3b 84 dd a7 08 08 0c 31 30 3a c2 ac 71 8c c4 46 65 48 eb 6a 1c 31 62 fd 0e c5 f9 0d 16 d5 6b 42 e0 4a 41 cb 1c 6e 56 cc 3e ff 3b a1 67 59 af 04 85 02 aa a1 00 5f 34 4b b2 16 e2 32 85 cb 79 f2 97 77 ac 32 63 cf 18 4b b2 16 e2 85 cb 79 32 77 ac f2 97 18 32 63 cf b4 ba 7f 86 8e 98 4d 26 f3 13 59 18 52 4e 20 76 ff 08 69 64 0b 53 34 14 84 bf ab 8f 4a 7c 43 b9 Table 6.4 AES Example 196 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD Round Number of Bits that Differ 0123456789abcdeffedcba9876543210 0023456789abcdeffedcba9876543210 1 0 0e3634aece7225b6f26b174ed92b5588 0f3634aece7225b6f26b174ed92b5588 1 1 657470750fc7ff3fc0e8e8ca4dd02a9c c4a9ad090fc7ff3fc0e8e8ca4dd02a9c 20 2 5c7bb49a6b72349b05a2317ff46d1294 fe2ae569f7ee8bb8c1f5a2bb37ef53d5 58 3 7115262448dc747e5cdac7227da9bd9c ec093dfb7c45343d689017507d485e62 59 4 f867aee8b437a5210c24c1974cffeabc 43efdb697244df808e8d9364ee0ae6f5 61 5 721eb200ba06206dcbd4bce704fa654e 7b28a5d5ed643287e006c099bb375302 68 6 0ad9d85689f9f77bc1c5f71185e5fb14 3bc2d8b6798d8ac4fe36a1d891ac181a 64 7 db18a8ffa16d30d5f88b08d777ba4eaa 9fb8b5452023c70280e5c4bb9e555a4b 67 8 f91b4fbfe934c9bf8f2f85812b084989 20264e1126b219aef7feb3f9b2d6de40 65 9 cca104a13e678500ff59025f3bafaa34 b56a0341b2290ba7dfdfbddcd8578205 61 10 ff0b844a0853bf7c6934ab4364148fb9 612b89398d0600cde116227ce72433f0 58 Table 6.5 Avalanche Effect in AES: Change in Plaintext plaintext (or key) space to be searched. What is desired is the avalanche effect, in which a small change in plaintext or key produces a large change in the ciphertext. Using the example from Table 6.4, Table 6.5 shows the result when the eighth bit of the plaintext is changed. The second column of the table shows the value of the State matrix at the end of each round for the two plaintexts. Note that after just one round, 20 bits of the State vector differ. After two rounds, close to half the bits differ. This magnitude of difference propagates through the remaining rounds. A bit difference in approximately half the positions in the most desirable outcome. Clearly, if almost all the bits are changed, this would be logically equivalent to almost none of the bits being changed. Put another way, if we select two plaintexts at random, we would expect the two plaintexts to differ in about half of the bit positions and the two ciphertexts to also differ in about half the positions. Table 6.6 shows the change in State matrix values when the same plaintext is used and the two keys differ in the eighth bit. That is, for the second case, the key is 0e1571c947d9e8590cb7add6af7f6798. Again, one round produces a significant change, and the magnitude of change after all subsequent rounds is roughly half the bits. Thus, based on this example, AES exhibits a very strong avalanche effect. 6.6 / AES IMPLEMENTATION 197 Round Number of Bits that Differ 0123456789abcdeffedcba9876543210 0123456789abcdeffedcba9876543210 0 0 0e3634aece7225b6f26b174ed92b5588 0f3634aece7225b6f26b174ed92b5588 1 1 657470750fc7ff3fc0e8e8ca4dd02a9c c5a9ad090ec7ff3fc1e8e8ca4cd02a9c 22 2 5c7bb49a6b72349b05a2317ff46d1294 90905fa9563356d15f3760f3b8259985 58 3 7115262448dc747e5cdac7227da9bd9c 18aeb7aa794b3b66629448d575c7cebf 67 4 f867aee8b437a5210c24c1974cffeabc f81015f993c978a876ae017cb49e7eec 63 5 721eb200ba06206dcbd4bce704fa654e 5955c91b4e769f3cb4a94768e98d5267 81 6 0ad9d85689f9f77bc1c5f71185e5fb14 dc60a24d137662181e45b8d3726b2920 70 7 db18a8ffa16d30d5f88b08d777ba4eaa fe8343b8f88bef66cab7e977d005a03c 74 8 f91b4fbfe934c9bf8f2f85812b084989 da7dad581d1725c5b72fa0f9d9d1366a 67 9 cca104a13e678500ff59025f3bafaa34 0ccb4c66bbfd912f4b511d72996345e0 59 10 ff0b844a0853bf7c6934ab4364148fb9 fc8923ee501a7d207ab670686839996b 53 Table 6.6 Avalanche Effect in AES: Change in Key Note that this avalanche effect is stronger than that for DES (Table 4.2), which requires three rounds to reach a point at which approximately half the bits are changed, both for a bit change in the plaintext and a bit change in the key. 6.6 AES IMPLEMENTATION Equivalent Inverse Cipher As was mentioned, the AES decryption cipher is not identical to the encryption cipher (Figure 6.3). That is, the sequence of transformations for decryption differs from that for encryption, although the form of the key schedules for encryption and decryption is the same. This has the disadvantage that two separate software or firmware modules are needed for applications that require both encryption and decryption. There is, however, an equivalent version of the decryption algorithm that has the same structure as the encryption algorithm. The equivalent version has the same sequence of transformations as the encryption algorithm (with transfor- mations replaced by their inverses). To achieve this equivalence, a change in key schedule is needed. 198 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD Two separate changes are needed to bring the decryption structure in line with the encryption structure. As illustrated in Figure 6.3, an encryption round has the structure SubBytes, ShiftRows, MixColumns, AddRoundKey. The standard decryption round has the structure InvShiftRows, InvSubBytes, AddRoundKey, InvMixColumns. Thus, the first two stages of the decryption round need to be inter- changed, and the second two stages of the decryption round need to be interchanged. INTERCHANGING INVSHIFTROWS AND INVSUBBYTES InvShiftRows affects the se- quence of bytes in State but does not alter byte contents and does not depend on byte contents to perform its transformation. InvSubBytes affects the contents of bytes in State but does not alter byte sequence and does not depend on byte se- quence to perform its transformation. Thus, these two operations commute and can be interchanged. For a given State Si, InvShiftRows [InvSubBytes (Si)] = InvSubBytes [InvShiftRows (Si)] INTERCHANGING ADDROUNDKEY AND INVMIXCOLUMNS The transformations AddRoundKey and InvMixColumns do not alter the sequence of bytes in State. If we view the key as a sequence of words, then both AddRoundKey and InvMixColumns operate on State one column at a time. These two operations are linear with respect to the column input. That is, for a given State Si and a given round key wj, InvMixColumns (Si⊕ wj) = [InvMixColumns (Si)]⊕ [InvMixColumns (wj)] To see this, suppose that the first column of State Si is the sequence (y0, y1, y2, y3) and the first column of the round key wj is (k0, k1, k2, k3). Then we need to show D 0E 0B 0D 0909 0E 0B 0D 0D 09 0E 0B 0B 0D 09 0E T Dy0⊕ k0y1⊕ k1 y2⊕ k2 y3⊕ k3 T = D 0E 0B 0D 0909 0E 0B 0D 0D 09 0E 0B 0B 0D 09 0E T Dy0y1 y2 y3 T ⊕ D 0E 0B 0D 0909 0E 0B 0D 0D 09 0E 0B 0B 0D 09 0E T Dk0k1 k2 k3 T Let us demonstrate that for the first column entry. We need to show [{0E} # (y0⊕ k0)]⊕ [{0B} # (y1⊕ k1)]⊕ [{0D} # (y2⊕ k2)]⊕ [{09} # (y3⊕ k3)] = [{0E} # y0]⊕ [{0B} # y1]⊕ [{0D} # y2]⊕ [{09} # y3]⊕ [{0E} # k0]⊕ [{0B} # k1]⊕ [{0D} # k2]⊕ [{09} # k3] This equation is valid by inspection. Thus, we can interchange AddRoundKey and InvMixColumns, provided that we first apply InvMixColumns to the round key. Note that we do not need to apply InvMixColumns to the round key for the input to the first AddRoundKey transformation (preceding the first round) nor to the last AddRoundKey transformation (in round 10). This is because these two AddRoundKey transformations are not interchanged with InvMixColumns to pro- duce the equivalent decryption algorithm. Figure 6.10 illustrates the equivalent decryption algorithm. 6.6 / AES IMPLEMENTATION 199 Figure 6.10 Equivalent Inverse Cipher Add round key w[36, 39] w[40, 43] Ciphertext Inverse sub bytes Inverse shift rows Inverse mix cols R ou nd 1 R ou nd 9 R ou nd 1 0 Add round keyInverse mix cols Inverse sub bytes Inverse shift rows Inverse mix cols Add round keyInverse mix cols Inverse sub bytes Inverse shift rowsExpand key Add round key PlaintextKey w[4, 7] w[0, 3] Implementation Aspects The Rijndael proposal [DAEM99] provides some suggestions for efficient im- plementation on 8-bit processors, typical for current smart cards, and on 32-bit processors, typical for PCs. 8-BIT PROCESSOR AES can be implemented very efficiently on an 8-bit proces- sor. AddRoundKey is a bytewise XOR operation. ShiftRows is a simple byte- shifting operation. SubBytes operates at the byte level and only requires a table of 256 bytes. The transformation MixColumns requires matrix multiplication in the field GF(28), which means that all operations are carried out on bytes. MixColumns only requires multiplication by {02} and {03}, which, as we have seen, involved simple shifts, conditional XORs, and XORs. This can be implemented in a more efficient Hiva-Network.Com 200 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD way that eliminates the shifts and conditional XORs. Equation set (6.4) shows the equations for the MixColumns transformation on a single column. Using the iden- tity {03} # x = ({02} # x)⊕ x, we can rewrite Equation set (6.4) as follows. Tmp = s0, j⊕ s1, j⊕ s2, j⊕ s3, j s0, j = = s0, j⊕ Tmp⊕ [2 # (s0, j⊕ s1, j)] s1, j = = s1, j⊕ Tmp⊕ [2 # (s1, j⊕ s2, j)] (6.9) s2, j = = s2, j⊕ Tmp⊕ [2 # (s2, j⊕ s3, j)] s3, j = = s3, j⊕ Tmp⊕ [2 # (s3, j⊕ s0, j)] Equation set (6.9) is verified by expanding and eliminating terms. The multiplication by {02} involves a shift and a conditional XOR. Such an implementation may be vulnerable to a timing attack of the sort described in Section 4.4. To counter this attack and to increase processing efficiency at the cost of some storage, the multiplication can be replaced by a table lookup. Define the 256-byte table X2, such that X2[i] = {02} # i. Then Equation set (6.9) can be rewritten as Tmp = s0, j⊕ s1, j⊕ s2, j⊕ s3, j s0, j = = s0, j⊕ Tmp⊕ X2[s0, j⊕ s1, j] s1, c = = s1, j⊕ Tmp⊕ X2[s1, j⊕ s2, j] s2, c = = s2, j⊕ Tmp⊕ X2[s2, j⊕ s3, j] s3, j = = s3, j⊕ Tmp⊕ X2[s3, j⊕ s0, j] 32-BIT PROCESSOR The implementation described in the preceding subsection uses only 8-bit operations. For a 32-bit processor, a more efficient implementation can be achieved if operations are defined on 32-bit words. To show this, we first define the four transformations of a round in algebraic form. Suppose we begin with a State matrix consisting of elements ai, j and a round-key matrix consisting of elements ki, j. Then the transformations can be expressed as follows. SubBytes bi, j = S[ai, j] ShiftRows D c0, jc1, j c2, j c3, j T = D b0, jb1, j- 1 b2, j- 2 b3, j- 3 T MixColumns Dd0, jd1, j d2, j d3, j T = D02 03 01 0101 02 03 01 01 01 02 03 03 01 01 02 T D c0, jc1, j c2, j c3, j T AddRoundKey D e0, je1, j e2, j e3, j T = Dd0, jd1, j d2, j d3, j T ⊕ Dk0, jk1, j k2, j k3, j T 6.6 / AES IMPLEMENTATION 201 In the ShiftRows equation, the column indices are taken mod 4. We can combine all of these expressions into a single equation: D e0, je1, j e2, j e3, j T = D02 03 01 0101 02 03 01 01 01 02 03 03 01 01 02 T D S[a0, j]S[a1, j- 1] S[a2, j- 2] S[a3, j- 3] T ⊕ Dk0, jk1, j k2, j k3, j T = § D0201 01 03 T # S[a0, j]¥ ⊕ § D030201 01 T # S[a1, j- 1]¥ ⊕ § D010302 01 T # S[a2, j- 2]¥ ⊕ § D0101 03 02 T # S[a3, j- 3]¥ ⊕ Dk0, jk1, jk2, j k3, j T In the second equation, we are expressing the matrix multiplication as a linear com- bination of vectors. We define four 256-word (1024-byte) tables as follows. T0[x] = § D020101 03 T # S[x]¥ T1[x] = § D030201 01 T # S[x]¥ T2[x] = § D010302 01 T # S[x]¥ T3[x] = § D010103 02 T # S[x]¥ Thus, each table takes as input a byte value and produces a column vector (a 32-bit word) that is a function of the S-box entry for that byte value. These tables can be calculated in advance. We can define a round function operating on a column in the following fashion. D s0, j=s1, j= s2, j = s3, j = T = T0[s0, j]⊕ T1[s1, j- 1]⊕ T2[s2, j- 2]⊕ T3[s3, j- 3]⊕ Dk0, jk1, jk2, j k3, j T As a result, an implementation based on the preceding equation requires only four table lookups and four XORs per column per round, plus 4 Kbytes to store the table. The developers of Rijndael believe that this compact, efficient implementa- tion was probably one of the most important factors in the selection of Rijndael for AES. 202 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD 6.7 KEY TERMS, REVIEW QUESTIONS, AND PROBLEMS Advanced Encryption Standard (AES) avalanche effect field finite field irreducible polynomial key expansion National Institute of Standards and Technology (NIST) Rijndael S-box Key Terms Review Questions 6.1 What was the original set of criteria used by NIST to evaluate candidate AES ciphers? 6.2 What was the final set of criteria used by NIST to evaluate candidate AES ciphers? 6.3 What is the difference between Rijndael and AES? 6.4 What is the purpose of the State array? 6.5 How is the S-box constructed? 6.6 Briefly describe SubBytes. 6.7 Briefly describe ShiftRows. 6.8 How many bytes in State are affected by ShiftRows? 6.9 Briefly describe MixColumns. 6.10 Briefly describe AddRoundKey. 6.11 Briefly describe the key expansion algorithm. 6.12 What is the difference between SubBytes and SubWord? 6.13 What is the difference between ShiftRows and RotWord? 6.14 What is the difference between the AES decryption algorithm and the equivalent inverse cipher? Problems 6.1 In the discussion of MixColumns and InvMixColumns, it was stated that b(x) = a-1(x) mod(x4 + 1) where a(x) = {03}x3 + {01}x2 + {01}x + {02} and b(x) = {0B}x3 + {0D}x2 + {09}x + {0E.} Show that this is true. 6.2 a. What is {0 2 }-1 in GF(28)? b. Verify the entry for {0 2 } in the S-box. 6.3 Show the first eight words of the key expansion for a 128-bit key of all ones. 6.4 Given the plaintext {0F0E0D0C0B0A09080706050403020100} and the key {02020202020202020202020202020202}: a. Show the original contents of State, displayed as a 4 * 4 matrix. b. Show the value of State after initial AddRoundKey. c. Show the value of State after SubBytes. d. Show the value of State after ShiftRows. e. Show the value of State after MixColumns. 6.5 Verify Equation (6.11) in Appendix 6A. That is, show that xi mod (x4 + 1) = xi mod 4. APPENDIX 6A / POLYNOMIALS WITH COEFFICIENTS IN GF(28) 203 6.6 Compare AES to DES. For each of the following elements of DES, indicate the com- parable element in AES or explain why it is not needed in AES. a. XOR of subkey material with the input to the f function b. XOR of the f function output with the left half of the block c. f function d. permutation P e. swapping of halves of the block 6.7 In the subsection on implementation aspects, it is mentioned that the use of tables helps thwart timing attacks. Suggest an alternative technique. 6.8 In the subsection on implementation aspects, a single algebraic equation is developed that describes the four stages of a typical round of the encryption algorithm. Provide the equivalent equation for the tenth round. 6.9 Compute the output of the MixColumns transformation for the following sequence of input bytes “A1 B2 C3 D4.” Apply the InvMixColumns transformation to the ob- tained result to verify your calculations. Change the first byte of the input from “A1” to “A3” perform the MixColumns transformation again for the new input, and deter- mine how many bits have changed in the output. Note: You can perform all calculations by hand or write a program supporting these computations. If you choose to write a program, it should be written entirely by you; no use of libraries or public domain source code is allowed in this assignment. 6.10 Use the key 1010 1001 1100 0011 to encrypt the plaintext “hi” as expressed in ASCII as 0110 1000 0110 1001. The designers of S-AES got the ciphertext 0011 1110 1111 1011. Do you? 6.11 Show that the matrix given here, with entries in GF(24), is the inverse of the matrix used in the MixColumns step of S-AES.¢x3 + 1 x x x3 + 1 ≤ 6.12 Carefully write up a complete decryption of the ciphertext 0011 1110 1111 1011 using the key 1010 1001 1100 0011 and the S-AES algorithm. You should get the plaintext we started with in Problem 6.10. Note that the inverse of the S-boxes can be done with a reverse table lookup. The inverse of the MixColumns step is given by the ma- trix in the previous problem. 6.13 Demonstrate that Equation (6.9) is equivalent to Equation (6.4). Programming Problems 6.14 Create software that can encrypt and decrypt using S-AES. Test data: A binary plaintext of 0110 1111 0110 1011 encrypted with a binary key of 1010 0111 0011 1011 should give a binary ciphertext of 0000 0111 0011 1000. Decryption should work correspondingly. 6.15 Implement a differential cryptanalysis attack on 1-round S-AES. APPENDIX 6A POLYNOMIALS WITH COEFFICIENTS IN GF(28) In Section 5.5, we discussed polynomial arithmetic in which the coefficients are in Zp and the polynomials are defined modulo a polynomial m(x) whose highest power is some integer n. In this case, addition and multiplication of coefficients occurred within the field Zp; that is, addition and multiplication were performed modulo p. 204 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD The AES document defines polynomial arithmetic for polynomials of degree 3 or less with coefficients in GF(28). The following rules apply. 1. Addition is performed by adding corresponding coefficients in GF(28). As was pointed out Section 5.4, if we treat the elements of GF(28) as 8-bit strings, then addition is equivalent to the XOR operation. So, if we have a(x) = a3x3 + a2x2 + a1x + a0 (6.10) and b(x) = b3x3 + b2x2 + b1x + b0 (6.11) then a(x) + b(x) = (a3⊕ b3)x3 + (a2⊕ b2)x2 + (a1 ⊕ b1)x + (a0⊕ b0) 2. Multiplication is performed as in ordinary polynomial multiplication with two refinements: a. Coefficients are multiplied in GF(28). b. The resulting polynomial is reduced mod (x4 + 1). We need to keep straight which polynomial we are talking about. Recall from Section 5.6 that each element of GF(28) is a polynomial of degree 7 or less with bi- nary coefficients, and multiplication is carried out modulo a polynomial of degree 8. Equivalently, each element of GF(28) can be viewed as an 8-bit byte whose bit values correspond to the binary coefficients of the corresponding polynomial. For the sets defined in this section, we are defining a polynomial ring in which each ele- ment of this ring is a polynomial of degree 3 or less with coefficients in GF(28), and multiplication is carried out modulo a polynomial of degree 4. Equivalently, each element of this ring can be viewed as a 4-byte word whose byte values are elements of GF(28) that correspond to the 8-bit coefficients of the corresponding polynomial. We denote the modular product of a(x) and b(x) by a(x)⊕ b(x). To com- pute d(x) = a(x)⊕ b(x), the first step is to perform a multiplication without the modulo operation and to collect coefficients of like powers. Let us express this as c(x) = a(x) * b(x). Then c(x) = c6x6 + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0 (6.12) where c0 = a0 # b0 c4 = (a3 # b1)⊕ (a2 # b2)⊕ (a1 # b3) c1 = (a1 # b0)⊕ (a0 # b1) c5 = (a3 # b2)⊕ (a2 # b3) c2 = (a2 # b0)⊕ (a1 # b1)⊕ (a0 # b2) c6 = a3 # b3 c3 = (a3 # b0)⊕ (a2 # b1)⊕ (a1 # b2)⊕ (a0 # b3) The final step is to perform the modulo operation d(x) = c(x) mod (x4 + 1) That is, d(x) must satisfy the equation c(x) = [(x4 + 1) * q(x)]⊕ d(x) such that the degree of d(x) is 3 or less. A practical technique for performing multiplication over this polynomial ring is based on the observation that xi mod (x4 + 1) = xi mod 4 (6.13) If we now combine Equations (6.12) and (6.13), we end up with d(x) = c(x) mod (x4 + 1) = [c6x6 + c5x5 + c4x4 + c3x3 + c2x2 + c1x + c0] mod (x4 + 1) = c3x3 + (c2⊕ c6)x2 + (c1⊕ c5)x + (c0⊕ c4) Expanding the ci coefficients, we have the following equations for the coef- ficients of d(x). d0 = (a0 # b0)⊕ (a3 # b1)⊕ (a2 # b2)⊕ (a1 # b3) d1 = (a1 # b0)⊕ (a0 # b1)⊕ (a3 # b2)⊕ (a2 # b3) d2 = (a2 # b0)⊕ (a1 # b1)⊕ (a0 # b2)⊕ (a3 # b3) d3 = (a3 # b0)⊕ (a2 # b1)⊕ (a1 # b2)⊕ (a0 # b3) This can be written in matrix form: Dd0d1 d2 d3 T = Da0 a3 a2 a1a1 a0 a3 a2 a2 a1 a0 a3 a3 a2 a1 a0 T Db0b1 b2 b3 T (6.14) MixColumns Transformation In the discussion of MixColumns, it was stated that there were two equivalent ways of defining the transformation. The first is the matrix multiplication shown in Equation (6.3), which is repeated here: D02 03 01 0101 02 03 01 01 01 02 03 03 01 01 02 T D s0, 0 s0, 1 s0, 2 s0, 3s1, 0 s1, 1 s1, 2 s1, 3 s2, 0 s2, 1 s2, 2 s2, 3 s3, 0 s3, 1 s3, 2 s3, 3 T = D s0, 0= s0, 1= s0, 2= s0, 3=s1, 0= s1, 1= s1, 2= s1, 3= s2, 0 = s2, 1 = s2, 2 = s2, 3 = s3, 0 = s3, 1 = s3, 2 = s3, 3 = T The second method is to treat each column of State as a four-term polynomial with coefficients in GF(28). Each column is multiplied modulo (x4 + 1) by the fixed polynomial a(x), given by a(x) = {03}x3 + {01}x2 + {01}x + {02} APPENDIX 6A / POLYNOMIALS WITH COEFFICIENTS IN GF(28) 205 206 CHAPTER 6 / ADVANCED ENCRYPTION STANDARD From Equation (6.10), we have a3 = {03}; a2 = {01}; a1 = {01}; and a0 = {02}. For the jth column of State, we have the polynomial colj(x) = s3,jx3 + s2,jx 2 + s1,jx + s0, j. Substituting into Equation (6.14), we can express d(x) = a(x) * colj(x) as Dd0d1 d2 d3 T = Da0 a3 a2 a1a1 a0 a3 a2 a2 a1 a0 a3 a3 a2 a1 a0 T D s0,js1,j s2,j s3,j T = D02 03 01 0101 02 03 01 01 01 02 03 03 01 01 02 T D s0,js1,j s2,j s3,j T which is equivalent to Equation (6.3). Multiplication by x Consider the multiplication of a polynomial in the ring by x: c(x) = x⊕ b(x). We have c(x) = x⊕ b(x) = [x * (b3x3 + b2x2 + b1x + b0] mod (x4 + 1) = (b3x4 + b2x3 + b1x2 + b0x) mod (x4 + 1) = b2x3 + b1x2 + b0x + b3 Thus, multiplication by x corresponds to a 1-byte circular left shift of the 4 bytes in the word representing the polynomial. If we represent the polynomial as a 4-byte column vector, then we have D c0c1 c2 c3 T = D00 00 00 0101 00 00 00 00 01 00 00 00 00 01 00 T Db0b1 b2 b3 T 207 Block Cipher Operation 7.1 Multiple Encryption and Triple DES Double DES Triple DES with Two Keys Triple DES with Three Keys 7.2 Electronic Codebook 7.3 Cipher Block Chaining Mode 7.4 Cipher Feedback Mode 7.5 Output Feedback Mode 7.6 Counter Mode 7.7 XTS-AES Mode for Block-Oriented Storage Devices Tweakable Block Ciphers Storage Encryption Requirements Operation on a Single Block Operation on a Sector 7.8 Format-Preserving Encryption Motivation Difficulties in Designing an FPE Feistel Structure for Format-Preserving Encryption NIST Methods for Format-Preserving Encryption 7.9 Key Terms, Review Questions, and Problems CHAPTER 208 CHAPTER 7 / BLOCK CIPHER OPERATION This chapter continues our discussion of symmetric ciphers. We begin with the topic of multiple encryption, looking in particular at the most widely used multiple-encryption scheme: triple DES. The chapter next turns to the subject of block cipher modes of operation. We find that there are a number of different ways to apply a block cipher to plaintext, each with its own advantages and particular applications. 7.1 MULTIPLE ENCRYPTION AND TRIPLE DES Because of its vulnerability to brute-force attack, DES, once the most widely used symmetric cipher, has been largely replaced by stronger encryption schemes. Two approaches have been taken. One approach is to design a completely new algo- rithm that is resistant to both cryptanalytic and brute-force attacks, of which AES is a prime example. Another alternative, which preserves the existing investment in software and equipment, is to use multiple encryption with DES and multiple keys. We begin by examining the simplest example of this second alternative. We then look at the widely accepted triple DES (3DES) algorithm. Double DES The simplest form of multiple encryption has two encryption stages and two keys (Figure 7.1a). Given a plaintext P and two encryption keys K1 and K2, ciphertext C is generated as C = E(K2, E(K1, P)) Decryption requires that the keys be applied in reverse order: P = D(K1, D(K2, C)) For DES, this scheme apparently involves a key length of 56 * 2 = 112 bits, and should result in a dramatic increase in cryptographic strength. But we need to exam- ine the algorithm more closely. LEARNING OBJECTIVES After studying this chapter, you should be able to: ◆ Analyze the security of multiple encryption schemes. ◆ Explain the meet-in-the-middle attack. ◆ Compare and contrast ECB, CBC, CFB, OFB, and counter modes of operation. ◆ Present an overview of the XTS-AES mode of operation. Hiva-Network.Com 7.1 / MULTIPLE ENCRYPTION AND TRIPLE DES 209 REDUCTION TO A SINGLE STAGE Suppose it were true for DES, for all 56-bit key val- ues, that given any two keys K1 and K2, it would be possible to find a key K3 such that E(K2, E(K1, P)) = E(K3, P) (7.1) If this were the case, then double encryption, and indeed any number of stages of multiple encryption with DES, would be useless because the result would be equiv- alent to a single encryption with a single 56-bit key. On the face of it, it does not appear that Equation (7.1) is likely to hold. Consider that encryption with DES is a mapping of 64-bit blocks to 64-bit blocks. In fact, the mapping can be viewed as a permutation. That is, if we consider all 264 possible input blocks, DES encryption with a specific key will map each block into a unique 64-bit block. Otherwise, if, say, two given input blocks mapped to the same output block, then decryption to recover the original plaintext would be impossible. Figure 7.1 Multiple Encryption (3-key) (2-key)K1 K3 or (3-key) (2-key)K1 K3 or E E K1 P K2 C X Encryption D D K1 C K2 P X Decryption (a) Double encryption E D E K1 P K2 C A B Encryption D E D K1 C K2 P Decryption (b) Triple encryption B A 210 CHAPTER 7 / BLOCK CIPHER OPERATION With 264 possible inputs, how many different mappings are there that generate a permutation of the input blocks? The value is easily seen to be (264)! = 10347380000000000000000 7 (1010 20 ) On the other hand, DES defines one mapping for each different key, for a total number of mappings: 256 6 1017 Therefore, it is reasonable to assume that if DES is used twice with different keys, it will produce one of the many mappings that are not defined by a single application of DES. Although there was much supporting evidence for this assumption, it was not until 1992 that the assumption was proven [CAMP92]. MEET-IN-THE-MIDDLE ATTACK Thus, the use of double DES results in a mapping that is not equivalent to a single DES encryption. But there is a way to attack this scheme, one that does not depend on any particular property of DES but that will work against any block encryption cipher. The algorithm, known as a meet-in-the-middle attack, was first described in [DIFF77]. It is based on the observation that, if we have C = E(K2, E(K1, P)) then (see Figure 7.1a) X = E(K1, P) = D(K2, C) Given a known pair, (P, C), the attack proceeds as follows. First, encrypt P for all 256 possible values of K1. Store these results in a table and then sort the table by the values of X. Next, decrypt C using all 256 possible values of K2. As each decryption is produced, check the result against the table for a match. If a match occurs, then test the two resulting keys against a new known plaintext–ciphertext pair. If the two keys produce the correct ciphertext, accept them as the correct keys. For any given plaintext P, there are 264 possible ciphertext values that could be produced by double DES. Double DES uses, in effect, a 112-bit key, so that there are 2112 possible keys. Therefore, for a given plaintext P, the maximum number of different 112-bit keys that could produce a given ciphertext C is 2112/264 = 248. Thus, the foregoing procedure can produce about 248 false alarms on the first (P, C) pair. A similar argument indicates that with an additional 64 bits of known plaintext and ciphertext, the false alarm rate is reduced to 248 - 64 = 2-16. Put another way, if the meet-in-the-middle attack is performed on two blocks of known plaintext– ciphertext, the probability that the correct keys are determined is 1 - 2-16. The result is that a known plaintext attack will succeed against double DES, which has a key size of 112 bits, with an effort on the order of 256, which is not much more than the 255 required for single DES. Triple DES with Two Keys An obvious counter to the meet-in-the-middle attack is to use three stages of encryption with three different keys. Using DES as the underlying algorithm, this approach is commonly referred to as 3DES, or Triple Data Encryption 7.1 / MULTIPLE ENCRYPTION AND TRIPLE DES 211 Algorithm (TDEA). As shown in Figure 7.1b, there are two versions of 3DES; one using two keys and one using three keys. NIST SP 800-67 (Recommendation for the Triple Data Encryption Block Cipher, January 2012) defines the two-key and three-key versions. We look first at the strength of the two-key version and then examine the three-key version. Two-key triple encryption was first proposed by Tuchman [TUCH79]. The function follows an encrypt-decrypt-encrypt (EDE) sequence (Figure 7.1b): C = E(K1, D(K2, E(K1, P))) P = D(K1, E(K2, D(K1, C))) There is no cryptographic significance to the use of decryption for the second stage. Its only advantage is that it allows users of 3DES to decrypt data encrypted by users of the older single DES: C = E(K1, D(K1, E(K1, P))) = E(K1, P) P = D(K1, E(K1, D(K1, C))) = D(K1, C) 3DES with two keys is a relatively popular alternative to DES and has been adopted for use in the key management standards ANSI X9.17 and ISO 8732.1 Currently, there are no practical cryptanalytic attacks on 3DES. Coppersmith [COPP94] notes that the cost of a brute-force key search on 3DES is on the order of 2112 ≈ (5 * 1033) and estimates that the cost of differential cryptanalysis suffers an exponential growth, compared to single DES, exceeding 1052. It is worth looking at several proposed attacks on 3DES that, although not practical, give a flavor for the types of attacks that have been considered and that could form the basis for more successful future attacks. The first serious proposal came from Merkle and Hellman [MERK81]. Their plan involves finding plaintext values that produce a first intermediate value of A = 0 (Figure 7.1b) and then using the meet-in-the-middle attack to determine the two keys. The level of effort is 256, but the technique requires 256 chosen plain- text–ciphertext pairs, which is a number unlikely to be provided by the holder of the keys. A known-plaintext attack is outlined in [VANO90]. This method is an im- provement over the chosen-plaintext approach but requires more effort. The attack is based on the observation that if we know A and C (Figure 7.1b), then the problem reduces to that of an attack on double DES. Of course, the attacker does not know A, even if P and C are known, as long as the two keys are unknown. However, the attacker can choose a potential value of A and then try to find a known (P, C) pair that produces A. The attack proceeds as follows. 1. Obtain n (P, C) pairs. This is the known plaintext. Place these in a table (Table 1) sorted on the values of P (Figure 7.2b). 1American National Standards Institute (ANSI): Financial Institution Key Management (Wholesale). From its title, X9.17 appears to be a somewhat obscure standard. Yet a number of techniques specified in this standard have been adopted for use in other standards and applications, as we shall see throughout this book. 212 CHAPTER 7 / BLOCK CIPHER OPERATION 2. Pick an arbitrary value a for A, and create a second table (Figure 7.2c) with en- tries defined in the following fashion. For each of the 256 possible keys K1 = i, calculate the plaintext value Pi such that Pi = D(i, a) For each Pi that matches an entry in Table 1, create an entry in Table 2 consist- ing of the K1 value and the value of B that is produced for the (P, C) pair from Table 1, assuming that value of K1: B = D(i, C) At the end of this step, sort Table 2 on the values of B. 3. We now have a number of candidate values of K1 in Table 2 and are in a position to search for a value of K2. For each of the 2 56 possible keys K2 = j, calculate the second intermediate value for our chosen value of a: Bj = D(j, a) At each step, look up Bj in Table 2. If there is a match, then the corresponding key i from Table 2 plus this value of j are candidate values for the unknown keys (K1, K2). Why? Because we have found a pair of keys (i, j) that produce a known (P, C) pair (Figure 7.2a). 4. Test each candidate pair of keys (i, j) on a few other plaintext–ciphertext pairs. If a pair of keys produces the desired ciphertext, the task is complete. If no pair succeeds, repeat from step 1 with a new value of a. Figure 7.2 Known-Plaintext Attack on Triple DES E D E i j i Ci a Bj (a) Two-key triple encryption with candidate pair of keys Pi Pi Ci (b) Table of n known plaintext–ciphertext pairs, sorted on P Bj Key i (c) Table of intermediate values and candidate keys 7.2 / ELECTRONIC CODEBOOK 213 For a given known (P, C), the probability of selecting the unique value of a that leads to success is 1/264. Thus, given n (P, C) pairs, the probability of success for a single selected value of a is n/264. A basic result from probability theory is that the expected number of draws required to draw one red ball out of a bin containing n red balls and N - n green balls is (N + 1)/(n + 1) if the balls are not replaced. So the expected number of values of a that must be tried is, for large n, 264 + 1 n + 1 ≈ 264 n Thus, the expected running time of the attack is on the order of (256) 264 n = 2120 - log2 n Triple DES with Three Keys Although the attacks just described appear impractical, anyone using two-key 3DES may feel some concern. Thus, many researchers now feel that three-key 3DES is the preferred alternative (e.g., [KALI96a]). In SP 800-57, Part 1 (Recommendation for Key Management—Part 1: General, July 2012) NIST recommends that 2-key 3DES be retired as soon as practical and replaced with 3-key 3DES. Three-key 3DES is defined as C = E(K3, D(K2, E(K1, P))) Backward compatibility with DES is provided by putting K3 = K2 or K1 = K2. One might expect that 3TDEA would provide 56 # 3 = 168 bits of strength. However, there is an attack on 3TDEA that reduces the strength to the work that would be involved in exhausting a 112-bit key [MERK81]. A number of Internet-based applications have adopted three-key 3DES, in- cluding PGP and S/MIME, both discussed in Chapter 19. 7.2 ELECTRONIC CODEBOOK A block cipher takes a fixed-length block of text of length b bits and a key as input and produces a b-bit block of ciphertext. If the amount of plaintext to be encrypted is greater than b bits, then the block cipher can still be used by breaking the plain- text up into b-bit blocks. When multiple blocks of plaintext are encrypted using the same key, a number of security issues arise. To apply a block cipher in a variety of applications, five modes of operation have been defined by NIST (SP 800-38A). In essence, a mode of operation is a technique for enhancing the effect of a cryp- tographic algorithm or adapting the algorithm for an application, such as applying a block cipher to a sequence of data blocks or a data stream. The five modes are intended to cover a wide variety of applications of encryption for which a block cipher could be used. These modes are intended for use with any symmetric block cipher, including triple DES and AES. The modes are summarized in Table 7.1 and described in this and the following sections. 214 CHAPTER 7 / BLOCK CIPHER OPERATION The simplest mode is the electronic codebook (ECB) mode, in which plaintext is handled one block at a time and each block of plaintext is encrypted using the same key (Figure 7.3). The term codebook is used because, for a given key, there is a unique ciphertext for every b-bit block of plaintext. Therefore, we can imagine a gigantic codebook in which there is an entry for every possible b-bit plaintext pat- tern showing its corresponding ciphertext. For a message longer than b bits, the procedure is simply to break the message into b-bit blocks, padding the last block if necessary. Decryption is performed one block at a time, always using the same key. In Figure 7.3, the plaintext (padded as necessary) consists of a sequence of b-bit blocks, P1, P2, c , PN; the correspond- ing sequence of ciphertext blocks is C1, C2, c , CN. We can define ECB mode as follows. ECB C j = E(K, Pj) j = 1, c , N Pj = D(K, Cj) j = 1, c , N The ECB mode should be used only to secure messages shorter than a single block of underlying cipher (i.e., 64 bits for 3DES and 128 bits for AES), such as to encrypt a secret key. Because in most of the cases messages are longer than the en- cryption block mode, this mode has a minimum practical value. The most significant characteristic of ECB is that if the same b-bit block of plaintext appears more than once in the message, it always produces the same ciphertext. Mode Description Typical Application Electronic Codebook (ECB) Each block of plaintext bits is encoded independently using the same key. Secure transmission of single values (e.g., an encryption key) Cipher Block Chaining (CBC) The input to the encryption algo- rithm is the XOR of the next block of plaintext and the preceding block of ciphertext. General-purpose block- oriented transmission Authentication Cipher Feedback (CFB) Input is processed s bits at a time. Preceding ciphertext is used as input to the encryption algorithm to produce pseudorandom output, which is XORed with plaintext to produce next unit of ciphertext. General-purpose stream-oriented transmission Authentication Output Feedback (OFB) Similar to CFB, except that the input to the encryption algorithm is the preceding encryption output, and full blocks are used. Stream-oriented transmission over noisy channel (e.g., satellite communication) Counter (CTR) Each block of plaintext is XORed with an encrypted counter. The counter is incremented for each subsequent block. General-purpose block- oriented transmission Useful for high-speed requirements Table 7.1 Block Cipher Modes of Operation 7.2 / ELECTRONIC CODEBOOK 215 For lengthy messages, the ECB mode may not be secure. If the message is highly structured, it may be possible for a cryptanalyst to exploit these regularities. For example, if it is known that the message always starts out with certain predefined fields, then the cryptanalyst may have a number of known plaintext–ciphertext pairs to work with. If the message has repetitive elements with a period of repetition a multiple of b bits, then these elements can be identified by the analyst. This may help in the analysis or may provide an opportunity for substituting or rearranging blocks. We now turn to more complex modes of operation. [KNUD00] lists the fol- lowing criteria and properties for evaluating and constructing block cipher modes of operation that are superior to ECB: ■ Overhead: The additional operations for the encryption and decryption opera- tion when compared to encrypting and decrypting in the ECB mode. ■ Error recovery: The property that an error in the ith ciphertext block is inher- ited by only a few plaintext blocks after which the mode resynchronizes. ■ Error propagation: The property that an error in the ith ciphertext block is inherited by the ith and all subsequent plaintext blocks. What is meant here is a bit error that occurs in the transmission of a ciphertext block, not a computa- tional error in the encryption of a plaintext block. Figure 7.3 Electronic Codebook (ECB) Mode C1 P1 Encrypt K P2 C2 Encrypt K P N CN Encrypt K (a) Encryption P1 C1 Decrypt K C2 P2 Decrypt K CN PN Decrypt K (b) Decryption 216 CHAPTER 7 / BLOCK CIPHER OPERATION ■ Diffusion: How the plaintext statistics are reflected in the ciphertext. Low en- tropy plaintext blocks should not be reflected in the ciphertext blocks. Roughly, low entropy equates to predictability or lack of randomness (see Appendix F). ■ Security: Whether or not the ciphertext blocks leak information about the plaintext blocks. 7.3 CIPHER BLOCK CHAINING MODE To overcome the security deficiencies of ECB, we would like a technique in which the same plaintext block, if repeated, produces different ciphertext blocks. A simple way to satisfy this requirement is the cipher block chaining (CBC) mode (Figure 7.4). In this scheme, the input to the encryption algorithm is the XOR of the current plaintext block and the preceding ciphertext block; the same key is used for each block. In effect, we have chained together the processing of the sequence of plaintext blocks. The input to the encryption function for each plaintext block bears no fixed relationship to the plaintext block. Therefore, repeating patterns of b bits are not exposed. As with the ECB mode, the CBC mode requires that the last block be padded to a full b bits if it is a partial block. Figure 7.4 Cipher Block Chaining (CBC) Mode C1 P1 Encrypt IV K P2 C2 Encrypt K PN CN CN–1 Encrypt K (a) Encryption P1 C1 Decrypt IV K C2 P2 Decrypt K CN PN CN–1 Decrypt K (b) Decryption 7.3 / CIPHER BLOCK CHAINING MODE 217 For decryption, each cipher block is passed through the decryption algorithm. The result is XORed with the preceding ciphertext block to produce the plaintext block. To see that this works, we can write Cj = E(K, [Cj- 1⊕ Pj]) Then D(K, Cj) = D(K, E(K, [Cj- 1⊕ Pj])) D(K, Cj) = Cj- 1⊕ Pj Cj- 1⊕D(K, Cj) = Cj- 1⊕ Cj- 1⊕ Pj = Pj To produce the first block of ciphertext, an initialization vector (IV) is XORed with the first block of plaintext. On decryption, the IV is XORed with the output of the decryption algorithm to recover the first block of plaintext. The IV is a data block that is the same size as the cipher block. We can define CBC mode as CBC C1 = E(K, [P1⊕ IV]) Cj = E(K, [Pj⊕ Cj- 1])j = 2, c , N P1 = D(K, C1)⊕ IV Pj = D(K, Cj)⊕ Cj- 1 j = 2, c , N The IV must be known to both the sender and receiver but be unpredictable by a third party. In particular, for any given plaintext, it must not be possible to predict the IV that will be associated to the plaintext in advance of the generation of the IV. For maximum security, the IV should be protected against unauthorized changes. This could be done by sending the IV using ECB encryption. One reason for protecting the IV is as follows: If an opponent is able to fool the receiver into using a different value for IV, then the opponent is able to invert selected bits in the first block of plaintext. To see this, consider C1 = E(K, [IV⊕ P1]) P1 = IV⊕D(K, C1) Now use the notation that X[i] denotes the ith bit of the b-bit quantity X. Then P1[i] = IV[i]⊕D(K, C1)[i] Then, using the properties of XOR, we can state P1[i]′ = IV[i]′ ⊕D(K, C1)[i] where the prime notation denotes bit complementation. This means that if an oppo- nent can predictably change bits in IV, the corresponding bits of the received value of P1 can be changed. For other possible attacks based on prior knowledge of IV, see [VOYD83]. So long as it is unpredictable, the specific choice of IV is unimportant. SP  800-38A recommends two possible methods: The first method is to apply the encryption function, under the same key that is used for the encryption of the plaintext, to a nonce.2 The nonce must be a data block that is unique to each 2NIST SP 800-90 (Recommendation for Random Number Generation Using Deterministic Random Bit Generators) defines nonce as follows: A time-varying value that has at most a negligible chance of repeat- ing, for example, a random value that is generated anew for each use, a timestamp, a sequence number, or some combination of these. Hiva-Network.Com 218 CHAPTER 7 / BLOCK CIPHER OPERATION execution of the encryption operation. For example, the nonce may be a counter, a timestamp, or a message number. The second method is to generate a random data block using a random number generator. In conclusion, because of the chaining mechanism of CBC, it is an appropriate mode for encrypting messages of length greater than b bits. In addition to its use to achieve confidentiality, the CBC mode can be used for authentication. This use is described in Chapter 12. 7.4 CIPHER FEEDBACK MODE For AES, DES, or any block cipher, encryption is performed on a block of b bits. In the case of DES, b = 64 and in the case of AES, b = 128. However, it is pos- sible to convert a block cipher into a stream cipher, using one of the three modes to be discussed in this and the next two sections: cipher feedback (CFB) mode, output feedback (OFB) mode, and counter (CTR) mode. A stream cipher elimi- nates the need to pad a message to be an integral number of blocks. It also can operate in real time. Thus, if a character stream is being transmitted, each char- acter can be encrypted and transmitted immediately using a character-oriented stream cipher. One desirable property of a stream cipher is that the ciphertext be of the same length as the plaintext. Thus, if 8-bit characters are being transmitted, each charac- ter should be encrypted to produce a ciphertext output of 8 bits. If more than 8 bits are produced, transmission capacity is wasted. Figure 7.5 depicts the CFB scheme. In the figure, it is assumed that the unit of transmission is s bits; a common value is s = 8. As with CBC, the units of plaintext are chained together, so that the ciphertext of any plaintext unit is a function of all the preceding plaintext. In this case, rather than blocks of b bits, the plaintext is divided into segments of s bits. First, consider encryption. The input to the encryption function is a b-bit shift register that is initially set to some initialization vector (IV). The leftmost (most significant) s bits of the output of the encryption function are XORed with the first segment of plaintext P1 to produce the first unit of ciphertext C1, which is then transmitted. In addition, the contents of the shift register are shifted left by s bits, and C1 is placed in the rightmost (least significant) s bits of the shift register. This process continues until all plaintext units have been encrypted. For decryption, the same scheme is used, except that the received ciphertext unit is XORed with the output of the encryption function to produce the plaintext unit. Note that it is the encryption function that is used, not the decryption function. This is easily explained. Let MSBs(X) be defined as the most significant s bits of X. Then C1 = P1⊕MSBs[E(K, IV)] Therefore, by rearranging terms: P1 = C1⊕MSBs[E(K, IV)] The same reasoning holds for subsequent steps in the process. 7.4 / CIPHER FEEDBACK MODE 219 We can define CFB mode as follows. CFB I1 = IV Ij = LSBb - s(Ij- 1) }Cj- 1 j = 2, c , N Oj = E(K, Ij) j = 1, c , N Cj = Pj⊕MSBs(Oj) j = 1, c , N I1 = IV Ij = LSBb - s(Ij- 1) }Cj- 1 j = 2, c , N Oj = E(K, Ij) j = 1, c , N Pj = Cj⊕MSBs(Oj) j = 1, c , N Although CFB can be viewed as a stream cipher, it does not conform to the typical construction of a stream cipher. In a typical stream cipher, the cipher takes Figure 7.5 s-bit Cipher Feedback (CFB) Mode C1 IV I1 O1 I1 O1 I2 O2 I2 O2 IN ON IN ON P1 Encrypt Select s bits Discard b – s bits K (a) Encryption CN–1 (b) Decryption s bits s bits s bits C2 P2 Encrypt Select s bits Discard b – s bits K s bits s bitsb – s bits Shift register s bits CN PN Encrypt Select s bits Discard b – s bits K s bits s bitsb – s bits Shift register P1 IV C1 Encrypt Select s bits Discard b – s bits K CN–1 s bits C2 s bits CN s bits s bits s bits P2 Encrypt Select s bits Discard b – s bits K s bitsb – s bits Shift register s bitsb – s bits Shift register s bits PN Encrypt Select s bits Discard b – s bits K 220 CHAPTER 7 / BLOCK CIPHER OPERATION as input some initial value and a key and generates a stream of bits, which is then XORed with the plaintext bits (see Figure 4.1). In the case of CFB, the stream of bits that is XORed with the plaintext also depends on the plaintext. In CFB encryption, like CBC encryption, the input block to each forward cipher function (except the first) depends on the result of the previous forward cipher function; therefore, multiple forward cipher operations cannot be performed in parallel. In CFB decryption, the required forward cipher operations can be per- formed in parallel if the input blocks are first constructed (in series) from the IV and the ciphertext. 7.5 OUTPUT FEEDBACK MODE The output feedback (OFB) mode is similar in structure to that of CFB. For OFB, the output of the encryption function is fed back to become the input for encrypting the next block of plaintext (Figure 7.6). In CFB, the output of the XOR unit is fed back to become input for encrypting the next block. The other difference is that the OFB mode operates on full blocks of plaintext and ciphertext, whereas CFB oper- ates on an s-bit subset. OFB encryption can be expressed as Cj = Pj⊕ E(K, Oj- 1) where Oj- 1 = E(K, Oj- 2) Some thought should convince you that we can rewrite the encryption expres- sion as: Cj = Pj⊕ E(K, [Cj- 1⊕ Pj- 1]) By rearranging terms, we can demonstrate that decryption works. Pj = Cj⊕ E(K, [Cj- 1⊕ Pj- 1]) We can define OFB mode as follows. OFB I1 = Nonce Ij = Oj- 1 j = 2, c , N Oj = E(K, Ij) j = 1, c , N Cj = Pj⊕ Oj j = 1, c , N - 1 CN * = PN* ⊕MSBu(ON) I1 = Nonce Ij = Oj- 1 j = 2, c , N Oj = E(K, Ij) j = 1, c , N Pj = Cj⊕ Oj j = 1, c , N - 1 PN * = CN* ⊕MSBu(ON) Let the size of a block be b. If the last block of plaintext contains u bits (indi- cated by *), with u 6 b, the most significant u bits of the last output block ON are used for the XOR operation; the remaining b - u bits of the last output block are discarded. As with CBC and CFB, the OFB mode requires an initialization vector. In the case of OFB, the IV must be a nonce; that is, the IV must be unique to each execution of the encryption operation. The reason for this is that the sequence of 7.5 / OUTPUT FEEDBACK MODE 221 encryption output blocks, Oi, depends only on the key and the IV and does not de- pend on the plaintext. Therefore, for a given key and IV, the stream of output bits used to XOR with the stream of plaintext bits is fixed. If two different messages had an identical block of plaintext in the identical position, then an attacker would be able to determine that portion of the Oi stream. One advantage of the OFB method is that bit errors in transmission do not propagate. For example, if a bit error occurs in C1, only the recovered value of P1 is affected; subsequent plaintext units are not corrupted. With CFB, C1 also serves as input to the shift register and therefore causes additional corruption downstream. The disadvantage of OFB is that it is more vulnerable to a message stream modification attack than is CFB. Consider that complementing a bit in the cipher- text complements the corresponding bit in the recovered plaintext. Thus, controlled Figure 7.6 Output Feedback (OFB) Mode (a) Encryption P1 C1 Nonce Encrypt K P2 PN C2 Encrypt K CN Encrypt K (b) Decryption C1 I1 I2 IN I1 I2 IN O1 O2 ON O1 O2 ON P1 Nonce Encrypt K C2 CN P2 Encrypt K PN Encrypt K 222 CHAPTER 7 / BLOCK CIPHER OPERATION changes to the recovered plaintext can be made. This may make it possible for an opponent, by making the necessary changes to the checksum portion of the message as well as to the data portion, to alter the ciphertext in such a way that it is not de- tected by an error-correcting code. For a further discussion, see [VOYD83]. OFB has the structure of a typical stream cipher, because the cipher gener- ates a stream of bits as a function of an initial value and a key, and that stream of bits is XORed with the plaintext bits (see Figure 4.1). The generated stream that is XORed with the plaintext is itself independent of the plaintext; this is highlighted by dashed boxes in Figure 7.6. One distinction from the stream ciphers we discuss in Chapter 8 is that OFB encrypts plaintext a full block at a time, where typically a block is 64 or 128 bits. Many stream ciphers encrypt one byte at a time. 7.6 COUNTER MODE Although interest in the counter (CTR) mode has increased recently with appli- cations to ATM (asynchronous transfer mode) network security and IPsec (IP security), this mode was proposed in 1979 (e.g., [DIFF79]). Figure 7.7 depicts the CTR mode. A counter equal to the plaintext block size is used. The only requirement stated in SP 800-38A is that the counter value must be different for each plaintext block that is encrypted. Typically, the counter is initial- ized to some value and then incremented by 1 for each subsequent block (modulo 2b, where b is the block size). For encryption, the counter is encrypted and then XORed with the plaintext block to produce the ciphertext block; there is no chaining. For decryption, the same sequence of counter values is used, with each encrypted coun- ter XORed with a ciphertext block to recover the corresponding plaintext block. Thus, the initial counter value must be made available for decryption. Given a sequence of counters T1, T2, c , TN, we can define CTR mode as follows. CTR Cj = Pj⊕ E(K, Tj) j = 1, c , N - 1 CN * = PN* ⊕MSBu[E(K, TN)] Pj = Cj⊕ E(K, Tj) j = 1, c , N - 1 PN * = CN* ⊕MSBu[E(K, TN)] For the last plaintext block, which may be a partial block of u bits, the most significant u bits of the last output block are used for the XOR operation; the re- maining b - u bits are discarded. Unlike the ECB, CBC, and CFB modes, we do not need to use padding because of the structure of the CTR mode. As with the OFB mode, the initial counter value must be a nonce; that is, T1 must be different for all of the messages encrypted using the same key. Further, all Ti values across all messages must be unique. If, contrary to this requirement, a counter value is used multiple times, then the confidentiality of all of the plaintext blocks corresponding to that counter value may be compromised. In particular, if any plaintext block that is encrypted using a given counter value is known, then the output of the encryption function can be determined easily from the associated ciphertext block. This output allows any other plaintext blocks that are encrypted using the same counter value to be easily recovered from their associated ciphertext blocks. 7.6 / COUNTER MODE 223 One way to ensure the uniqueness of counter values is to continue to incre- ment the counter value by 1 across messages. That is, the first counter value of the each message is one more than the last counter value of the preceding message. [LIPM00] lists the following advantages of CTR mode. ■ Hardware efficiency: Unlike the three chaining modes, encryption (or decryp- tion) in CTR mode can be done in parallel on multiple blocks of plaintext or ciphertext. For the chaining modes, the algorithm must complete the computa- tion on one block before beginning on the next block. This limits the maximum throughput of the algorithm to the reciprocal of the time for one execution of block encryption or decryption. In CTR mode, the throughput is only limited by the amount of parallelism that is achieved. Figure 7.7 Counter (CTR) Mode (a) Encryption P1 C1 Counter 1 Encrypt K Counter 2 Counter N P2 PN C2 Encrypt K CN Encrypt K (b) Decryption C1 P1 Counter 1 Encrypt K Counter 2 Counter N C2 CN P2 Encrypt K PN Encrypt K 224 CHAPTER 7 / BLOCK CIPHER OPERATION ■ Software efficiency: Similarly, because of the opportunities for parallel execu- tion in CTR mode, processors that support parallel features, such as aggressive pipelining, multiple instruction dispatch per clock cycle, a large number of reg- isters, and SIMD instructions, can be effectively utilized. ■ Preprocessing: The execution of the underlying encryption algorithm does not depend on input of the plaintext or ciphertext. Therefore, if sufficient memory is available and security is maintained, preprocessing can be used to prepare the output of the encryption boxes that feed into the XOR functions, as in Figure 7.7. When the plaintext or ciphertext input is presented, then the only computation is a series of XORs. Such a strategy greatly enhances throughput. ■ Random access: The ith block of plaintext or ciphertext can be processed in random-access fashion. With the chaining modes, block Ci cannot be com- puted until the i - 1 prior blocks are computed. There may be applications in which a ciphertext is stored and it is desired to decrypt just one block; for such applications, the random access feature is attractive. ■ Provable security: It can be shown that CTR is at least as secure as the other modes discussed in this chapter. ■ Simplicity: Unlike ECB and CBC modes, CTR mode requires only the imple- mentation of the encryption algorithm and not the decryption algorithm. This matters most when the decryption algorithm differs substantially from the en- cryption algorithm, as it does for AES. In addition, the decryption key schedul- ing need not be implemented. Note that, with the exception of ECB, all of the NIST-approved block ci- pher modes of operation involve feedback. This is clearly seen in Figure 7.8. To highlight the feedback mechanism, it is useful to think of the encryption function as taking input from an input register whose length equals the encryption block length and with output stored in an output register. The input register is updated one block at a time by the feedback mechanism. After each update, the encryp- tion algorithm is executed, producing a result in the output register. Meanwhile, a block of plaintext is accessed. Note that both OFB and CTR produce output that is independent of both the plaintext and the ciphertext. Thus, they are natu- ral candidates for stream ciphers that encrypt plaintext by XOR one full block at a time. 7.7 XTS-AES MODE FOR BLOCK-ORIENTED STORAGE DEVICES In 2010, NIST approved an additional block cipher mode of operation, XTS-AES. This mode is also an IEEE standard, IEEE Std 1619-2007, which was developed by the IEEE Security in Storage Working Group (P1619). The standard describes a method of encryption for data stored in sector-based devices where the threat model includes possible access to stored data by the adversary. The standard has received widespread industry support. 7.7 / XTS-AES MODE FOR BLOCK-ORIENTED STORAGE DEVICES 225 Tweakable Block Ciphers The XTS-AES mode is based on the concept of a tweakable block cipher, intro- duced in [LISK02], which functions in much the same manner as a salt used with passwords, as described in Chapter 22. The form of this concept used in XTS-AES was first described in [ROGA04]. Before examining XTS-AES, let us consider the general structure of a tweak- able block cipher. A tweakable block cipher is one that has three inputs: a plain- text P, a symmetric key K, and a tweak T; and produces a ciphertext output C. We can write this as C = E(K, T, P). The tweak need not be kept secret. Whereas the Figure 7.8 Feedback Characteristic of Modes of Operation Plaintext block Plaintext block Encrypt Input register Output register Ciphertext Ciphertext (a) Cipher block chaining (CBC) mode Key Encrypt Input register Output register Key (b) Cipher feedback (CFB) mode Plaintext block Ciphertext Key Encrypt Input register Output register (c) Output feedback (OFB) mode Plaintext block Ciphertext Key Encrypt Input register Output register Counter (d) Counter (CTR) mode 226 CHAPTER 7 / BLOCK CIPHER OPERATION purpose of the key is to provide security, the purpose of the tweak is to provide variability. That is, the use of different tweaks with the same plaintext and same key produces different outputs. The basic structure of several tweakable clock ciphers that have been implemented is shown in Figure 7.9. Encryption can be expressed as: C = H(T)⊕ E(K, H(T)⊕ P) where H is a hash function. For decryption, the same structure is used with the plaintext as input and decryption as the function instead of encryption. To see that this works, we can write H(T)⊕ C = E(K, H(T)⊕ P) D[K, H(T)⊕ C] = H(T)⊕ P H(T)⊕D(K, H(T)⊕ C) = P It is now easy to construct a block cipher mode of operation by using a differ- ent tweak value on each block. In essence, the ECB mode is used but for each block the tweak is changed. This overcomes the principal security weakness of ECB, which is that two encryptions of the same block yield the same ciphertext. Storage Encryption Requirements The requirements for encrypting stored data, also referred to as “data at rest” dif- fer somewhat from those for transmitted data. The P1619 standard was designed to have the following characteristics: 1. The ciphertext is freely available for an attacker. Among the circumstances that lead to this situation: a. A group of users has authorized access to a database. Some of the records in the database are encrypted so that only specific users can successfully read/ Figure 7.9 Tweakable Block Cipher K Hash function Tj H(Tj) Pj Cj Encrypt (a) Encryption K Hash function Tj Cj Pj Decrypt (b) Decryption Hiva-Network.Com 7.7 / XTS-AES MODE FOR BLOCK-ORIENTED STORAGE DEVICES 227 write them. Other users can retrieve an encrypted record but are unable to read it without the key. b. An unauthorized user manages to gain access to encrypted records. c. A data disk or laptop is stolen, giving the adversary access to the encrypted data. 2. The data layout is not changed on the storage medium and in transit. The en- crypted data must be the same size as the plaintext data. 3. Data are accessed in fixed sized blocks, independently from each other. That is, an authorized user may access one or more blocks in any order. 4. Encryption is performed in 16-byte blocks, independently from other blocks (except the last two plaintext blocks of a sector, if its size is not a multiple of 16 bytes). 5. There are no other metadata used, except the location of the data blocks within the whole data set. 6. The same plaintext is encrypted to different ciphertexts at different locations, but always to the same ciphertext when written to the same location again. 7. A standard conformant device can be constructed for decryption of data en- crypted by another standard conformant device. The P1619 group considered some of the existing modes of operation for use with stored data. For CTR mode, an adversary with write access to the encrypted media can flip any bit of the plaintext simply by flipping the corresponding ciphertext bit. Next, consider requirement 6 and the use of CBC. To enforce the requirement that the same plaintext encrypts to different ciphertext in different locations, the IV could be derived from the sector number. Each sector contains multiple blocks. An adversary with read/write access to the encrypted disk can copy a ciphertext sec- tor from one position to another, and an application reading the sector off the new location will still get the same plaintext sector (except perhaps the first 128 bits). For example, this means that an adversary that is allowed to read a sector from the second position but not the first can find the content of the sector in the first posi- tion by manipulating the ciphertext. Another weakness is that an adversary can flip any bit of the plaintext by flipping the corresponding ciphertext bit of the previous block, with the side-effect of “randomizing” the previous block. Operation on a Single Block Figure 7.10 shows the encryption and decryption of a single block. The operation in- volves two instances of the AES algorithm with two keys. The following parameters are associated with the algorithm. Key The 256 or 512 bit XTS-AES key; this is parsed as a concatenation of two fields of equal size called Key1 and Key2, such that Key = Key1 }Key2 . Pj The jth block of plaintext. All blocks except possibly the final block have a length of 128 bits. A plaintext data unit, typically a disk sector, consists of a sequence of plaintext blocks P1, P2, c , Pm. Cj The jth block of ciphertext. All blocks except possibly the final block have a length of 128 bits. 228 CHAPTER 7 / BLOCK CIPHER OPERATION j The sequential number of the 128-bit block inside the data unit. i The value of the 128-bit tweak. Each data unit (sector) is assigned a tweak value that is a nonnegative integer. The tweak values are assigned consecutively, starting from an arbitrary nonnegative integer. a A primitive element of GF(2128) that corresponds to polynomial x (i.e., 0000c 0102). aj a multiplied by itself j times, in GF(2128). ⊕ Bitwise XOR. ⊗ Modular multiplication of two polynomials with binary coefficients modulo x128 + x7 + x2 + x + 1. Thus, this is multiplication in GF(2128). Figure 7.10 XTS-AES Operation on Single Block Key2 Key1 AES Encrypt i T CC PP Pj Cj AES Encrypt (a) Encryption (b) Decryption j Key2 Key1 AES Encrypt i T CC PP Cj Pj AES Decrypt j 7.7 / XTS-AES MODE FOR BLOCK-ORIENTED STORAGE DEVICES 229 In essence, the parameter j functions much like the counter in CTR mode. It assures that if the same plaintext block appears at two different positions within a data unit, it will encrypt to two different ciphertext blocks. The parameter i functions much like a nonce at the data unit level. It assures that, if the same plaintext block appears at the same position in two different data units, it will encrypt to two differ- ent ciphertext blocks. More generally, it assures that the same plaintext data unit will encrypt to two different ciphertext data units for two different data unit positions. The encryption and decryption of a single block can be described as XTS-AES block operation T = E(K2, i)⊗ aj PP = P⊕ T CC = E(K1, PP) C = CC⊕ T T = E(K2, i)⊗ aj CC = C⊕ T PP = D(K1, CC) P = PP⊕ T To see that decryption recovers the plaintext, let us expand the last line of both en- cryption and decryption. For encryption, we have C = CC⊕ T = E(K1, PP)⊕ T = E(K1, P⊕ T)⊕ T and for decryption, we have P = PP⊕ T = D(K1, CC)⊕ T = D(K1, C⊕ T)⊕ T Now, we substitute for C: P = D(K1, C⊕ T)⊕ T = D(K1, [E(K1, P⊕ T)⊕ T]⊕ T)⊕ T = D(K1, E(K1, P⊕ T))⊕ T = (P⊕ T)⊕ T = P Operation on a Sector The plaintext of a sector or data unit is organized into blocks of 128 bits. Blocks are labeled P0, P1, c , Pm. The last block my be null or may contain from 1 to 127 bits. In other words, the input to the XTS-AES algorithm consists of m 128-bit blocks and possibly a final partial block. For encryption and decryption, each block is treated independently and en- crypted/decrypted as shown in Figure 7.10. The only exception occurs when the last block has less than 128 bits. In that case, the last two blocks are encrypted/de- crypted using a ciphertext-stealing technique instead of padding. Figure 7.11 shows the scheme. Pm - 1 is the last full plaintext block, and Pm is the final plaintext block, which contains s bits with 1 … s … 127. Cm - 1 is the last full ciphertext block, and Cm is the final ciphertext block, which contains s bits. This technique is commonly called ciphertext stealing because the processing of the last block “steals” a tempo- rary ciphertext of the penultimate block to complete the cipher block. Let us label the block encryption and decryption algorithms of Figure 7.10 as Block encryption: XTS-AES-blockEnc(K, Pj, i, j) Block decryption: XTS-AES-blockDec(K, Cj, i, j) 230 CHAPTER 7 / BLOCK CIPHER OPERATION Then, XTS-AES mode is defined as follows: XTS-AES mode with null final block Cj = XTS@AES@blockEnc(K, Pj, i, j) j = 0, c , m - 1 Pj = XTS@AES@blockEnc(K, Cj, i, j) j = 0, c , m - 1 XTS-AES mode with final block containing s bits Cj = XTS@AES@blockEnc(K, Pj, i, j) j = 0, c , m - 2 XX = XTS@AES@blockEnc(K, Pm - 1, i, m - 1) CP = LSB128 - s(XX) YY = Pm }CP Cm - 1 = XTS@AES@blockEnc(K, YY, i, m) Cm = MSBs(XX) Pj = XTS@AES@blockDec(K, Cj, i, j) j = 0, c , m - 2 YY = XTS@AES@blockDec(K, Cm - 1, i, m - 1) CP = LSB128 - s(YY) XX = Cm }CP Pm - 1 = XTS@AES@blockDec(K, XX, i, m) Pm = MSBs(YY) Figure 7.11 XTS-AES Mode C0 P0 XTS-AES block encryption Key i, 0 C1 P1 XTS-AES block encryption Key i, 1 CP XX XX YY YY Cm CPPmPm–1 XTS-AES block encryption Key i, m–1 Cm–1 Cm–1 XTS-AES block encryption Key i, m Cm (a) Encryption (b) Decryption P0 C0 XTS-AES block decryption Key i, 0 P1 C1 XTS-AES block decryption Key i, 1 CPPm CPCmCm–1 XTS-AES block decryption Key i, m Pm–1 Pm–1 XTS-AES block decryption Key i, m–1 Pm 7.8 / FORMAT-PRESERVING ENCRYPTION 231 As can be seen, XTS-AES mode, like CTR mode, is suitable for parallel oper- ation. Because there is no chaining, multiple blocks can be encrypted or decrypted simultaneously. Unlike CTR mode, XTS-AES mode includes a nonce (the param- eter i) as well as a counter (parameter j). 7.8 FORMAT-PRESERVING ENCRYPTION Format-preserving encryption (FPE) refers to any encryption technique that takes a plaintext in a given format and produces a ciphertext in the same format. For example, credit cards consist of 16 decimal digits. An FPE that can accept this type of input would produce a ciphertext output of 16 decimal digits. Note that the ciphertext need not be, and in fact is unlikely to be, a valid credit card number. But it will have the same format and can be stored in the same way as credit card number plaintext. A simple encryption algorithm is not format preserving, with the exception that it preserves the format of binary strings. For example, Table 7.2 shows three types of plaintext for which it might be desired to perform FPE. The third row shows examples of what might be generated by an FPE algorithm. The fourth row shows (in hexadecimal) what is produced by AES with a given key. Motivation FPE facilitates the retrofitting of encryption technology to legacy applications, where a conventional encryption mode might not be feasible because it would dis- rupt data fields/pathways. FPE has emerged as a useful cryptographic tool, whose applications include financial-information security, data sanitization, and transpar- ent encryption of fields in legacy databases. The principal benefit of FPE is that it enables protection of particular data elements in a legacy database that did not provide encryption of those data ele- ments, while still enabling workflows that were in place before FPE was in use. With FPE, as opposed to ordinary AES encryption or TDEA encryption, no database schema changes and minimal application changes are required. Only applications that need to see the plaintext of a data element need to be modified and generally these modifications will be minimal. Some examples of legacy applications where FPE is desirable: ■ COBOL data-processing applications: Any changes in the structure of a re- cord requires corresponding changes in all code that references that record structure. Typical code sizes involve hundreds of modules, each containing around 5,000–10,000 lines on average. Credit Card Tax ID Bank Account Number Plaintext 8123 4512 3456 6780 219-09-9999 800N2982K-22 FPE 8123 4521 7292 6780 078-05-1120 709G9242H-35 AES (hex) af411326466add24 c86abd8aa525db7a 7b9af4f3f218ab25 07c7376869313afa 9720ec7f793096ff d37141242e1c51bd Table 7.2 Comparison of Format-Preserving Encryption and AES 232 CHAPTER 7 / BLOCK CIPHER OPERATION ■ Database applications: Fields that are specified to take only character strings cannot be used to store conventionally encrypted binary ciphertext. Base64 encoding of such binary ciphertext is not always feasible without increase in data lengths, requiring augmentation of corresponding field lengths. ■ FPE-encrypted characters can be significantly compressed for efficient trans- mission. This cannot be said about AES-encrypted binary ciphertext. Difficulties in Designing an FPE A general-purpose standardized FPE should meet a number of requirements: 1. The ciphertext is of the same length and format as the plaintext. 2. It should be adaptable to work with a variety of character and number types. Examples include decimal digits, lowercase alphabetic characters, and the full character set of a standard keyboard or international keyboard. 3. It should work with variable plaintext lengths. 4. Security strength should be comparable to that achieved with AES. 5. Security should be strong even for very small plaintext lengths. Meeting the first requirement is not at all straightforward. As illustrated in Table 7.2, a straightforward encryption with AES yields a 128-bit binary block that does not resemble the required format. Also, a standard symmetric block cipher is not easily adaptable to produce an FPE. Consider a simple example. Assume that we want an algorithm that can en- crypt decimal digit strings of maximum length of 32 digits. The input to the algo- rithm can be stored in 16 bytes (128 bits) by encoding each digit as four bits and using the corresponding binary value for each digit (e.g., 6 is encoded as 0101). Next, we use AES to encrypt the 128-bit block, in the following fashion: 1. The plaintext input X is represented by the string of 4-bit decimal digits X[1] . . . X[16]. If the plaintext is less than 16 digits long, it is padded out to the left (most significant) with zeros. 2. Treating X as a 128-bit binary string and using key K, form ciphertext Y = AESK(X). 3. Treat Y as a string of length 16 of 4-bit elements. 4. Some of the entries in Y may have values greater than 9 (e.g., 1100). To gener- ate ciphertext Z in the required format, calculate Z[i] = Y[i] mod 10, 1 … i … 16 This generates a ciphertext of 16 decimal digits, which conforms to the de- sired format. However, this algorithm does not meet the basic requirement of any encryption algorithm of reversibility. It is impossible to decrypt Z to recover the original plaintext X because the operation is one-way; that is, it is a many- to-one function. For example, 12 mod 10 = 2 mod 10 = 2. Thus, we need to de- sign a reversible function that is both a secure encryption algorithm and format preserving. 7.8 / FORMAT-PRESERVING ENCRYPTION 233 A second difficulty in designing an FPE is that some of the input strings are quite short. For example, consider the 16-digit credit card number (CCN). The first six digits provide the issuer identification number (IIN), which identifies the insti- tution that issued the card. The final digit is a check digit to catch typographical errors or other mistakes. The remaining nine digits are the user’s account number. However, a number of applications require that the last four digits be in the clear (the check digit plus three account digits) for applications such as credit card re- ceipts, which leaves only six digits for encryption. Now suppose that an adversary is able to obtain a number of plaintext/ciphertext pairs. Each such pair corresponds to not just one CCN, but multiple CCNs that have the same middle six digits. In a large database of credit card numbers, there may be multiple card numbers with the same middle six digits. An adversary may be able to assemble a large diction- ary mapping known as six-digit plaintexts to their corresponding ciphertexts. This could be used to decrypt unknown ciphertexts from the database. As pointed out in [BELL10a], in a database of 100 million entries, on average about 100 CCNs will share any given middle-six digits. Thus, if the adversary has learned k CCNs and gains access to such a database, the adversary can decrypt approximately 100k CCNs. The solution to this second difficulty is to use a tweakable block cipher; this concept is described in Section 7.7. For example, the tweak for CCNs could be the first two and last four digits of the CCN. Prior to encryption, the tweak is added, digit-by-digit mod 10, to the middle six-digit plaintext, and the result is then en- crypted. Two different CCNs with identical middle six digits will yield different tweaked inputs and therefore different ciphertexts. Consider the following: CCN Tweak Plaintext Plaintext + Tweak 4012 8812 3456 1884 401884 123456 524230 5105 1012 3456 6782 516782 123456 639138 Two CCNs with the same middle six digits have different tweaks and there- fore different values to the middle six digits after the tweak is added. Feistel Structure for Format-Preserving Encryption As the preceding discussion shows, the challenge with FPE is to design an algo- rithm for scrambling the plaintext that is secure, preserves format, and is reversible. A number of approaches have been proposed in recent years [ROGA10, BELL09] for FPE algorithms. The majority of these proposals use a Feistel structure. Although IBM introduced this structure with their Lucifer cipher [SMIT71] almost half a century ago, it remains a powerful basis for implementing ciphers. This section provides a general description of how the Feistel structure can be used to implement an FPE. In the following section, we look at three specific Feistel-based algorithms that are in the process of receiving NIST approval. ENCRYPTION AND DECRYPTION Figure 7.12 shows the Feistel structure used in all of the NIST algorithms, with encryption shown on the left-hand side and decryption on the right-hand side. The structure in Figure 7.12 is the same as that shown in 234 CHAPTER 7 / BLOCK CIPHER OPERATION Figure 4.3 but, to simplify the presentation, it is untwisted, not illustrating the swap that occurs at the end of each round. The input to the encryption algorithm is a plaintext character string of n = u + v characters. If n is even, then u = v, otherwise u and v differ by 1. The two parts of the string pass through an even number of rounds of processing to produce a ciphertext block of n characters and the same format as the plaintext. Each round i has inputs Ai and Bi, derived from the preceding round (or plaintext for round 0). All rounds have the same structure. On even-numbered rounds, a substitution is performed on the left part (length u) of the data, Ai. This is done by applying the round function FK to the right part (length v) of the data, Bi, and then performing Figure 7.12 Feistel Structure for Format-Preserving Encryption Input (plaintext) Output (ciphertext) (a) Encryption (b) Decryption R ou nd 0 R ou nd 1 A0 C0 C1 u characters v characters B0 n, T, 0 n, T, 1 A2 B1 B2 C1 + FK + B1 C0 A1 B0 FK R ou nd r– 2 R ou nd r– 1 Ar–2 Cr–2 Br–2 n, T, r–2 n, T, r–1 Ar Br–1 Br Cr–1 + FK + Br–1 Cr–2 Ar–1 Br–2 FK Output (plaintext) Input (ciphertext) R ou nd r– 1 R ou nd r– 2 A0 C0 C0 C1 u characters v characters B0 A1 n, T, 0 n, T, 1 A2 C2 B2 A3 – FK – B1 A2 A1 C1 FK R ou nd 1 R ou nd 0 Ci–2 Cr–1 Cr–1 n, T, i–2 n, T, r–1 Ar Br – FK – Br–1 Ar Ar–1 Cr–1 Ar–2 Cr–2 Br–2 Ar–1 FK 7.8 / FORMAT-PRESERVING ENCRYPTION 235 a modular addition of the output of FK with Ai. The modular addition function and the selection of modulus are described subsequently. On odd-numbered rounds, the substitution is done on the right part of the data. FK is a one-way function that converts the input into a binary string, performs a scrambling transformation on the string, and then converts the result back into a character string of suitable format and length. The function has as parameters the secret key K, the plaintext length n, a tweak T, and the round number i. Note that on even-numbered rounds, FK has an input of v characters, and that the modular addition produces a result of u characters, whereas on odd-numbered rounds, FK has an input of u characters, and that the modular addition produces a result of v characters. The total number of rounds is even, so that the output consists of an A portion of length u concatenated with a B portion of length v, matching the partition of the plaintext. The process of decryption is essentially the same as the encryption process. The differences are: (1) the addition function is replaced by a subtraction function that is its inverse; and (2) the order of the round indices is reversed. To demonstrate that the decryption produces the correct result, Figure 7.12b shows the encryption process going down the left-hand side and the decryption pro- cess going up the right-hand side. The diagram indicates that, at every round, the intermediate value of the decryption process is equal to the corresponding value of the encryption process. We can walk through the figure to validate this, starting at the bottom. The ciphertext is produced at the end of round r - 1 as a string of the form A r }B r, with Ar and Br having string lengths u and v, respectively. Encryption round r - 1 can be described with the following equations: Ar = Br - 1 Br = Ar - 1 + FK[Br - 1] Because we define the subtraction function to be the inverse of the addition function, these equations can be rewritten: Br - 1 = Ar Ar - 1 = Br - FK[Br - 1] It can be seen that the last two equations describe the action of round 0 of the decryption function, so that the output of round 0 of decryption equals the input of round r - 1 of encryption. This correspondence holds all the way through the r iterations, as is easily shown. Note that the derivation does not require that F be a reversible function. To see this, take a limiting case in which F produces a constant output (e.g., all ones) regardless of the values of its input. The equations still hold. CHARACTER STRINGS The NIST algorithms, and the other FPE algorithms that have been proposed, are used with plaintext consisting of a string of elements, called characters. Specifically, a finite set of two or more symbols is called an alphabet, and the elements of an alphabet are called characters. A character string is a finite sequence of characters from an alphabet. Individual characters may repeat in the string. The number of different characters in an alphabet is called the base, also Hiva-Network.Com 236 CHAPTER 7 / BLOCK CIPHER OPERATION referred to as the radix of the alphabet. For example, the lowercase English alpha- bet a, b, c, . . . has a radix, or base, of 26. For purposes of encryption and decryption, the plaintext alphabet must be converted to numerals, where a numeral is a non- negative integer that is less than the base. For example, for the lowercase alphabet, the assignment could be characters a, b, c, . . . , z map into 0, 1, 2, . . . , 25. A limitation of this approach is that all of the elements in a plaintext format must have the same radix. So, for example, an identification number that consists of an alphabetic character followed by nine numeric digits cannot be handled in format-preserving fashion by the FPEs that have been implemented so far. The NIST document defines notation for specifying these conversions (Table 7.3a). To begin, it is assumed that the character string is represented by a numeral string. To convert a numeral string X into a number x, the function NUMradix(X) is used. Viewing X as the string X[1] . . . X [m] with the most signifi- cant numeral first, the function is defined as NUMradix(X) = a m i=1 X[i] radixm - i = a m - 1 i=0 X[m - i] radixi Observe that 0 … NUMradix(X) 6 radixm and that 0 … X[i] 6 radix. [x]s Converts an integer into a byte string; it is the string of s bytes that encodes the number x, with 0 … x 6 28s. The equivalent notation is STR28s(x). LEN(X) Length of the character string X. NUMradix(X) Converts strings to numbers. The number that the numeral string X represents in base radix, with the most significant character first. In other words, it is the nonnegative integer less than radixLEN(X) whose most-significant-character-first representation in base radix is X. PRFK(X) A pseudorandom function that produces a 128-bit output with X as the input, using encryption key K. STRradix m (x) Given a nonnegative integer x less than radixm, this function produces a repre- sentation of x as a string of m characters in base radix, with the most significant character first. [i .. j] The set of integers between two integers i and j, including i and j. X[i .. j] The substring of characters of a string X from X[i] to X[j], including X[i] and X[j]. REV(X) Given a bit string, X, the string that consists of the bits of X in reverse order. (a) Notation radix The base, or number of characters, in a given plaintext alphabet. tweak Input parameter to the encryption and decryption functions whose confidentiality is not protected by the mode. tweakradix The base for tweak strings minlen Minimum message length, in characters. maxlen Maximum message length, in characters. maxTlen Maximum tweak length (b) Parameters Table 7.3 Notation and Parameters Used in FPE Algorithms 7.8 / FORMAT-PRESERVING ENCRYPTION 237 For example, consider the string zaby in radix 26, which converts to the numeral string 25 0 1 24. This converts to the number x = (25 * 263) + (1 * 261) + 2 4 = 4 3 9 4 5 0 . To go in the opposite direction and convert from a number x 6 radixm to a numeral string X of length m, the function STRradixm (x) is used: STRradix m (x) = X[1]c X[m], where X[i] = j x radixm - i kmod radix, i = 1, c, m With the mapping of characters to numerals and the use of the NUM func- tion, a plaintext character string can be mapped to a number and stored as an unsigned integer. We would like to treat this unsigned integer as a bit string that can be input to a bit-scrambling algorithm in FK. However, different platforms store unsigned integers differently, some in little-endian and some in big-endian fashion. So one more step is needed. By the definition of the STR function, STR2 8s(x) will generate a bit string of length 8s, equivalently a byte string of length s, which is a binary integer with the most significant bit first, regardless of how x is stored as an unsigned integer. For convenience the following notation is used: [x]s = STR28s(x). Thus, [NUMradix(X)] s will convert the character string X into an unsigned integer and then convert that to a byte string of length s bytes with the most significant bit first. Continuing, the preceding example should help clarify the issues involved. Character string “zaby” Numeral string X representation of character string 25 0 1 24 Convert X to number x = NUM26(X) decimal: 439450 hex: 6B49A binary: 1101011010010011010 x stored on big-endian byte order machine as a 32-bit unsigned integer hex: 00 06 B4 9A binary: 00000000000001101011010010011010 x stored on little-endian byte order machine as a 32-bit unsigned integer hex: 9A B4 06 00 binary: 10011010101101000000011000000000 Convert x, regardless of endian format, to a bit string of length 32 bits (4 bytes), expressed as [x]4 00000000000001101011010010011010 THE FUNCTION FK We can now define in general terms the function FK. The core of FK is some type of randomizing function whose input and output are bit strings. For convenience, the strings should be multiples of 8 bits, forming byte strings. Define m to be u for even rounds and v for odd rounds; this specifies the desired output character string length. Define b to be the number of bytes needed to store the number representing a character string of m bytes. Then the 238 CHAPTER 7 / BLOCK CIPHER OPERATION round, including FK, consists of the following general steps (A and B refer to Ai and Bi for round i): 1. Q d [NUMradix(B)]b Converts numeral string X into byte string Q of length b bytes. 2. Y d RAN[Q] A pseudorandom function PRNF that produces a pseudorandom byte string Y as a function of the bits of Q. 3. y d NUM2(Y) Converts Y into unsigned integer. 4. c d (NUMradix(A) + y) mod radixm Converts numeral string A into an integer and adds to y, modulo radixm. 5. C d STRradixm (c) Converts c into a numeral string C of length m. 6. A d B; B d C Completes the round by placing the unchanged value of B from the preceding round into A, and placing C into B. Steps 1 through 3 constitute the round function FK. Step 3 is presented with Y, which is an unstructured bit string. Because different platforms may store unsigned integers using different word lengths and endian conventions, it is necessary to per- form NUM2(Y) to get an unsigned integer y. The stored bit sequence for y may or may not be identical to the bit sequence for Y. As mentioned, the pseudorandom function in step 2 need not be reversible. Its purpose is to provide a randomized, scrambled bit string. For DES, this is achieved by using fixed S-boxes, as described in Appendix S. Virtually all FPE schemes that use the Feistel structure use AES as the basis for the scrambling function to achieve stronger security. RELATIONSHIP BETWEEN RADIX, MESSAGE LENGTH, AND BIT LENGTH Consider a numeral string X of length len and base radix. If we convert this to a number x = NUMradix(X), then the maximum value of x is radixlen - 1. The number of bits needed to encode x is bitlen =

from B, or to sign messages sent to B, then B will require A’s public key, which can
be obtained from the following certification path:
B can obtain this set of certificates from the directory, or A can provide them
as part of its initial message to B.
REVOCATION OF CERTIFICATES Recall from Figure 14.15 that each certificate includes
a period of validity, much like a credit card. Typically, a new certificate is issued just
before the expiration of the old one. In addition, it may be desirable on occasion to
revoke a certificate before it expires, for one of the following reasons.
1. The user’s private key is assumed to be compromised.
2. The user is no longer certified by this CA. Reasons for this include that the
subject’s name has changed, the certificate is superseded, or the certificate was
not issued in conformance with the CA’s policies.
3. The CA’s certificate is assumed to be compromised.
Each CA must maintain a list consisting of all revoked but not expired
certificates issued by that CA, including both those issued to users and to other
CAs. These lists should also be posted on the directory.
Figure 14.16 X.509 Hierarchy: A Hypothetical Example
X<> X<> Z<>

14.4 / X.509 CERTIFICATES 465
Each certificate revocation list (CRL) posted to the directory is signed by the
issuer and includes (Figure 14.15b) the issuer’s name, the date the list was created,
the date the next CRL is scheduled to be issued, and an entry for each revoked
certificate. Each entry consists of the serial number of a certificate and revocation
date for that certificate. Because serial numbers are unique within a CA, the serial
number is sufficient to identify the certificate.
When a user receives a certificate in a message, the user must determine
whether the certificate has been revoked. The user could check the directory each
time a certificate is received. To avoid the delays (and possible costs) associated
with directory searches, it is likely that the user would maintain a local cache of
certificates and lists of revoked certificates.
X.509 Version 3
The X.509 version 2 format does not convey all of the information that recent design
and implementation experience has shown to be needed. [FORD95] lists the follow-
ing requirements not satisfied by version 2.
1. The subject field is inadequate to convey the identity of a key owner to a
public-key user. X.509 names may be relatively short and lacking in obvious
identification details that may be needed by the user.
2. The subject field is also inadequate for many applications, which typically
recognize entities by an Internet email address, a URL, or some other Internet-
related identification.
3. There is a need to indicate security policy information. This enables a security
application or function, such as IPSec, to relate an X.509 certificate to a given
4. There is a need to limit the damage that can result from a faulty or malicious
CA by setting constraints on the applicability of a particular certificate.
5. It is important to be able to identify different keys used by the same owner at
different times. This feature supports key lifecycle management: in particular,
the ability to update key pairs for users and CAs on a regular basis or under
exceptional circumstances.
Rather than continue to add fields to a fixed format, standards developers
felt that a more flexible approach was needed. Thus, version 3 includes a number
of optional extensions that may be added to the version 2 format. Each extension
consists of an extension identifier, a criticality indicator, and an extension value.
The criticality indicator indicates whether an extension can be safely ignored. If the
indicator has a value of TRUE and an implementation does not recognize the
extension, it must treat the certificate as invalid.
The certificate extensions fall into three main categories: key and policy
information, subject and issuer attributes, and certification path constraints.
KEY AND POLICY INFORMATION These extensions convey additional information
about the subject and issuer keys, plus indicators of certificate policy. A certif-
icate policy is a named set of rules that indicates the applicability of a certifi-
cate to a particular community and/or class of application with common security
requirements. For example, a policy might be applicable to the authentication of

electronic data interchange (EDI) transactions for the trading of goods within a
given price range.
This area includes:
■ Authority key identifier: Identifies the public key to be used to verify the
signature on this certificate or CRL. Enables distinct keys of the same CA to
be differentiated. One use of this field is to handle CA key pair updating.
■ Subject key identifier: Identifies the public key being certified. Useful for sub-
ject key pair updating. Also, a subject may have multiple key pairs and, cor-
respondingly, different certificates for different purposes (e.g., digital signature
and encryption key agreement).
■ Key usage: Indicates a restriction imposed as to the purposes for which, and
the policies under which, the certified public key may be used. May indicate
one or more of the following: digital signature, nonrepudiation, key encryp-
tion, data encryption, key agreement, CA signature verification on certificates,
CA signature verification on CRLs.
■ Private-key usage period: Indicates the period of use of the private key cor-
responding to the public key. Typically, the private key is used over a different
period from the validity of the public key. For example, with digital signature
keys, the usage period for the signing private key is typically shorter than that
for the verifying public key.
■ Certificate policies: Certificates may be used in environments where multiple
policies apply. This extension lists policies that the certificate is recognized as
supporting, together with optional qualifier information.
■ Policy mappings: Used only in certificates for CAs issued by other CAs. Policy
mappings allow an issuing CA to indicate that one or more of that issuer’s
policies can be considered equivalent to another policy used in the subject
CA’s domain.
tive names, in alternative formats, for a certificate subject or certificate issuer and
can convey additional information about the certificate subject to increase a cer-
tificate user’s confidence that the certificate subject is a particular person or entity.
For  example, information such as postal address, position within a corporation, or
picture image may be required.
The extension fields in this area include:
■ Subject alternative name: Contains one or more alternative names, using any
of a variety of forms. This field is important for supporting certain applications,
such as electronic mail, EDI, and IPSec, which may employ their own name
■ Issuer alternative name: Contains one or more alternative names, using any of
a variety of forms.
■ Subject directory attributes: Conveys any desired X.500 directory attribute
values for the subject of this certificate.

CERTIFICATION PATH CONSTRAINTS These extensions allow constraint specifications
to be included in certificates issued for CAs by other CAs. The constraints may
restrict the types of certificates that can be issued by the subject CA or that may
occur subsequently in a certification chain.
The extension fields in this area include:
■ Basic constraints: Indicates if the subject may act as a CA. If so, a certification
path length constraint may be specified.
■ Name constraints: Indicates a name space within which all subject names in
subsequent certificates in a certification path must be located.
■ Policy constraints: Specifies constraints that may require explicit certifi-
cate policy identification or inhibit policy mapping for the remainder of the
certification path.
RFC 4949 (Internet Security Glossary) defines public-key infrastructure (PKI) as
the set of hardware, software, people, policies, and procedures needed to create,
manage, store, distribute, and revoke digital certificates based on asymmetric
cryptography. The principal objective for developing a PKI is to enable secure,
convenient, and efficient acquisition of public keys. The Internet Engineering Task
Force (IETF) Public Key Infrastructure X.509 (PKIX) working group has been the
driving force behind setting up a formal (and generic) model based on X.509 that is
suitable for deploying a certificate-based architecture on the Internet. This section
describes the PKIX model.
Figure 14.17 shows the interrelationship among the key elements of the PKIX
model. These elements are
■ End entity: A generic term used to denote end users, devices (e.g., servers,
routers), or any other entity that can be identified in the subject field of a
public-key certificate. End entities typically consume and/or support PKI-
related services.
■ Certification authority (CA): The issuer of certificates and (usually) certifi-
cate revocation lists (CRLs). It may also support a variety of administrative
functions, although these are often delegated to one or more Registration
■ Registration authority (RA): An optional component that can assume a num-
ber of administrative functions from the CA. The RA is often associated with
the end entity registration process but can assist in a number of other areas
as well.
■ CRL issuer: An optional component that a CA can delegate to publish CRLs.
■ Repository: A generic term used to denote any method for storing certificates
and CRLs so that they can be retrieved by end entities.

PKIX Management Functions
PKIX identifies a number of management functions that potentially need to be
supported by management protocols. These are indicated in Figure 14.17 and
include the following:
■ Registration: This is the process whereby a user first makes itself known to
a CA (directly or through an RA), prior to that CA issuing a certificate or
certificates for that user. Registration begins the process of enrolling in a PKI.
Registration usually involves some offline or online procedure for mutual
authentication. Typically, the end entity is issued one or more shared secret
keys used for subsequent authentication.
■ Initialization: Before a client system can operate securely, it is necessary to
install key materials that have the appropriate relationship with keys stored
elsewhere in the infrastructure. For example, the client needs to be securely
initialized with the public key and other assured information of the trusted
CA(s), to be used in validating certificate paths.
■ Certification: This is the process in which a CA issues a certificate for a user’s
public key, returns that certificate to the user’s client system, and/or posts that
certificate in a repository.
■ Key pair recovery: Key pairs can be used to support digital signature creation
and verification, encryption and decryption, or both. When a key pair is used for
Figure 14.17 PKIX Architectural Model
End entity
Certificate/CRL retrieval
key pair recovery,
key pair update
revocation request
CRL issuer

encryption/decryption, it is important to provide a mechanism to recover the
necessary decryption keys when normal access to the keying material is no longer
possible, otherwise it will not be possible to recover the encrypted data. Loss of
access to the decryption key can result from forgotten passwords/PINs, corrupted
disk drives, damage to hardware tokens, and so on. Key pair recovery allows end
entities to restore their encryption/decryption key pair from an authorized key
backup facility (typically, the CA that issued the end entity’s certificate).
■ Key pair update: All key pairs need to be updated regularly (i.e., replaced
with a new key pair) and new certificates issued. Update is required when the
certificate lifetime expires and as a result of certificate revocation.
■ Revocation request: An authorized person advises a CA of an abnormal situ-
ation requiring certificate revocation. Reasons for revocation include private-
key compromise, change in affiliation, and name change.
■ Cross certification: Two CAs exchange information used in establishing a
cross-certificate. A cross-certificate is a certificate issued by one CA to another
CA that contains a CA signature key used for issuing certificates.
PKIX Management Protocols
The PKIX working group has defines two alternative management protocols
between PKIX entities that support the management functions listed in the pre-
ceding subsection. RFC 2510 defines the certificate management protocols (CMP).
Within CMP, each of the management functions is explicitly identified by specific
protocol exchanges. CMP is designed to be a flexible protocol able to accommodate
a variety of technical, operational, and business models.
RFC 2797 defines certificate management messages over CMS (CMC), where
CMS refers to RFC 2630, cryptographic message syntax. CMC is built on earlier work
and is intended to leverage existing implementations. Although all of the PKIX func-
tions are supported, the functions do not all map into specific protocol exchanges.
Key Terms
Review Questions
14.1 Explain why man-in-the-middle attacks are ineffective on the secret key distribution
protocol discussed in Figure 14.3.
14.2 What is the major issue in end to end key distribution? How does the key hierarchy
concept address that issue?
14.3 What is a nonce?
14.4 What is a key distribution center?
14.5 What are two different uses of public-key cryptography related to key distribution?
end-to-end encryption
key distribution
key distribution center (KDC)
key management
man-in-the-middle attack
master key
public-key certificate
public-key directory
X.509 certificate

14.6 List four general categories of schemes for the distribution of public keys.
14.7 Discuss the potential security issues that arise due to public key directory based
14.8 What is a public-key certificate?
14.9 What are the requirements for the use of a public-key certificate scheme?
14.10 What is the purpose of the X.509 standard?
14.11 What is a chain of certificates?
14.12 How is an X.509 certificate revoked?
14.1 One local area network vendor provides a key distribution facility, as illustrated in
Figure 14.18.
a. Describe the scheme.
b. Compare this scheme to that of Figure 14.3. What are the pros and cons?
14.2 “We are under great pressure, Holmes.” Detective Lestrade looked nervous. “We
have learned that copies of sensitive government documents are stored in computers
of one foreign embassy here in London. Normally these documents exist in electronic
form only on a selected few government computers that satisfy the most stringent
security requirements. However, sometimes they must be sent through the network
connecting all government computers. But all messages in this network are encrypted
using a top-secret encryption algorithm certified by our best crypto experts. Even the
NSA and the KGB are unable to break it. And now these documents have appeared
in hands of diplomats of a small, otherwise insignificant, country. And we have no
idea how it could happen.”
“But you do have some suspicion who did it, do you?” asked Holmes.
“Yes, we did some routine investigation. There is a man who has legal access
to one of the government computers and has frequent contacts with diplomats from
the embassy. But the computer he has access to is not one of the trusted ones where
these documents are normally stored. He is the suspect, but we have no idea how he
could obtain copies of the documents. Even if he could obtain a copy of an encrypted
document, he couldn’t decrypt it.”
Figure 14.18 Figure for Problem 14.1

Center (KDC)
(1) IDA, E(Ka, Na)
(2) IDA, E(Ka, Na), IDB, E(Kb, Nb)
(4) E(Ka, [Ks, IDB, Na])
(3) E(Kb, [Ks, IDA, Nb]), E(Ka, [Ks, IDB, Na])

“Hmm, please describe the communication protocol used on the network.”
Holmes opened his eyes, thus proving that he had followed Lestrade’s talk with an
attention that contrasted with his sleepy look.
“Well, the protocol is as follows. Each node N of the network has been assigned
a unique secret key Kn. This key is used to secure communication between the node
and a trusted server. That is, all the keys are stored also on the server. User A, wishing
to send a secret message M to user B, initiates the following protocol:
1. A generates a random number R and sends to the server his name A, destination
B, and E(Ka, R).
2. Server responds by sending E(Kb, R) to A.
3. A sends E(R, M) together with E(Kb, R) to B.
4. B knows Kb, thus decrypts E(Kb, R), to get R and will subsequently use R to
decrypt E(R, M) to get M.
You see that a random key is generated every time a message has to be sent. I admit
the man could intercept messages sent between the top-secret trusted nodes, but I see
no way he could decrypt them.”
“Well, I think you have your man, Lestrade. The protocol isn’t secure because
the server doesn’t authenticate users who send him a request. Apparently designers
of the protocol have believed that sending E(Kx, R) implicitly authenticates user X as
the sender, as only X (and the server) knows Kx. But you know that E(Kx, R) can be
intercepted and later replayed. Once you understand where the hole is, you will be
able to obtain enough evidence by monitoring the man’s use of the computer he has
access to. Most likely he works as follows. After intercepting E(Ka, R) and E(R, M)
(see steps 1 and 3 of the protocol), the man, let’s denote him as Z, will continue by
pretending to be A and . . . 
Finish the sentence for Holmes.
14.3 The 1988 version of X.509 lists properties that RSA keys must satisfy to be secure
given current knowledge about the difficulty of factoring large numbers. The discus-
sion concludes with a constraint on the public exponent and the modulus n:
It must be ensured that e 7 log2(n) to prevent attack by taking the eth
root mod n to disclose the plaintext.
Although the constraint is correct, the reason given for requiring it is incorrect. What
is wrong with the reason given and what is the correct reason?
14.4 Find at least one intermediate certification authority’s certificate and one trusted
root certification authority’s certificate on your computer (e.g., in the browser). Print
screenshots of both the general and details tab for each certificate.
14.5 NIST defines the term cryptoperiod as the time span during which a specific key is
authorized for use or in which the keys for a given system or application may remain
in effect. One document on key management uses the following time diagram for
a shared secret key.
Originator usage period
Recipient usage period

Explain the overlap by giving an example application in which the originator’s usage
period for the shared secret key begins before the recipient’s usage period and also
ends before the recipients usage period.
14.6 Consider the following protocol, designed to let A and B decide on a fresh, shared
session key KAB
= . We assume that they already share a long-term key KAB.
1. A S B: A, NA.
2. B S A: E(KAB, [NA, KAB= ])
3. A S B: E(KAB= , NA)
a. We first try to understand the protocol designer’s reasoning:
—Why would A and B believe after the protocol ran that they share KAB
= with the
other party?
—Why would they believe that this shared key is fresh?
In both cases, you should explain both the reasons of both A and B, so your answer
should complete the sentences
A believes that she shares KAB
= with B since . . . 
B believes that he shares KAB
= with A since . . . 
A believes that KAB
= is fresh since . . . 
B believes that KAB
= is fresh since . . . 
b. Assume now that A starts a run of this protocol with B. However, the connection
is intercepted by the adversary C. Show how C can start a new run of the protocol
using reflection, causing A to believe that she has agreed on a fresh key with B (in
spite of the fact that she has only been communicating with C). Thus, in particular,
the belief in (a) is false.
c. Propose a modification of the protocol that prevents this attack.
14.7 What are the management functions of a PKI? What is a cross certificate?
14.8 State the significance of key pair recovery. When is the key pair updated?
Note: The remaining problems deal with the a cryptographic product developed by IBM,
which is briefly described in a document at (IBMCrypto ). Try these
problems after reviewing the document.
14.9 What is the effect of adding the instruction EMKi
EMKi: X S E(KMHi, X) i = 0, 1
14.10 Suppose N different systems use the IBM Cryptographic Subsystem with host master
keys KMH[i](i = 1, 2, c N). Devise a method for communicating between sys-
tems without requiring the system to either share a common host master key or to
divulge their individual host master keys. Hint: Each system needs three variants of
its host master key.
14.11 The principal objective of the IBM Cryptographic Subsystem is to protect transmis-
sions between a terminal and the processing system. Devise a procedure, perhaps
adding instructions, which will allow the processor to generate a session key KS and
distribute it to Terminal i and Terminal j without having to store a key-equivalent
variable in the host.

User Authentication
15.1 Remote User-Authentication Principles
The NIST Model for Electronic User Authentication
Means of Authentication
Mutual Authentication
One-Way Authentication
15.2 Remote User-Authentication Using Symmetric Encryption
Mutual Authentication
One-Way Authentication
15.3 Kerberos
Kerberos Version 4
Kerberos Version 5
15.4 Remote User-Authentication Using Asymmetric Encryption
Mutual Authentication
One-Way Authentication
15.5 Federated Identity Management
Identity Management
Identity Federation
15.6 Personal Identity Verification
PIV System Model
PIV Documentation
PIV Credentials and Keys
15.7 Key Terms, Review Questions, and Problems

This chapter examines some of the authentication functions that have been developed
to support network-based user authentication. The chapter begins with an introduc-
tion to some of the concepts and key considerations for user authentication over a
network or the Internet. The next section examines user-authentication protocols that
rely on symmetric encryption. This is followed by a section on one of the earliest and
also one of the most widely used authentication services: Kerberos. Next, the chapter
looks at user-authentication protocols that rely on asymmetric encryption. This is fol-
lowed by a discussion of the X.509 user-authentication protocol. Finally, the concept of
federated identity is introduced.
In most computer security contexts, user authentication is the fundamental build-
ing block and the primary line of defense. User authentication is the basis for most
types of access control and for user accountability. RFC 4949 (Internet Security
Glossary) defines user authentication as the process of verifying an identity claimed
by or for a system entity. This process consists of two steps:
■ Identification step: Presenting an identifier to the security system. (Identifiers
should be assigned carefully, because authenticated identities are the basis for
other security services, such as access control service.)
■ Verification step: Presenting or generating authentication information that
corroborates the binding between the entity and the identifier.
For example, user Alice Toklas could have the user identifier ABTOKLAS.
This information needs to be stored on any server or computer system that Alice
wishes to use and could be known to system administrators and other users.
After studying this chapter, you should be able to:
◆ Understand the distinction between identification and verification.
◆ Present an overview of techniques for remote user authentication using
symmetric encryption.
◆ Give a presentation on Kerberos.
◆ Explain the differences between versions 4 and 5 of Kerberos.
◆ Describe the use of Kerberos in multiple realms.
◆ Present an overview of techniques for remote user authentication using
asymmetric encryption.
◆ Understand the need for a federated identity management system.
◆ Explain the use of PIV mechanisms as part of a user authentication system.

A typical item of authentication information associated with this user ID is a pass-
word, which is kept secret (known only to Alice and to the system). If no one is
able to obtain or guess Alice’s password, then the combination of Alice’s user ID
and password enables administrators to set up Alice’s access permissions and audit
her activity. Because Alice’s ID is not secret, system users can send her email, but
because her password is secret, no one can pretend to be Alice.
In essence, identification is the means by which a user provides a claimed
identity to the system; user authentication is the means of establishing the validity
of the claim. Note that user authentication is distinct from message authentication.
As defined in Chapter 12, message authentication is a procedure that allows com-
municating parties to verify that the contents of a received message have not been
altered and that the source is authentic. This chapter is concerned solely with user
The NIST Model for Electronic User Authentication
NIST SP 800-63-2 (Electronic Authentication Guideline, August 2013) defines elec-
tronic user authentication as the process of establishing confidence in user identi-
ties that are presented electronically to an information system. Systems can use the
authenticated identity to determine if the authenticated individual is authorized to
perform particular functions, such as database transactions or access to system re-
sources. In many cases, the authentication and transaction or other authorized function
takes place across an open network such as the Internet. Equally authentication and
subsequent authorization can take place locally, such as across a local area network.
SP 800-63-2 defines a general model for user authentication that involves a num-
ber of entities and procedures. We discuss this model with reference to Figure 15.1.
The initial requirement for performing user authentication is that the user
must be registered with the system. The following is a typical sequence for registra-
tion. An applicant applies to a registration authority (RA) to become a subscriber
Figure 15.1 The NIST SP 800-63-2 E-Authentication Architectural Model
authority (RA)
Registration, credential issuance,
and maintenance
E-Authentication using
token and credential
Identity proofing
User registration
, cr
Authenticated session
Authenticated protocol
party (RP)
provider (RA)

of a credential service provider (CSP). In this model, the RA is a trusted entity that
establishes and vouches for the identity of an applicant to a CSP. The CSP then
engages in an exchange with the subscriber. Depending on the details of the over-
all authentication system, the CSP issues some sort of electronic credential to the
subscriber. The credential is a data structure that authoritatively binds an identity
and additional attributes to a token possessed by a subscriber, and can be verified
when presented to the verifier in an authentication transaction. The token could
be an encryption key or an encrypted password that identifies the subscriber. The
token may be issued by the CSP, generated directly by the subscriber, or provided
by a third party. The token and credential may be used in subsequent authentica-
tion events.
Once a user is registered as a subscriber, the actual authentication process can
take place between the subscriber and one or more systems that perform authen-
tication and, subsequently, authorization. The party to be authenticated is called a
claimant and the party verifying that identity is called a verifier. When a claimant
successfully demonstrates possession and control of a token to a verifier through an
authentication protocol, the verifier can verify that the claimant is the subscriber
named in the corresponding credential. The verifier passes on an assertion about the
identity of the subscriber to the relying party (RP). That assertion includes identity
information about a subscriber, such as the subscriber name, an identifier assigned
at registration, or other subscriber attributes that were verified in the registration
process. The RP can use the authenticated information provided by the verifier to
make access control or authorization decisions.
An implemented system for authentication will differ from or be more com-
plex than this simplified model, but the model illustrates the key roles and functions
needed for a secure authentication system.
Means of Authentication
There are four general means of authenticating a user’s identity, which can be used
alone or in combination:
■ Something the individual knows: Examples include a password, a personal
identification number (PIN), or answers to a prearranged set of questions.
■ Something the individual possesses: Examples include cryptographic keys,
electronic keycards, smart cards, and physical keys. This type of authenticator
is referred to as a token.
■ Something the individual is (static biometrics): Examples include recognition
by fingerprint, retina, and face.
■ Something the individual does (dynamic biometrics): Examples include recog-
nition by voice pattern, handwriting characteristics, and typing rhythm.
All of these methods, properly implemented and used, can provide secure
user authentication. However, each method has problems. An adversary may be
able to guess or steal a password. Similarly, an adversary may be able to forge or
steal a token. A user may forget a password or lose a token. Furthermore, there is a
significant administrative overhead for managing password and token information
on systems and securing such information on systems. With respect to biometric

authenticators, there are a variety of problems, including dealing with false positives
and false negatives, user acceptance, cost, and convenience. For network-based user
authentication, the most important methods involve cryptographic keys and some-
thing the individual knows, such as a password.
Mutual Authentication
An important application area is that of mutual authentication protocols. Such pro-
tocols enable communicating parties to satisfy themselves mutually about each oth-
er’s identity and to exchange session keys. This topic was examined in Chapter 14.
There, the focus was key distribution. We return to this topic here to consider the
wider implications of authentication.
Central to the problem of authenticated key exchange are two issues: confi-
dentiality and timeliness. To prevent masquerade and to prevent compromise of
session keys, essential identification and session-key information must be commu-
nicated in encrypted form. This requires the prior existence of secret or public keys
that can be used for this purpose. The second issue, timeliness, is important because
of the threat of message replays. Such replays, at worst, could allow an opponent to
compromise a session key or successfully impersonate another party. At minimum,
a successful replay can disrupt operations by presenting parties with messages that
appear genuine but are not.
[GONG93] lists the following examples of replay attacks:
1. The simplest replay attack is one in which the opponent simply copies a mes-
sage and replays it later.
2. An opponent can replay a timestamped message within the valid time window.
If both the original and the replay arrive within then time window, this inci-
dent can be logged.
3. As with example (2), an opponent can replay a timestamped message within
the valid time window, but in addition, the opponent suppresses the original
message. Thus, the repetition cannot be detected.
4. Another attack involves a backward replay without modification. This is a re-
play back to the message sender. This attack is possible if symmetric encryp-
tion is used and the sender cannot easily recognize the difference between
messages sent and messages received on the basis of content.
One approach to coping with replay attacks is to attach a sequence number to
each message used in an authentication exchange. A new message is accepted only
if its sequence number is in the proper order. The difficulty with this approach is
that it requires each party to keep track of the last sequence number for each claim-
ant it has dealt with. Because of this overhead, sequence numbers are generally not
used for authentication and key exchange. Instead, one of the following two general
approaches is used:
■ Timestamps: Party A accepts a message as fresh only if the message contains
a timestamp that, in A’s judgment, is close enough to A’s knowledge of cur-
rent time. This approach requires that clocks among the various participants
be synchronized.

■ Challenge/response: Party A, expecting a fresh message from B, first sends B
a nonce (challenge) and requires that the subsequent message (response) re-
ceived from B contain the correct nonce value.
It can be argued (e.g., [LAM92a]) that the timestamp approach should not be
used for connection-oriented applications because of the inherent difficulties with
this technique. First, some sort of protocol is needed to maintain synchronization
among the various processor clocks. This protocol must be both fault tolerant, to
cope with network errors, and secure, to cope with hostile attacks. Second, the oppor-
tunity for a successful attack will arise if there is a temporary loss of synchronization
resulting from a fault in the clock mechanism of one of the parties. Finally, because
of the variable and unpredictable nature of network delays, distributed clocks cannot
be expected to maintain precise synchronization. Therefore, any timestamp-based
procedure must allow for a window of time sufficiently large to accommodate net-
work delays yet sufficiently small to minimize the opportunity for attack.
On the other hand, the challenge-response approach is unsuitable for a con-
nectionless type of application, because it requires the overhead of a handshake be-
fore any connectionless transmission, effectively negating the chief characteristic of
a connectionless transaction. For such applications, reliance on some sort of secure
time server and a consistent attempt by each party to keep its clocks in synchroniza-
tion may be the best approach (e.g., [LAM92b]).
One-Way Authentication
One application for which encryption is growing in popularity is electronic mail
(email). The very nature of electronic mail, and its chief benefit, is that it is not nec-
essary for the sender and receiver to be online at the same time. Instead, the email
message is forwarded to the receiver’s electronic mailbox, where it is buffered until
the receiver is available to read it.
The “envelope” or header of the email message must be in the clear, so that
the message can be handled by the store-and-forward email protocol, such as the
Simple Mail Transfer Protocol (SMTP) or X.400. However, it is often desirable that
the mail-handling protocol not require access to the plaintext form of the message,
because that would require trusting the mail-handling mechanism. Accordingly, the
email message should be encrypted such that the mail-handling system is not in
possession of the decryption key.
A second requirement is that of authentication. Typically, the recipient wants
some assurance that the message is from the alleged sender.
Mutual Authentication
As was discussed in Chapter 14, a two-level hierarchy of symmetric encryption keys
can be used to provide confidentiality for communication in a distributed environ-
ment. In general, this strategy involves the use of a trusted key distribution center

(KDC). Each party in the network shares a secret key, known as a master key, with
the KDC. The KDC is responsible for generating keys to be used for a short time
over a connection between two parties, known as session keys, and for distribut-
ing those keys using the master keys to protect the distribution. This approach is
quite common. As an example, we look at the Kerberos system in Section 15.3.
The discussion in this subsection is relevant to an understanding of the Kerberos
Figure 14.3 illustrates a proposal initially put forth by Needham and Schroeder
[NEED78] for secret key distribution using a KDC that, as was mentioned in
Chapter 14, includes authentication features. The protocol can be summarized as
1. A S KDC: IDA } IDB }N1
2. KDC S A: E(Ka, [Ks } IDB }N1 }E(Kb, [Ks } IDA])])
3. A S B: E(Kb, [Ks } IDA])
4. B S A: E(Ks, N2)
5. A S B: E(Ks, f(N2)) where f() is a generic function that modifies the
value of the nonce.
Secret keys Ka and Kb are shared between A and the KDC and B and the
KDC, respectively. The purpose of the protocol is to distribute securely a session
key Ks to A and B. Entity A securely acquires a new session key in step 2. The mes-
sage in step 3 can be decrypted, and hence understood, only by B. Step 4 reflects B’s
knowledge of Ks, and step 5 assures B of A’s knowledge of Ks and assures B that this
is a fresh message because of the use of the nonce N2. Recall from our discussion in
Chapter 14 that the purpose of steps 4 and 5 is to prevent a certain type of replay at-
tack. In particular, if an opponent is able to capture the message in step 3 and replay
it, this might in some fashion disrupt operations at B.
Despite the handshake of steps 4 and 5, the protocol is still vulnerable to a
form of replay attack. Suppose that an opponent, X, has been able to compromise
an old session key. Admittedly, this is a much more unlikely occurrence than that
an opponent has simply observed and recorded step 3. Nevertheless, it is a potential
security risk. X can impersonate A and trick B into using the old key by simply re-
playing step 3. Unless B remembers indefinitely all previous session keys used with
A, B will be unable to determine that this is a replay. If X can intercept the hand-
shake message in step 4, then it can impersonate A’s response in step 5. From this
point on, X can send bogus messages to B that appear to B to come from A using an
authenticated session key.
Denning [DENN81, DENN82] proposes to overcome this weakness by a
modification to the Needham/Schroeder protocol that includes the addition of a
timestamp to steps 2 and 3. Her proposal assumes that the master keys, Ka and Kb,
are secure, and it consists of the following steps.
1The portion to the left of the colon indicates the sender and the receiver; the portion to the right indi-
cates the contents of the message; the symbol } indicates concatenation.

2. KDC S A: E(Ka, [Ks } IDB }T }E(Kb, [Ks } IDA }T])])
3. A S B: E(Kb, [Ks } IDA }T])
4. B S A: E(Ks, N1)
5. A S B: E(Ks, f(N1))
T is a timestamp that assures A and B that the session key has only just been
generated. Thus, both A and B know that the key distribution is a fresh exchange.
A and B can verify timeliness by checking that
�Clock – T � 6 ∆t1 + ∆t2
where ∆t1 is the estimated normal discrepancy between the KDC’s clock and the
local clock (at A or B) and ∆t2 is the expected network delay time. Each node can
set its clock against some standard reference source. Because the timestamp T is
encrypted using the secure master keys, an opponent, even with knowledge of an
old session key, cannot succeed because a replay of step 3 will be detected by B as
A final point: Steps 4 and 5 were not included in the original presentation
[DENN81] but were added later [DENN82]. These steps confirm the receipt of the
session key at B.
The Denning protocol seems to provide an increased degree of security com-
pared to the Needham/Schroeder protocol. However, a new concern is raised:
namely, that this new scheme requires reliance on clocks that are synchronized
throughout the network. [GONG92] points out a risk involved. The risk is based
on the fact that the distributed clocks can become unsynchronized as a result of
sabotage on or faults in the clocks or the synchronization mechanism.2 The problem
occurs when a sender’s clock is ahead of the intended recipient’s clock. In this case,
an opponent can intercept a message from the sender and replay it later when the
timestamp in the message becomes current at the recipient’s site. This replay could
cause unexpected results. Gong refers to such attacks as suppress-replay attacks.
One way to counter suppress-replay attacks is to enforce the requirement that
parties regularly check their clocks against the KDC’s clock. The other alternative,
which avoids the need for clock synchronization, is to rely on handshaking protocols
using nonces. This latter alternative is not vulnerable to a suppress-replay attack,
because the nonces the recipient will choose in the future are unpredictable to the
sender. The Needham/Schroeder protocol relies on nonces only but, as we have
seen, has other vulnerabilities.
In [KEHN92], an attempt is made to respond to the concerns about suppress-
replay attacks and at the same time fix the problems in the Needham/Schroeder
protocol. Subsequently, an inconsistency in this latter protocol was noted and an
improved strategy was presented in [NEUM93a].3 The protocol is
2Such things can and do happen. In recent years, flawed chips were used in a number of computers and other
electronic systems to track the time and date. The chips had a tendency to skip forward one day. [NEUM90]
3It really is hard to get these things right.

1. A S B: IDA }Na
2. B S KDC: IDB }Nb }E(Kb, [IDA }Na }Tb])
3. KDC S A: E(Ka, [IDB }Na }Ks }Tb]) }E(Kb, [IDA }Ks }Tb]) }Nb
4. A S B: E(Kb, [IDA }Ks }Tb]) }E(Ks, Nb)
Let us follow this exchange step by step.
1. A initiates the authentication exchange by generating a nonce, Na, and sending
that plus its identifier to B in plaintext. This nonce will be returned to A in an
encrypted message that includes the session key, assuring A of its timeliness.
2. B alerts the KDC that a session key is needed. Its message to the KDC in-
cludes its identifier and a nonce, Nb. This nonce will be returned to B in an
encrypted message that includes the session key, assuring B of its timeliness.
B’s message to the KDC also includes a block encrypted with the secret key
shared by B and the KDC. This block is used to instruct the KDC to issue
credentials to A; the block specifies the intended recipient of the credentials, a
suggested expiration time for the credentials, and the nonce received from A.
3. The KDC passes on to A B’s nonce and a block encrypted with the secret key
that B shares with the KDC. The block serves as a “ticket” that can be used
by A for subsequent authentications, as will be seen. The KDC also sends to
A a block encrypted with the secret key shared by A and the KDC. This block
verifies that B has received A’s initial message (IDB) and that this is a timely
message and not a replay (Na), and it provides A with a session key (Ks) and
the time limit on its use (Tb).
4. A transmits the ticket to B, together with the B’s nonce, the latter encrypted
with the session key. The ticket provides B with the secret key that is used to de-
crypt E(Ks, Nb) to recover the nonce. The fact that B’s nonce is encrypted with
the session key authenticates that the message came from A and is not a replay.
This protocol provides an effective, secure means for A and B to establish a
session with a secure session key. Furthermore, the protocol leaves A in posses-
sion of a key that can be used for subsequent authentication to B, avoiding the
need to contact the authentication server repeatedly. Suppose that A and B estab-
lish a session using the aforementioned protocol and then conclude that session.
Subsequently, but within the time limit established by the protocol, A desires a new
session with B. The following protocol ensues:
1. A S B: E(Kb, [IDA }Ks }Tb]) }Na=
2. B S A: Nb= }E(Ks, Na= )
3. A S B: E(Ks, Nb= )
When B receives the message in step 1, it verifies that the ticket has not expired.
The newly generated nonces Na
= and Nb
= assure each party that there is no replay
In all the foregoing, the time specified in Tb is a time relative to B’s clock.
Thus, this timestamp does not require synchronized clocks, because B checks only
self-generated timestamps.

One-Way Authentication
Using symmetric encryption, the decentralized key distribution scenario illustrated
in Figure 14.5 is impractical. This scheme requires the sender to issue a request to
the intended recipient, await a response that includes a session key, and only then
send the message.
With some refinement, the KDC strategy illustrated in Figure 14.3 is a can-
didate for encrypted electronic mail. Because we wish to avoid requiring that the
recipient (B) be on line at the same time as the sender (A), steps 4 and 5 must be
eliminated. For a message with content M, the sequence is as follows:
1. A S KDC: IDA } IDB }N1
2. KDC S A: E(Ka, [Ks } IDB }N1 }E(Kb, [Ks } IDA])])
3. A S B: E(Kb, [Ks } IDA]) }E(Ks, M)
This approach guarantees that only the intended recipient of a message will be
able to read it. It also provides a level of authentication that the sender is A. As
specified, the protocol does not protect against replays. Some measure of defense
could be provided by including a timestamp with the message. However, because
of the potential delays in the email process, such timestamps may have limited
Kerberos4 is an authentication service developed as part of Project Athena at MIT.
The problem that Kerberos addresses is this: Assume an open distributed environ-
ment in which users at workstations wish to access services on servers distributed
throughout the network. We would like for servers to be able to restrict access to
authorized users and to be able to authenticate requests for service. In this envi-
ronment, a workstation cannot be trusted to identify its users correctly to network
services. In particular, the following three threats exist:
1. A user may gain access to a particular workstation and pretend to be another
user operating from that workstation.
2. A user may alter the network address of a workstation so that the requests
sent from the altered workstation appear to come from the impersonated
3. A user may eavesdrop on exchanges and use a replay attack to gain entrance
to a server or to disrupt operations.
In any of these cases, an unauthorized user may be able to gain access to services
and data that he or she is not authorized to access. Rather than building in elaborate
4“In Greek mythology, a many headed dog, commonly three, perhaps with a serpent’s tail, the guardian
of the entrance of Hades.” From Dictionary of Subjects and Symbols in Art, by James Hall, Harper &
Row, 1979. Just as the Greek Kerberos has three heads, the modern Kerberos was intended to have three
components to guard a network’s gate: authentication, accounting, and audit. The last two heads were
never implemented.

15.3 / KERBEROS 483
authentication protocols at each server, Kerberos provides a centralized authenti-
cation server whose function is to authenticate users to servers and servers to users.
Unlike most other authentication schemes described in this book, Kerberos relies
exclusively on symmetric encryption, making no use of public-key encryption.
Two versions of Kerberos are in common use. Version 4 [MILL88, STEI88]
implementations still exist. Version 5 [KOHL94] corrects some of the security defi-
ciencies of version 4 and has been issued as a proposed Internet Standard (RFC
4120 and RFC 4121).5
We begin this section with a brief discussion of the motivation for the Kerberos
approach. Then, because of the complexity of Kerberos, it is best to start with a de-
scription of the authentication protocol used in version 4. This enables us to see the
essence of the Kerberos strategy without considering some of the details required to
handle subtle security threats. Finally, we examine version 5.
If a set of users is provided with dedicated personal computers that have no network
connections, then a user’s resources and files can be protected by physically secur-
ing each personal computer. When these users instead are served by a centralized
time-sharing system, the time-sharing operating system must provide the security.
The operating system can enforce access-control policies based on user identity and
use the logon procedure to identify users.
Today, neither of these scenarios is typical. More common is a distributed
architecture consisting of dedicated user workstations (clients) and distributed
or centralized servers. In this environment, three approaches to security can be
1. Rely on each individual client workstation to assure the identity of its user or
users and rely on each server to enforce a security policy based on user iden-
tification (ID).
2. Require that client systems authenticate themselves to servers, but trust the
client system concerning the identity of its user.
3. Require the user to prove his or her identity for each service invoked. Also
require that servers prove their identity to clients.
In a small, closed environment in which all systems are owned and operated
by a single organization, the first or perhaps the second strategy may suffice.6 But
in a more open environment in which network connections to other machines are
supported, the third approach is needed to protect user information and resources
housed at the server. Kerberos supports this third approach. Kerberos assumes a
distributed client/server architecture and employs one or more Kerberos servers to
provide an authentication service.
5Versions 1 through 3 were internal development versions. Version 4 is the “original” Kerberos.
6However, even a closed environment faces the threat of attack by a disgruntled employee.

The first published report on Kerberos [STEI88] listed the following
■ Secure: A network eavesdropper should not be able to obtain the necessary
information to impersonate a user. More generally, Kerberos should be strong
enough that a potential opponent does not find it to be the weak link.
■ Reliable: For all services that rely on Kerberos for access control, lack of
availability of the Kerberos service means lack of availability of the supported
services. Hence, Kerberos should be highly reliable and should employ a
distributed server architecture with one system able to back up another.
■ Transparent: Ideally, the user should not be aware that authentication is taking
place beyond the requirement to enter a password.
■ Scalable: The system should be capable of supporting large numbers of clients
and servers. This suggests a modular, distributed architecture.
To support these requirements, the overall scheme of Kerberos is that of a
trusted third-party authentication service that uses a protocol based on that pro-
posed by Needham and Schroeder [NEED78], which was discussed in Section 15.2.
It is trusted in the sense that clients and servers trust Kerberos to mediate their
mutual authentication. Assuming the Kerberos protocol is well designed, then the
authentication service is secure if the Kerberos server itself is secure.7
Kerberos Version 4
Version 4 of Kerberos makes use of DES, in a rather elaborate protocol, to pro-
vide the authentication service. Viewing the protocol as a whole, it is difficult to see
the need for the many elements contained therein. Therefore, we adopt a strategy
used by Bill Bryant of Project Athena [BRYA88] and build up to the full protocol
by looking first at several hypothetical dialogues. Each successive dialogue adds
additional complexity to counter security vulnerabilities revealed in the preceding
After examining the protocol, we look at some other aspects of version 4.
A SIMPLE AUTHENTICATION DIALOGUE In an unprotected network environment, any
client can apply to any server for service. The obvious security risk is that of im-
personation. An opponent can pretend to be another client and obtain unauthor-
ized privileges on server machines. To counter this threat, servers must be able to
confirm the identities of clients who request service. Each server can be required to
undertake this task for each client/server interaction, but in an open environment,
this places a substantial burden on each server.
An alternative is to use an authentication server (AS) that knows the
passwords of all users and stores these in a centralized database. In addition, the AS
shares a unique secret key with each server. These keys have been distributed physi-
cally or in some other secure manner. Consider the following hypothetical dialogue:
(1) C S AS: IDC }PC } IDV
(2) AS S C: Ticket
(3) C S V: IDC }Ticket
Ticket = E(Kv, [IDC }ADC } IDV])
C = client
AS = authentication server
V = server
IDC = identifier of user on C
IDV = identifier of V
PC = password of user on C
ADC = network address of C
Kv = secret encryption key shared by AS and V
In this scenario, the user logs on to a workstation and requests access to server V.
The client module C in the user’s workstation requests the user’s password and then
sends a message to the AS that includes the user’s ID, the server’s ID, and the user’s
password. The AS checks its database to see if the user has supplied the proper
password for this user ID and whether this user is permitted access to server V. If
both tests are passed, the AS accepts the user as authentic and must now convince
the server that this user is authentic. To do so, the AS creates a ticket that con-
tains the user’s ID and network address and the server’s ID. This ticket is encrypted
using the secret key shared by the AS and this server. This ticket is then sent back
to C. Because the ticket is encrypted, it cannot be altered by C or by an opponent.
With this ticket, C can now apply to V for service. C sends a message to V con-
taining C’s ID and the ticket. V decrypts the ticket and verifies that the user ID in
the ticket is the same as the unencrypted user ID in the message. If these two match,
the server considers the user authenticated and grants the requested service.
Each of the ingredients of message (3) is significant. The ticket is encrypted to
prevent alteration or forgery. The server’s ID (IDV) is included in the ticket so that
the server can verify that it has decrypted the ticket properly. IDC is included in the
ticket to indicate that this ticket has been issued on behalf of C. Finally, ADC serves
to counter the following threat. An opponent could capture the ticket transmitted
in message (2), then use the name IDC and transmit a message of form (3) from
another workstation. The server would receive a valid ticket that matches the user
ID and grant access to the user on that other workstation. To prevent this attack,
the AS includes in the ticket the network address from which the original request
came. Now the ticket is valid only if it is transmitted from the same workstation that
initially requested the ticket.

A MORE SECURE AUTHENTICATION DIALOGUE Although the foregoing scenario solves
some of the problems of authentication in an open network environment, problems
remain. Two in particular stand out. First, we would like to minimize the number
of times that a user has to enter a password. Suppose each ticket can be used only
once. If user C logs on to a workstation in the morning and wishes to check his or her
mail at a mail server, C must supply a password to get a ticket for the mail server. If
C wishes to check the mail several times during the day, each attempt requires re-
entering the password. We can improve matters by saying that tickets are reusable.
For a single logon session, the workstation can store the mail server ticket after it is
received and use it on behalf of the user for multiple accesses to the mail server.
However, under this scheme, it remains the case that a user would need a new
ticket for every different service. If a user wished to access a print server, a mail
server, a file server, and so on, the first instance of each access would require a new
ticket and hence require the user to enter the password.
The second problem is that the earlier scenario involved a plaintext transmis-
sion of the password [message (1)]. An eavesdropper could capture the password
and use any service accessible to the victim.
To solve these additional problems, we introduce a scheme for avoiding plain-
text passwords and a new server, known as the ticket-granting server (TGS). The
new (but still hypothetical) scenario is as follows.
Once per user logon session:
(1) C S AS: IDC } IDtgs
(2) AS S C: E(Kc, Tickettgs)
Once per type of service:
(3) C S TGS: IDC } IDV }Tickettgs
(4) TGS S C: Ticketv
Once per service session:
(5) C S V: IDC }Ticketv
Tickettgs = E(Ktgs, [IDC }ADC } IDtgs }TS1 }Lifetime1])
Ticketv = E(Kv, [IDC }ADC } IDv }TS2 }Lifetime2])
The new service, TGS, issues tickets to users who have been authenticated to
AS. Thus, the user first requests a ticket-granting ticket (Tickettgs) from the AS. The
client module in the user workstation saves this ticket. Each time the user requires
access to a new service, the client applies to the TGS, using the ticket to authenti-
cate itself. The TGS then grants a ticket for the particular service. The client saves
each service-granting ticket and uses it to authenticate its user to a server each time
a particular service is requested. Let us look at the details of this scheme:
1. The client requests a ticket-granting ticket on behalf of the user by sending its
user’s ID to the AS, together with the TGS ID, indicating a request to use the
TGS service.

2. The AS responds with a ticket that is encrypted with a key that is derived from
the user’s password (Kc), which is already stored at the AS. When this response
arrives at the client, the client prompts the user for his or her password, gen-
erates the key, and attempts to decrypt the incoming message. If the correct
password is supplied, the ticket is successfully recovered.
Because only the correct user should know the password, only the correct user
can recover the ticket. Thus, we have used the password to obtain credentials from
Kerberos without having to transmit the password in plaintext. The ticket itself
consists of the ID and network address of the user, and the ID of the TGS. This
corresponds to the first scenario. The idea is that the client can use this ticket to
request multiple service-granting tickets. So the ticket-granting ticket is to be reus-
able. However, we do not wish an opponent to be able to capture the ticket and use
it. Consider the following scenario: An opponent captures the login ticket and waits
until the user has logged off his or her workstation. Then the opponent either gains
access to that workstation or configures his workstation with the same network ad-
dress as that of the victim. The opponent would be able to reuse the ticket to spoof
the TGS. To counter this, the ticket includes a timestamp, indicating the date and
time at which the ticket was issued, and a lifetime, indicating the length of time for
which the ticket is valid (e.g., eight hours). Thus, the client now has a reusable ticket
and need not bother the user for a password for each new service request. Finally,
note that the ticket-granting ticket is encrypted with a secret key known only to the
AS and the TGS. This prevents alteration of the ticket. The ticket is reencrypted
with a key based on the user’s password. This assures that the ticket can be recov-
ered only by the correct user, providing the authentication.
Now that the client has a ticket-granting ticket, access to any server can be
obtained with steps 3 and 4.
3. The client requests a service-granting ticket on behalf of the user. For this pur-
pose, the client transmits a message to the TGS containing the user’s ID, the
ID of the desired service, and the ticket-granting ticket.
4. The TGS decrypts the incoming ticket using a key shared only by the AS and
the TGS (Ktgs) and verifies the success of the decryption by the presence of its
ID. It checks to make sure that the lifetime has not expired. Then it compares
the user ID and network address with the incoming information to authenti-
cate the user. If the user is permitted access to the server V, the TGS issues a
ticket to grant access to the requested service.
The service-granting ticket has the same structure as the ticket-granting ticket.
Indeed, because the TGS is a server, we would expect that the same elements are
needed to authenticate a client to the TGS and to authenticate a client to an appli-
cation server. Again, the ticket contains a timestamp and lifetime. If the user wants
access to the same service at a later time, the client can simply use the previously
acquired service-granting ticket and need not bother the user for a password. Note
that the ticket is encrypted with a secret key (Kv) known only to the TGS and the
server, preventing alteration.
Finally, with a particular service-granting ticket, the client can gain access to
the corresponding service with step 5.

5. The client requests access to a service on behalf of the user. For this purpose, the
client transmits a message to the server containing the user’s ID and the service-
granting ticket. The server authenticates by using the contents of the ticket.
This new scenario satisfies the two requirements of only one password query
per user session and protection of the user password.
THE VERSION 4 AUTHENTICATION DIALOGUE Although the foregoing scenario en-
hances security compared to the first attempt, two additional problems remain. The
heart of the first problem is the lifetime associated with the ticket-granting ticket.
If this lifetime is very short (e.g., minutes), then the user will be repeatedly asked
for a password. If the lifetime is long (e.g., hours), then an opponent has a greater
opportunity for replay. An opponent could eavesdrop on the network and capture
a copy of the ticket-granting ticket and then wait for the legitimate user to log out.
Then the opponent could forge the legitimate user’s network address and send the
message of step (3) to the TGS. This would give the opponent unlimited access to
the resources and files available to the legitimate user.
Similarly, if an opponent captures a service-granting ticket and uses it before it
expires, the opponent has access to the corresponding service.
Thus, we arrive at an additional requirement. A network service (the TGS or
an application service) must be able to prove that the person using a ticket is the
same person to whom that ticket was issued.
The second problem is that there may be a requirement for servers to authen-
ticate themselves to users. Without such authentication, an opponent could sabo-
tage the configuration so that messages to a server were directed to another loca-
tion. The false server would then be in a position to act as a real server and capture
any information from the user and deny the true service to the user.
We examine these problems in turn and refer to Table 15.1, which shows the
actual Kerberos protocol. Figure 15.2 provides a simplified overview.
(1) C S AS IDc } IDtgs }TS1
(2) AS S C E(Kc, [Kc, tgs } IDtgs }TS2 }Lifetime2 }Tickettgs])
Tickettgs = E(Ktgs, [Kc, tgs } IDC }ADC } IDtgs }TS2 }Lifetime2])
(a) Authentication Service Exchange to obtain ticket-granting ticket
(3) C S TGS IDv }Tickettgs }Authenticatorc
(4) TGS S C E(Kc, tgs, [Kc, v } IDv }TS4 }Ticketv])
Tickettgs = E(Ktgs, [Kc, tgs } IDC }ADC } IDtgs }TS2 }Lifetime2])
Ticketv = E(Kv, [Kc, v } IDC }ADC } IDv }TS4 }Lifetime4])
Authenticatorc = E(Kc, tgs, [IDC }ADC }TS3])
(b) Ticket-Granting Service Exchange to obtain service-granting ticket
(5) C S V Ticketv }Authenticatorc
(6) V S C E(Kc,v, [TS5 + 1]) (for mutual authentication)
Ticketv = E(Kv, [Kc, v } IDC }ADC } IDv }TS4 }Lifetime4])
Authenticatorc = E(Kc, v, [IDC }ADC }TS5])
(c) Client/Server Authentication Exchange to obtain service
Table 15.1 Summary of Kerberos Version 4 Message Exchanges

First, consider the problem of captured ticket-granting tickets and the need
to determine that the ticket presenter is the same as the client for whom the ticket
was issued. The threat is that an opponent will steal the ticket and use it before it
expires. To get around this problem, let us have the AS provide both the client and
the TGS with a secret piece of information in a secure manner. Then the client can
prove its identity to the TGS by revealing the secret information—again in a secure
manner. An efficient way of accomplishing this is to use an encryption key as the
secure information; this is referred to as a session key in Kerberos.
Table 15.1a shows the technique for distributing the session key. As before,
the client sends a message to the AS requesting access to the TGS. The AS re-
sponds with a message, encrypted with a key derived from the user’s password
(Kc), that contains the ticket. The encrypted message also contains a copy of the
session key, Kc,tgs, where the subscripts indicate that this is a session key for C and
TGS. Because this session key is inside the message encrypted with Kc, only the
user’s client can read it. The same session key is included in the ticket, which can
be read only by the TGS. Thus, the session key has been securely delivered to both
C and the TGS.
Figure 15.2 Overview of Kerberos
server (TGS)
est t
once per
user logon
1. User logs on to
workstation and
requests service on host
3. Workstation prompts
user for password to decrypt
incoming message, and then
send ticket and
authenticator that contains
user’s name, network
address, and time to TGS.
+ ses
sion k
request se
ticket + sess
ion key
once per
type of service
4. TGS decrypts ticket and
authenticator, verifies request,
and then creates ticket for
requested application server.
5. Workstation sends
ticket and authenticator
to host.
6. Host verifies that
ticket and authenticator
match, and then grants
access to service. If
mutual authentication is
required, server returns
an authenticator.
request service
provide server
once per
service session
2. AS verifies user’s access right in
database, and creates ticket-granting ticket
and session key. Results are encrypted
using key derived from user’s password.

Note that several additional pieces of information have been added to this
first phase of the dialogue. Message (1) includes a timestamp, so that the AS knows
that the message is timely. Message (2) includes several elements of the ticket in a
form accessible to C. This enables C to confirm that this ticket is for the TGS and to
learn its expiration time.
Armed with the ticket and the session key, C is ready to approach the TGS.
As before, C sends the TGS a message that includes the ticket plus the ID of the
requested service [message (3) in Table 15.1b]. In addition, C transmits an authentica-
tor, which includes the ID and address of C’s user and a timestamp. Unlike the ticket,
which is reusable, the authenticator is intended for use only once and has a very short
lifetime. The TGS can decrypt the ticket with the key that it shares with the AS. This
ticket indicates that user C has been provided with the session key Kc,tgs. In effect,
the ticket says, “Anyone who uses Kc,tgs must be C.” The TGS uses the session key to
decrypt the authenticator. The TGS can then check the name and address from the
authenticator with that of the ticket and with the network address of the incoming
message. If all match, then the TGS is assured that the sender of the ticket is indeed
the ticket’s real owner. In effect, the authenticator says, “At time TS3, I hereby use
Kc,tgs.” Note that the ticket does not prove anyone’s identity but is a way to distribute
keys securely. It is the authenticator that proves the client’s identity. Because the au-
thenticator can be used only once and has a short lifetime, the threat of an opponent
stealing both the ticket and the authenticator for presentation later is countered.
The reply from the TGS in message (4) follows the form of message (2). The
message is encrypted with the session key shared by the TGS and C and includes
a session key to be shared between C and the server V, the ID of V, and the time-
stamp of the ticket. The ticket itself includes the same session key.
C now has a reusable service-granting ticket for V. When C presents this ticket,
as shown in message (5), it also sends an authenticator. The server can decrypt the
ticket, recover the session key, and decrypt the authenticator.
If mutual authentication is required, the server can reply as shown in message
(6) of Table 15.1. The server returns the value of the timestamp from the authenti-
cator, incremented by 1, and encrypted in the session key. C can decrypt this mes-
sage to recover the incremented timestamp. Because the message was encrypted by
the session key, C is assured that it could have been created only by V. The contents
of the message assure C that this is not a replay of an old reply.
Finally, at the conclusion of this process, the client and server share a secret
key. This key can be used to encrypt future messages between the two or to ex-
change a new random session key for that purpose.
Figure 15.3 illustrates the Kerberos exchanges among the parties. Table 15.2
summarizes the justification for each of the elements in the Kerberos protocol.
KERBEROS REALMS AND MULTIPLE KERBERI A full-service Kerberos environment
consisting of a Kerberos server, a number of clients, and a number of application
servers requires the following:
1. The Kerberos server must have the user ID and hashed passwords of all partic-
ipating users in its database. All users are registered with the Kerberos server.
2. The Kerberos server must share a secret key with each server. All servers are
registered with the Kerberos server.

15.3 / KERBEROS 491
Message (1) Client requests ticket-granting ticket.
IDC Tells AS identity of user from this client.
IDtgs Tells AS that user requests access to TGS.
TS1 Allows AS to verify that client’s clock is synchronized with that of AS.
Message (2) AS returns ticket-granting ticket.
Kc Encryption is based on user’s password, enabling AS and client to verify password, and
protecting contents of message (2).
Kc, tgs Copy of session key accessible to client created by AS to permit secure exchange between
client and TGS without requiring them to share a permanent key.
IDtgs Confirms that this ticket is for the TGS.
TS2 Informs client of time this ticket was issued.
Lifetime2 Informs client of the lifetime of this ticket.
Tickettgs Ticket to be used by client to access TGS.
(a) Authentication Service Exchange
Message (3) Client requests service-granting ticket.
IDV Tells TGS that user requests access to server V.
Tickettgs Assures TGS that this user has been authenticated by AS.
Authenticatorc Generated by client to validate ticket.
Table 15.2 Rationale for the Elements of the Kerberos Version 4 Protocol
Figure 15.3 Kerberos Exchanges
Client authentication
IDc || IDtgs || TS1
Tickettgs, server ID, and client authentication
IDv || Tickettgs || Authenticatorc
Shared key and ticket
E(Kc,tgs, [Kc,v || IDv || TS4 || Ticketv])
Ticketv and client authentication
Ticketv || Authenticatorc
Service granted
E(Kc,v, [TS5 + 1])
Shared key and ticket
E(Kc, [Kc, tgs || IDtgs || TS2 ||
Lifetime2 || Tickettgs])
server (AS)
server (TGS)

Message (4) TGS returns service-granting ticket.
Kc, tgs Key shared only by C and TGS protects contents of message (4).
Kc, v Copy of session key accessible to client created by TGS to permit secure exchange between
client and server without requiring them to share a permanent key.
IDV Confirms that this ticket is for server V.
TS4 Informs client of time this ticket was issued.
TicketV Ticket to be used by client to access server V.
Tickettgs Reusable so that user does not have to reenter password.
Ktgs Ticket is encrypted with key known only to AS and TGS, to prevent tampering.
Kc, tgs Copy of session key accessible to TGS used to decrypt authenticator, thereby authenticating
IDC Indicates the rightful owner of this ticket.
ADC Prevents use of ticket from workstation other than one that initially requested the ticket.
IDtgs Assures server that it has decrypted ticket properly.
TS2 Informs TGS of time this ticket was issued.
Lifetime2 Prevents replay after ticket has expired.
Authenticatorc Assures TGS that the ticket presenter is the same as the client for whom the ticket was
issued has very short lifetime to prevent replay.
Kc, tgs Authenticator is encrypted with key known only to client and TGS, to prevent tampering.
IDC Must match ID in ticket to authenticate ticket.
ADC Must match address in ticket to authenticate ticket.
TS3 Informs TGS of time this authenticator was generated.
(b) Ticket-Granting Service Exchange
Message (5) Client requests service.
TicketV Assures server that this user has been authenticated by AS.
Authenticatorc Generated by client to validate ticket.
Message (6) Optional authentication of server to client.
Kc, v Assures C that this message is from V.
TS5 + 1 Assures C that this is not a replay of an old reply.
Ticketv Reusable so that client does not need to request a new ticket from TGS for each access to
the same server.
Kv Ticket is encrypted with key known only to TGS and server, to prevent tampering.
Kc, v Copy of session key accessible to client; used to decrypt authenticator, thereby authenticating
IDC Indicates the rightful owner of this ticket.
ADC Prevents use of ticket from workstation other than one that initially requested the ticket.
IDV Assures server that it has decrypted ticket properly.
TS4 Informs server of time this ticket was issued.
Lifetime4 Prevents replay after ticket has expired.
Authenticatorc Assures server that the ticket presenter is the same as the client for whom the ticket was
issued; has very short lifetime to prevent replay.
Kc, v Authenticator is encrypted with key known only to client and server, to prevent tampering.
IDC Must match ID in ticket to authenticate ticket.
ADC Must match address in ticket to authenticate ticket.
TS5 Informs server of time this authenticator was generated.
(c) Client/Server Authentication Exchange
Table 15.2 Continued

15.3 / KERBEROS 493
Such an environment is referred to as a Kerberos realm. The concept of
realm can be explained as follows. A Kerberos realm is a set of managed nodes
that share the same Kerberos database. The Kerberos database resides on the
Kerberos master computer system, which should be kept in a physically secure
room. A read-only copy of the Kerberos database might also reside on other
Kerberos computer systems. However, all changes to the database must be
made on the master computer system. Changing or accessing the contents of a
Kerberos database requires the Kerberos master password. A related concept
is that of a Kerberos principal, which is a service or user that is known to the
Kerberos system. Each Kerberos principal is identified by its principal name.
Principal names consist of three parts: a service or user name, an instance name,
and a realm name.
Networks of clients and servers under different administrative organizations
typically constitute different realms. That is, it generally is not practical or does
not conform to administrative policy to have users and servers in one administra-
tive domain registered with a Kerberos server elsewhere. However, users in one
realm may need access to servers in other realms, and some servers may be will-
ing to provide service to users from other realms, provided that those users are
Kerberos provides a mechanism for supporting such interrealm authentication.
For two realms to support interrealm authentication, a third requirement is added:
3. The Kerberos server in each interoperating realm shares a secret key with the
server in the other realm. The two Kerberos servers are registered with each
The scheme requires that the Kerberos server in one realm trust the Kerberos
server in the other realm to authenticate its users. Furthermore, the participating
servers in the second realm must also be willing to trust the Kerberos server in the
first realm.
With these ground rules in place, we can describe the mechanism as follows
(Figure 15.4): A user wishing service on a server in another realm needs a ticket for
that server. The user’s client follows the usual procedures to gain access to the local
TGS and then requests a ticket-granting ticket for a remote TGS (TGS in another
realm). The client can then apply to the remote TGS for a service-granting ticket for
the desired server in the realm of the remote TGS.
The details of the exchanges illustrated in Figure 15.4 are as follows (compare
Table 15.1).
(1) C S AS: IDc } IDtgs }TS1
(2) AS S C: E(Kc, [Kc, tgs } IDtgs }TS2 }Lifetime2 }Tickettgs])
(3) C S TGS: IDtgsrem }Tickettgs }Authenticatorc
(4) TGS S C: E(Kc,tgs, [Kc, tgsrem } IDtgsrem }TS4 }Tickettgsrem])
(5) C S TGSrem: IDvrem }Tickettgsrem }Authenticatorc
(6) TGSrem S C: E(Kc,tgsrem, [Kc, vrem } IDvrem }TS6 }Ticketvrem])
(7) C S Vrem: Ticketvrem }Authenticatorc

The ticket presented to the remote server (Vrem) indicates the realm in which
the user was originally authenticated. The server chooses whether to honor the re-
mote request.
One problem presented by the foregoing approach is that it does not scale well
to many realms. If there are N realms, then there must be N(N – 1)/2 secure key
exchanges so that each Kerberos realm can interoperate with all other Kerberos
Kerberos Version 5
Kerberos version 5 is specified in RFC 4120 and provides a number of improve-
ments over version 4 [KOHL94]. To begin, we provide an overview of the changes
from version 4 to version 5 and then look at the version 5 protocol.
Figure 15.4 Request for Service in Another Realm
server (AS)
server (TGS)
server (AS)
server (TGS)
Realm A
Realm B
1. Requ
est tick
et for l
ocal TG
2. Tick
et for l
ocal T
3. Request ticket for remoteTGS
4. Ticket for remote TGS
5. Request ticket
for rem
ote server
6. Ticket for rem
ote server

15.3 / KERBEROS 495
DIFFERENCES BETWEEN VERSIONS 4 AND 5 Version 5 is intended to address the limita-
tions of version 4 in two areas: environmental shortcomings and technical deficien-
cies. Let us briefly summarize the improvements in each area.8
Kerberos version 4 was developed for use within the Project Athena environ-
ment and, accordingly, did not fully address the need to be of general purpose. This
led to the following environmental shortcomings.
1. Encryption system dependence: Version 4 requires the use of DES. Export
restriction on DES as well as doubts about the strength of DES were thus of
concern. In version 5, ciphertext is tagged with an encryption-type identifier
so that any encryption technique may be used. Encryption keys are tagged
with a type and a length, allowing the same key to be used in different al-
gorithms and allowing the specification of different variations on a given
2. Internet protocol dependence: Version 4 requires the use of Internet Protocol
(IP) addresses. Other address types, such as the ISO network address, are not
accommodated. Version 5 network addresses are tagged with type and length,
allowing any network address type to be used.
3. Message byte ordering: In version 4, the sender of a message employs a byte
ordering of its own choosing and tags the message to indicate least signifi-
cant byte in lowest address or most significant byte in lowest address. This
techniques works but does not follow established conventions. In version
5, all message structures are defined using Abstract Syntax Notation One
(ASN.1) and Basic Encoding Rules (BER), which provide an unambiguous
byte ordering.
4. Ticket lifetime: Lifetime values in version 4 are encoded in an 8-bit quantity
in units of five minutes. Thus, the maximum lifetime that can be expressed is
28 * 5 = 1280 minutes (a little over 21 hours). This may be inadequate for
some applications (e.g., a long-running simulation that requires valid Kerberos
credentials throughout execution). In version 5, tickets include an explicit start
time and end time, allowing tickets with arbitrary lifetimes.
5. Authentication forwarding: Version 4 does not allow credentials issued to one
client to be forwarded to some other host and used by some other client. This
capability would enable a client to access a server and have that server access
another server on behalf of the client. For example, a client issues a request to
a print server that then accesses the client’s file from a file server, using the cli-
ent’s credentials for access. Version 5 provides this capability.
6. Interrealm authentication: In version 4, interoperability among N realms
requires on the order of N2 Kerberos-to-Kerberos relationships, as described
earlier. Version 5 supports a method that requires fewer relationships, as de-
scribed shortly.
8The following discussion follows the presentation in [KOHL94].

Apart from these environmental limitations, there are technical deficiencies
in the version 4 protocol itself. Most of these deficiencies were documented in
[BELL90], and version 5 attempts to address these. The deficiencies are the
1. Double encryption: Note in Table 15.1 [messages (2) and (4)] that tickets pro-
vided to clients are encrypted twice—once with the secret key of the target
server and then again with a secret key known to the client. The second en-
cryption is not necessary and is computationally wasteful.
2. PCBC encryption: Encryption in version 4 makes use of a nonstandard mode
of DES known as propagating cipher block chaining (PCBC).9 It has been
demonstrated that this mode is vulnerable to an attack involving the inter-
change of ciphertext blocks [KOHL89]. PCBC was intended to provide an in-
tegrity check as part of the encryption operation. Version 5 provides explicit
integrity mechanisms, allowing the standard CBC mode to be used for encryp-
tion. In particular, a checksum or hash code is attached to the message prior to
encryption using CBC.
3. Session keys: Each ticket includes a session key that is used by the client
to encrypt the authenticator sent to the service associated with that ticket.
In addition, the session key may subsequently be used by the client and the
server to protect messages passed during that session. However, because
the same ticket may be used repeatedly to gain service from a particular
server, there is the risk that an opponent will replay messages from an old
session to the client or the server. In version 5, it is possible for a client
and server to negotiate a subsession key, which is to be used only for that
one connection. A new access by the client would result in the use of a new
subsession key.
4. Password attacks: Both versions are vulnerable to a password attack. The mes-
sage from the AS to the client includes material encrypted with a key based
on the client’s password.10 An opponent can capture this message and attempt
to decrypt it by trying various passwords. If the result of a test decryption is of
the proper form, then the opponent has discovered the client’s password and
may subsequently use it to gain authentication credentials from Kerberos. This
is the same type of password attack described in Chapter 21, with the same
kinds of countermeasures being applicable. Version 5 does provide a mecha-
nism known as preauthentication, which should make password attacks more
difficult, but it does not prevent them.
THE VERSION 5 AUTHENTICATION DIALOGUE Table 15.3 summarizes the basic ver-
sion 5 dialogue. This is best explained by comparison with version 4 (Table 15.1).
First, consider the authentication service exchange. Message (1) is a client re-
quest for a ticket-granting ticket. As before, it includes the ID of the user and the TGS.
The following new elements are added:
9This is described in Appendix T.
10Appendix T describes the mapping of passwords to encryption keys.

15.3 / KERBEROS 497
(1) C S AS Options } IDc }Realmc } IDtgs }Times }Nonce1
(2) AS S C RealmC } IDC }Tickettgs }E(Kc, [Kc,tgs }Times }Nonce1 }Realmtgs } IDtgs])
Tickettgs = E(Ktgs, [Flags }Kc,tgs }Realmc } IDC }ADC }Times])
(a) Authentication Service Exchange to obtain ticket-granting ticket
(3) C S TGS Options } IDv }Times }Nonce2 }Tickettgs }Authenticatorc
(4) TGS S C Realmc } IDC }Ticketv }E(Kc,tgs, [Kc,v }Times }Nonce2 }Realmv } IDv])
Tickettgs = E(Ktgs, [Flags }Kc,tgs }Realmc } IDC }ADC }Times])
Ticketv = E(Kv, [Flags }Kc,v }Realmc } IDC }ADC }Times])
Authenticatorc = E(Kc,tgs, [IDC }Realmc }TS1])
(b) Ticket-Granting Service Exchange to obtain service-granting ticket
(5) C S V Options }Ticketv }Authenticatorc
(6) V S C EKc,v[TS2 }Subkey }Seq #]
Ticketv = E(Kv, [Flag }Kc,v }Realmc } IDC }ADC }Times])
Authenticatorc = E(Kc,v, [IDC }Relamc }TS2 }Subkey }Seq #])
(c) Client/Server Authentication Exchange to obtain service
Table 15.3 Summary of Kerberos Version 5 Message Exchanges
■ Realm: Indicates realm of user
■ Options: Used to request that certain flags be set in the returned ticket
■ Times: Used by the client to request the following time settings in the ticket:
—from: the desired start time for the requested ticket
—till: the requested expiration time for the requested ticket
—rtime: requested renew-till time
■ Nonce: A random value to be repeated in message (2) to assure that the re-
sponse is fresh and has not been replayed by an opponent
Message (2) returns a ticket-granting ticket, identifying information for the
client, and a block encrypted using the encryption key based on the user’s password.
This block includes the session key to be used between the client and the TGS,
times specified in message (1), the nonce from message (1), and TGS identifying
information. The ticket itself includes the session key, identifying information for
the client, the requested time values, and flags that reflect the status of this ticket
and the requested options. These flags introduce significant new functionality to
version 5. For now, we defer a discussion of these flags and concentrate on the over-
all structure of the version 5 protocol.
Let us now compare the ticket-granting service exchange for versions
4 and 5. We see that message (3) for both versions includes an authenticator, a
ticket, and the name of the requested service. In addition, version 5 includes re-
quested times and options for the ticket and a nonce—all with functions similar
to those of message (1). The authenticator itself is essentially the same as the one
used in version 4.

Message (4) has the same structure as message (2). It returns a ticket plus
information needed by the client, with the information encrypted using the session
key now shared by the client and the TGS.
Finally, for the client/server authentication exchange, several new features
appear in version 5. In message (5), the client may request as an option that mutual
authentication is required. The authenticator includes several new fields:
■ Subkey: The client’s choice for an encryption key to be used to protect this
specific application session. If this field is omitted, the session key from the
ticket (Kc,v) is used.
■ Sequence number: An optional field that specifies the starting sequence num-
ber to be used by the server for messages sent to the client during this session.
Messages may be sequence numbered to detect replays.
If mutual authentication is required, the server responds with message (6).
This message includes the timestamp from the authenticator. Note that in version 4,
the timestamp was incremented by one. This is not necessary in version 5, because
the nature of the format of messages is such that it is not possible for an oppo-
nent to create message (6) without knowledge of the appropriate encryption keys.
The subkey field, if present, overrides the subkey field, if present, in message (5).
The optional sequence number field specifies the starting sequence number to be
used by the client.
TICKET FLAGS The flags field included in tickets in version 5 supports expanded
functionality compared to that available in version 4. Table 15.4 summarizes the
flags that may be included in a ticket.
INITIAL This ticket was issued using the AS protocol and not issued based on a
ticket-granting ticket.
PRE-AUTHENT During initial authentication, the client was authenticated by the KDC
before a ticket was issued.
HW-AUTHENT The protocol employed for initial authentication required the use of hard-
ware expected to be possessed solely by the named client.
RENEWABLE Tells TGS that this ticket can be used to obtain a replacement ticket that
expires at a later date.
MAY-POSTDATE Tells TGS that a postdated ticket may be issued based on this ticket-
granting ticket.
POSTDATED Indicates that this ticket has been postdated; the end server can check the
authtime field to see when the original authentication occurred.
INVALID This ticket is invalid and must be validated by the KDC before use.
PROXIABLE Tells TGS that a new service-granting ticket with a different network
address may be issued based on the presented ticket.
PROXY Indicates that this ticket is a proxy.
FORWARDABLE Tells TGS that a new ticket-granting ticket with a different network
address may be issued based on this ticket-granting ticket.
FORWARDED Indicates that this ticket has either been forwarded or was issued based on
authentication involving a forwarded ticket-granting ticket.
Table 15.4 Kerberos Version 5 Flags

15.3 / KERBEROS 499
The INITIAL flag indicates that this ticket was issued by the AS, not by the
TGS. When a client requests a service-granting ticket from the TGS, it presents a
ticket-granting ticket obtained from the AS. In version 4, this was the only way to
obtain a service-granting ticket. Version 5 provides the additional capability that
the client can get a service-granting ticket directly from the AS. The utility of this is
as follows: A server, such as a password-changing server, may wish to know that the
client’s password was recently tested.
The PRE-AUTHENT flag, if set, indicates that when the AS received the ini-
tial request [message (1)], it authenticated the client before issuing a ticket. The
exact form of this preauthentication is left unspecified. As an example, the MIT
implementation of version 5 has encrypted timestamp preauthentication, enabled
by default. When a user wants to get a ticket, it has to send to the AS a preauthen-
tication block containing a random confounder, a version number, and a timestamp
all encrypted in the client’s password-based key. The AS decrypts the block and will
not send a ticket-granting ticket back unless the timestamp in the preauthentica-
tion block is within the allowable time skew (time interval to account for clock drift
and network delays). Another possibility is the use of a smart card that generates
continually changing passwords that are included in the preauthenticated messages.
The passwords generated by the card can be based on a user’s password but be
transformed by the card so that, in effect, arbitrary passwords are used. This pre-
vents an attack based on easily guessed passwords. If a smart card or similar device
was used, this is indicated by the HW-AUTHENT flag.
When a ticket has a long lifetime, there is the potential for it to be stolen and
used by an opponent for a considerable period. If a short lifetime is used to lessen
the threat, then overhead is involved in acquiring new tickets. In the case of a ticket-
granting ticket, the client would either have to store the user’s secret key, which is
clearly risky, or repeatedly ask the user for a password. A compromise scheme is
the use of renewable tickets. A ticket with the RENEWABLE flag set includes two
expiration times: One for this specific ticket and one that is the latest permissible
value for an expiration time. A client can have the ticket renewed by presenting it
to the TGS with a requested new expiration time. If the new time is within the limit
of the latest permissible value, the TGS can issue a new ticket with a new session
time and a later specific expiration time. The advantage of this mechanism is that
the TGS may refuse to renew a ticket reported as stolen.
A client may request that the AS provide a ticket-granting ticket with the
MAY-POSTDATE flag set. The client can then use this ticket to request a ticket
that is flagged as POSTDATED and INVALID from the TGS. Subsequently, the
client may submit the postdated ticket for validation. This scheme can be useful
for running a long batch job on a server that requires a ticket periodically. The
client can obtain a number of tickets for this session at once, with spread out time
values. All but the first ticket are initially invalid. When the execution reaches a
point in time when a new ticket is required, the client can get the appropriate ticket
validated. With this approach, the client does not have to repeatedly use its ticket-
granting ticket to obtain a service-granting ticket.
In version 5, it is possible for a server to act as a proxy on behalf of a client, in
effect adopting the credentials and privileges of the client to request a service from
another server. If a client wishes to use this mechanism, it requests a ticket-granting

ticket with the PROXIABLE flag set. When this ticket is presented to the TGS, the
TGS is permitted to issue a service-granting ticket with a different network address;
this latter ticket will have its PROXY flag set. An application receiving such a ticket
may accept it or require additional authentication to provide an audit trail.11
The proxy concept is a limited case of the more powerful forwarding procedure.
If a ticket is set with the FORWARDABLE flag, a TGS can issue to the requestor a
ticket-granting ticket with a different network address and the FORWARDED flag
set. This ticket then can be presented to a remote TGS. This capability allows a cli-
ent to gain access to a server on another realm without requiring that each Kerberos
maintain a secret key with Kerberos servers in every other realm. For example,
realms could be structured hierarchically. Then a client could walk up the tree to a
common node and then back down to reach a target realm. Each step of the walk
would involve forwarding a ticket-granting ticket to the next TGS in the path.
Mutual Authentication
In Chapter 14, we presented one approach to the use of public-key encryption for
the purpose of session-key distribution (Figure 14.9). This protocol assumes that
each of the two parties is in possession of the current public key of the other. It may
not be practical to require this assumption.
A protocol using timestamps is provided in [DENN81]:
1. A S AS: IDA } IDB
2. AS S A: E(PRas, [IDA }PUa }T]) }E(PRas, [IDB }PUb }T])
3. A S B: E(PRas, [IDA }PUa }T]) }E(PRas, [IDB }PUb }T]) }
E(PUb, E(PRa, [Ks }T]))
In this case, the central system is referred to as an authentication server (AS),
because it is not actually responsible for secret-key distribution. Rather, the AS pro-
vides public-key certificates. The session key is chosen and encrypted by A; hence,
there is no risk of exposure by the AS. The timestamps protect against replays of
compromised keys.
This protocol is compact but, as before, requires the synchronization of clocks.
Another approach, proposed by Woo and Lam [WOO92a], makes use of nonces.
The protocol consists of the following steps.
2. KDC S A: E(PRauth, [IDB }PUb])
3. A S B: E(PUb, [Na } IDA])
4. B S KDC: IDA } IDB }E(PUauth, Na)
5. KDC S B: E(PRauth, [IDA }PUa]) }E(PUb, E(PRauth, [Na }Ks } IDB]))
11For a discussion of some of the possible uses of the proxy capability, see [NEUM93b].

6. B S A: E(PUa, [E(PRauth, [(Na }Ks } IDB)]) }Nb])
7. A S B: E(Ks, Nb)
In step 1, A informs the KDC of its intention to establish a secure connection
with B. The KDC returns to A a copy of B’s public-key certificate (step 2). Using B’s
public key, A informs B of its desire to communicate and sends a nonce Na (step 3).
In step 4, B asks the KDC for A’s public-key certificate and requests a session key;
B includes A’s nonce so that the KDC can stamp the session key with that nonce.
The nonce is protected using the KDC’s public key. In step 5, the KDC returns to
B a copy of A’s public-key certificate, plus the information {Na, Ks, IDB}. This infor-
mation basically says that Ks is a secret key generated by the KDC on behalf of B
and tied to Na; the binding of Ks and Na will assure A that Ks is fresh. This triple is
encrypted using the KDC’s private key to allow B to verify that the triple is in fact
from the KDC. It is also encrypted using B’s public key so that no other entity may
use the triple in an attempt to establish a fraudulent connection with A. In step 6,
the triple {Na, Ks, IDB}, still encrypted with the KDC’s private key, is relayed to A,
together with a nonce Nb generated by B. All the foregoing are encrypted using A’s
public key. A retrieves the session key Ks, uses it to encrypt Nb, and returns it to B.
This last message assures B of A’s knowledge of the session key.
This seems to be a secure protocol that takes into account the various attacks.
However, the authors themselves spotted a flaw and submitted a revised version of
the algorithm in [WOO92b]:
2. KDC S A: E(PRauth, [IDB }PUb])
3. A S B: E(PUb, [Na } IDA])
4. B S KDC: IDA } IDB }E(PUauth, Na)
5. KDC S B: E(PRauth, [IDA }PUa]) }E(PUb, E(PRauth, [Na }Ks } IDA } IDB]))
6. B S A: E(PUa, [Nb }E(PRauth, [Na }Ks } IDA } IDB])])
7. A S B: E(Ks, Nb)
The identifier of A, IDA, is added to the set of items encrypted with the KDC’s
private key in steps 5 and 6. This binds the session key Ks to the identities of the two
parties that will be engaged in the session. This inclusion of IDA accounts for the
fact that the nonce value Na is considered unique only among all nonces generated
by A, not among all nonces generated by all parties. Thus, it is the pair {IDA, Na}
that uniquely identifies the connection request of A.
In both this example and the protocols described earlier, protocols that ap-
peared secure were revised after additional analysis. These examples highlight the
difficulty of getting things right in the area of authentication.
One-Way Authentication
We have already presented public-key encryption approaches that are suited to
electronic mail, including the straightforward encryption of the entire message for
confidentiality (Figure 12.1b), authentication (Figure 12.1c), or both (Figure 12.1d).
These approaches require that either the sender know the recipient’s public key

(confidentiality), the recipient know the sender’s public key (authentication), or
both (confidentiality plus authentication). In addition, the public-key algorithm
must be applied once or twice to what may be a long message.
If confidentiality is the primary concern, then the following may be more efficient:
A S B: E(PUb, Ks) }E(Ks, M)
In this case, the message is encrypted with a one-time secret key. A also encrypts this
one-time key with B’s public key. Only B will be able to use the corresponding private
key to recover the one-time key and then use that key to decrypt the message. This
scheme is more efficient than simply encrypting the entire message with B’s public key.
If authentication is the primary concern, then a digital signature may suffice,
as was illustrated in Figure 13.2:
A S B: M }E(PRa, H(M))
This method guarantees that A cannot later deny having sent the message.
However, this technique is open to another kind of fraud. Bob composes a mes-
sage to his boss Alice that contains an idea that will save the company money. He
appends his digital signature and sends it into the email system. Eventually, the
message will get delivered to Alice’s mailbox. But suppose that Max has heard of
Bob’s idea and gains access to the mail queue before delivery. He finds Bob’s mes-
sage, strips off his signature, appends his, and requeues the message to be delivered
to Alice. Max gets credit for Bob’s idea.
To counter such a scheme, both the message and signature can be encrypted
with the recipient’s public key:
A S B: E(PUb, [M }E(PRa, H(M))])
The latter two schemes require that B know A’s public key and be convinced
that it is timely. An effective way to provide this assurance is the digital certificate,
described in Chapter 14. Now we have
A S B: M }E(PRa, H(M)) }E(PRas, [T } IDA }PUa])
In addition to the message, A sends B the signature encrypted with A’s private
key and A’s certificate encrypted with the private key of the authentication server.
The recipient of the message first uses the certificate to obtain the sender’s public
key and verify that it is authentic and then uses the public key to verify the message
itself. If confidentiality is required, then the entire message can be encrypted with
B’s public key. Alternatively, the entire message can be encrypted with a one-time
secret key; the secret key is also transmitted, encrypted with B’s public key. This ap-
proach is explored in Chapter 19.
Federated identity management is a relatively new concept dealing with the use of
a common identity management scheme across multiple enterprises and numerous
applications and supporting many thousands, even millions, of users. We begin our
overview with a discussion of the concept of identity management and then examine
federated identity management.

Identity Management
Identity management is a centralized, automated approach to provide enterprise-
wide access to resources by employees and other authorized individuals. The focus
of identity management is defining an identity for each user (human or process),
associating attributes with the identity, and enforcing a means by which a user can
verify identity. The central concept of an identity management system is the use of
single sign-on (SSO).
SSO enables a user to access all network resources after a single authentication.
Typical services provided by a federated identity management system include
the following:
■ Point of contact: Includes authentication that a user corresponds to the user
name provided, and management of user/server sessions.
■ SSO protocol services: Provides a vendor-neutral security token service for
supporting a single sign on to federated services.
■ Trust services: Federation relationships require a trust relationship-based
federation between business partners. A trust relationship is represented by
the combination of the security tokens used to exchange information about a
user, the cryptographic information used to protect these security tokens, and
optionally the identity mapping rules applied to the information contained
within this token.
■ Key services: Management of keys and certificates.
■ Identity services: services that provide the interface to local data stores, includ-
ing user registries and databases, for identity-related information management.
■ Authorization: Granting access to specific services and/or resources based on
the authentication.
■ Provisioning: Includes creating an account in each target system for the user,
enrollment or registration of user in accounts, establishment of access rights or
credentials to ensure the privacy and integrity of account data.
■ Management: Services related to runtime configuration and deployment.
Note that Kerberos contains a number of the elements of an identity manage-
ment system.
Figure 15.5 illustrates entities and data flows in a generic identity manage-
ment architecture. A principal is an identity holder. Typically, this is a human user
that seeks access to resources and services on the network. User devices, agent pro-
cesses, and server systems may also function as principals. Principals authenticate
themselves to an identity provider. The identity provider associates authentication
information with a principal, as well as attributes and one or more identifiers.
Increasingly, digital identities incorporate attributes other than simply an iden-
tifier and authentication information (such as passwords and biometric information).
An attribute service manages the creation and maintenance of such attributes. For
example, a user needs to provide a shipping address each time an order is placed at a
new Web merchant, and this information needs to be revised when the user moves.
Identity management enables the user to provide this information once, so that it
is maintained in a single place and released to data consumers in accordance with

authorization and privacy policies. Users may create some of the attributes to be
associated with their digital identity, such as an address. Administrators may also as-
sign attributes to users, such as roles, access permissions, and employee information.
Data consumers are entities that obtain and employ data maintained and
provided by identity and attribute providers, which are often used to support autho-
rization decisions and to collect audit information. For example, a database server
or file server is a data consumer that needs a client’s credentials so as to know what
access to provide to that client.
Identity Federation
Identity federation is, in essence, an extension of identity management to multiple
security domains. Such domains include autonomous internal business units, exter-
nal business partners, and other third-party applications and services. The goal is to
provide the sharing of digital identities so that a user can be authenticated a single
time and then access applications and resources across multiple domains. Because
these domains are relatively autonomous or independent, no centralized control is
possible. Rather, the cooperating organizations must form a federation based on
agreed standards and mutual levels of trust to securely share digital identities.
Federated identity management refers to the agreements, standards, and
technologies that enable the portability of identities, identity attributes, and entitle-
ments across multiple enterprises and numerous applications and supporting many
thousands, even millions, of users. When multiple organizations implement interop-
erable federated identity schemes, an employee in one organization can use a single
sign-on to access services across the federation with trust relationships associated
with the identity. For example, an employee may log onto her corporate intranet
and be authenticated to perform authorized functions and access authorized ser-
vices on that intranet. The employee could then access their health benefits from an
outside health-care provider without having to reauthenticate.
Figure 15.5 Generic Identity Management Architecture
consumer Principal

Beyond SSO, federated identity management provides other capabilities. One
is a standardized means of representing attributes. Increasingly, digital identities
incorporate attributes other than simply an identifier and authentication informa-
tion (such as passwords and biometric information). Examples of attributes include
account numbers, organizational roles, physical location, and file ownership. A user
may have multiple identifiers; for example, each identifier may be associated with a
unique role with its own access permissions.
Another key function of federated identity management is identity mapping.
Different security domains may represent identities and attributes differently.
Further, the amount of information associated with an individual in one domain
may be more than is necessary in another domain. The federated identity manage-
ment protocols map identities and attributes of a user in one domain to the require-
ments of another domain.
Figure 15.6 illustrates entities and data flows in a generic federated identity
management architecture.
Figure 15.6 Federated Identity Operation
Identity provider
(source domain)
Service provider
(destination domain)
1 End user’s browser or other application engages
in an authentication dialogue with identity provider
in the same domain. End user also provides attribute
values associated with user’s identity.
2 Some attributes associated with an identity, such as
allowable roles, may be provided by an administrator
in the same domain.
3 A service provider in a remote domain, which the user
wishes to access, obtains identity information,
authentication information, and associated attributes
from the identity provider in the source domain.
4 Service provider opens session with remote user and
enforces access control restrictions based on user’s
identity and attributes.

The identity provider acquires attribute information through dialogue and pro-
tocol exchanges with users and administrators. For example, a user needs to provide
a shipping address each time an order is placed at a new Web merchant, and this
information needs to be revised when the user moves. Identity management enables
the user to provide this information once, so that it is maintained in a single place and
released to data consumers in accordance with authorization and privacy policies.
Service providers are entities that obtain and employ data maintained and pro-
vided by identity providers, often to support authorization decisions and to collect
audit information. For example, a database server or file server is a data consumer
that needs a client’s credentials so as to know what access to provide to that client.
A service provider can be in the same domain as the user and the identity provider.
The power of this approach is for federated identity management, in which the ser-
vice provider is in a different domain (e.g., a vendor or supplier network).
STANDARDS Federated identity management uses a number of standards as the
building blocks for secure identity exchange across different domains or heteroge-
neous systems. In essence, organizations issue some form of security tickets for their
users that can be processed by cooperating partners. Identity federation standards
are thus concerned with defining these tickets, in terms of content and format, pro-
viding protocols for exchanging tickets and performing a number of management
tasks. These tasks include configuring systems to perform attribute transfers and
identity mapping, and performing logging and auditing functions. The key stan-
dards are as follows:
■ The Extensible Markup Language (XML): A markup language that uses sets
of embedded tags or labels to characterize text elements within a document
so as to indicate their appearance, function, meaning, or context. XML docu-
ments appear similar to HTML (Hypertext Markup Language) documents
that are visible as Web pages, but provide greater functionality. XML includes
strict definitions of the data type of each field, thus supporting database for-
mats and semantics. XML provides encoding rules for commands that are used
to transfer and update data objects.
■ The Simple Object Access Protocol (SOAP): A minimal set of conventions
for invoking code using XML over HTTP. It enables applications to request
services from one another with XML-based requests and receive responses
as data formatted with XML. Thus, XML defines data objects and structures,
and SOAP provides a means of exchanging such data objects and performing
remote procedure calls related to these objects. See [ROS06] for an informa-
tive discussion.
■ WS-Security: A set of SOAP extensions for implementing message integrity
and confidentiality in Web services. To provide for secure exchange of SOAP
messages among applications, WS-Security assigns security tokens to each
message for use in authentication.
■ Security Assertion Markup Language (SAML): An XML-based language for
the exchange of security information between online business partners. SAML
conveys authentication information in the form of assertions about subjects.
Assertions are statements about the subject issued by an authoritative entity.

The challenge with federated identity management is to integrate multiple
technologies, standards, and services to provide a secure, user-friendly utility. The
key, as in most areas of security and networking, is the reliance on a few mature
standards widely accepted by industry. Federated identity management seems to
have reached this level of maturity.
EXAMPLES To get some feel for the functionality of identity federation, we look at
three scenarios, taken from [COMP06].
In the first scenario (Figure 15.7a), contracts with
to provide employee health benefits. An employee uses a Web interface to sign on to and goes through an authentication procedure there. This  enables
the employee to access authorized services and resources at When
the employee clicks on a link to access health benefits, her browser is redirected to At the same time, the software passes the user’s identi-
fier to in a secure manner. The two organizations are part of a federation
that cooperatively exchanges user identifiers. maintains user identities
Figure 15.7 Federated Identity Scenarios
User store
(a) Federation based on account linking
(c) Chained Web services
(employee portal)
User store
User store
(b) Federation based on roles
User store
tion Website access
End user
User ID
(Purchasing Web
End user
SOAP message
(shipping Web
SOAP message
(employee portal)
tion Website access
End user

for every employee at and associates with each identity health-bene-
fits information and access rights. In this example, the linkage between the two com-
panies is based on account information and user participation is browser based.
Figure 15.7b shows a second type of browser-based scheme. PartsSupplier.
com is a regular supplier of parts to In this case, a role-based
access-control (RBAC) scheme is used for access to information. An engineer of authenticates at the employee portal at and clicks
on a link to access information at Because the user is authen-
ticated in the role of an engineer, he is taken to the technical documentation and
troubleshooting portion of’s Web site without having to sign on.
Similarly, an employee in a purchasing role signs on at and is au-
thorized, in that role, to place purchases at without having to
authenticate to For this scenario, does not
have identity information for individual employees at Rather, the
linkage between the two federated partners is in terms of roles.
The scenario illustrated in Figure 15.7c can be referred to as document based
rather than browser based. In this third example, has a purchasing
agreement with, and has a business relationship
with An employee of signs on and is authenticated to
make purchases. The employee goes to a procurement application that provides a
list of’s suppliers and the parts that can be ordered. The user clicks
on the PinSupplies button and is presented with a purchase order Web page (HTML
page). The employee fills out the form and clicks the submit button. The procure-
ment application generates an XML/SOAP document that it inserts into the enve-
lope body of an XML-based message. The procurement application then inserts the
user’s credentials in the envelope header of the message, together with Workplace.
com’s organizational identity. The procurement application posts the message to
the’s purchasing Web service. This service authenticates the in-
coming message and processes the request. The purchasing Web service then sends
a SOAP message to its shipping partner to fulfill the order. The message includes
a security token in the envelope header and the list of items to be
shipped as well as the end user’s shipping information in the envelope body. The
shipping Web service authenticates the request and processes the shipment order.
User authentication based on the possession of a smart card is becoming more wide-
spread. A smart card has the appearance of a credit card, has an electronic inter-
face, and may use a variety of authentication protocols.
A smart card contains within it an entire microprocessor, including processor,
memory, and I/O ports. Some versions incorporate a special co-processing circuit
for cryptographic operation to speed the task of encoding and decoding messages or
generating digital signatures to validate the information transferred. In some cards,
the I/O ports are directly accessible by a compatible reader by means of exposed
electrical contacts. Other cards rely instead on an embedded antenna for wireless
communication with the reader.

A typical smart card includes three types of memory. Read-only memory
(ROM) stores data that does not change during the card’s life, such as the card
number and the cardholder’s name. Electrically erasable programmable ROM
(EEPROM) holds application data and programs, such as the protocols that the
card can execute. It also holds data that may vary with time. For example, in a tele-
phone card, the EEPROM holds the talk time remaining. Random access memory
(RAM) holds temporary data generated when applications are executed.
For the practical application of smart card authentication, a wide range of
vendors must conform to standards that cover smart card protocols, authentication
and access control formats and protocols, database entries, message formats, and so
on. An important step in this direction is FIPS 201-2 (Personal Identity Verification
[PIV] of Federal Employees and Contractors, June 2012). The standard defines a
reliable, government-wide PIV system for use in applications such as access to fed-
erally controlled facilities and information systems. The standard specifies a PIV
system within which common identification credentials can be created and later
used to verify a claimed identity. The standard also identifies Federal government-
wide requirements for security levels that are dependent on risks to the facility or
information being protected. The standard applies to private-sector contractors as
well, and serves as a useful guideline for any organization.
PIV System Model
Figure 15.8 illustrates the major components of FIPS 201-2 compliant systems. The
PIV front end defines the physical interface to a user who is requesting access to a
facility, which could be either physical access to a protected physical area or logical
access to an information system. The PIV front-end subsystem supports up to three-
factor authentication; the number of factors used depends on the level of security
required. The front end makes use of a smart card, known as a PIV card, which
is a dual-interface contact and contactless card. The card holds a cardholder pho-
tograph, X.509 certificates, cryptographic keys, biometric data, and a cardholder
unique identifier (CHUID). Certain cardholder information may be read-protected
and require a personal identification number (PIN) for read access by the card
reader. The biometric reader, in the current version of the standard, is a fingerprint
reader or an iris scanner.
The standard defines three assurance levels for verification of the card and the
encoded data stored on the card, which in turn leads to verifying the authenticity of
the person holding the credential. A level of some confidence corresponds to use of
the card reader and PIN. A level of high confidence adds a biometric comparison
of a fingerprint captured and encoded on the card during the card-issuing process
and a fingerprint scanned at the physical access point. A very high confidence level
requires that the process just described is completed at a control point attended by
an official observer.
The other major component of the PIV system is the PIV card issuance and
management subsystem. This subsystem includes the components responsible for
identity proofing and registration, card and key issuance and management, and the
various repositories and services (e.g., public key infrastructure [PKI] directory,
certificate status servers) required as part of the verification infrastructure.

The PIV system interacts with a relying subsystem, which includes compo-
nents responsible for determining a particular PIV cardholder’s access to a physical
or logical resource. FIPS 201-2 standardizes data formats and protocols for interac-
tion between the PIV system and the relying system.
Unlike the typical card number/facility code encoded on most access control
cards, the FIPS 201 CHUID takes authentication to a new level, through the use of
an expiration date (a required CHUID data field) and an optional CHUID digital
signature. A digital signature can be checked to ensure that the CHUID recorded
on the card was digitally signed by a trusted source and that the CHUID data have
not been altered since the card was signed. The CHUID expiration date can be
checked to verify that the card has not expired. This is independent from whatever
expiration date is associated with cardholder privileges. Reading and verifying the
CHUID alone provides only some assurance of identity because it authenticates
the card data, not the cardholder. The PIN and biometric factors provide identity
verification of the individual.
PIV Documentation
The PIV specification is quite complex, and NIST has issued a number of docu-
ments that cover a broad range of PIV topics. These are as follows:
Figure 15.8 FIPS 201 PIV System Model
Identity profiling
& registration
Card issuance
& maintenance
PKI directory &
certificate status
PIV card issuance
and management
I&A Authorization
Physical Access Control
I&A = Identification and Authentication
Direction of information flow
Logical Access Control
Card reader
PIN input
PIV card
PIV Front End

■ FIPS 201-2—Personal Identity Verification (PIV) of Federal Employees
and Contractors: Specifies the physical card characteristics, storage media,
and data elements that make up the identity credentials resident on the PIV
■ SP 800-73-3—Interfaces for Personal Identity Verification: Specifies the in-
terfaces and card architecture for storing and retrieving identity credentials
from a smart card, and provides guidelines for the use of authentication mech-
anisms and protocols.
■ SP 800-76-2—Biometric Data Specification for Personal Identity Verification:
Describes technical acquisition and formatting specifications for the biometric
credentials of the PIV system.
■ SP 800-78-3—Cryptographic Algorithms and Key Sizes for Personal Identity
Verification: Identifies acceptable symmetric and asymmetric encryption algo-
rithms, digital signature algorithms, and message digest algorithms, and speci-
fies mechanisms to identify the algorithms associated with PIV keys or digital
■ SP 800-104—A Scheme for PIV Visual Card Topography: Provides additional
recommendations on the PIV card color-coding for designating employee
■ SP 800-116—A Recommendation for the Use of PIV Credentials in Physical
Access Control Systems (PACS): Describes a risk-based approach for select-
ing appropriate PIV authentication mechanisms to manage physical access to
Federal government facilities and assets.
■ SP 800-79-1—Guidelines for the Accreditation of Personal Identity
Verification Card Issuers: Provides guidelines for accrediting the reliability
of issuers of PIV cards that collect, store, and disseminate personal identity
credentials and issue smart cards.
■ SP 800-96—PIV Card to Reader Interoperability Guidelines: Provides re-
quirements that facilitate interoperability between any card and any reader.
In addition there are other documents that deal with conformance testing and
codes for identifiers.
PIV Credentials and Keys
The PIV card contains a number of mandatory and optional data elements that
serve as identity credentials with varying levels of strength and assurance. These
credentials are used singly or in sets to authenticate the holder of the PIV card to
achieve the level of assurance required for a particular activity or transaction. The
mandatory data elements are the following:
■ Personal Identification Number (PIN): Required to activate the card for privi-
leged operation.
■ Cardholder Unique Identifier (CHUID): Includes the Federal Agency Smart
Credential Number (FASC-N) and the Global Unique Identification Number
(GUID), which uniquely identify the card and the cardholder.

■ PIV Authentication Key: Asymmetric key pair and corresponding certificate
for user authentication.
■ Two fingerprint templates: For biometric authentication.
■ Electronic facial image: For biometric authentication.
■ Asymmetric Card Authentication Key: Asymmetric key pair and correspond-
ing certificate used for card authentication.
Optional elements include the following:
■ Digital Signature Key: Asymmetric key pair and corresponding certificate that
supports document signing and signing of data elements such as the CHUID.
■ Key Management Key: Asymmetric key pair and corresponding certificate
supporting key establishment and transport.
■ Symmetric Card Authentication Key: For supporting physical access applications.
■ PIV Card Application Administration Key: Symmetric key associated with the
card management system.
■ One or two iris images: For biometric authentication.
Table 15.5 lists the algorithm and key size requirements for PIV key types.
Using the electronic credentials resident on a PIV card, the card supports the fol-
lowing authentication mechanisms:
■ CHUID: The cardholder is authenticated using the signed CHUID data ele-
ment on the card. The PIN is not required. This mechanism is useful in envi-
ronments where a low level of assurance is acceptable and rapid contactless
authentication is necessary.
PIV Key Type Algorithms Key Sizes (bits) Application
PIV Authentication Key
RSA 2048 Supports card and
cardholder authentication
for an interoperable
environmentECDSA 256
Card Authentication Key
3TDEA 168 Supports card authentication
for physical accessAES 128, 192, or 256
RSA 2048 Supports card
authentication for an
interoperable environmentECDSA 256
Digital Signature Key
RSA 2048 or 3072 Supports document signing
and nonce signingECDSA 256 or 384
Key Management Key
RSA 2048 Supports key establishment
and transportECDH 256 or 384
Table 15.5 PIV Algorithms and Key Sizes

■ Card Authentication Key: The PIV card is authenticated using the Card
Authentication Key in a challenge response protocol. The PIN is not required.
This mechanism allows contact (via card reader) or contactless (via radio
waves) authentication of the PIV card without the holder’s active participa-
tion, and provides a low level of assurance.
■ BIO: The cardholder is authenticated by matching his or her fingerprint
sample(s) to the signed biometric data element in an environment without a
human attendant in view. The PIN is required to activate the card. This mecha-
nism achieves a high level of assurance and requires the cardholder’s active
participation is submitting the PIN as well as the biometric sample.
■ BIO-A: The cardholder is authenticated by matching his or her fingerprint
sample(s) to the signed biometric data element in an environment with a
human attendant in view. The PIN is required to activate the card. This mecha-
nism achieves a very high level of assurance when coupled with full trust val-
idation of the biometric template retrieved from the card, and requires the
cardholder’s active participation is submitting the PIN as well as the biometric
■ PKI: The cardholder is authenticated by demonstrating control of the PIV au-
thentication private key in a challenge response protocol that can be validated
using the PIV authentication certificate. The PIN is required to activate the
card. This mechanism achieves a very high level of identity assurance and re-
quires the cardholder’s knowledge of the PIN.
In each of the above use cases, except the symmetric Card Authentication Key
use case, the source and the integrity of the corresponding PIV credential are vali-
dated by verifying the digital signature on the credential, with the signature being
provided by a trusted entity.
A variety of protocols can be constructed for each of these authentication
types. SP 800-78-3 gives examples for each type. Figure 15.9 illustrates an authenti-
cation scenario that includes the use of the PIV Authentication Key. This scenario
provides a high level of assurance. This scenario would be appropriate for authenti-
cation of a user who possesses a PIV card and seeks access to a computer resource.
The computer, designated local system in the figure, includes PIV application soft-
ware and communicates to the card via an application program interface that en-
ables the use of relatively high-level procedure calls. These high-level commands
are converted into PIV commands that are issued to the card through a physical
interface via a card reader or via a wireless interface. In either case, SP 800-73 refers
to the card command interface as the PIV card edge.
The process begins when the local system detects the card either through an
attached card reader or wirelessly. It then selects an application on the card for au-
thentication. The local system then requests the public-key certificate for the card’s
PIV Authentication Key. If the certificate is valid (i.e., has a valid signature, has not
expired or been revoked), authentication continues. Otherwise the card is rejected.
The next step is for the local system to request that the cardholder enter the PIN
for the card. If the submitted PIN matches the PIN stored on the card, the card
returns a positive acknowledgment; otherwise the card returns a failure message.

The local system either continues or rejects the card accordingly. The next phase is
a challenge-response protocol. The local system sends a nonce to be signed by the
PIV, and the PIV returns the signature. The local system uses the PIV authentica-
tion public key to verify the signature. If the signature is valid, the cardholder is ac-
cepted as being identified. Otherwise the local system rejects the card.
The scenario of Figure 15.9 accomplishes three types of authentication. The
combination of possession of the card and knowledge of the PIN service authenti-
cates the cardholder. The PIV Authentication Key certificate validates the card’s
credentials. The challenge-response protocol authenticates the card.
Figure 15.9 Authentication Using PIV Authentication Key
End transaction
Verify PIN
Request card signature
PIV card app ID and Version
PIV Auth certificate returned
Signed nonce returned
Read value (PIV Auth certificate)
Select application
Select application
Verify PIN
Sign nonce
Begin transaction
Present card
PIV Application
on Local System
API on
Local System
Card Edge
Retrieve PIV
AUTH certificate
Retrieve FASC-N
from the certificate
CardV = Card validation
CredV = Credential validation
HolderV = Cardholder validation
FASC-N = Federal Agency Smart Credential Number
Validate certificate
(signature, expiration, and
revocation) (CredV)
Retrieve algorithm
ID and key size for
signature request
Acquire PIN
Verify signed
possesses private
key (CardV)
Reject Cardholder identifier

Key Terms
authentication server
credential service provider
federated identity
identity management
Kerberos realm
mutual authentication
one-way authentication
personal identity verification
registration authority (RA)
relying party (RP)
replay attack
suppress-replay attack
ticket-granting server (TGS)
Review Questions
15.1 What are the steps involved in an authentication process?
15.2 List three general approaches to dealing with replay attacks.
15.3 What is a suppress-replay attack?
15.4 What problem was Kerberos designed to address?
15.5 What are three threats associated with user authentication over a network or
15.6 List three approaches to secure user authentication in a distributed environment.
15.7 What four requirements were defined for Kerberos?
15.8 What entities constitute a full-service Kerberos environment?
15.9 In the context of Kerberos, what is a realm?
15.10 What are the mandatory elements to authenticate a PIV card holder?
15.1 In Section 15.4, we outlined the public-key scheme proposed in [WOO92a] for the
distribution of secret keys. The revised version includes IDA in steps 5 and 6. What
attack, specifically, is countered by this revision?
15.2 The protocol referred to in Problem 15.1 can be reduced from seven steps to five,
having the following sequence:
a. A S B:
b. A S KDC:
c. KDC S B:
d. B S A:
e. A S B:
Show the message transmitted at each step. Hint: The final message in this protocol is
the same as the final message in the original protocol.
15.3 Reference the suppress-replay attack described in Section 15.2 to answer the
a. Give an example of an attack when a party’s clock is ahead of that of the KDC.
b. Give an example of an attack when a party’s clock is ahead of that of another

15.4 There are three typical ways to use nonces as challenges. Suppose Na is a nonce gen-
erated by A, A and B share key K, and f() is a function (such as an increment). The
three usages are
Usage 1 Usage 2 Usage 3
(1) A S B: Na (1) A S B: E(K, Na) (1) A S B: E(K, Na)
(2) B S A: E(K, Na) (2) B S A: Na (2) B S A: E(K, f(Na))
Describe situations for which each usage is appropriate.
15.5 Show that a random error in one block of ciphertext is propagated to all subsequent
blocks of plaintext in PCBC mode (See Figure T.2 in Appendix T).
15.6 Suppose that, in PCBC mode, blocks Ci and Ci+ 1 are interchanged during transmis-
sion. Show that this affects only the decrypted blocks Pi and Pi+ 1 but not subsequent
15.7 In addition to providing a standard for public-key certificate formats, X.509 specifies
an authentication protocol. The original version of X.509 contains a security flaw.
The essence of the protocol is as follows.
A S B: A {tA, rA, IDB}
B S A: B {tB, rB, IDA, rA}
A S B: A {rB}
where tA and tB are timestamps, rA and rB are nonces and the notation X{Y} indicates
that the message Y is transmitted, encrypted, and signed by X.
The text of X.509 states that checking timestamps tA and tB is optional for
three-way authentication. But consider the following example: Suppose A and B
have used the preceding protocol on some previous occasion, and that opponent C
has intercepted the preceding three messages. In addition, suppose that timestamps
are not used and are all set to 0. Finally, suppose C wishes to impersonate A to B. C
initially sends the first captured message to B:
C S B: A {0, rA, IDB}
B responds, thinking it is talking to A but is actually talking to C:
B S C: B {0, r B= , IDA, rA}
C meanwhile causes A to initiate authentication with C by some means. As a result,
A sends C the following:
A S C: A {0, r A= , IDC}
C responds to A using the same nonce provided to C by B:
C S A: C {0, r B= , IDA, r A= }
A responds with
A S C: A {r B= }

This is exactly what C needs to convince B that it is talking to A, so C now repeats the
incoming message back out to B.
C S B: A {r B= }
So B will believe it is talking to A whereas it is actually talking to C. Suggest a simple
solution to this problem that does not involve the use of timestamps.
15.8 Consider a one-way authentication technique based on asymmetric encryption:
B S A: R1
A S B: E(PRa, R1)
a. Explain the protocol.
b. What type of attack is this protocol susceptible to?
15.9 Consider a one-way authentication technique based on asymmetric encryption:
a. Explain the protocol.
b. What type of attack is this protocol susceptible to?
15.10 In Kerberos, when Bob receives a Ticket from Alice, how does he know it is not
15.11 In Kerberos, how does Bob know that the received token is not corresponding to
15.12 In Kerberos, how does Alice know that a reply to an earlier message is from Bob?
15.13 In Kerberos, what does the Ticket contain that allows Alice and Bob to talk securely?

This page intentionally left blank
M02_STAL4284_07_SE_C02.indd 46 7/26/16 3:53 PM

16.1 Network Access Control
Elements of a Network Access Control System
Network Access Enforcement Methods
16.2 Extensible Authentication Protocol
Authentication Methods
EAP Exchanges
16.3 IEEE 802.1X Port-Based Network Access Control
16.4 Cloud Computing
Cloud Computing Elements
Cloud Computing Reference Architecture
16.5 Cloud Security Risks and Countermeasures
16.6 Data Protection in the Cloud
16.7 Cloud Security as a Service
16.8 Addressing Cloud Computing Security Concerns
16.9 Key Terms, Review Questions, and Problems
Network Access Control
and Cloud Security

This chapter begins our discussion of network security, focusing on two key topics:
network access control and cloud security. We begin with an overview of network
access control systems, summarizing the principal elements and techniques involved
in such a system. Next, we discuss the Extensible Authentication Protocol and IEEE
802.1X, two widely implemented standards that are the foundation of many network
access control systems.
The remainder of the chapter deals with cloud security. We begin with an
overview of cloud computing, and follow this with a discussion of cloud security
Network access control (NAC) is an umbrella term for managing access to a
network. NAC authenticates users logging into the network and determines what
data they can access and actions they can perform. NAC also examines the health of
the user’s computer or mobile device (the endpoints).
Elements of a Network Access Control System
NAC systems deal with three categories of components:
■ Access requestor (AR): The AR is the node that is attempting to access the
network and may be any device that is managed by the NAC system, including
workstations, servers, printers, cameras, and other IP-enabled devices. ARs are
also referred to as supplicants, or simply, clients.
■ Policy server: Based on the AR’s posture and an enterprise’s defined policy,
the policy server determines what access should be granted. The policy server
often relies on backend systems, including antivirus, patch management, or a
user directory, to help determine the host’s condition.
After studying this chapter, you should be able to:
◆ Discuss the principal elements of a network access control system.
◆ Discuss the principal network access enforcement methods.
◆ Present an overview of the Extensible Authentication Protocol.
◆ Understand the operation and role of the IEEE 802.1X Port-Based
Network Access Control mechanism.
◆ Present an overview of cloud computing concepts.
◆ Understand the unique security issues related to cloud computing.

■ Network access server (NAS): The NAS functions as an access control point
for users in remote locations connecting to an enterprise’s internal network.
Also called a media gateway, a remote access server (RAS), or a policy server,
an NAS may include its own authentication services or rely on a separate
authentication service from the policy server.
Figure 16.1 is a generic network access diagram. A variety of different ARs
seek access to an enterprise network by applying to some type of NAS. The first
step is generally to authenticate the AR. Authentication typically involves some
sort of secure protocol and the use of cryptographic keys. Authentication may be
performed by the NAS, or the NAS may mediate the authentication process. In the
latter case, authentication takes place between the supplicant and an authentication
server that is part of the policy server or that is accessed by the policy server.
The authentication process serves a number of purposes. It verifies a suppli-
cant’s claimed identity, which enables the policy server to determine what access
privileges, if any, the AR may have. The authentication exchange may result in the
Figure 16.1 Network Access Control Context
Network access servers
Antivirus Antispyware
Enterprise network

establishment of session keys to enable future secure communication between the
supplicant and resources on the enterprise network.
Typically, the policy server or a supporting server will perform checks on the
AR to determine if it should be permitted interactive remote access connectivity.
These checks—sometimes called health, suitability, screening, or assessment
checks—require software on the user’s system to verify compliance with certain
requirements from the organization’s secure configuration baseline. For example,
the user’s antimalware software must be up-to-date, the operating system must
be fully patched, and the remote computer must be owned and controlled by the
organization. These checks should be performed before granting the AR access to
the enterprise network. Based on the results of these checks, the organization can
determine whether the remote computer should be permitted to use interactive
remote access. If the user has acceptable authorization credentials but the remote
computer does not pass the health check, the user and remote computer should be
denied network access or have limited access to a quarantine network so that autho-
rized personnel can fix the security deficiencies. Figure 16.1 indicates that the quar-
antine portion of the enterprise network consists of the policy server and related
AR suitability servers. There may also be application servers that do not require the
normal security threshold be met.
Once an AR has been authenticated and cleared for a certain level of access
to the enterprise network, the NAS can enable the AR to interact with resources in
the enterprise network. The NAS may mediate every exchange to enforce a security
policy for this AR, or may use other methods to limit the privileges of the AR.
Network Access Enforcement Methods
Enforcement methods are the actions that are applied to ARs to regulate access
to the enterprise network. Many vendors support multiple enforcement methods
simultaneously, allowing the customer to tailor the configuration by using one or a
combination of methods. The following are common NAC enforcement methods.
■ IEEE 802.1X: This is a link layer protocol that enforces authorization before
a port is assigned an IP address. IEEE 802.1X makes use of the Extensible
Authentication Protocol for the authentication process. Sections 16.2 and 16.3
cover the Extensible Authentication Protocol and IEEE 802.1X, respectively.
■ Virtual local area networks (VLANs): In this approach, the enterprise net-
work, consisting of an interconnected set of LANs, is segmented logically into
a number of virtual LANs.1 The NAC system decides to which of the network’s
VLANs it will direct an AR, based on whether the device needs security reme-
diation, Internet access only, or some level of network access to enterprise
resources. VLANs can be created dynamically and VLAN membership, of
both enterprise servers and ARs, may overlap. That is, an enterprise server or
an AR may belong to more than one VLAN.
1A VLAN is a logical subgroup within a LAN that is created via software rather than manually moving
cables in the wiring closet. It combines user stations and network devices into a single unit regardless
of  the physical LAN segment they are attached to and allows traffic to flow more efficiently within
populations of mutual interest. VLANs are implemented in port-switching hubs and LAN switches.

■ Firewall: A firewall provides a form of NAC by allowing or denying network
traffic between an enterprise host and an external user. Firewalls are discussed
in Chapter 23.
■ DHCP management: The Dynamic Host Configuration Protocol (DHCP) is
an Internet protocol that enables dynamic allocation of IP addresses to hosts.
A DHCP server intercepts DHCP requests and assigns IP addresses instead.
Thus, NAC enforcement occurs at the IP layer based on subnet and IP assign-
ment. A DCHP server is easy to install and configure, but is subject to various
forms of IP spoofing, providing limited security.
There are a number of other enforcement methods available from vendors.
The ones in the preceding list are perhaps the most common, and IEEE 802.1X is by
far the most commonly implemented solution.
The Extensible Authentication Protocol (EAP), defined in RFC 3748, acts as a
framework for network access and authentication protocols. EAP provides a set of
protocol messages that can encapsulate various authentication methods to be used
between a client and an authentication server. EAP can operate over a variety of
network and link level facilities, including point-to-point links, LANs, and other
networks, and can accommodate the authentication needs of the various links and
networks. Figure 16.2 illustrates the protocol layers that form the context for EAP.
Authentication Methods
EAP supports multiple authentication methods. This is what is meant by referring
to EAP as extensible. EAP provides a generic transport service for the exchange of
authentication information between a client system and an authentication server.
The basic EAP transport service is extended by using a specific authentication proto-
col, or method, that is installed in both the EAP client and the authentication server.
Figure 16.2 EAP Layered Context
Data link
Extensible Authentication Protocol (EAP)
IEEE 802.1X
PPP 802.3Ethernet
WLAN Other

Numerous methods have been defined to work over EAP. The following are
commonly supported EAP methods:
■ EAP-TLS (EAP Transport Layer Security): EAP-TLS (RFC 5216) defines
how the TLS protocol (described in Chapter 17) can be encapsulated in EAP
messages. EAP-TLS uses the handshake protocol in TLS, not its encryption
method. Client and server authenticate each other using digital certificates.
Client generates a pre-master secret key by encrypting a random number with
the server’s public key and sends it to the server. Both client and server use the
pre-master to generate the same secret key.
■ EAP-TTLS (EAP Tunneled TLS): EAP-TTLS is like EAP-TLS, except only
the server has a certificate to authenticate itself to the client first. As in EAP-
TLS, a secure connection (the “tunnel”) is established with secret keys, but
that connection is used to continue the authentication process by authenti-
cating the client and possibly the server again using any EAP method or
legacy method such as PAP (Password Authentication Protocol) and CHAP
(Challenge-Handshake Authentication Protocol). EAP-TTLS is defined in
RFC 5281.
■ EAP-GPSK (EAP Generalized Pre-Shared Key): EAP-GPSK, defined in
RFC 5433, is an EAP method for mutual authentication and session key deri-
vation using a Pre-Shared Key (PSK). EAP-GPSK specifies an EAP method
based on pre-shared keys and employs secret key-based cryptographic algo-
rithms. Hence, this method is efficient in terms of message flows and com-
putational costs, but requires the existence of pre-shared keys between each
peer and EAP server. The set up of these pairwise secret keys is part of the
peer registration, and thus, must satisfy the system preconditions. It provides
a protected communication channel when mutual authentication is success-
ful for both parties to communicate over and is designed for authentication
over insecure networks such as IEEE 802.11. EAP-GPSK does not require
any public-key cryptography. The EAP method protocol exchange is done in a
minimum of four messages.
■ EAP-IKEv2: It is based on the Internet Key Exchange protocol version 2
(IKEv2), which is described in Chapter 20. It supports mutual authentication
and session key establishment using a variety of methods. EAP-TLS is defined
in RFC 5106.
EAP Exchanges
Whatever method is used for authentication, the authentication information and
authentication protocol information are carried in EAP messages.
RFC 3748 defines the goal of the exchange of EAP messages to be successful
authentication. In the context of RFC 3748, successful authentication is an exchange
of EAP messages, as a result of which the authenticator decides to allow access
by the peer, and the peer decides to use this access. The authenticator’s decision
typically involves both authentication and authorization aspects; the peer may
successfully authenticate to the authenticator, but access may be denied by the
authenticator due to policy reasons.

Figure 16.3 indicates a typical arrangement in which EAP is used. The follow-
ing components are involved:
■ EAP peer: Client computer that is attempting to access a network.
■ EAP authenticator: An access point or NAS that requires EAP authentication
prior to granting access to a network.
■ Authentication server: A server computer that negotiates the use of a specific
EAP method with an EAP peer, validates the EAP peer’s credentials, and
authorizes access to the network. Typically, the authentication server is a
Remote Authentication Dial-In User Service (RADIUS) server.
The authentication server functions as a backend server that can authenti-
cate peers as a service to a number of EAP authenticators. The EAP authentica-
tor then makes the decision of whether to grant access. This is referred to as the
EAP  pass-through mode. Less commonly, the authenticator takes over the role of
the EAP server; that is, only two parties are involved in the EAP execution.
As a first step, a lower-level protocol, such as PPP (point-to-point protocol)
or IEEE 802.1X, is used to connect to the EAP authenticator. The software entity
in the EAP peer that operates at this level is referred to as the supplicant. EAP
messages containing the appropriate information for a chosen EAP method are
then exchanged between the EAP peer and the authentication server.
EAP messages may include the following fields:
■ Code: Identifies the Type of EAP message. The codes are Request (1),
Response (2), Success (3), and Failure (4).
■ Identifier: Used to match Responses with Requests.
■ Length: Indicates the length, in octets, of the EAP message, including the
Code, Identifier, Length, and Data fields.
Figure 16.3 EAP Protocol Exchanges
EAP peer/
EAP layer
Lower layer
EAP layer
Lower layer
EAP peer/
EAP layer
Lower layer
EAP peer
EAP authenticator Authentication server

■ Data: Contains information related to authentication. Typically, the Data field
consists of a Type subfield, indicating the type of data carried, and a Type-Data
The Success and Failure messages do not include a Data field.
The EAP authentication exchange proceeds as follows. After a lower-level
exchange that established the need for an EAP exchange, the authenticator sends a
Request to the peer to request an identity, and the peer sends a Response with the
identity information. This is followed by a sequence of Requests by the authentica-
tor and Responses by the peer for the exchange of authentication information. The
information exchanged and the number of Request–Response exchanges needed
depend on the authentication method. The conversation continues until either
(1) the authenticator determines that it cannot authenticate the peer and transmits
an EAP Failure or (2) the authenticator determines that successful authentication
has occurred and transmits an EAP Success.
Figure 16.4 gives an example of an EAP exchange. Not shown in the figure is a
message or signal sent from the EAP peer to the authenticator using some protocol
other than EAP and requesting an EAP exchange to grant network access. One
protocol used for this purpose is IEEE 802.1X, discussed in the next section. The
first pair of EAP Request and Response messages is of Type identity, in which the
authenticator requests the peer’s identity, and the peer returns its claimed identity
in the Response message. This Response is passed through the authenticator to the
authentication server. Subsequent EAP messages are exchanged between the peer
and the authentication server.
Figure 16.4 EAP Message Flow in Pass-Through Mode
EAP peer
EAP authenticator Authentication server

Upon receiving the identity Response message from the peer, the server
selects an EAP method and sends the first EAP message with a Type field related
to an authentication method. If the peer supports and accepts the selected EAP
method, it replies with the corresponding Response message of the same type.
Otherwise, the peer sends a NAK, and the EAP server either selects another EAP
method or aborts the EAP execution with a failure message. The selected EAP
method determines the number of Request/Response pairs. During the exchange
the appropriate authentication information, including key material, is exchanged.
The exchange ends when the server determines that authentication has succeeded
or that no further attempt can be made and authentication has failed.
IEEE 802.1X Port-Based Network Access Control was designed to provide access
control functions for LANs. Table 16.1 briefly defines key terms used in the IEEE
802.11 standard. The terms supplicant, network access point, and authentication
An entity at one end of a point-to-point LAN segment that facilities authentication of the entity to the other
end of the link.
Authentication exchange
The two-party conversation between systems performing an authentication process.
Authentication process
The cryptographic operations and supporting data frames that perform the actual authentication.
Authentication server (AS)
An entity that provides an authentication service to an authenticator. This service determines, from the
credentials provided by supplicant, whether the supplicant is authorized to access the services provided by
the system in which the authenticator resides.
Authentication transport
The datagram session that actively transfers the authentication exchange between two systems.
Bridge port
A port of an IEEE 802.1D or 802.1Q bridge.
Edge port
A bridge port attached to a LAN that has no other bridges attached to it.
Network access port
A point of attachment of a system to a LAN. It can be a physical port, such as a single LAN MAC attached to
a physical LAN segment, or a logical port, for example, an IEEE 802.11 association between a station and an
access point.
Port access entity (PAE)
The protocol entity associated with a port. It can support the protocol functionality associated with the
authenticator, the supplicant, or both.
An entity at one end of a point-to-point LAN segment that seeks to be authenticated by an authenticator
attached to the other end of that link.
Table 16.1 Terminology Related to IEEE 802.1X

server correspond to the EAP terms peer, authenticator, and authentication server,
Until the AS authenticates a supplicant (using an authentication protocol),
the authenticator only passes control and authentication messages between the sup-
plicant and the AS; the 802.1X control channel is unblocked, but the 802.11 data
channel is blocked. Once a supplicant is authenticated and keys are provided, the
authenticator can forward data from the supplicant, subject to predefined access
control limitations for the supplicant to the network. Under these circumstances,
the data channel is unblocked.
As indicated in Figure 16.5, 802.1X uses the concepts of controlled and uncon-
trolled ports. Ports are logical entities defined within the authenticator and refer to
physical network connections. Each logical port is mapped to one of these two types
of physical ports. An uncontrolled port allows the exchange of protocol data units
(PDUs) between the supplicant and the AS, regardless of the authentication state
of the supplicant. A controlled port allows the exchange of PDUs between a sup-
plicant and other systems on the network only if the current state of the supplicant
authorizes such an exchange.
The essential element defined in 802.1X is a protocol known as EAPOL (EAP
over LAN). EAPOL operates at the network layers and makes use of an IEEE 802
LAN, such as Ethernet or Wi-Fi, at the link level. EAPOL enables a supplicant to
communicate with an authenticator and supports the exchange of EAP packets for
Figure 16.5 802.1X Access Control
access point
Authentication server
Network or Internet

The most common EAPOL packets are listed in Table 16.2. When the
supplicant first connects to the LAN, it does not know the MAC address of the
authenticator. Actually it doesn’t know whether there is an authenticator present
at all. By sending an EAPOL-Start packet to a special group-multicast address
reserved for IEEE 802.1X authenticators, a supplicant can determine whether an
authenticator is present and let it know that the supplicant is ready. In many cases,
the authenticator will already be notified that a new device has connected from some
hardware notification. For example, a hub knows that a cable is plugged in before
the device sends any data. In this case the authenticator may preempt the Start mes-
sage with its own message. In either case the authenticator sends an EAP-Request
Identity message encapsulated in an EAPOL-EAP packet. The EAPOL-EAP is
the EAPOL frame type used for transporting EAP packets.
The authenticator uses the EAP-Key packet to send cryptographic keys to the
supplicant once it has decided to admit it to the network. The EAP-Logoff packet
type indicates that the supplicant wishes to be disconnected from the network.
The EAPOL packet format includes the following fields:
■ Protocol version: version of EAPOL.
■ Packet type: indicates start, EAP, key, logoff, etc.
■ Packet body length: If the packet includes a body, this field indicates the body
■ Packet body: The payload for this EAPOL packet. An example is an EAP
Figure 16.6 shows an example of exchange using EAPOL. In Chapter 18, we
examine the use of EAP and EAPOL in the context of IEEE 802.11 wireless LAN
There is an increasingly prominent trend in many organizations to move a substan-
tial portion of or even all information technology (IT) operations to an Internet-
connected infrastructure known as enterprise cloud computing. This section provides
an overview of cloud computing. For a more detailed treatment, see [STAL16].
Frame Type Definition
EAPOL-EAP Contains an encapsulated EAP packet.
EAPOL-Start A supplicant can issue this packet instead of waiting for
a challenge from the authenticator.
EAPOL-Logoff Used to return the state of the port to unauthorized when
the supplicant is finished using the network.
EAPOL-Key Used to exchange cryptographic keying information.
Table 16.2 Common EAPOL Frame Types

Cloud Computing Elements
NIST defines cloud computing, in NIST SP-800-145 (The NIST Definition of Cloud
Computing), as follows:
Figure 16.6 Example Timing Diagram for IEEE 802.1X
EAP peer
EAPOL-EAP (EAP-Request/Identity)
EAPOL-EAP (EAP-Response/Identity)
EAP authenticator Authentication server
EAPOL-EAP (EAP-Response/Auth)
EAPOL-EAP (EAP-Request/Auth)
EAPOL-EAP (EAP-Response/Auth)
EAPOL-EAP (EAP-Request/Auth)
Cloud computing: A model for enabling ubiquitous, convenient, on-demand net-
work access to a shared pool of configurable computing resources (e.g., networks,
servers, storage, applications, and services) that can be rapidly provisioned and
released with minimal management effort or service provider interaction. This
cloud model promotes availability and is composed of five essential characteris-
tics, three service models, and four deployment models.
The definition refers to various models and characteristics, whose relationship is
illustrated in Figure 16.7. The essential characteristics of cloud computing include
the following:
■ Broad network access: Capabilities are available over the network and ac-
cessed through standard mechanisms that promote use by heterogeneous thin

or thick client platforms (e.g., mobile phones, laptops, and PDAs) as well as
other traditional or cloud-based software services.
■ Rapid elasticity: Cloud computing gives you the ability to expand and reduce
resources according to your specific service requirement. For example, you
may need a large number of server resources for the duration of a specific task.
You can then release these resources upon completion of the task.
■ Measured service: Cloud systems automatically control and optimize resource
use by leveraging a metering capability at some level of abstraction appropri-
ate to the type of service (e.g., storage, processing, bandwidth, and active user
accounts). Resource usage can be monitored, controlled, and reported, provid-
ing transparency for both the provider and consumer of the utilized service.
■ On-demand self-service: A consumer can unilaterally provision computing
capabilities, such as server time and network storage, as needed automati-
cally without requiring human interaction with each service provider. Because
the service is on demand, the resources are not permanent parts of your IT
■ Resource pooling: The provider’s computing resources are pooled to serve
multiple consumers using a multi-tenant model, with different physical and
virtual resources dynamically assigned and reassigned according to consumer
demand. There is a degree of location independence in that the customer
Figure 16.7 Cloud Computing Elements
Network Access
Resource Pooling
Public Private Hybrid Community
Software as a Service (SaaS)
Platform as a Service (PaaS)
Infrastructure as a Service (IaaS)

generally has no control or knowledge of the exact location of the provided
resources, but may be able to specify location at a higher level of abstraction
(e.g., country, state, or data center). Examples of resources include storage,
processing, memory, network bandwidth, and virtual machines. Even private
clouds tend to pool resources between different parts of the same organization.
NIST defines three service models, which can be viewed as nested service
■ Software as a service (SaaS): The capability provided to the consumer is to use
the provider’s applications running on a cloud infrastructure. The applications
are accessible from various client devices through a thin client interface such as
a Web browser. Instead of obtaining desktop and server licenses for software
products it uses, an enterprise obtains the same functions from the cloud service.
SaaS saves the complexity of software installation, maintenance, upgrades, and
patches. Examples of services at this level are Gmail, Google’s email service,
and, which helps firms keep track of their customers.
■ Platform as a service (PaaS): The capability provided to the consumer is to
deploy onto the cloud infrastructure consumer-created or acquired applica-
tions created using programming languages and tools supported by the pro-
vider. PaaS often provides middleware-style services such as database and
component services for use by applications. In effect, PaaS is an operating
system in the cloud.
■ Infrastructure as a service (IaaS): The capability provided to the consumer is
to provision processing, storage, networks, and other fundamental computing
resources where the consumer is able to deploy and run arbitrary software,
which can include operating systems and applications. IaaS enables custom-
ers to combine basic computing services, such as number crunching and data
storage, to build highly adaptable computer systems.
NIST defines four deployment models:
■ Public cloud: The cloud infrastructure is made available to the general public
or a large industry group and is owned by an organization selling cloud ser-
vices. The cloud provider is responsible both for the cloud infrastructure and
for the control of data and operations within the cloud.
■ Private cloud: The cloud infrastructure is operated solely for an organization.
It may be managed by the organization or a third party and may exist on prem-
ise or off premise. The cloud provider (CP) is responsible only for the infra-
structure and not for the control.
■ Community cloud: The cloud infrastructure is shared by several organizations
and supports a specific community that has shared concerns (e.g., mission, security
requirements, policy, and compliance considerations). It may be managed by the
organizations or a third party and may exist on premise or off premise.
■ Hybrid cloud: The cloud infrastructure is a composition of two or more clouds
(private, community, or public) that remain unique entities but are bound
together by standardized or proprietary technology that enables data and
application portability (e.g., cloud bursting for load balancing between clouds).

Figure 16.8 illustrates the typical cloud service context. An enterprise main-
tains workstations within an enterprise LAN or set of LANs, which are connected
by a router through a network or the Internet to the cloud service provider. The
cloud service provider maintains a massive collection of servers, which it man-
ages with a variety of network management, redundancy, and security tools. In the
figure, the cloud infrastructure is shown as a collection of blade servers, which is a
common architecture.
Cloud Computing Reference Architecture
NIST SP 500-292 (NIST Cloud Computing Reference Architecture) establishes a
reference architecture, described as follows:
Figure 16.8 Cloud Computing Context
or Internet
(Cloud user)
The NIST cloud computing reference architecture focuses on the requirements
of “what” cloud services provide, not a “how to” design solution and implemen-
tation. The reference architecture is intended to facilitate the understanding of
the operational intricacies in cloud computing. It does not represent the system
architecture of a specific cloud computing system; instead it is a tool for describ-
ing, discussing, and developing a system-specific architecture using a common
framework of reference.

NIST developed the reference architecture with the following objectives
in mind:
■ to illustrate and understand the various cloud services in the context of an
overall cloud computing conceptual model
■ to provide a technical reference for consumers to understand, discuss, catego-
rize, and compare cloud services
■ to facilitate the analysis of candidate standards for security, interoperability,
and portability and reference implementations
The reference architecture, depicted in Figure 16.9, defines five major actors
in terms of the roles and responsibilities:
■ Cloud consumer: A person or organization that maintains a business relation-
ship with, and uses service from, cloud providers.
■ Cloud provider: A person, organization, or entity responsible for making a
service available to interested parties.
■ Cloud auditor: A party that can conduct independent assessment of cloud
services, information system operations, performance, and security of the
cloud implementation.
■ Cloud broker: An entity that manages the use, performance, and delivery of
cloud services, and negotiates relationships between CPs and cloud consumers.
■ Cloud carrier: An intermediary that provides connectivity and transport of
cloud services from CPs to cloud consumers.
The roles of the cloud consumer and provider have already been discussed. To
summarize, a cloud provider can provide one or more of the cloud services to meet
IT and business requirements of cloud consumers. For each of the three service
Figure 16.9 NIST Cloud Computing Reference Architecture
Cloud provider
impact audit
Service layer
Service orchestration Cloud
Physical resource layer
Resource abstraction
and control layer
Cloud carrier

models (SaaS, PaaS, IaaS), the CP provides the storage and processing facilities
needed to support that service model, together with a cloud interface for cloud
service consumers. For SaaS, the CP deploys, configures, maintains, and updates
the operation of the software applications on a cloud infrastructure so that the
services are provisioned at the expected service levels to cloud consumers. The
consumers of SaaS can be organizations that provide their members with access to
software applications, end users who directly use software applications, or software
application administrators who configure applications for end users.
For PaaS, the CP manages the computing infrastructure for the platform and
runs the cloud software that provides the components of the platform, such as run-
time software execution stack, databases, and other middleware components. Cloud
consumers of PaaS can employ the tools and execution resources provided by CPs to
develop, test, deploy, and manage the applications hosted in a cloud environment.
For IaaS, the CP acquires the physical computing resources underlying the
service, including the servers, networks, storage, and hosting infrastructure. The
IaaS cloud consumer in turn uses these computing resources, such as a virtual
computer, for their fundamental computing needs.
The cloud carrier is a networking facility that provides connectivity and trans-
port of cloud services between cloud consumers and CPs. Typically, a CP will set up
service level agreements (SLAs) with a cloud carrier to provide services consistent
with the level of SLAs offered to cloud consumers, and may require the cloud carrier
to provide dedicated and secure connections between cloud consumers and CPs.
A cloud broker is useful when cloud services are too complex for a cloud con-
sumer to easily manage. Three areas of support can be offered by a cloud broker:
■ Service intermediation: These are value-added services, such as identity man-
agement, performance reporting, and enhanced security.
■ Service aggregation: The broker combines multiple cloud services to meet
consumer needs not specifically addressed by a single CP, or to optimize per-
formance or minimize cost.
■ Service arbitrage: This is similar to service aggregation except that the services
being aggregated are not fixed. Service arbitrage means a broker has the flexibil-
ity to choose services from multiple agencies. The cloud broker, for example, can
use a credit-scoring service to measure and select an agency with the best score.
A cloud auditor can evaluate the services provided by a CP in terms of secu-
rity controls, privacy impact, performance, and so on. The auditor is an independent
entity that can assure that the CP conforms to a set of standards.
In general terms, security controls in cloud computing are similar to the security
controls in any IT environment. However, because of the operational models and
technologies used to enable cloud service, cloud computing may present risks that
are specific to the cloud environment. The essential concept in this regard is that
the enterprise loses a substantial amount of control over resources, services, and
applications but must maintain accountability for security and privacy policies.

The Cloud Security Alliance [CSA10] lists the following as the top cloud-
specific security threats, together with suggested countermeasures:
■ Abuse and nefarious use of cloud computing: For many CPs, it is relatively
easy to register and begin using cloud services, some even offering free limited
trial periods. This enables attackers to get inside the cloud to conduct various
attacks, such as spamming, malicious code attacks, and denial of service. PaaS
providers have traditionally suffered most from this kind of attacks; however,
recent evidence shows that hackers have begun to target IaaS vendors as well.
The burden is on the CP to protect against such attacks, but cloud service cli-
ents must monitor activity with respect to their data and resources to detect
any malicious behavior.
Countermeasures include (1) stricter initial registration and valida-
tion processes; (2) enhanced credit card fraud monitoring and coordination;
(3) comprehensive introspection of customer network traffic; and (4) monitor-
ing public blacklists for one’s own network blocks.
■ Insecure interfaces and APIs: CPs expose a set of software interfaces or APIs
that customers use to manage and interact with cloud services. The security
and availability of general cloud services are dependent upon the security of
these basic APIs. From authentication and access control to encryption and
activity monitoring, these interfaces must be designed to protect against both
accidental and malicious attempts to circumvent policy.
Countermeasures include (1) analyzing the security model of CP inter-
faces; (2) ensuring that strong authentication and access controls are imple-
mented in concert with encrypted transmission; and (3) understanding the
dependency chain associated with the API.
■ Malicious insiders: Under the cloud computing paradigm, an organization
relinquishes direct control over many aspects of security and, in doing so, con-
fers an unprecedented level of trust onto the CP. One grave concern is the
risk of malicious insider activity. Cloud architectures necessitate certain roles
that are extremely high risk. Examples include CP system administrators and
managed security service providers.
Countermeasures include the following: (1) enforce strict supply chain
management and conduct a comprehensive supplier assessment; (2) specify
human resource requirements as part of legal contract; (3) require transpar-
ency into overall information security and management practices, as well as
compliance reporting; and (4) determine security breach notification processes.
■ Shared technology issues: IaaS vendors deliver their services in a scalable way
by sharing infrastructure. Often, the underlying components that make up this
infrastructure (CPU caches, GPUs, etc.) were not designed to offer strong iso-
lation properties for a multi-tenant architecture. CPs typically approach this
risk by the use of isolated virtual machines for individual clients. This approach
is still vulnerable to attack, by both insiders and outsiders, and so can only be a
part of an overall security strategy.
Countermeasures include the following: (1) implement security best
practices for installation/configuration; (2) monitor environment for unauthor-
ized changes/activity; (3) promote strong authentication and access control

for administrative access and operations; (4) enforce SLAs for patching and
vulnerability remediation; and (5) conduct vulnerability scanning and
configuration audits.
■ Data loss or leakage: For many clients, the most devastating impact from a
security breach is the loss or leakage of data. We address this issue in the next
Countermeasures include the following: (1) implement strong API ac-
cess control; (2) encrypt and protect integrity of data in transit; (3) analyze
data protection at both design and run time; and (4) implement strong key
generation, storage and management, and destruction practices.
■ Account or service hijacking: Account or service hijacking, usually with stolen
credentials, remains a top threat. With stolen credentials, attackers can often
access critical areas of deployed cloud computing services, allowing them to
compromise the confidentiality, integrity, and availability of those services.
Countermeasures include the following: (1) prohibit the sharing of
account credentials between users and services; (2) leverage strong two- factor
authentication techniques where possible; (3) employ proactive monitor-
ing to detect unauthorized activity; and (4) understand CP security policies
and SLAs.
■ Unknown risk profile: In using cloud infrastructures, the client necessarily
cedes control to the CP on a number of issues that may affect security. Thus
the client must pay attention to and clearly define the roles and responsibili-
ties involved for managing risks. For example, employees may deploy applica-
tions and data resources at the CP without observing the normal policies and
procedures for privacy, security, and oversight.
Countermeasures include (1) disclosure of applicable logs and data;
(2)  partial/full disclosure of infrastructure details (e.g., patch levels and
firewalls); and (3) monitoring and alerting on necessary information.
Similar lists have been developed by the European Network and Information
Security Agency [ENIS09] and NIST [JANS11].
As can be seen from the previous section, there are numerous aspects to cloud
security and numerous approaches to providing cloud security measures.
A  further example is seen in the NIST guidelines for cloud security, specified
in SP-800-14 and listed in Table 16.3. Thus, the topic of cloud security is well
beyond the scope of this chapter. In this section, we focus on one specific element
of cloud security.
There are many ways to compromise data. Deletion or alteration of records
without a backup of the original content is an obvious example. Unlinking a record
from a larger context may render it unrecoverable, as can storage on unreliable
media. Loss of an encoding key may result in effective destruction. Finally, unau-
thorized parties must be prevented from gaining access to sensitive data.

Extend organizational practices pertaining to the policies, procedures, and standards used for application
development and service provisioning in the cloud, as well as the design, implementation, testing, use, and
monitoring of deployed or engaged services.
Put in place audit mechanisms and tools to ensure organizational practices are followed throughout the
system life cycle.
Understand the various types of laws and regulations that impose security and privacy obligations on the
organization and potentially impact cloud computing initiatives, particularly those involving data location,
privacy and security controls, records management, and electronic discovery requirements.
Review and assess the cloud provider’s offerings with respect to the organizational requirements to be met
and ensure that the contract terms adequately meet the requirements.
Ensure that the cloud provider’s electronic discovery capabilities and processes do not compromise the
privacy or security of data and applications.
Ensure that service arrangements have sufficient means to allow visibility into the security and privacy
controls and processes employed by the cloud provider, and their performance over time.
Establish clear, exclusive ownership rights over data.
Institute a risk management program that is flexible enough to adapt to the constantly evolving and
shifting risk landscape for the life cycle of the system.
Continuously monitor the security state of the information system to support ongoing risk management
Understand the underlying technologies that the cloud provider uses to provision services, including the
implications that the technical controls involved have on the security and privacy of the system, over the full
system life cycle and across all system components.
Identity and access management
Ensure that adequate safeguards are in place to secure authentication, authorization, and other identity and
access management functions, and are suitable for the organization.
Software isolation
Understand virtualization and other logical isolation techniques that the cloud provider employs in its
multi-tenant software architecture, and assess the risks involved for the organization.
Data protection
Evaluate the suitability of the cloud provider’s data management solutions for the organizational data
concerned and the ability to control access to data, to secure data while at rest, in transit, and in use, and to
sanitize data.
Take into consideration the risk of collating organizational data with those of other organizations whose
threat profiles are high or whose data collectively represent significant concentrated value.
Fully understand and weigh the risks involved in cryptographic key management with the facilities
available in the cloud environment and the processes established by the cloud provider.
Understand the contract provisions and procedures for availability, data backup and recovery, and disaster
recovery, and ensure that they meet the organization’s continuity and contingency planning requirements.
Ensure that during an intermediate or prolonged disruption or a serious disaster, critical operations
can be immediately resumed, and that all operations can be eventually reinstituted in a timely and organized
Incident response
Understand the contract provisions and procedures for incident response and ensure that they meet the
requirements of the organization.
Table 16.3 NIST Guidelines on Security and Privacy Issues and Recommendations

Ensure that the cloud provider has a transparent response process in place and sufficient mechanisms to
share information during and after an incident.
Ensure that the organization can respond to incidents in a coordinated fashion with the cloud provider in
accordance with their respective roles and responsibilities for the computing environment.
Table 16.3 Continued
The threat of data compromise increases in the cloud, due to the number of
and interactions between risks and challenges that are either unique to the cloud or
more dangerous because of the architectural or operational characteristics of the
cloud environment.
Database environments used in cloud computing can vary significantly. Some
providers support a multi-instance model, which provides a unique DBMS running
on a virtual machine instance for each cloud subscriber. This gives the subscriber
complete control over role definition, user authorization, and other administrative
tasks related to security. Other providers support a multi-tenant model, which pro-
vides a predefined environment for the cloud subscriber that is shared with other
tenants, typically through tagging data with a subscriber identifier. Tagging gives
the appearance of exclusive use of the instance, but relies on the CP to establish and
maintain a sound secure database environment.
Data must be secured while at rest, in transit, and in use, and access to the
data must be controlled. The client can employ encryption to protect data in transit,
though this involves key management responsibilities for the CP. The client can
enforce access control techniques but, again, the CP is involved to some extent
depending on the service model used.
For data at rest, the ideal security measure is for the client to encrypt the data-
base and only store encrypted data in the cloud, with the CP having no access to the
encryption key. So long as the key remains secure, the CP has no ability to read the
data, although corruption and other denial-of-service attacks remain a risk.
A straightforward solution to the security problem in this context is to encrypt
the entire database and not provide the encryption/decryption keys to the service
provider. This solution by itself is inflexible. The user has little ability to access
individual data items based on searches or indexing on key parameters, but rather
would have to download entire tables from the database, decrypt the tables, and
work with the results. To provide more flexibility, it must be possible to work with
the database in its encrypted form.
An example of such an approach, depicted in Figure 16.10, is reported in
[DAMI05] and [DAMI03]. A similar approach is described in [HACI02]. Four enti-
ties are involved:
■ Data owner: An organization that produces data to be made available for
controlled release, either within the organization or to external users.
■ User: Human entity that presents requests (queries) to the system. The user
could be an employee of the organization who is granted access to the data-
base via the server, or a user external to the organization who, after authenti-
cation, is granted access.
■ Client: Frontend that transforms user queries into queries on the encrypted
data stored on the server.

■ Server: An organization that receives the encrypted data from a data owner
and makes them available for distribution to clients. The server could in fact
be owned by the data owner but, more typically, is a facilit

