Goldsmiths’ 2014 Mathematics

Tuesday 22nd July

Dr Behrang Noohi

Cryptography – Lecture

Secure Communication

How can one send a secret message?

Steganography vs. Cryptography; Kerckhoff Principle

Steganography is the art or practice of concealing a message, image, or file within another message, image, or file. The word steganography combines the Ancient Greek words steganos, meaning “covered, concealed, or protected”, and graphein meaning “writing”. The first recorded use of the term was in 1499 by Johannes Trithemius in his Steganographia, a treatise on cryptography and steganography, disguised as a book on magic. Generally, the hidden messages will appear to be (or be part of) something else: images, articles, shopping lists, or some other cover text. For example, the hidden message may be in invisible ink between the visible lines of a private letter. Some implementations of steganography which lack a shared secret are forms of security through obscurity, whereas key-dependent steganographic schemes adhere to Kerckhoffs’s principle.

The first recorded uses of steganography can be traced back to 440 BC when Herodotus mentions two examples in his Histories. Demaratus sent a warning about a forthcoming attack to Greece by writing it directly on the wooden backing of a wax tablet before applying its beeswax surface. Wax tablets were in common use then as reusable writing surfaces, sometimes used for shorthand.

Another example from ancient Greece was hidden messages on messenger’s body. Herodotus tells the story of a message tattooed on the shaved head of a slave of Histiaeus, hidden by the hair that afterwards grew over it, and exposed by shaving the head again. The message allegedly carried a warning to Greece about Persian invasion plans. This method has obvious drawbacks, such as delayed transmission while waiting for the slave’s hair to grow, and the restrictions on the number and size of messages that can be encoded on one person’s scalp.

http://en.wikipedia.org/wiki/Steganography

The Ambassadors (1533) is a painting by Hans Holbein the Younger in the National Gallery, London.

As well as being a double portrait, the painting contains a still life of several meticulously rendered objects, the meaning of which is the cause of much debate.

The most notable and famous of Holbein’s symbols in the work, however, is the distorted skull which is placed in the bottom centre of the composition. The skull, rendered in anamorphic perspective, another invention of the Early Renaissance, is meant to be a visual puzzle as the viewer must approach the painting nearly from the side to see the form morph into an accurate rendering of a human skull.

http://en.wikipedia.org/wiki/The_Ambassadors_(Holbein)

Anamorphosis is a distorted projection or perspective requiring the viewer to use special devices or occupy a specific vantage point to reconstitute the image.

http://en.wikipedia.org/wiki/Anamorphic_projection

Johannes Trithemius (1 February 1462 – 13 December 1516), born Johann Heidenberg, was a German Benedictine abbot and a polymath active in the German Renaissance, as a lexicographer, chronicler, cryptographer and occultist. He took considerable influence on the development of early modern and modern occultism; among his students were Heinrich Cornelius Agrippa and Paracelsus.

http://en.wikipedia.org/wiki/Johannes_Trithemius (above left)

Hans Holbein the Younger (c. 1497[1] – between 7 October and 29 November 1543) was a German artist and printmaker who worked in a Northern Renaissance style. He is best known as one of the greatest portraitists of the 16th century.

http://en.wikipedia.org/wiki/Hans_Holbein_the_Younger (above right)

Modern methods of steganography include invisible ink, hidden words, microdots, DNA, watermarks…

The above left image is of a tree with a steganographically hidden image. The hidden image is revealed by removing all but the two least significant bits of each colour component and a subsequent normalisation. The hidden image of a cat is shown above right.

In cryptography, Kerckhoffs’s principle (also called Kerckhoffs’s desiderata, Kerckhoffs’s assumption, axiom, or law) was stated by Auguste Kerckhoffs in the 19th century: A cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

Stated simply, the security of a cryptosystem should depend solely on the secrecy of the key and the private randomizer. Another way of putting it is that a method of secretly coding and transmitting information should be secure even if everyone knows how it works. Of course, despite the attacker’s familiarity with the system in question, the attacker lacks knowledge as to which of all possible instances is being presently observed.

http://en.wikipedia.org/wiki/Kerckhoffs’s_principle

Auguste Kerckhoffs (19 January 1835 – 9 August 1903) was a Dutch linguist and cryptographer who was professor of languages at the École des Hautes Études Commerciales in Paris in the late 19th century.

http://en.wikipedia.org/wiki/Auguste_Kerckhoffs

Cryptography model

Alice and Bob share a secret key, unknown to Eve “Eavesdropper”

Alice encrypts the plaintext message with the key, forming a ciphertext.

Bob decrypts the ciphertext with the key, obtaining the original plaintext.

Eve also receives the ciphertext, but cannot understand it.

Kerckhoffs’ Principle

Eve sees the communication AND knows the system. Only the key is secret.

Encryption/decryption methods include substitution, transposition, codebook, stream ciphers,…

Substitution ciphers

Monoalphabetic substitution is where each letter is consistently replaced by another.

An example is a reversed alphabet: A → Z, B → Y, C → X, . . . HELLO → SVOOL.

The key is a permutation of the alphabet: a bijective map

In mathematics, a bijection (or bijective function or one-to-one correspondence) is a function between the elements of two sets, where every element of one set is paired with exactly one element of the other set, and every element of the other set is paired with exactly one element of the first set. There are no unpaired elements. In formal mathematical terms, a bijective function f: X → Y is a one to one and onto mapping of a set X to a set Y.

http://en.wikipedia.org/wiki/Bijection

The image below shows a bijective function, f: X → Y, where set X is {1, 2, 3, 4} and set Y is {A, B, C, D}. For example, f(1) = D.

σ : {A, . . . , Z} → {A, . . . , Z}.

Encryption: apply σ to each letter. Decryption: apply the inverse permutation σ⁻¹ to each letter

Cryptanalysis: breaking a cipher

The security of a cipher – How easy/hard is it to break? (Understand message / obtain key).

Brute force – Any cipher can be broken by trying all possible keys.

How long will it take? – Number of substitution cipher keys 26! = 26 × 25 × · · · × 1 = 403291461126605635584000000 = age of universe in nanoseconds!

Computational feasibility – Security is relative to our powers of computation.

Statistical analysis

Some letters are more common than others. The most common letters in English writing are E, T, A, O, I/N, H/S/R, . . .

Frequency analysis – Count letter frequencies in the ciphertext; replace the most common ones by E, T, A, etc.; then try and guess the others.

ZH GRQW JHW SHDU WDUWV IURP SHDFK WUHHV

Ze GRQt Jet SeaU taUtV IURP SeaFK tUeeV

Ze GRQt Jet Sear tarts IrRP SeaFK trees

Ze GoQt Jet Sear tarts IroP SeaFK trees

we don’t get pear tarts from peach trees

Modular arithmetic

Caesar cipher – Previous example used shift by 3: A → D, B → E, C → F, . . . , Z → C.

In cryptography, a Caesar cipher, also known as Caesar’s cipher, the shift cipher, Caesar’s code or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by A, E would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

http://en.wikipedia.org/wiki/Caesar_cipher

A numerical interpretation – Identify A, . . . , Z with 0, . . . , 25. Encode e(x) = x + 3 mod 26. Decode d(x) = x − 3 mod 26.

General shifts

Suppose we use an m-letter alphabet, identified with 0, . . . , m − 1. Encode e_n(x) = x + n mod m. Decode d_n(x) = x − n mod m.

Caesar’s revenge: polyalphabetic ciphers

Examples: one-time pad, Vigenere, Enigma Machine,…

In cryptography, a one-time pad (OTP) is an encryption technique that cannot be cracked if used correctly. In this technique, a plaintext is paired with random, secret key (or pad). Then, each bit or character of the plaintext is encrypted by combining it with the corresponding bit or character from the pad using modular addition. If the key is truly random, and at least as long as the plaintext, and never reused in whole or in part, and kept completely secret, then the resulting ciphertext will be impossible to decrypt or break. It has also been proven that any cipher with the perfect secrecy property must use keys with effectively the same requirements as OTP keys. However, practical problems have prevented one-time pads from being widely used.

The one-time pad – Keep changing amount we shift by! Let’s use binary alphabet {0, 1}.

The secret key is a random binary string, say k = 01100110.

Encryption, decryption both m 7→ m + k (bitwise addition mod 2): e.g.

e(10101010) = 10101010 + 01100110 = 11001100,

d(11001100) = 11001100 + 01100110 = 10101010.

Pro: Unbreakable! If k is random then so is m + k: it contains no information about m. (Shannon’s Theorem)

Con: Inefficient! k is as long as m: it begs the question of how Alice and Bob managed to agree on k.

More efficient: short k and long m, break m into blocks b₁, b₂, · · · , encode as b₁ + k, b₂ + k, . . . . (But this is breakable.)

http://en.wikipedia.org/wiki/One-time_pad

In information theory, the Shannon–Hartley theorem tells the maximum rate at which information can be transmitted over a communications channel of a specified bandwidth in the presence of noise.

http://en.wikipedia.org/wiki/Shannon%E2%80%93Hartley_theorem

The Vigenere cipher

The Vigenère cipher is a method of encrypting alphabetic text by using a series of different Caesar ciphers based on the letters of a keyword. It is a simple form of polyalphabetic substitution.

We don’t communicate in binary! Cipher easier to remember if we use A..Z.

The secret key is a word; each letter represents the shift from A to that letter; e.g. CAESAR ↔ +2,+0,+4,+18,+0,+17.

Example: ‘The rain in Spain falls mainly on the plain.’

Confusion is created since at different times (i) the same letter may be encoded differently, and (ii) different letters may be encoded identically!

http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher

Breaking the Vigenere cipher

It is much harder than a substitution, but it has weaknesses…

Suppose we know the key length, say it is 6. Just look at the letters in positions 6,12,18,… they are encoded by the same shift: can use frequency analysis! Repeat for other remainders mod 6.

How to get the key length? We could guess. Or use more sophisticated statistics…

Kasiski method: Look for repeated consecutive pairs (digrams) or triples (trigrams). The key length probably divides the distance between them.

In cryptanalysis, Kasiski examination (also referred to as Kasiski’s Test or Kasiski’s Method) is a method of attacking polyalphabetic substitution ciphers, such as the Vigenère cipher. It was first published by Friedrich Kasiski (29 November 1805 – 22 May 1881) in 1863, but seems to have been independently discovered by Charles Babbage as early as 1846

http://en.wikipedia.org/wiki/Kasiski_examination

http://en.wikipedia.org/wiki/Friedrich_Kasiski

Charles Babbage, FRS (26 December 1791 – 18 October 1871) was an English polymath. He was a mathematician, philosopher, inventor and mechanical engineer, who is best, remembered now for originating the concept of a programmable computer.

http://en.wikipedia.org/wiki/Charles_Babbage

http://en.wikipedia.org/wiki/Bigram

http://en.wikipedia.org/wiki/Trigram

Enigma Machine

An Enigma machine was any of a family of related electro-mechanical rotor cipher machines used in the twentieth century for enciphering and deciphering secret messages. Enigma was invented by the German engineer Arthur Scherbius (20 October 1878 – 13 May 1929, German electrical engineer) at the end of World War I. Early models were used commercially from the early 1920s, and adopted by military and government services of several countries—most notably by Nazi Germany before and during World War II. Several different Enigma models were produced, but the German military models are the most commonly discussed.

It uses a polyalphabetic substitution cipher used by Germans in WWII and broken by Polish and British cryptologists.

Permutation of alphabet implemented by a set of rotors (and a plugboard)

The permutation changes with each keystroke (ie, rotors turn).

German procedural flaws, operator mistakes, laziness, failure to systematically introduce changes in encipherment procedures, and Allied capture of key tables and hardware enabled Allied cryptologists to succeed in breaking the code during the second world war,

http://en.wikipedia.org/wiki/Enigma_machine

http://en.wikipedia.org/wiki/Arthur_Scherbius

Key exchange

Diffie-Hellman idea: method where key is public knowledge?! How could this possibly work?

Diffie–Hellman key exchange (D–H) is a specific method of exchanging cryptographic keys. It is one of the earliest practical examples of key exchange implemented within the field of cryptography. The Diffie–Hellman key exchange method allows two parties that have no prior knowledge of each other to jointly establish a shared secret key over an insecure communications channel. This key can then be used to encrypt subsequent communications using a symmetric key cipher.

http://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange

RSA is one of the first practicable public-key cryptosystems and is widely used for secure data transmission. In such a cryptosystem, the encryption key is public and differs from the decryption key which is kept secret. In RSA, this asymmetry is based on the practical difficulty of factoring the product of two large prime numbers, the factoring problem. RSA stands for Ron Rivest, Adi Shamir and Leonard Adleman, who first publicly described the algorithm in 1977. Clifford Cocks, an English mathematician, had developed an equivalent system in 1973, but it wasn’t declassified until 1997.

http://en.wikipedia.org/wiki/RSA_(cryptosystem)

http://en.wikipedia.org/wiki/Ron_Rivest

http://en.wikipedia.org/wiki/Adi_Shamir

http://en.wikipedia.org/wiki/Leonard_Adleman

http://en.wikipedia.org/wiki/Clifford_Cocks

Issues in modern cryptography

Message Integrity: Can Eve crucially change the meaning of a message she cannot entirely read (e.g. the amount in a bank transaction)?

Digital Signatures: Eve sees some signed messages, can she forge a signature?

Communication protocols: Zero-knowledge proof, Multiparty secrets, Elections, Digital cash…

Conclusion

Cryptography concerns secure communication. Unlike steganography (making the message obscure), the assumption (Kerckhoffs’ Principle) is ‘Eve knows the system; only the key is secret’.

Ciphers are various methods of using the secret key to encrypt/decrypt a message, e.g. Substitution, Vigenere, Permutation, . . .

Security is always relative to computational power, and in fear of an ingenious unforseen attack.

Public Key Cryptography provides great flexibility, but its security is only empirical.

Modern cryptography has evolved into a diverse field of theoretical and practical research.

I would thank Peter Keevash for letting me use images from his slides.

Cryptography Workshop

Personally I love puzzles and I enjoyed this workshop but if I was a decoder during the Second World War it would have finished before I had had got one of the problems solved

Have a go at the puzzles but don’t ask me for the answers.

Easy Cipher

In English – possibly with padding (non-standard case for last one):

1) CQRB XWN RB ZDRCN NJBH. RC SDBC OXA FJAVDY

2) XJSIF MZSIW JIXQF AJXFX YWNGZ YJYTW TRJXX

3) ORJNE RGURV QRFBS ZNEPU

4) Sio uly ch jlcmih ch u zilycah wiohnls uhx uff siol gucf cm vycha lyux vs nby jlcmih aoulxm. Qbun gynbix iz wlsjnialujbs il mnyauhialujbs gcabn sio omy ni myhx u mywlyn gymmuay ni siol fuqsyl, uhx qbs?

Intermediate Cipher

1) GWNU JANL BGWN LBGZ HDTA RJUU BAGR QGWN DHWR RIRQ LBGW NLBG ZHBI DHZN OHND BGXL JINL EABH NDAN DNBA HWBO KURD GTAB KJBG NGNB HWZO TZOB ITNE ABHR LEZO BGRA ZHDB OBIV DZDT NRLN GAVG RURI RTVI RTZH BOKU AREB EZIZ GVXX

2) JWDWX NRJRJ TSXWD PJVJW ZBIJI JRXMZ BIQDM JADYD MDTAQ BRDTS JWXTJ MQBAB QTXGS DQRJT SXWDV FVDYX WASDA PDWAX DASTD WAFQZ IBQDW TXGSD QXWNJ WYYDT XGSDQ XWNVD TQDAR DVVJN DVDWX NRJPJ VXWKD WADYO ZASDN DQRJW DWNXW DDQJQ ASFQV TSDQO XFVJA ASDDW YBIPB QMYPJ QXXXX

3) Lti cddopi softiv oq c lmfi kd ukpkczftcxilos qgxqlolglokp softiv, wtiviop icst zilliv op cp czftcxil oq ucffin lk olq pguivos iagobczipl, ipsvmflin gqopy c qoufzi ucltiucloscz dgpslokp, cpn skpbivlin xcse lk c zilliv. Lti dkvugzc gqin uicpq ltcl icst zilliv ipsvmflq lk kpi kltiv zilliv, cpn xcse cycop, uicpopy lti softiv oq iqqiploczzm c qlcpncvn qgxqlolglokp softiv wolt c vgzi ykbivpopy wtost zilliv ykiq lk wtost.

Hard Cipher

1. VGVFN GEHGU HAVIR EFNYY LNPXA BJYRQ TRQGU NGNFV ATYRZ NAVAC BFFRF FVBAB SNTBB QSBEG HARZH FGORV AJNAG BSNJV SRUBJ RIREY VGGYR XABJA GURSR RYVAT FBEIV RJFBS FHPUN ZNAZN LORBA UVFSV EFGRA GREVA TNARV TUOBH EUBBQ GUVFG EHGUV FFBJR YYSVK RQVAG URZVA QFBSG URFHE EBHAQ VATSN ZVYVR FGUNG URVFP BAFVQ RERQG UREVT UGSHY CEBCR EGLBS FBZRB ARBEB GUREB SGURV EQNHT UGREF ZLQRN EZEOR AARGF NVQUV FYNQL GBUVZ BARQN LUNIR LBHUR NEQGU NGARG URESV RYQCN EXVFY RGNGY NFGZE ORAAR GERCY VRQGU NGURU NQABG OHGVG VFERG HEARQ FURSB EZEFY BATUN FWHFG ORRAU RERNA QFURG BYQZR NYYNO BHGVG ZEORA ARGZN QRABN AFJRE QBLBH ABGJN AGGBX ABJJU BUNFG NXRAV GPEVR QUVFJ VSRVZ CNGVR AGYLL BHJNA GGBGR YYZRN AQVUN IRABB OWRPG VBAGB URNEV ATVGG UVFJN FVAIV GNGVB ARABH TU

2. YFYZX QGMMF QQVGO OUZSB CXFXN FFACI LYCQO ZIXOZ XUBIA YFQAZ CWMZQ FYCQL ZWWFA EZQGI XFWAY FWBBV BVZYC LYVCS FQABW CFXYB GQFZI XEZQN BWFOC TFZMG PJBZW XAYZI ZWBBN AYFOZ IXOZX UEYBP WBSCX FXYCN ECAYL ZWWFA XCIIF WQZIX ZAAFI XZIMF OCSFX BIAYF VOBBW JFOBE ZIXFS FWUAC NFYFE FIABG AYFEZ QBJOC LFXAB PZQQY FWTCA MYFIA YFXBB WBVEY CMYCI SZWCZ JOUQA BBXBP FIZIX FZMYA CNFYF PZQQF XAYFU BGILN ZIYZX ZQCMT VWCLY AFIFX VFFOC ILEYC MYNZX FYCNQ MBEOZ IXVFF OZQYZ NFXYF EZQYB PFOFQ QOUCI XFJAA BYCQO ZIXOZ XUZIX EZQZV WZCXB VNFFA CILYF WAYCQ EZQIB AJFMZ GQFYF EZQMB EZWXO UZIXZ JDFMA KGCAF AYFMB IAWZW UJGAV BWQBN FACNF PZQAY FYZXJ FFICI ZIBSF WQAWZ CIFXC WWCAZ JOFMB IXCAC BISFW LCILB IYUPB MYBIX WCZYF YZXJF MBNFQ BMBNP OFAFO UZJQB WJFXC IYCNQ FOVZI XCQBO ZAFXV WBNYC QVFOO BEQAY ZAYFX WFZXF XNFFA CILIB ABIOU YCQOZ IXOZX UJGAZ IUBIF ZAZOO YFEZQ MWGQY FXJUP BSFWA UJGAA YFZIH CFACF QBVYC QPBQC ACBIY ZXBVO ZAFMF ZQFXA BEFCL YGPBI YCNYF YZXLC SFIGP ZAAFI XCILA BNZAA FWQBV PWZMA CMZOC NPBWA ZIMFY FYZXO BQAZO OXFQC WFABX BQB

3. LAROF ITGNO SUIEM GOTDE EBOTO LRAED EMOSY SEMIT INEHW UPDAH MTUOT DNACY EYMEL OWSEY LCDLU OSESO KCIUQ AHTYL DAHIT VETON MITNE ASOTE OGMIY OTGNI PEELS AHDNA HNAFL ALRUO HTRET UOHTE HTTHG WTITA MITSA OGOTE ELSOT UOWPE AWADL EMNEK LUOWI TYRTD ATUPO HTYAW KOOBE HCIHW GAMII WDENI ITSSA MNILL DNAHY TDNAS WOLBO HTTUO HGILE DAHIT TNEEB IKNIH LLAGN ITEHT IHWEM AWIEL ELSAS WFOPE HITAH SUJDA NEEBT IDAER TUBGN OHTYM STHGU URDAH OTNIN NAHCA FOLEN RIEHT NUNWO MILIT FLESY EMEES UTCAD TYLLA EVAHO MOCEB SEHTE CEJBU YMFOT AKOOB CRUHC AUQAH TTETR VIREH BYRLA EEWTE NARFN ISIOC HCDNA SELRA SIHTV ERPMI NOISS DLUOW ISREP ROFTS MEMOS TNEMO ETFAS SAWIR EKAWA DIDTI IDTON BRUTS NIMYM ITUBD LYALT CSEKI USELA YMNOP ASEYE ERPDN ETNEV MEHTD RMORF TSIGE GNIRE AFEHT AHTTC CEHTT ELDNA ONSAW EGNOL NRUBR HTGNI WTINE BDLUO TNIGE MEESO TNINU GILLE AELBI TEHTS HGUOH AFOST EMROF SIXER ECNET BTSUM RAOTE ACNIE ETANR IRIPS SEHTT CEJBU YMFOT WKOOB SDLUO ARAPE STIET RFFLE LEMMO NIVAE RFEMG COTEE ESOOH HTEHW OWIRE OFDLU RAPMR TIFOT ONR

Some statistics related to the hard ciphers above

Hints

1. Cipher statistics can help you guess what type of cipher you are dealing with.

2. In a substitution cipher, the letter frequencies after decoding should be similar to those in everyday English.

3. Rigidly going by the letter frequencies is unlikely to work perfectly, as there will be some natural variation in any particular text. You’ll need some trial-and-error to figure out which of several likely possibilities is actually correct.

4. Think about what other methods you can use to guess parts of the text. Digraph frequencies can give extra clues, and you might spot repeated patterns that are likely to be whole words.

5. It is sometimes possible to break a cipher that uses a completely unfamiliar method. Somehow you need to get inside the mind of the creator of the cipher. Bear in mind that it has to be simple enough to be convenient for everyday use (and to be feasible for you to break in a short interval of time!)

Points for discussion

1. What was easy, what was hard, and why?

2. How could the ciphers have been made more secure?

3. Could you write down some rules that would help someone else break similar ciphers?

4. Do you need inspiration to break a cipher, or can the process be automated? What parts can be automated?

A brief guide on how to break affine ciphers

AFFINE SUBSTITUTIONS

• For fixed integers a and b, with gcd(a, 26) = 1, we can define a substitution by θ_a,b : x 7→ ax + b (mod 26). In this formula, we assume that the English alphabet is enumerated by 0, 1, 2, . . . , 25. (Eg, a = 0 and z = 25.)

• These are slightly harder than Caesar ciphers, but still quite easy to solve.

Example: Suppose we have discovered (say, by frequency analysis) that the letters c = 2 and f = 5 in the plaintext have been encrypted to H = 7 and Q = 16 in the ciphertext, respectively. This would be enough to determine the affine substitution.

Namely, we need to solve this system of congruence equations to find a and b:

2a + b ≡ 7 (mod 26) and 5a + b ≡ 16 (mod 26).

The solutions is a = 3, b = 1. The affine substitutions θ_3,1 can be depicted as:

A table of letter frequencies in English (and other languages)

The substitution cipher tool

You can use the form below to perform substitution on a text: either to encode a text using a substitution cipher or as a helper in trying to decode one. Just type the text into the text area, in place of the example text, and fill in the substitutions you want to apply.