Knapsack Algorithms

The first algorithm for generalized public-key encryption was the knapsack algorithm developed by Ralph Merkle and Martin Hellman in1978. It could only be used for encryption, although Adi Shamir later adapted the system for digital signatures . Knapsack algorithms get their security from the knapsack problem, an NP-complete problem. Although this algorithm was later found to be insecure, it is worth examining because it demonstrates how an NP-complete problem can be used for public-key cryptography.

The knapsack problem is a simple one. Given a pile of items, each with different weights, is it possible to put some of those items into a knapsack so that the knapsack weighs a given amount? More formally: Given a set of values $M_1, M_2,\ldots, M_n$ , and a sum $S$, compute the values of  $b_i$ such that $$S = b_1M_1 + b_2M_2 + \ldots+ b_nM_n$$

The values of $b_i$ can be either zero or one. A one indicates that the item is in the knapsack; a zero indicates that it isn’t.

For example, the items might have weights of 1, 5, 6, 11, 14, and 20. You could pack a knapsack that weighs 22; use weights 5, 6, and 11. You could not pack a knapsack that weighs 24. In general, the time required to solve this problem seems to grow exponentially with the number of items in the pile.

The idea behind the Merkle-Hellman knapsack algorithm is to encode a message as a solution to a series of knapsack problems. A block of plaintext equal in length to the number of items in the pile would select the items in the knapsack (plaintext bits corresponding to the b values), and the ciphertext
would be the resulting sum. 
The trick is that there are actually two different knapsack problems, one solvable in linear time and the other believed not to be. The easy knapsack can be modified to create the hard knapsack. The public key is the hard knapsack, which can easily be used to encrypt but cannot be used to decrypt messages. The private key is the easy knapsack, which gives an easy way to decrypt messages. People who don’t know the private key are forced to try to solve the hard knapsack problem.

Superincreasing Knapsacks
What is the easy knapsack problem? If the list of weights is a superincreasing sequence, then the resulting knapsack problem is easy to solve. A superincreasing sequence is a sequence in which every term is greater than the sum of all the previous terms. For example, {1, 3, 6, 13, 27, 52} is a super increasing sequence, but {1, 3, 4, 9, 15, 25} is not.

The solution to a superincreasing knapsack is easy to find. Take the total weight and compare it with the largest number in the sequence. If the total weight is less than the number, then it is not in the knapsack. If the total weight is greater than or equal to the number, then it is in the knapsack. Reduce the weight of the knapsack by the value and move to the next largest number in the sequence. Repeat until finished. If the total weight has been brought to zero, then there is a solution. If the total weight has not, there isn’t.

For example, consider a total knapsack weight of 70 and a sequence of weights of {2, 3, 6, 13, 27, 52}. The largest weight, 52, is less than 70, so 52 is in the knapsack. Subtracting 52 from 70 leaves 18. The next weight, 27, is greater than 18, so 27 is not in the knapsack. The next weight, 13, is less than 18, so
13 is in the knapsack. Subtracting 13 from 18 leaves 5. The next weight, 6, is greater than 5, so 6 is not in the knapsack. Continuing this process will show that both 2 and 3 are in the knapsack and the total weight is brought to 0, which indicates that a solution has been found. Were this a Merkle-Hellman knapsack encryption block, the plaintext that resulted from a ciphertext value of 70 would be 110101.

Non-superincreasing, or normal, knapsacks are hard problems; they have no known quick algorithm. The only known way to determine which items are in the knapsack is to methodically test possible solutions until you stumble on the correct one. The fastest algorithms, taking into account the various heuristics, grow exponentially with the number of possible weights in the knapsack.

The Merkle-Hellman algorithm is based on this property. The private key is a sequence of weights for a superincreasing knapsack problem. The public key is a sequence of weights for a normal knapsack problem with the same solution. Merkle and Hellman developed a technique for converting a superincreasing knapsack problem into a normal knapsack problem. They did this using modular arithmetic.

Creating the Public Key from the Private Key
Without going into the number theory, this is how the algorithm works: To get a normal knapsack sequence, take a superincreasing knapsack sequence, for example {2, 3, 6, 13, 27, 52}, and multiply all of the values by a number $n \pmod{m}$. The modulus should be a number greater than the sum of all the numbers in the sequence: for example, 105. The multiplier should have no factors in common with the modulus: for example, 31. The normal knapsack sequence would then be

2 * 31 mod 105 = 62
3 * 31 mod 105 = 93
6 * 31 mod 105 = 81
13 * 31 mod 105 = 88
27 * 31 mod 105 = 102
52 * 31 mod 105 = 37
The knapsack would then be {62, 93, 81, 88, 102, 37}.
The superincreasing knapsack sequence is the private key. The normal knapsack sequence is the public key.

Encryption u
To encrypt a binary message, first break it up into blocks equal to the number of items in the knapsack sequence(public key).Then, allowing a one to indicate the item is present and a zero to indicate that the item is absent, compute the total weights of the knapsacks—one for every message block.

For example, if the message were 011000110101101110 in binary, encryption using the previous knapsack would proceed like this:

message = 011000 110101 101110
011000 corresponds to 93 + 81 = 174
110101 corresponds to 62 + 93 + 88 + 37 = 280
101110 corresponds to 62 + 81 + 88 + 102 = 333

The ciphertext would be
174,280,333

Decryption
A legitimate recipient of this message knows the private key: the original superincreasing knapsack, as well as the values of n and m used to transform it into a normal knapsack. To decrypt the message, the recipient must first determine $n^{-1}$ such that $n.n^{-1}= 1 \pmod m$. Multiply each of the ciphertext values by $n^{-1} \pmod m$, and then partition with the private knapsack to get the plaintext values.

In our example, the superincreasing knapsack is {2, 3, 6, 13, 27, 52}, $m$ is equal to 105, and $n$ is equal to 31. The ciphertext message is 174, 280, 333. In this case $n^{-1}$ is equal to 61, so the ciphertext values must be multiplied by $61 \pmod {105}.$

174 * 61 mod 105 = 9 = 3 + 6, which corresponds to 011000
280 * 61 mod 105 = 70 = 2 + 3 + 13 + 52, which corresponds to 110101
333 * 61 mod 105 = 48 = 2 + 6 + 13 + 27, which corresponds to 101110
The recovered plaintext is 011000 110101 101110.

Practical Implementations
With a knapsack sequence of only six items, it’s not hard to solve the problem even if it isn’t superincreasing. Real knapsacks should contain at least 250 items. The value for each term in the superincreasing knapsack should be somewhere between 200 and 400 bits long, and the modulus should be somewhere between 100 to 200 bits long. Real implementations of the algorithm use random-sequence generators to produce these values.

With knapsacks like that, it’s futile to try to solve them by brute force. If a computer could try a million possibilities per second, trying all possible knapsack values would take over $10^{46}$ years. Even a million machines working in parallel wouldn’t solve this problem before the sun went nova.

Security of the Knapsack Cryptosystem

The security of the system is based on the difficulty of solving the general knapsack problem, which is NP-hard. The transformation from a superincreasing sequence to a general sequence hides the structure of the original problem, making it hard for an attacker to reverse the encryption without the private key (the superincreasing sequence and the values m and n).

However, Merkle-Hellman Knapsack Cryptosystem was eventually broken using lattice-based attacks in the 1980s, because researchers found methods to exploit the structure of the transformed knapsack.

Comments

Popular posts from this blog

Cryptographic Algorithms CST 393 KTU CS Honour Notes Semester V -Dr Binu V P

Data Encryption Standard - DES Algorithm

OSI Security Architecture