Programming Project

Due date: Monday, March 31, 2014 (firm).

Problem Description

Main Project

You are given a number n that is known to be composite. Furthermore, n = p * q for two prime factors p,q. Your goal is to find the prime factors p,q for given n.

A list of numbers will be given in this page to help you test and calibrate your code. Each number in the list must be factored in under a minute.

Sub-Project As a sub-project, once you have your factorizer in place, you will be able to break certain toy RSA encrypted messages. There will be python scripts provided that will automate the process for you.

Approach

The simple approach is to improve upon the following brute force method:

Brute Force Factoring
  int findOneFactor( int n){
     int m = sqrt(n);
     for (i = 2 ; i < m; ++i)
        if ( n % i == 0)
           return i;
  }

There are a few things that you should figure out on your own:

  • The brute force approach works with machine native integers. However, the numbers involved will easily be larger than machine native integer types. Can we work with exact arithmetic packages or bignums in the language of your choice?

  • Skip over certain numbers:

    • The brute force approach above tests even numbers as well as odd.

    • We can do a better job tackling multiples of 3. How about multiples of 5,7,11,13,…

  • Parallelization: Explore multithreading.

Advanced Ideas

What other factoring algorithms are at our disposal? There are a few that are fairly accessible. These include Pollard's-Rho algorithm. I have posted a tutorial on it. Can you implement it?

Rules

I hope all students adhere to the rules below. Not doing so constitutes cheating and a violation of the honor code:

  • You cannot take code someone else wrote off the internet and pass it off as your own.

  • You cannot take someone else's code, edit it and pass it off as your own.

  • Please do not use scripts that others wrote.

It is very easy to find out if you did so using sophisticated plagiarism detection tools that we will run.

You can solve the problem individually, or team up with a friend (at most two students per team). If you team up with someone, the team will need to do a lot more to qualify for the same credit. In other words, any team effort will necessarily have to explore some advanced ideas.

If you need systems assistance setting up or compiling libraries you are welcome to approach me or anyone else.

Deliverables

Full submission instructions will be posted via email as the deadline nears.

  1. Code tarball with detailed instructions on how to run.

  2. Short writeup detailing what was done and expt. observations if any.

  3. Interview grading: we will schedule slots during the first two weeks of April.

Grading

But let us say baseline is a C-. You get bumps up for doing more than the bare minimum (which would be to turn in the function above). Suggested features that will help increase your grade:

Basics:

  1. Code compiles OK?

  2. Code works and has been tested?

  3. Code handles arbitrary precision integers and does not overflow.

  4. Code is cleanly written documented, easy to use?

  5. Can handle some 15 digit numbers in under 30 seconds?

  6. Can handle 20 digit numbers in under 30 seconds?

More advanced:

  1. Comparing different approaches with good experimental setup, drawing some conclusions.

  2. Multithreaded implementations

  3. Some innovative attempts to improve performance.

  4. Simply off the charts in terms of capabilities.

  5. Can decrypt some of the messages below?

Resources

List of numbers: TXT file

Each line in the file represents a number to factor. How far can your code go down this list?

Extra Credit Based on the way the list was generated, there is a trivial sneaky way to factor the entire list rather trivially, as Prof. Black pointed out last year :-)

I will post some python scripts and secret messages for you to decode by factoring prime numbers.

Python Code for RSA

I have posted three python files:

Here is a simple demo of how to use them to encode and decode messages.

Step 1: Make up some RSA Keys

To create a key, we run

running key generator

python rsaKeyGen.py 20

The argument 20 to the command above specifies that you are interested in keys of 20 digits long. We use a random number generator internally to get random primes. So the output is going to differ each time. You will get an output that looks like this:

Expected Output

p = 76109378112451927699 q = 95587773022304816707

n = 7275125959881829066786422208675011267193

k = 7275125959881829066614725057540254522788

e = 2425041986627276355538241685846751507597

d = 4850083973254552711076483371693503015193

We can now create two files one called publickey.txt and other called privatekey.txt. You can choose any other names as well.

File publickey.txt should contain two lines with first line n and second line e.

Public Key File

7275125959881829066786422208675011267193

2425041986627276355538241685846751507597

File privateKey.txt should contain two lines with first line n and second line d. You can download a sample privatekeyfile here.

Private Key File

7275125959881829066786422208675011267193

4850083973254552711076483371693503015193

Step 2: Encode a Message

To encode a message run the script rsaencode.py

python rsaencode.py

It will prompt you for various things.

Running RSA Encoder

python rsaencode.py

Enter the string to encode: haha, I can encode stuff that you cannot decode

Enter key file: publickey.txt

>> I will read keys from: publickey.txt

>> n= 7275125959881829066786422208675011267193 e= 2425041986627276355538241685846751507597

104, 97, 104, 97, 44, 32, 73, 32, 99, 97, 110, 32, 101, 110, 99, 111, 100, 101, 32, 115, 116, 117, 102, 102, 32, 116, 104, 97, 116, 32, 121, 111, 117, 32, 99, 97, 110, 110, 111, 116, 32, 100, 101, 99, 111, 100, 101

>> Encoded Message Stream: 104L, 97L, 104L, 97L, 44L, 32L, 73L, 32L, 99L, 97L, 110L, 32L, 101L, 110L, 99L, 111L, 100L, 101L, 32L, 115L, 116L, 117L, 102L, 102L, 32L, 116L, 104L, 97L, 116L, 32L, 121L, 111L, 117L, 32L, 99L, 97L, 110L, 110L, 111L, 116L, 32L, 100L, 101L, 99L, 111L, 100L, 101L

Where should I write the encoded message to? message

>> done.

Using the public key in file publickey.txt the message “haha, I can encode stuff that you cannot decode” has been encoded into a stream of numbers and written out to a file message.

Step 3: Decoding a Message

The file message created in step 2 has a bunch of numbers in it. We will decode it using the script rsadecode.py.

python rsadecode.py

Here is how the run will look like

python rsadecode.py

Enter filename to decode: message

>> OK! Encoded msg: 104, 97, 104, 97, 44, 32, 73, 32, 99, 97, 110, 32, 101, 110, 99, 111, 100, 101, 32, 115, 116, 117, 102, 102, 32, 116, 104, 97, 116, 32, 121, 111, 117, 32, 99, 97, 110, 110, 111, 116, 32, 100, 101, 99, 111, 100, 101

Enter key file: privatekey.txt

>> I will read keys from: privatekey.txt

>> n= 7275125959881829066786422208675011267193 d= 4850083973254552711076483371693503015193

>> decoded message: haha, I can encode stuff that you cannot decode

Everything works out and you notice that the original message is back. Let us say we do not have the correct privatekey file. We will use a file with some junk numbers called junkkey.txt. Here is what happens.

python rsadecode.py

Enter filename to decode: message

>> OK! Encoded msg: 104, 97, 104, 97, 44, 32, 73, 32, 99, 97, 110, 32, 101, 110, 99, 111, 100, 101, 32, 115, 116, 117, 102, 102, 32, 116, 104, 97, 116, 32, 121, 111, 117, 32, 99, 97, 110, 110, 111, 116, 32, 100, 101, 99, 111, 100, 101 Enter key file: junkkey.txt

>> I will read keys from: junkkey.txt

>> n= 109831120598125401981124 d= 129841928492184128728711

>> decoded message: X�X�h�h’��hq�’7�qh-xxhhX�hhM7-h’���7hh�q’7�q

Message Decrypting Tasks

Now that everything is in place and you have a prime number cracking utility, go ahead and attempt to crack the following messages: