CS50x – Week 2

Written by on August 31, 2017 in Coding, CS50

Crypto

Cryptography is the practice of securing communications via encryption and decryption. Encryption is the method of taking plain text and using a key or secret to change the message into something else. Decryption is the method to reverse this and taking an encrypted message and using the secret key to reassemble the original plan text message.

Cryptography dates back to Julius Caesar who used it to communicate with his generals. The Caesar cipher as its known is changing the letter in a message by a fixed amount long the alphabet, so if it was 3 letters down, hello world becomes khoor zruog. Cryptography is now an important part of the computing world.

Arrays

A programming concept we didn’t cover last week is Arrays. An array is a contiguous block of data in memory. An array can be any data type and you can have one variable name to represent all the data in that array. This is extremely useful when you have to manipulate large amounts of data. In C you declare an array variable with square brackets [].

Strings

Strings are actually an array. An array of chars. Actually strings are more complex than that, but we’ll cover than in later weeks.

Ascii

Back in week 0 we touched briefly upon the ASCII system. The standard for mapping binary numbers to certain alphabetic characters.

A B C D E F G H I J K L M
65 66 67 68 69 70 71 72 73 74 75 76 77
N O P Q R S T U V W X Y Z
78 79 80 81 82 83 84 85 86 87 88 89 90
a b c d e f g h i j k l m
97 98 99 100 101 102 103 104 105 106 107 108 109
n o p q r s t u v w x y z
110 111 112 113 114 115 116 117 118 119 120 121 122

In C we are able to typecast between chars and integers. This means we can use 65 to display A and use c to represent 99.
There is a pattern to the ASCII characters in relation to the uppercase and lowercase characters, that is there is a difference of 32.
The C language offers several libraries to help us. ctype.h is one such library and contains a lot of useful functions for manipulation strings or chars. 2 functions are toupper() and tolower(). These functions will convert a lowercase char to uppercase and vice versa. In the string.h library contains one very useful function strlen() which retrieves the length of a string excluding the null byte. Since strings can be any length and strings are an array of chars and arrays are contiguous bits of data in memory, we need a way of knowing when one string ends and when another begins. In C we use \0 which is a null byte and used to denote the end of a string.

Command Line arguments

Our entry point in to the program is via a function called main(). It usually looks like this

int main(void){

Notice in the brackets it says void. This means the function does not take in any parameters or arguments. We can use arguments to give our program extra functionality. We add two variables to our main code in order to accept command line arguments.

int main(int argc, string argv[]){

By adding this we are now ready to accept command line arguments when running our code. argc will hold the number of arguments we have entered and argv will store in an array the arguments we’ve entered. argv[0] will always be the program name.
Another thing to note that in C functions must return something if they are given a type. The main() function is of type int and once the code executes it returns a value of 0. This will happen automatically if you know explicitly state it in your code. When you return from main() your program will exit. You can also use other values to return often for error check purposes. We can use it to avoid run-time errors and when debugging.

Problem Set 2

This week’s problem set requires us to complete 2 tasks.

  1. Implement either of:
    • Initials, less comfortable
    • Initials, more comfortable
  2. Implement any two (2) of:
    • Caesar, less comfortable
    • Vigenere, less comfortable
    • Crack, more comfortable

Initials

Initials asks us to implement a program that will print a person initials when they provide it. CS50 offered us 2 approaches to this task. The less comfortable approach assumes that the user will input names with upper and lowercase and without extra spaces or punctuation.  I decided to go with the more comfortable version which requires us to return the initials of a name even if the input is irregular, that is if the input had extra spaces or periods.

Implementing this program was quite straight forward. I knew I had to read in the given string and iterate through it and only print out the first character and the characters next to a space.

see source
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string name = get_string();
// Check to ses if string is valid
if (name != NULL)
{
// Find first character and capitalize
int index = 0;
while (name[index] == ' ')
{
index++;
}
printf("%c", toupper(name[index]));
// Iterate through the rest of the string looking for spaces and printing next Letter
for (int i = index + 1, n = strlen(name); i<n; i++)
{
if (name[i] == ' ' && name[i+1] != ' ' && name[i+1] != '\0')
{
printf("%c", toupper(name[i+1]));
}
}
printf("\n");
}
}
view raw initials.c hosted with ❤ by GitHub

 

We had to do 2 of the 3 tasks out of Caesar,Vigenere and Crack. I decided to do all three.

Caesar

Caesar requires us to implement a program that encrypts a message with Caesar’s cipher. As mentioned earlier Caesar’s cipher works by shifting each letter in a message by a number of places. So HELLO shifted along 1 place becomes IFMMP. This should wrap around so Z shifted 1 place becomes A. This program also requires that we use 1 command line argument which is the number of places to shift. The input for running the program is $ ./caesar n, where n is the input for places to shift. The program must also check to see we are entering the correct usage for arguments and return an error if we are not.

Implementing this cipher was interesting and although it didn’t prove much of a challenge I did learn a lot about manipulating strings. The key steps in this program are to get the message, check each letter and then shift its place accordingly. We check to see if its uppercase or lowercase and shift the letter down via its ascii code.

see source
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, string argv[])
{
//CHeck to see if the correct number of arguments have been entered.
if (argc != 2)
{
printf("Please enter the correct number of arguments\n");
return 1;
}
//Convert argument from string to int
int key = atoi(argv[1]);
//Get user input for plaintext
printf("plaintext: ");
string plaintext = get_string();
if (plaintext != NULL)
{
printf("ciphertext: ");
//Iterate through the string and convert each letter
for (int i = 0, n = strlen(plaintext); i < n; i++)
{
//Check for lowercase
if (islower(plaintext[i]))
{
printf("%c", ((((plaintext[i] - 97) + key) % 26) + 97));
}
//check for uppercase
else if(isupper(plaintext[i]))
{
printf("%c", ((((plaintext[i] - 65) + key) % 26) + 65));
}
//Print non-alpha characters as they are
else
{
printf("%c", plaintext[i]);
}
}
}
//Print new line to end the deciphered text.
printf("\n");
//Exit the program
return 0;
}
view raw caesar.c hosted with ❤ by GitHub

Vigenere

Vigenere requires us to implement a program that encrypts a message using Vigenere’s cipher. Vigenere’s cipher builds upon and improves Caesar’s cipher in that instead of shifting each letter a number of places, it uses another key or phrase where each key in the phrase is the number of places to shift. In Vigenere’s cipher A=0, B=1, C=2….Z=25. The key must be reused cyclically. For example encrypting the message HELLO with the key ABC would result in HFNLP. This program must also use a command line argument which is the key or phrase in which to encrypt a message.

Vigenere’s cipher is an improvement upon Caesar’s cipher so I reused some of the code although the key difference is you don’t shift each letter by some number, you now have to shift each letter by its corresponding key in the secret phrase. This meant we had to add a couple of extra steps to the code, mainly getting the key for the secret phrase, and using the key to shift to letters.

see source
#include <cs50.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, string argv[])
{
//CHeck to see if the correct number of arguments have been entered.
if (argc != 2)
{
printf("Please enter the correct number of arguments\n");
return 1;
}
else
{
for (int i = 0, n = strlen(argv[1]); i<n; i++)
{
if(!isalpha(argv[1][i]))
{
printf("Please enter the correct argument. Key must be alphabetic characters only.");
return 1;
}
}
}
//Get key from argument as a string and get the length
string key = argv[1];
int key_length = strlen(key);
//Get user input for plaintext
printf("plaintext: ");
string plaintext = get_string();
if (plaintext != NULL)
{
printf("ciphertext: ");
//Iterate through the string and convert each letter
for (int i = 0, j = 0, n = strlen(plaintext); i < n; i++)
{
//Get the key for this letter
int letter_key = tolower(key[j % key_length])-97;
//Check for lowercase
if (islower(plaintext[i]))
{
printf("%c", ((((plaintext[i] - 97) + letter_key) % 26) + 97));
j++;
}
//check for uppercase
else if(isupper(plaintext[i]))
{
printf("%c", ((((plaintext[i] - 65) + letter_key) % 26) + 65));
j++;
}
//Print non-alpha characters as they are
else
{
printf("%c", plaintext[i]);
}
}
}
//Print new line to end the deciphered text.
printf("\n");
//Exit the program
return 0;
}
view raw vigenere.c hosted with ❤ by GitHub

Crack

Crack requires us to implement a program that can crack passwords. We are given a hashed password and we must try to figure out what the password is using a brute force method.

I had a lot of fun with this program. The important thing to have learnt was how the crypt() function works. The crypt function uses a hashing algorithm to convert a plaintext password into nonsense password. The crypt function takes in 2 arguments, the plaintext password and a salt. A salt is like a key that the hashing function uses to create even more possible hashed passwords. This gives adds an extra layer of security as the salt could be almost anything thus increasing the number of tries you have to make to crack a password. Once you know how crypt() works the next step was trying to figure out a way to find the password. The pset does enforce a few rules to make it easier for us. First we can assume that any given password will not exceed 4 characters. Second we know that each password will only be alphabetic characters, upper and lowercase. Third the salt for these password is always 50. This tremendously narrows down the number of possibilities we have to check against. We are given a set of 10 hashed passwords for which we have to find the plaintext password. The best way I could think of to tackle this project was to use a 5 char array and iterate through it adding a letter at a time starting from a and ending at ZZZZ. This would give us a lot of possibilities to try but that is what a brute force method is.

It took a lot of trial and error but I finally managed to get all 10 passwords. The first obstacle I faced was that initially I only found the 4 letter passwords. This meant I was not properly checking against all the 1-letter, 2-letter and 3-letter passwords. The reason for this was because I was not adding a null char as one of chars in my character set. After I added a check for null bytes I was able to retrieve all 10 passwords.

It takes a long time to find the passwords and although the program does it’s job I believe there is a lot of room for improvement. I noticed with the way I’ve implemented the code I’m checking 1-letter, 2-letter and 3-letter passwords more than once thus leading to inflated number of tries. I can add a check to see if I’ve already checked a certain password to see if that decreases the number of tries. Another improvement I would like to implement is adding a timer to time how long it takes to crack a password. This would then allow me to modify my code to run more efficiently.

see source
#define _XOPEN_SOURCE
#include <cs50.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, string argv[])
{
//Check for correct number of arguments
if (argc != 2)
{
printf("Please enter the correct number of arguments.\n");
return 1;
}
else
{
//Initialise variables
char *hash; //
hash = argv[1];
unsigned long long tries = 0; // need to check a possible 8m combinations so we need a data type large enough
char salt[2] = {hash[0], hash[1]};
char pass[5];
char *result;
bool success;
char set[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\0";
//Knowing that the password is no longer than 4 chacters we iterate through each letter with every character
//to check every possible combination. By including the NULL character in our set above we can check
//for passwords less than 4 characters.
for (int p0 = 0; p0 <53; p0++)
{
pass[0] = set[p0];
for (int p1 = 0; p1 < 53; p1++)
{
pass[1] = set[p1];
for (int p2 = 0; p2 < 53; p2++)
{
pass[2] = set[p2];
for (int p3 = 0; p3 < 53; p3++)
{
pass[3] = set[p3];
pass[4] = '\0';
result = crypt(pass,salt); //Encrypt the combination
success = strcmp(result,hash) == 0; //Check for a match
tries++;
//Print to screen the combination we're trying and how many times we've tried.
printf("Trying: %s\n Tries: %llu\n", pass, tries);
if (success)
{
printf("The password is %s\n", pass);
return 0;
}
}
}
}
}
if(!success)
{
printf("The password is has not been found!");
}
}
return 0;
}
view raw crack.c hosted with ❤ by GitHub

 

That was week 2 and This is CS50!

Leave a Reply

Your email address will not be published. Required fields are marked *

Comments«