Key to the sea – CTF Write Up

By | May 1, 2021

It takes 14 minutes to read this blog.
Updated May 2nd, 2021: adding some clarity on several parts.

It’s been a long time since I played my last CTF. Most of the CTFs I played were on the weekend and it was only 24 hours. But this CTF is special because it runs internally. Kinda fun, kudos to the committee who prepared this environment for a beginner like me. This writing is intended for those who are still in beginner’s level. Please feel free to comment below.

The Challenge: Reverse Engineering 200 pts

This is the most challenging task that made me think harder than other (simple) questions. There’s no clue at all, just simply links to resources and one line prologue: “Why don't you play with us ;-)

Perhaps, just like any other Reversing Task, this is quite straightforward: Reverse Engineering the binary to see what’s inside and use the information to the other files. First, let’s see what are the file types for those files using command file.

My guess at that time was, the message.txt was the message that need to be decoded/decrypted/reverse processed by using the C program provided, after the {enter_key_here} can be solved by understanding what process inside the rev_1 file.

The Executable File

So, I started to see what is the process inside the executable file rev_1. As you can see, the binary file is an executable file in Apple macOS environment only. Mach-O is the file type of executable file only for Apple macOS, it is not ELF for Linux OS or PE32 for Microsoft Windows OS. Don’t forget to pay attention to the processor type. Above sample is using an x86_64 type processor, the regular Intel/AMD processor. Sometimes, when playing CTF, we need to prepare our environment to anticipate binary files like this. You need to have your own hardware to run the binary for specific processor type, or find a good emulator or VM to run the binary.

Before you execute the file, make sure that you already changed the execution bit for the file. Otherwise, the file would not be able to be executed.

Executing the rev_1 file, it gave me lots of questions need to be answered, that might lead to give the key to the C file. It matches with the title of the task and the category of the task. It also matches with the First Hints provided in the platform.

This makes me confident that I am in the correct direction. But executing the binary by providing the correct answers was not giving me an informative clue or key. It only gave me a cryptic message: “The end is the beginning

Perhaps this is the time to do the reversing task. Using ghidra, I can’t simply understand what was the routine inside this executable file. I found a lot of string dyLib inside this executable file. My guess at that time was this binary file is a python script that was packed as an executable file using py_compile, and the python script part was compressed/decoded/hidden on the certain part of the Mach-O.

So, I went another route to find how to decompile or unpack the file to give me the idea on the behavior of the executable files. Perhaps there are one or two arguments needed to be passed when executing the binary file, and the application might behave differently based on supplied arguments. In Linux OS, I usually do the binary dynamic analysis using strace or ptrace command. But in my macOS, there’s no strace or ptrace. Hence, I followed the tutorial from here on how to do debugging Mach-O application via CLI.

Instead, I became confident that by observing the binary too deep made me following the rabbit hole when found several cpython strings. These are the indicators that you went too far from the task. Most of the time, decompiling cpython usually exist on the hard level and with bigger point. This level is unknown, but the score is “only” 200, which I guessed that this is a medium level task.

So, back to the result. After translating 5 binary numbers to ASCII chars, I was assuming that the 5-chars word is the key to the C program rev_2. But not sure with the sentence: “First milestone” and “The end is the beginning“. Perhaps it has to be in reversed order? Not sure. Let’s deal with the C Program part.

The C Program

I did a quick observation of the program, and it performed some custom “encryption” and the players need to “decrypt” the algorithm to see the flag. Because the message.txt is in a non-readable state, and it satisfies the Confidentiality in CIA Triangle (Confidentiality-Integrity-Availability) that only the one who has the key can read the message. Take a look at the source code in rev_2.c file.

#include <stdio.h>
#include <stdlib.h>
typedef unsigned char Byte;
int main(int argc, char** argv)
{
  if (argc < 3) {
    printf("Use: %s <input_file> <output_file>\n\n", argv[0]);
    exit(-1);
  }
  FILE* in_file = fopen(argv[1], "r");
	FILE* out_file = fopen(argv[2], "w+");
	Byte letter;
	Byte temp;
  while(fscanf(in_file, "%c", &letter) != EOF)
  {
    temp = (letter >> 7) & 1;
    letter = letter << 1;
    letter = letter | temp;
    letter = ~letter;
    
    letter = letter ^ {Enter_key_here};
    
    fprintf( out_file, "%c", letter );
  }
	fclose(in_file);
	fclose(out_file);
}

If you are not familiar with C Programming Language, that’s okay. It is easy to understand, as long as you have at least one good understanding in coding or scripting. In C, the main part is between int main() { until closing bracket }. This program, after getting compiled as binary, takes 2 additional arguments. The first argument that is argv[0] is the executable filename. The second argument that is argv[1] is the input file, which will be stored as variable in_file. While the last argument that is argv[2] is the output file, which will be stored as variable out_file.

As in_file is for input, the operation is only read, denoted with “r“. While out_file is the output with write operation denoted with “w+“. The “+” sign means that if the filename already exists, then the file size will be truncated into 0 bytes, to make sure the file is empty.

Now, here comes the important part: encryption loop. This process happens between while() { until closing bracket }. In the while loop, the programming block between { and } will be repeated as long as the condition is still satisfied. In this case, the function fscanf will read one character (denoted with "%c") at the time until EOF (end of file).

Bitwise operations in C Language

Before we observe the while loop, it is important to understand what operators are involved in the while loop. Complete list of C operators can be searched online, one of them is here. And it also useful to observe the ASCII table to see what hexadecimal or binary representation of each character.

  • Shift right, >>, bitwise operation to shift the bit to the right direction. For example, the “G” character that was part of the key mentioned above, is represented in binary as 01000111. If in the program written as “letter >> 7“, means that the bit position will be moving to the right direction: from
    01000111, to
    00100011,
    00010001,
    00001000,
    00000100,
    00000010,
    00000001, and ends in
    00000000.
    Imagine a 1-bit with value 0 is pushing the other bits from left to the right direction.
  • Shift left, <<, bitwise operation to shift the bit to left direction. If the “G” letter was “letter << 1” then the binary moved from 01000111 to 10001110. Just imagine a 1-bit with value 0 is pushing the other bits from right to the left direction.
  • AND, &, bitwise operation AND. This is need to remembered that
    1 & 1 = 1,
    1 & 0 = 0,
    0 & 1 = 0,
    0 & 0 = 0
  • OR, |, bitwise operation OR. This is also need to remembered that
    1 | 1 = 1,
    1 | 0 = 1,
    0 | 1 = 1,
    0 | 0 = 0
  • COMPLEMENT, ~, bitwise operation 1’s or One’s COMPLEMENT, means that invert all 1 to 0, and 0 to 1. So letter “G” in binary in 01000111 is having complement of 10111000
  • XOR, ^, bitwise operation XOR. This is also need to remembered that
    1 ^ 1 = 0,
    1 ^ 0 = 1,
    0 ^ 1 = 1,
    0 ^ 0 = 0

Understanding the encryption loop

By having a good understanding of bitwise operation in the loop, we can understand how the message.txt was created. The file message.txt was created by compiling the rev_2.c, and execute it. How do we know that this rev_2.c is the program to generate the message.txt? Well, it’s a hunch 😆But there’s another hint provided for this task that shows the argv[0] is rev_2 (this is the filename that being used as the output while compiling the C program), argv[1] is redacted, but it is to denote that this is the plaintext that will be encrypted, and argv[2] is message.txt denotes the ciphertext as the output of the rev_2 program.

By understanding the flow on how the message.txt was created, we can create a reverse process of the loop to make the ciphertext become the plaintext again. As the analogy, if you were asked : (X + 4) * 2 = 10, then you know that the reverse process of addition is subtraction, and multiplication is division. Hence, X = 10 / 2 – 4 (remember PEMDAS?). So, if we know that the operation was

plaintext [ops] key = message.txt,

then to get the the plaintext we can do

plaintext = message.txt [rev_ops] key

The [ops] part is actually part of these steps inside the while loop:

  1. temp = (letter >> 7) & 1;
    create a variable temp from the current letter, but right shifted to 7 position, and do AND 1.
  2. letter = letter << 1;
    current letter is left shifted to 1 position…
  3. letter = letter | temp;
    … then OR with temp
  4. letter = ~letter;
    … then COMPLEMENT…
  5. letter = letter ^ {Enter_key_here};
    … then XOR with the key.

If you notice, that on the step 1, for all visible and printable ASCII characters ranging from SPACE (denoted as 0x20 in Hexadecimal, or 00100000) until DEL (denoted as 0x7F in Hexadecimal or 01111111), you will notice that any visible characters that might be on the plaintext (I assume that the flag was written in ASCII and readable characters only), then step 1 is always storing the 0 value to the variable temp. Which then make all the steps above can be replaced with these steps:

  1. letter = letter << 1;
    current letter is left shifted to 1 position…
  2. letter = letter | 0;
    … then OR with 0
  3. letter = ~letter;
    … then COMPLEMENT…
  4. letter = letter ^ {Enter_key_here};
    … then XOR with the key.

The reverse operations of the encryption loop

To get the [rev_ops], then we need to do the process in reversed order, similar with the arithmetics example above.

  1. letter = letter ^ {Enter_key_here};
    Read the current char, XOR with the key (the reverse of XOR is XOR),
  2. letter = ~letter;
    … then COMPLEMENT it… (the reverse of 1’s COMPLEMENT is 1’s COMPLEMENT)
  3. letter = letter | 0;
    … then OR with 0 … (the reverse of OR is OR)
  4. letter = letter >> 1;
    … then right shift to 1 position (the reverse of SHIFT LEFT is SHIFT RIGHT)

That makes the reverse of rev_2.c become this C code below. Oh by the way, I am a 1TBS programmer, so I changed it accordingly to match my style 😁

#include <stdio.h>
#include <stdlib.h>
typedef unsigned char Byte;
int main(int argc, char** argv) {
  if (argc < 2) {
    printf("Use: %s <input_file> <output_file>\n\n", argv[0]);
    exit(-1);
  }
  FILE* in_file = fopen(argv[1], "r");
	Byte letter;
	Byte temp;
  while(fscanf(in_file, "%c", &letter) != EOF) {
    letter = letter ^ {Enter_key_here};
    letter = ~letter;
    letter = letter | 0;
    letter = letter >> 1;
    fprintf( out_file, "%c", letter );
  }
	fclose(in_file);
}

Let’s get the flag!

But wait, what is the key {Enter_key_here} for this encryption?

The key to the sea

Well, that’s the good question. How do we know which key for this process? Remember the 5-letter word we had by executing the binary rev_1? Yes! That’s the confusing part. So, I thought that that word becomes the key for the encryption process above. Since the length of the message in message.txt (84 bytes or 84 chars) is longer than the key (5 bytes or 5 chars), I extended the key to match the message size. So, for example, the plain text is “This is plain” and the key is “KEY”. Then the extended key is as follow:

So, if the [ops] in only XOR, then the message at char-nth will be XOR-ed with Key at char-nth, i.e. T to the K, h to the E, i to the Y, [space] to the K, so on and so forth. Then I implemented it on my code, and compile it using gcc in macOS.

gcc -o rev_2 rev_2.c
where rev_2 will be the binary file after the compilation of rev_2.c is success.

…. of course the output was incorrect. I thought that the other sentence is the hint: “The end is the beginning“, then I use the reversed order key. But the result is still the same: the output was not readable.

Then I was thinking of other things that might be missed. When the executable file printed “First milestone.”, I was expecting the “second milestone”. But how do I reach that milestone? Reversing the cpython is not my things right now, and I was trying to extract the real source code of the rev_1 using some tutorials from the web. But still, none to avail. Then I gave up and went to sleep.

While doing my daily tasks in the morning of following day, I noticed that there’s the final hint broadcasted to the players. I wasn’t interested because I was in the middle of doing the work. When I’m done, and take a look at the hints, I was surprised 😯 .

What’s this? Some kind of jokes? From which hints or clues that the key is only one single char? Doing some modification, I finally get the output, but it is already 9 minutes late. The CTF portal was closed at 3pm, and I didn’t get any score for this task.

#include <stdio.h>
#include <stdlib.h>
typedef unsigned char Byte;
int main(int argc, char** argv) {
  if (argc < 2) {
    printf("Use: %s <input_file>\n\n", argv[0]);
    exit(-1);
  }
  FILE* in_file = fopen(argv[1], "r");
	Byte letter;
	Byte temp;
  char key[1]="G";
  int i = 0; /* for key counter */
  while(fscanf(in_file, "%c", &letter) != EOF)   {
    letter = letter ^ key[0];
    letter = ~letter;
    letter = letter | 0;
    letter = letter >> 1;
    printf("%c",letter);
  }
	fclose(in_file);
}

Conclusion

Understanding programming languages is essential for me to do my job. Sometimes, I have to reverse engineering the binary of malwares to understand the behaviour, or deobfuscating the malware payload in PowerShell, or VBScript. I play CTF just for knowing what other programming techniques that might be used by programmer to hide the source code. Above example made me think that I have to learned on how to do reverse engineering of python script that was packed with cpython.

Even when we live in the world of subscription to doing reverse engineering-as-a-service, I think having this skill is still a bonus point for IT security personnel. Perhaps to you need to justify whether the detection engine of one’s service is not working well if the binary was simply applying a simple software packing. Or perhaps you want to justify whether the dynamic analysis engine is already satisfies your needs, because what you did manually have been provided by the service provider.

Hope you enjoy and learn something new from this blog.

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.