I tried a couple problems at UIUCTF last weekend, crypto and pwn mostly. It had an interesting mixture of simple and hard problems with not much in between. Overall, it was both fun and educational.
First, the easy side:
This task was about undoing an 8-character ROR+XOR encryption (XOR with some bit rotation mixed in as well). It was known that the plaintext contains a certain 8-character string, and that the password was composed of letters only, so this was completely straightforward (only 100 points).
At its heart, the job was to reconstruct an integer secret given its remainders modulo a bunch of small primes, i.e., elementary number theory (Chinese remainder theorem). You had to succeed 5 times to get the flag. The only twist was that the secret was larger than the product n of all primes you had, so you could only get secret = k n + s0 where k was unknown and had to be guessed. You had up to 250 guesses per secret. It was still pretty simple because you could just request the maximum number of primes (which was 9), and then guess systematically k = 0, 1, 2, 3,… and get the right one quickly enough.
The challenge was amusing in two aspects:
there was quite some dead code in there (checks that were always true, for example), and
the challenge code also contained all relevant routines you needed to break it,
such as this oneliner for modular inverse:
glue = lambda A, n, s=1, t=0, N=0: (n < 2 and t % N or glue(n, A % n, t, s - A//n * t, N or n), -1)[n < 1]
# this computes (1/A) mod n
and few-liner for CRT reconstruction:
def construct_key(shares):
glue = lambda A, n, s=1, t=0, N=0: (n < 2 and t % N or glue(n, A % n, t, s - A//n * t, N or n), -1)[n < 1]
mod = prod([m for s, m in shares])
secret = sum([s * glue(mod//m, m) * mod//m for s, m in shares]) % mod
return secret
# reconstructs integer from (remainder, modulus) pairs
It was also educational because it turned out that number.getPrime(8) in Crypto.Util may return not only any of the 8-bit primes, but the 9-bit prime 257 too.
This was listed as a pwn problem but to me remote rev fits better because the binary had an otherwise dead flag printing routine you simply had to invoke with the correct inputs. The difficulty was easy - there were lots of hints throughout (only 150 points). For the heck of it we posted a writeup on CTF Time.
Now let us turn to the hard side:
This was about passing 500 captchas in 10 minutes. Each captcha had 5 minecraft runes that were randomly distorted somewhat plus there was a random line drawn across the image. You could also find a hint to a URL that gave you about 55,000 samples with known solutions. A very intriguing problem that totally smelled like a machine learning exercise (so I skipped it :p).
I would have thought that one first needs image processing to find and remove the line and break the captcha into characters, and then use a network to recognize each character… But it turns out that one network can do all of that directly. Wow.
In this problem you could upload an image that was then rendered as background on a TI-84 calculator (more precisely, a TI-84 emulator). The image had to be in the .8ca format, and you could not just fool it by just changing file extension (so at least some rudimentary format checks were in place). Eventually I noticed that the challenge description had a link to the source code (in C), and after looking at the code for some time I think this must have been a heap overflow bug in this function:
task_t *task_init() {
unsigned i;
ti_var_t img = 0;
uint8_t *archived_data, *raw_data;
uint16_t size;
task_t *task;
for(i = 0; i < 10 && !img; i++) {
ti_CloseAll();
img = ti_OpenVar(image_tokens[i], "r", TI_IMAGE_TYPE);
}
if(!img)
return NULL;
//The toolchain gives the wrong data offset for some reason.
archived_data = (uint8_t*)ti_GetDataPtr(img) - 0x3C;
size = *(uint16_t*)archived_data;
raw_data = malloc(size);
task = malloc(sizeof(task_t));
task->filters.count = 2;
task->filters.arr = default_filters;
memcpy(raw_data, archived_data, SIZE);
task->img = image_init(raw_data);
return task;
Here, size is determined from the image data, while SIZE is a hardcoded value for a .8ca image defined as
#define SIZE 0x56e5
So a suitably doctored file would get you size < SIZE, and memcopy later would write beyond the region allocated by malloc.
That said, I had no immediate idea how I would get execution to code I put on the heap, so I skipped this one too. Ever since I have been checking CTF Time for a writeup, with no success :/… but one appeared just now at last and it seems to confirm my hunch. Looking forward to learning a trick or two when I get to reading it in detail.
I spent most time in earnest on Nookcrypt. This was a blackbox crypto problem accessible via netcat, where you got a menu and had two options:
There was no code, but on connection it printed “Here we use fancy elliptic curve encryption”, and all ciphertexts where pairs of integers (x,y). So the natural guess was that the problem simply converts the plaintext to an integer k and gives you k*G, where G is the generator of some curve. Results for both options seemed to be the same across connections, so one could assume that the same curve was used each time. By sending in plaintexts \x01, \x02, \x03, … one could verify that this was indeed so (\x01 gives the generator, in fact), and with a couple plaintext-ciphertext pairs one could deduce the Weierstrass parameters a, b, p for the curve. The encryption of “hello world” also checked out, so this pretty much looked like an elliptic curve discrete log challenge.
One way to infer p is to take the equation of the curve for each point known
y1**2 = x1**3 + a * x1 + b mod p
y2**2 = x2**3 + a * x2 + b mod p
...
yn**2 = xn**3 + a * xn + b mod p
which are linear equations for a and b. So from any pair of these equations, you can express a or b or b/a (all mod p), e.g.,
a = (y1**2 - x1**3 + x2**3 - y2**2) / (x1 - x2) mod p # from eqns 1 and 2
Results coming from different pairs of equations must agree, so their difference must be 0 mod p, i.e., without the mod you must get a multiple of p. (If two ratios agree mod p, e.g., C/D = E/F mod p, then C*F = E*D mod p, so C*F - E*D = const*p.) You can then either factor that multiple to get candidates for p (p must be a prime) or just use several equation pairs and then gcd the results to extract the common factor in them (which must ultimately be p).
The parameters were, in fact, that of the standard curve sec128r2, which is a 128-bit curve with generator of only 126-bit order. That is fairly low for security purposes. So this all suggested a constrained ECDLOG problem where k corresponds to ASCII text (the flag), which I hoped may be breakable with enough wits and some resources. 126 bits mean only up to 16 characters, so as a first step I ruled out any such string of the form uiuctf{…}. Given that, I thought that k must correspond to only the string inside the flag… and spent a lot of time sweeping through up to 11 characters with a meet-in-the-middle attack, trading off cycles for as much memory as I could afford. I also made the math more and more optimal, switched from Python to C/C++ (huge speedup), converted to an Edwards curve for further gains (doable because the order of sec128r2 is divisible by 4), etc. It was a lot of fun but in the end all in vain regarding solving the problem :/
As it turned out afterwards, the challenge hung on a crucial hint posted on Discord during the CTF that said “cosmic rays have corrupted the prime”. The most natural interpretation of this is that in the corrupted case the server is doing EC math but modulo a value q that is different from p. Indeed, making lots of requests to the server, one could verify that with nonnegligible probability option 1) returned “err” instead of ciphertexts, and with roughly 1 in 100 chance one got different ciphertexts for the FLAG and “hello world” from the usual results. From three points known - generator, FLAG, hello world - in such abnormal cases, one could repeat the technique discussed above to find the modified prime q (a is known, so the gcd method is already viable with just three points). Once q is known, one can determine b as well (it will not be the original sec128r2 value because the same generator coordinates must satisfy the curve equation mod q now). In general the corrupted value q was not a prime but that was the point!
Perfoming the prime factorization q = q1 * q2 * … * qn, one can take the coordinates of each of the three points modulo any of the qi, which is mathematically equivalent to using curve parameters ai = a mod qi, bi = b mod qi, with prime qi. I.e., one is working over a smaller number field. The order ni of the generator on such curves tends to be much less than the 126-bit order for the original curve, and when ni is smaller than ballpark 1e10, the ECDLOG problem can be done in seconds with SageMath. Solving the reduced DLOG only gives k mod ni, however - but repeating this step for many enough different qi values (which requires interacting with the server long enough to catch at least a handful of “cosmic ray” events) one has a big enough set of different ni to reconstruct the full k using the Chinese remainder theorem.
This was a really nice problem except for having to resort to a Discord hint… plus that in the end the flag was 263 bits long. That is way bigger than the sec128r2 group, which meant that the original ciphertext was undecryptable without the cosmic ray bug (solving the 126-bit DLOG would not have worked at all) :/