Discussion on PRNG Sequence Guessing Challenge

3 years ago+ 2 comments

There is no editorial, the hints below are somewhat cryptic and even the problem description seems to have changed over time => so I take the freedom to publish some notes (please have merci ;-)

state the problem

Given a sequence: with unknown seed (48 bit), and 10 successive samples . Determine the next 10 samples.

I will refer with 'lcg' (affine linear congruence generator, nerd speek for add/multiply/mod) to both the sequence and the generating formula.
(note that the reversed sequence is given by , where INV = pow(0x5DEECE66D, -1, 2**48) . )

timing

One can try to reconstruct the (first) seed. This is time critical in Python.
As a rule of thumb the time restrictions allow for about complex operations or simple operations, I'll call that the cost.

information

each sample gives us about 10bit information.

the 'easy' solution ...

missing in the description

The seeds for the testcases were generated from timestamps in seconds (date range 2000 - 2014), xor 0x5DEECE66D .
That is seed is 32bit (35bit after xor) .

[note that the lcg is iterated 1 time before taking samples]

determine the lower part of the seeds

Our samples only depend on the higher bits of the seeds, we have no information from the 17 bits to the right. However, as 1000 is divisible by 8, the rightmost 3 bits of the sample and those of seed[bit47..17] are identical. That means, we know seed[bit19..17] of every iteration of the lcg.
So we can brute force the lower part of the initial seed by taking any 20bit value, iterate and compare those 3bit. cost: , information: 20bit unknown, 3bit per sample: 7 samples sufficient.

We can even start with (3bit from first sample)+17 arbitrary bits - cost: , information as above.

note that this works independent of whether seed is 48bit or 35bit.

determine the higher part of the seeds

we know the lower 20 bits of all seeds by now.
As the initial seeds in the testcases are 35bit (32bit before xor), we are only missing 15 bit.
this can easily broken by brute force at a cost of :
take any combination of 15 bits + known lower part, iterate lcg , take samples and compare to given.

a little harder ...

It is not very satisfying to depend on some hidden weakening of requirements, ie. our algorithm should be able to handle any 48bit seed.

go for the higher part of the seeds

the lower part of the seeds can be determined as above with cost .

we can brute force the part seed[bit47..17] with values: sample + i*1000
at a cost of: which is just fast enough.
As above we iterate the lcg with this seeds, take samples and compare.
Information perspective: 28 bit of seed unknown, 7 bit per sample: 4 samples should suffice.

other possibilities

I would have liked to restrict the range for brute force even more, by looking at two consecutive seeds.
Did some calculations, but had no breakthrough.
Perhaps it may even be possible to calculate the seed from the samples without brute force - beyond my knowledge, perhaps some number theory wizards drop in here....