Skip to content Skip to sidebar Skip to footer

Generating Random String Of Seedable Data

I'm looking for a way to generate a random string of n bytes in Python in a similar way to os.urandom() method except providing a way to seed the data generation. So far I have: de

Solution 1:

If you have numpy available, it has a version of the random module as numpy.random that contains this function that you might consider:

numpy.random.bytes(length)

It is very fast:

$ python -mtimeit "import numpy""numpy.random.bytes(1<<30)"10 loops, best of 3: 2.19 sec per loop

That's for 1GiB.

And you can seed it with numpy.random.seed.

Solution 2:

NEW ANSWER

After re-reading OP's question, I understand now that it's about raw bytes, not ascii chars string

So, how about this?

importrandomgl=0
def randBytes(size):
    global glnr= bytearray(random.getrandbits(8) for _ in xrange(size))
    gl = nr
    return

%timeit randBytes(1000000)1 loops, best of 3: 262 ms per loop

In [27]: gl.__sizeof__()
Out[27]: 1087223

OLD ANSWER HERE

import random
importstring
def generateRandomString(size):
    return(''.join(random.choice(string.ascii_letters) for i in range(size)))

Notes:

One ascii character is 1 byte. So "size" denotes both length of string and size in bytes.

You can use string.ascii_uppercase or ascii_lowercase to have either lower and uppercase

random.seed can be used to specify the seed.

random.seed([x])¶

Initialize the basic random number generator. Optional argument x can be any hashable object. If x is omitted or None, current system time is used; current system time is also used to initialize the generator when the module is first imported. If randomness sources are provided by the operating system, they are used instead of the system time (see the os.urandom() function for details on availability).

So you could have:

import random
    import string
    defgenerateRandomString(size, seed=None):
        if seed != None:
             random.seed(seed)
        return(''.join(random.choice(string.ascii_letters) for i inrange(size)))

Timings:

In [30]: %time generateRandomString(1000000)
Wall time: 554 ms
<andthenoutput>

Solution 3:

As Dan D. says, letting numpy generate your bytes in one hit at C speed is going to be way faster than producing them one at a time at Python speed.

However, if you don't want to use numpy you can make your code a little more efficient.

Building a string by concatenation eg buf = buf + chr(random.randint(0,255)) is very slow, since a new buf has to be allocated on every loop (remember, Python strings are immutable). The usual technique in Python for building a string from substrings is to accumulate the substrings in a list then to use the str.join() method to combine them in one go.

We can also save a little bit of time by pre-generating a list of our 1 byte strings rather than calling chr() for every byte we want.

from random import seed, choice

allbytes = [chr(i) for i inrange(256)]

defrandom_bytes(n):
    bytes = []
    for _ inrange(n):
        bytes.append(choice(allbytes))
    return''.join(bytes)

We can streamline this and make it slightly more efficient by using a list comprehension:

defrandom_bytes(n):
    return''.join([choice(allbytes) for _ inrange(n)])

Depending on how you intend to use these random bytes, you may find it useful to put them into a bytearray or bytes object.

Here's a variant based on cristianmtr's new answer:

defrandom_bytes(n):
    returnbytes(bytearray(getrandbits(8) for _ in xrange(n)))

You could use str() in place of bytes(), but bytes() is better for Python 3, since Python 3 strings are Unicode.

Solution 4:

Python 3.9 random.randbytes + random.seed

Docs: https://docs.python.org/3.9/library/random.html#random.randbytes

main.py

#!/usr/bin/env python
import random
import sys
random.seed(0)
sys.stdout.buffer.write(random.randbytes(8))

writes 8 pseudorandom bytes to stdout with fixed seed of 0:

./main.py | hd

outputs:

00000000  cd 07 2c d8 be 6f 9f 62                           |..,..o.b|
00000008

Its definition in CPython is simply:

defrandbytes(self, n):
        """Generate n random bytes."""return self.getrandbits(n * 8).to_bytes(n, 'little')

Here it is converted to a Bash one liner and benchmarked compared to /dev/urandom: Something similar to /dev/urandom with configurable seed?

Post a Comment for "Generating Random String Of Seedable Data"