Generating Random String Of Seedable Data
Solution 1:
If you have numpy
available, it has a version of the random
module as numpy.random
that contains this function that you might consider:
numpy.random.bytes(length)
It is very fast:
$ python -mtimeit "import numpy""numpy.random.bytes(1<<30)"10 loops, best of 3: 2.19 sec per loop
That's for 1GiB.
And you can seed it with numpy.random.seed
.
Solution 2:
NEW ANSWER
After re-reading OP's question, I understand now that it's about raw bytes, not ascii chars string
So, how about this?
importrandomgl=0
def randBytes(size):
global glnr= bytearray(random.getrandbits(8) for _ in xrange(size))
gl = nr
return
%timeit randBytes(1000000)1 loops, best of 3: 262 ms per loop
In [27]: gl.__sizeof__()
Out[27]: 1087223
OLD ANSWER HERE
import random
importstring
def generateRandomString(size):
return(''.join(random.choice(string.ascii_letters) for i in range(size)))
Notes:
One ascii character is 1 byte. So "size" denotes both length of string and size in bytes.
You can use string.ascii_uppercase or ascii_lowercase to have either lower and uppercase
random.seed can be used to specify the seed.
random.seed([x])¶
Initialize the basic random number generator. Optional argument x can be any hashable object. If x is omitted or None, current system time is used; current system time is also used to initialize the generator when the module is first imported. If randomness sources are provided by the operating system, they are used instead of the system time (see the os.urandom() function for details on availability).
So you could have:
import random
import string
defgenerateRandomString(size, seed=None):
if seed != None:
random.seed(seed)
return(''.join(random.choice(string.ascii_letters) for i inrange(size)))
Timings:
In [30]: %time generateRandomString(1000000)
Wall time: 554 ms
<andthenoutput>
Solution 3:
As Dan D. says, letting numpy
generate your bytes in one hit at C speed is going to be way faster than producing them one at a time at Python speed.
However, if you don't want to use numpy
you can make your code a little more efficient.
Building a string by concatenation eg buf = buf + chr(random.randint(0,255))
is very slow, since a new buf
has to be allocated on every loop (remember, Python strings are immutable). The usual technique in Python for building a string from substrings is to accumulate the substrings in a list then to use the str.join()
method to combine them in one go.
We can also save a little bit of time by pre-generating a list of our 1 byte strings rather than calling chr()
for every byte we want.
from random import seed, choice
allbytes = [chr(i) for i inrange(256)]
defrandom_bytes(n):
bytes = []
for _ inrange(n):
bytes.append(choice(allbytes))
return''.join(bytes)
We can streamline this and make it slightly more efficient by using a list comprehension:
defrandom_bytes(n):
return''.join([choice(allbytes) for _ inrange(n)])
Depending on how you intend to use these random bytes, you may find it useful to put them into a bytearray or bytes
object.
Here's a variant based on cristianmtr's new answer:
defrandom_bytes(n):
returnbytes(bytearray(getrandbits(8) for _ in xrange(n)))
You could use str()
in place of bytes()
, but bytes()
is better for Python 3, since Python 3 strings are Unicode.
Solution 4:
Python 3.9 random.randbytes
+ random.seed
Docs: https://docs.python.org/3.9/library/random.html#random.randbytes
main.py
#!/usr/bin/env python
import random
import sys
random.seed(0)
sys.stdout.buffer.write(random.randbytes(8))
writes 8 pseudorandom bytes to stdout with fixed seed of 0:
./main.py | hd
outputs:
00000000 cd 07 2c d8 be 6f 9f 62 |..,..o.b|
00000008
Its definition in CPython is simply:
defrandbytes(self, n):
"""Generate n random bytes."""return self.getrandbits(n * 8).to_bytes(n, 'little')
Here it is converted to a Bash one liner and benchmarked compared to /dev/urandom
: Something similar to /dev/urandom with configurable seed?
Post a Comment for "Generating Random String Of Seedable Data"