Skip to content Skip to sidebar Skip to footer

Reading A Binary File With Memoryview

I read a large file in the code below which has a special structure - among others two blocks that need be processed at the same time. Instead of seeking back and forth in the file

Solution 1:

A memoryview is not going to give you any advantages when it comes to null-terminated strings as they have no facilities for anything but fixed-width data. You may as well use bytes.split() here instead:

file_names_block = bsa_file.read(total_file_name_length)
file_names = file_names_block.split(b'\00')

Slicing a memoryview doesn't use extra memory (other than the view parameters), but if using a cast you do produce new native objects for the parsed memory region the moment you try to access elements in the sequence.

You can still use the memoryview for the file_records_block parsing; those strings are prefixed by a length giving you the opportunity to use slicing. Just keep slicing bytes of the memory view as you process folder_path values, there's no need to keep an index:

for folder_record infolder_records:
    name_size = file_records_block[0]  # first byte is the length, indexing gives the integer
    folder_path = file_records_block[1:name_size].tobytes()
    file_records_block = file_records_block[name_size + 1:]  # skip the null

Because the memoryview was sourced from a bytes object, indexing will give you the integer value for a byte, .tobytes() on a given slice gives you a new bytes string for that section, and you can then continue to slice to leave the remainder for the next loop.

Post a Comment for "Reading A Binary File With Memoryview"