Skip to content Skip to sidebar Skip to footer

Md5 Hashing A Csv With Python

I have a csv with email addresses that needs to be hashed in MD5 format, then save the hashed emails as a new csv. I haven't seen my exact use case on SO and haven't been able to s

Solution 1:

The answer in your comment is nearly correct. You only need to open another file with the write attribute w. I have changed your query to use with so you don't to have to explicitly close the file handlers:

withopen("/Users/[username]/Downloads/email_original.csv",'rb')  as file:
    withopen("/Users/[username]/Downloads/email_hashed.csv",'w')  as output:
        for line in file: 
           line=line.strip() 
           print hashlib.md5(line).hexdigest() 
           output.write(hashlib.md5(line).hexdigest() +'\n')

Solution 2:

Jaco's answer is good but incomplete since it neglects the encoding for the MD5 hash. The code would also be insufficient if the CSV format was modified to include other columns in the future. Here is an example that tackles both problems while also making easy to change the hash in the future along with specifying other columns that can have individual hash algorithms applied to them:

import csv
import hashlib

IN_PATH ='email_original.csv'
OUT_PATH ='email_hashed.csv'
ENCODING ='ascii'
HASH_COLUMNS = dict(email_addr='md5')


def main():
    withopen(IN_PATH, 'rt', encoding=ENCODING, newline='') as in_file, \
            open(OUT_PATH, 'wt', encoding=ENCODING, newline='') as out_file:
        reader = csv.DictReader(in_file)
        writer = csv.DictWriter(out_file, reader.fieldnames)
        writer.writeheader()
        forrowin reader:
            forcolumn, methodin HASH_COLUMNS.items():
                data =row[column].encode(ENCODING)
                digest = hashlib.new(method, data).hexdigest()
                row[column] ='0x'+ digest.upper()
            writer.writerow(row)

if __name__ =='__main__':
    main()

Post a Comment for "Md5 Hashing A Csv With Python"