Reading Csv File Into Pandas From Sftp Server Via Paramiko Fails With "'utf-8' Codec Can't Decode Byte ... In Position ....: Invalid Start Byte"
I'm trying to read a CSV file into Pandas from am SFTP server using Paramiko: with sftp.open(path + file.filename) as fp: fp_aux = pd.read_csv(fp, separator = '|') But when at
Solution 1:
Pandas seems to be somehow confused by the Paramiko file-like object API. It does not use its encoding
argument, when presented with Paramiko file-like object.
Quick and dirty solution is to read the remote file to in-memory file-like object and present that to Pandas. Then the encoding
argument is used.
flo = BytesIO()
sftp.getfo(path + file.filename, flo)
flo.seek(0)
pd.read_csv(flo, separator = '|', encoding='iso-8859-1')
More efficient might be to build a wrapper class on top of Paramiko file-like object, with the API that Pandas can work with.
Post a Comment for "Reading Csv File Into Pandas From Sftp Server Via Paramiko Fails With "'utf-8' Codec Can't Decode Byte ... In Position ....: Invalid Start Byte""