Skip to content Skip to sidebar Skip to footer

Reading Csv File Into Pandas From Sftp Server Via Paramiko Fails With "'utf-8' Codec Can't Decode Byte ... In Position ....: Invalid Start Byte"

I'm trying to read a CSV file into Pandas from am SFTP server using Paramiko: with sftp.open(path + file.filename) as fp: fp_aux = pd.read_csv(fp, separator = '|') But when at

Solution 1:

Pandas seems to be somehow confused by the Paramiko file-like object API. It does not use its encoding argument, when presented with Paramiko file-like object.

Quick and dirty solution is to read the remote file to in-memory file-like object and present that to Pandas. Then the encoding argument is used.

flo = BytesIO()
sftp.getfo(path + file.filename, flo)
flo.seek(0)
pd.read_csv(flo, separator = '|', encoding='iso-8859-1')

More efficient might be to build a wrapper class on top of Paramiko file-like object, with the API that Pandas can work with.

Post a Comment for "Reading Csv File Into Pandas From Sftp Server Via Paramiko Fails With "'utf-8' Codec Can't Decode Byte ... In Position ....: Invalid Start Byte""