4

Using Python 3.4 and Im trying to import csv files with some containing commas, others containing semicolons, and other containing tabs as delimiters.

Is it possible to let python detect which proper delimiter to use? i have read the post on python: import csv file (delimiter “;” or “,”) but cannot get the appropriate result.

My code thus far:

import csv

class Data(object):
def __init__(self, csv_file):
    self.raw_data = []
    self.read(csv_file)

def read(self, csv_file):
        with open(csv_file, newline='') as csvfile:
            dialect = csv.Sniffer().sniff(csvfile.read(), delimiters=',;')
            csvfile.seek(0)
            f = csv.reader(csvfile, dialect)
            for row in f:
               self.raw_data.append(row)
            print(self.raw_data)

mycsv = Data('comma_separate.csv')

comma_separate.csv contains:

[email protected], $161,321, True, 1
[email protected], $95.00, False, 3
[email protected], $952025, False, 3

Right now my output is:

['[email protected], $161,321, True, 1'], ['[email protected], $95.00, False, 3'], ['[email protected], $952025, False, 3']

My desired output is:

['[email protected]', '$161,321', 'True', '1'], ['[email protected]', '$95.00', 'False', '3'], ['[email protected]', '$952025', 'False', '3']
Community
  • 1
  • 1
JanV123
  • 41
  • 3
  • This maybe help you. [read-data-from-csv-file-and-transform-to-correct-data-type](http://stackoverflow.com/questions/11665628/read-data-from-csv-file-and-transform-to-correct-data-type) – luoluo Sep 10 '15 at 11:11

2 Answers2

1

The problem seems to be the first line of your csv-file that you are using to determine the delimiter. The program works as expected, if you change the line to:

[email protected], $161.321, True, 1

I guess the reason for that is that he wants to have the same number of attributes per line in your csv-file.

Dencrash
  • 133
  • 1
  • 7
0

use sniff without passing possible delimiters works for me

import csv

class Data(object):
    def __init__(self, csv_file):
        self.raw_data = []
        self.read(csv_file)

    def read(self, csv_file):
            with open(csv_file, newline='') as csvfile:
                dialect = csv.Sniffer().sniff(csvfile.read())
                csvfile.seek(0)
                f = csv.reader(csvfile, dialect)
                for row in f:
                   self.raw_data.append(row)

                print(csvfile.name)
                print(self.raw_data)


for f in ['tab_separate.tsv','comma_separate.csv','comma_separate2.csv']:
    mycsv = Data(f)

output

tab_separate.tsv
[['[email protected]', '$161,321', 'True', '1'], ['[email protected]', '$95.00', 'False', '3'], ['[email protected]', '$952025', 'False', '3']]
comma_separate.csv
[['[email protected],', '$161,321,', 'True,', '1'], ['[email protected],', '$95.00,', 'False,', '3'], ['[email protected],', '$952025,', 'False,', '3']]
comma_separate2.csv
[['[email protected]', '$161,321', 'True', '1'], ['[email protected]', '$95.00', 'False', '3'], ['[email protected]', '$952025', 'False', '3']]

comma input

[email protected], $161,321, True, 1
[email protected], $95.00, False, 3
[email protected], $952025, False, 3

tab input

[email protected]  $161,321    True    1
[email protected]    $95.00  False   3
[email protected] $952025 False   3

semi colon input

[email protected];$161,321;True;1
[email protected];$95.00;False;3
[email protected];$952025;False;3
Brendan Doherty
  • 148
  • 2
  • 6
  • That's odd. I even copied you code as well as the comma_separate.csv file and it still gives me this output: comma_separate.csv [['[email protected], $161,321, True, 1'], ['[email protected], $95.00, False, 3'], ['[email protected], $952025, False, 3']]' Are you using Python 3.4 as well? – JanV123 Sep 10 '15 at 12:44
  • 1
    Yes 3.4. looking at it again, I think mine is splitting on the space in the comma separated file rather than the comma. ie it thinks its a space delimited file. Could you upload the csv input somewhere? – Brendan Doherty Sep 10 '15 at 14:42
  • https://drive.google.com/file/d/0BxeHWbvOxiOTYU15N200S3R4cVk/view?usp=sharing https://drive.google.com/file/d/0BxeHWbvOxiOTb1hSYTVaTl9oZUU/view?usp=sharing – JanV123 Sep 11 '15 at 01:58
  • Have resolved issue. Was a format error with MS Excel. Used Libreoffice to make csv files and they now work. Go figure.. – JanV123 Sep 11 '15 at 05:40