read csv file directly from URL / How to Fix a 403 Forbidden Error

Question

The csv file is downloadable. I can download the file and use read_csv, But I want to read the file via direct URL in jupyter, I used the following code, but I get the HTTP 403 Forbidden error

from io import StringIO

import pandas as pd
import requests
url="https://fineli.fi/fineli/en/elintarvikkeet/resultset.csv"
s=requests.get(url).text

c=pd.read_csv(StringIO(s))
c

how do I read the csv file via URL directly in python with a delimeter ";"

Tasos · Accepted Answer · 2019-04-23T06:35:46.557

4

The problem is that the url you have doesn't accept "non-browser" requests. The default header of Python requests is

'User-Agent': 'python-requests/2.13.0'

You can pass your own headers as an argument like that

from io import StringIO
import pandas as pd
import requests

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'}

url="https://fineli.fi/fineli/en/elintarvikkeet/resultset.csv"
s=requests.get(url, headers= headers).text

c=pd.read_csv(StringIO(s), sep=";")
c

edited Apr 23 '19 at 06:35

answered Apr 23 '19 at 06:28

Tasos

3,920
4
23
54

SyntaxError: illegal target for annotation – KHAN irfan Apr 23 '19 at 06:31
can you try now? I am running it on codelab right now and I don't get any error. – Tasos Apr 23 '19 at 06:36

score 0 · Answer 2 · answered Apr 23 '19 at 06:27

I read the file using the following code

from urllib.request import urlopen, Request
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.3"}
reg_url = "https://fineli.fi/fineli/en/elintarvikkeet/resultset.csv"
req = Request(url=reg_url, headers=headers) 
html = urlopen(req).read() 
print(html)

read csv file directly from URL / How to Fix a 403 Forbidden Error

2 Answers2