3

The csv file is downloadable. I can download the file and use read_csv, But I want to read the file via direct URL in jupyter, I used the following code, but I get the HTTP 403 Forbidden error

from io import StringIO

import pandas as pd
import requests
url="https://fineli.fi/fineli/en/elintarvikkeet/resultset.csv"
s=requests.get(url).text

c=pd.read_csv(StringIO(s))
c

how do I read the csv file via URL directly in python with a delimeter ";"

KHAN irfan
  • 421
  • 1
  • 7
  • 16

2 Answers2

4

The problem is that the url you have doesn't accept "non-browser" requests. The default header of Python requests is

'User-Agent': 'python-requests/2.13.0'

You can pass your own headers as an argument like that

from io import StringIO
import pandas as pd
import requests

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'}

url="https://fineli.fi/fineli/en/elintarvikkeet/resultset.csv"
s=requests.get(url, headers= headers).text

c=pd.read_csv(StringIO(s), sep=";")
c
Tasos
  • 3,920
  • 4
  • 23
  • 54
0

I read the file using the following code

from urllib.request import urlopen, Request
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.3"}
reg_url = "https://fineli.fi/fineli/en/elintarvikkeet/resultset.csv"
req = Request(url=reg_url, headers=headers) 
html = urlopen(req).read() 
print(html) 
KHAN irfan
  • 421
  • 1
  • 7
  • 16