Here is a collection of public available data sources. If you know some more, let me know.
Update 1.3.2020: updated broken links and added OECD and UCDP.
Name | Description | URL | Format | Copyright | Documentation |
---|---|---|---|---|---|
Wikipedia Page Hits | How often was a Wikipedia acrticle requested in a given time period? | https://dumps.wikimedia.org/ | csv file gzip | "free" like gnu 3 and CC-3 | https://dumps.wikimedia.org/ |
GDELT | Global database of events, location and tone. News articles worldwide in an machine readable format. | http://gdeltproject.org/data.html#rawdatafiles | csv file gzip, restful API, google big data | free to use, quote source | http://gdeltproject.org/#downloading |
Yahoo Finance (depracted) | stock price | http://download.finance.yahoo.com/d/quotes.csv?sGE+PTR+MSFT&f=snd1l1yr | RESTful -> csv file | google for descriptions | |
DWD Deutscher Wetterdienst | Wetterdaten stündlich, täglich, monatlich, jährlich, gemessen und auf Gitter abgeleitet. | https://opendata.dwd.de/ | ZIP file | https://opendata.dwd.de/README.txt | https://opendata.dwd.de/weather/tree.html |
Presseportal | dpa-Tochter news aktuell stellt im Presseportal via API hochwertige Inhalte kostenfrei bereit. | http://api.presseportal.de/doc/ | XML, JSON | "Es ist für uns ein Gebot der Fairness, dass die Nutzer unserer Inhalte Ihre eigenen Web-Angebote mit dem Presseportal verlinken und auf die Herkunft der Inhalte verweisen." | http://api.presseportal.de/doc/ |
NOAA | Climate Data Online (CDO) provides free access to NCDC's archive of global historical weather and climate data in addition to station history information. These data include quality controlled daily, monthly, seasonal, and yearly measurements of temperature, precipitation, wind, and degree days as well as radar data and 30-year Climate Normals. | https://www.ncdc.noaa.gov/cdo-web/ | gz files, API | depending on dataset https://www.drought.gov/gdm/content/site-disclaimers | https://www.ncdc.noaa.gov/cdo-web/faq |
news.google.com | presents google rated news | http://news.google.com/news?output=rss | website, rss | depending on news source | google for usage examples |
The Internet archive: archive.org | TV, texts, websites | https://archive.org/ | website | https://archive.org/about/terms.php | |
Bing | News search, Web search, Images | http://www.bing.com/toolbox/bingdeveloper/ | JSON | Microsoft, payed service | https://bkgdocs.azurewebsites.net/ |
instagramm | top german intstagrammer by follower | http://www.instabrowse.de/instagram-rankings.html | website | ||
socialblade | top international twitter/facebook | https://socialblade.com/twitter/ | Website Table | Copyright ©2008-2020. Social Blade LLC. All Rights Reserved | |
Gdelt fullltext | fulltext search in GDELT | http://api.gdeltproject.org/api/v1/search_ftxtsearch/search_ftxtsearch?query=Merkel&output=artimglist&dropdup=true | REST API | http://blog.gdeltproject.org/announcing-the-gdelt-full-text-search-api/ | |
Sattelite data | Data from the Sentinel-1 to 3 Sattelites | REST API, web | special copyright | https://scihub.copernicus.eu/userguide/ | |
ontology wikipedia | http://mappings.dbpedia.org/server/ontology/classes | ||||
amazon open data | scientific datasets (medical research, satellite data, web data, ...) | https://registry.opendata.aws/ | file download , depending on dataset | depending on dataset | |
Google Trends | Which search term has been used how often at what time? | https://trends.google.com/trends/ | csv, API? - not tested https://towardsdatascience.com/google-trends-api-for-python-a84bc25db88f | https://pypi.python.org/pypi/pytrends (depracted) | |
UCDP | Uppsala Conflict Data Program Department of Peace and Conflict Research | https://ucdp.uu.se/ | API: https://ucdp.uu.se/apidocs/ | Documentation | |
OECD | Organisation for Economic Co-operation and Development (OECD), general statistical data on all countries of the world | https://data.oecd.org/ | API: https://data.oecd.org/api/ Bulk raw data for DB dumps; selected tables as csv, xml files for manual evaluation | https://data.oecd.org/ | |
fivethirtyeight | mostly USA, Data driven news site from ABC news. | https://github.com/fivethirtyeight/data | mainly csv from github | Unless otherwise noted, our data sets are available under the Creative Commons Attribution 4.0 International License, and the code is available under the MIT License. | https://data.fivethirtyeight.com/ |
Google correlate (depracted) | correlate series with google search data | Website | https://www.google.com/trends/correlate/ |