For machine learning and data science projects data sets are necessary to train and test model. Furthermore, several datasets repository is available on the Internet. Some of them are freely available some need to pay money.
Here I will list some popular datasets resources over the Internet for machine learning projects.
Almost 17,000 datasets are freely available ranging from student, marketing, business, cancer, diabetes, plants, social media, sports and many more. You can download the datasets freely after login to the website. Kaggle is now owned by Google and is a subsidiary of Alphabet corporation.
Very popular datasets repository from the University of California, Irvine Campus. A variety of datasets are also available on this website. To download check out the URL in the heading.
KDNuggets does not have a data repository. However, links are given to the data sources which are authentic.
Government of India provides data policies implementation, health improvement, employment, and other public and state-related policies. These datasets are freely available and you can use in your project.
This repository contains the following types of datasets.
1-United States government and demographics
2-International government and demographics
5- Technology and APIs
6-Sports and Entertainment
7-General Aggregation Sites
Source- URL https://datascience.berkeley.edu/open-data-sets/
The dataset repository contains datasets especially about social network and the Internet.