Bazsites.com Datasets
Directory Topics
On the Web
- The StatLib Datasets Archive - A repository of datasets used in statistics and machine learning.
- Web->KB dataset - Web pages partitioned into classes, with hyperlink data. The dataset has been used for text categorization and learning to extract symbolic knowledge from the World Wide Web.
- Finding and Sorting Data in DataSets VB.NET - In this tutorial readers learn about finding and sorting data in DataSets - Filtering on Row State and Version, Sorting and Data View Manager.
- Face recognition dataset - A dataset of face images for face recognition algorithms.
- Dataset generator - Datgen, formerly SCDS, is a computer program that generates data to systematically test programs that consume data. These synthetic datasets can be used to validate learning algorithms.
- Bilkent University Function Approximation Repository - Datasets used for the experimental analysis of function approximation techniques and for training and demonstration by machine learning and statistics community.
- HS3D - Homo Sapiens Splice Sites Dataset - HS3D (Homo Sapiens Splice Sites Dataset) is a database of Homo Sapiens Exon, Intron and Splice regions extracted from GenBank primate sequences Rel.123. The aim of this data set is to give standardized material to train and to assess the prediction accuracy of computational approaches for gene identification and characterization.
- DataCutter Project - Research project developing a middleware framework for filtering large, scientific datasets in a cluster or Grid environment. Enables highly efficient exploration and analysis of datasets in distributed and heterogeneous environments.
- Temperature Trends: Surface (CRU) - Land and sea surface temperature anomalies, analysed by the Climate Research Unit, Norwich, UK. Includes scientific papers, dataset terminology, file formats, data for downloading, answers to frequently asked questions, and links to related web sites, grids, and datasets.
- GEO Information Systems - Repository for GIS datasets in Oklahoma. Datasets include school districts, cultural data (tiger roads, hydrology, railroads) and high resolution jpeg images.
Wikipedia Articles
- ParaView - ParaView is an open source, freely available program for parallel, interactive, scientific visualization. It has a client-server architecture to facilitate remote visualization of datasets, and generates level of detail (LOD) models to maintain interactive framerates for large datasets.
- Basic partitioned access method - In IBM mainframe operating systems, a basic partitioned access method (BPAM) is an access method for libraries with a specific structure, called partitioned datasets (PDS). BPAM is used in OS/360, OS/VS2, MVS, z/OS, and others.
- HHCode - A Helical Hyperspatial Code, also known as an HHCode, is a data storage format for very large spatio-temporal datasets.
- DBpedia - DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia and to interlink other datasets on the Web with DBpedia data.
- BrownBoost - BrownBoost is a boosting algorithm that is robust to noisy datasets. BrownBoost is an adaptive version of the boost by majority algorithm.