A COMPILATION OF DATA MINING APPLICATIONS:
This
webpage collects a group of data mining news which attracted my attention. The
news are grouped in an intuitive way. I use these links for teaching issues.
Although the used methodology is mentioned, they are written in a divulgative
style, where emphasis is put on the problem solved.
- Data mining general issues:
- What is "data mining"? Overview and applications:
- A divulgative article about the data mining discipline and several applications (published in the spanish newspaper, El Pais)
- A dibulgative article about the data mining discipline in the NYTimes: "Big data's impact in the world". The spanish version of the article can be found here
- "The social data revolution": a good short-article which reviews the historic development of data-gathering and data-modeling activities by companies which want to stay relevant for customers
- Ethics and Data mining:
- Controversy in the US Senate about the privacy in Government's data mining projects [article in the Washington Post journal]
- New US Federal Laws to protect user's privacy and the use of their personal data to be mined [article in the Los Angeles Times]
- Big data mining, fairness and privacy: when using data mining, how to protect people from privacy intrusion and unfair discrimination
- Data mining for terrorism clues: an article in USA-Today journal
- Fighting crime with data: the use of data mining technique to crime prediction by police offices
- Courthouse news -- McDonald's, CBS, & Microsoft Mine Data from Web Ads, mining consumers' web browser histories for entries of particular relevance to defendants' respective, customized advertising campaigns
- An "orwellian"-style tale on data mining, the injustice of the richness distribution and the expulsion from the Heaven" (written in spanish)
- "In Praise of the Innaccuracy" -- "Elogio de la inexactitud": a reflection about the limits of the data analysis discipline (published in the the spanish newspaper, El Diario Vasco)
- Data mining as an industrial or academic oportunity:
- Yahoo! Key scientific challenges: among them, disciplines like "machine learning", "statistics", "algorithmic economics", "web mining" and "green computing".
- 5 start-up companies in Silicon Valley that make use of data mining technologies
- The raise of the "data sciencist" employment: "...by 2018 the U.S. could face a shortage of up to 190,000 workers with analytical skills. Data engineers are already harder to find than search engineers, and that's a sign of the times..." says Deep Nishar, head of product at LinkedIn"
- 10 technologies to bet on in the recession
- An article which relates the chess game, artificial intelligence and the oportunities to overcome the current financial crisis (written in spanish, El Pais journal)
- Data mining among the "hot careers" for college graduates
- New trends in data types/quality:
- Quality of the data: the future for the success of data mining
- New challenging data collections to be mined: "big data is watching you"
- Kdnuggets.com polls about data mining usage: data types analyzed, data mining methods/algorithms used, data mining tools used, largest database mined, application types, tools for data manipulation, using Google to store data? ...
- Recommender Systems:
- Music recommender systems:
- last.fm in Wikipedia: a pioneering on-line music recommender system
- Sun music recommender system
- Another music recommender system: the "musical brain project"
- Film recommender systems:
- NeoMetrics: analysis of social networks, "film recommender" (in spanish)
- The NetflixPrize: a prize for improments on the film recommender system topic
- KDD-Cup 2011, from Yahoo! Music collected data: predicting scores that users gave to various items (first track), separation of loved songs from other songs (second track)
- Goodreads will recommend your next book
- Five most important projects on Recommender Systems during year 2008: music, video, pharmaceutical drugs...
- Various recommender systems: StumbleUpon, PhotoTree, Jester - Online joke recommender
- Various data mining applications (grouped, intuitively, by application areas):
- Biomedicine - Bioinformatics:
- Medical diagnosis helped by the use of data mining (a divulgative article in the spanish journal El Pais)
- Applying machine learning technique over genomic data ("Ion Torrent" device's output) to understand the DNA and diagnose several diseases (a divulgative article in the spanish journal El Pais)
- Bioinformatics topics: Decode - prediction of the probability to suffer several diseases, personalized DNA screening (polymorphisms) (both in spanish)
- 10 years of the Human Genome Project: controversy in the patent of genes (a divulgative article in spanish journal El Pais)
- The use of Bayesian networks for the genetic improvement of vegetals (in spanish)
- D. Kraft: a video on "Medicine's future: the role of ITs on medicine's future"
- Sports:
- Sport injuries' prediction by means of neural networks. AC Milan-Lab
- Politics:
- Data mining and politics: Democrat party is using data mining: "How Obama's data-crunching prowess may get him re-elected"
- Computer Security:
- Security and machine learning: malware detection, spam filtering
- Poker, games, arts:
- Poker games' mining (in spanish)
- Microgaming network, a well-known poker-software, ends the use of data mining in its poker-hands database
- Games and machine learning: the role and benefits of datamining in the design of games
- Success of a song: predicting it by Polypho (in spanish)
- Traffic and airports - logistics:
- Visualization of airline and airport traffic delays
- Cheap flights mining - Cheap hotel mining (Farecast company has been recently bought by Microsoft)
- Designing a classifier that will detect whether the driver is alert or not alert, employing data that are acquired while driving
- Traffics jams: a Microsoft tool to avoid them
- A competition on the traffic congestion issue: traffic congestion, jams and traffic reconstruction by GPS
- Ecology - Sustainability:
- Forecasting Alaska's long-term Ecosystem [link]
- Institute for Computational Sustainability: current computational challenges in ecosystem issues
- The Climate Corporation: A company which learns and sells weather predictions realted to farming insurance, making use of the vast amount of free data published by the National Weather Service on heat and precipitation patterns around the country
- Stock market - Stock exchange:
- Artificial Intelligence to decide in the Stock Exchange (in spanish). A Job offer in a financial company
- The role of automatic programs (robots) in the Stock Exhange (in spanish)
- Computers that trade on the news -- monitoring news, articles and social networks to predict the behaviour of the markets
- Get in or out of the stock market? A company offers a product which tries to reply to this question.
- IBM is buying companies that analyze, by means of data mining techniques, financial and insurance risks
- Direct Advertising - Customer profiling - Analysis of shopping patterns:
- The new Era of marketing: Data mining pushes marketing to a new level
- Workshop: "The use of Machine Learning in Online Advertising"
- "Information to better compete": this article appeared in the spanish El Pais journal, and it explains the experience of the CognoData Consulting company to segment their clients (in spanish)
- Competition on Plagiarism Detection, Author Identification, and Wikipedia Vandalism Detection
- "Personas": a MIT's tool to discover "how the web sees you" (an article which roughly describes the tool)
- Predicting the students' success in US Universities: a big real dataset with students' records is built to apply data mining techniques over it [link]
- Genetic algorithms - Optimization:
- Genetic algorithms to "eat waves": marine energy (in spanish)
- Genetic algorithms for car design
- Genetic algorithms for antenna design in the NASA
- Genetic algorithms for the on-line optimization of a 2D-car in a particular terrain: nice GUI and show
- Theo Jansen, a kinetic sculptor who uses genetic algorithms for the design-optimization of the shape of the machine's parts and gait
- Data mining frontiers: mining social data / mining web-blogs and end-users' opinion
- Social climate analysis in the web. Companies (Nielsen, Serendio, Asomo, Jodange, etc.: find here a list of companies devoted to this topic) track the web to measure users' opinion about brands, etc. All of them offer services to well-known-brand companies in order to monitor how the Internet user's rate and assess their products (Sentiment Analysis):
- A divulgative article about "sentiment analysis on the web" (published by the NY Times)
- A divulgative article about "automatic sentiment classification and blog analysis" (published in the spanish El Pais newspaper)
- A Carnegie Mellon University study on Twitter Sentiments (relating it with the results on opinion polls)
- An urban legend? Can be done data mining over FaceBook: (a negative perspective)? Here a positive perspective of this approach: predicting the humour of the users by FaceBook (in spanish)
- What can be "mined-analyzed" in social networks? Five ideas to make money "mining" social networks [article]
- Google punishes the companies which are not careful with their clients [article; published in the spanish Expansion journal)
- Insurance companies: detecting fraud by social network analysis [description]
- The social web: challenges for mining social networks
- Mining cellphone data:
- MIT and Nokia project: "Reality mining"
- A description of the "Reality mining" project's web
- "Cellphone records reveal patterns of human activity"
- Prediction of the future:
- Google: a tool for analyzing your historic data and predicting future outcomes
- Google "Prediction API": an API for sentiment analysis, recommender systems, spam detection, etc.
- Microsoft: Predestination, a Microsoft tool for predicting the future
- Companies that use data mining in the core of their products:
- BitYviP: applying data mining techniques (neural networks) in several social analysis tasks: social shoping, recommendation, advertising...
- Last.fm: applying recommender techniques for music recommendations (click the link for a description of their job offers)
- BioBayex: a spin-off company of the University of Almeria, specialized in computational biology, bioinformatics and data mining "bio" related applications
- ZZAlpha: a company which performs stock market recommendations
- The Climate Corporation: A company which learns and sells weather predictions realted to farming insurance, making use of the vast amount of free data published by the National Weather Service on heat and precipitation patterns around the country
- Job offers in the "data mining" field (in kdnuggets.com)
- Highlighted videolectures for teaching purposes (introductory and motivating lectures for students):
- F. Fogelman (ECML 2008): Industrial data mining, challenges and perspectives [link]
- A.N. Srivastava (KDD 2009): Data mining at NASA: from theory to applications [link]
- U. Fayyad (KDD 2007): A data miner's history: getting to know the grand challenges [link]
- D. Mladenic (SLSF 2005): Dimensionality reduction by feature selection in machine learning [link]
- University of Granada, Artificial Intelligence Departament: a divulgative video about several applications of computational intelligence (in spanish) [link]
- M. Golovnya (MIT Sports Analytics Conference 2011): "A step by step introduction to data mining for sports analysis" [link]
Updated on March the 26th, 2012, by Iñaki Inza
Intelligent Systems Group
University of the Basque Country, San Sebastian, Spain