It is the process to discover the knowledge or hidden pattern form large databases.
The overall goal of data mining is to extract and obtain information from databases and transfer it into an understandable format for use in future.
It is used by Business intelligence organizations, Financial analysts, Marketing organizations, and companies with a strong consumer focus like retail ,financial and communication .
It can also be seen as one of the core process of knowledge discovery in data base (KDD).
It can be viewed as process of Knowledge Discovery in database.
Data Extraction/gathering:- To collect the data from sources . Eg: data warehousing.
Data cleansing :- To eliminate bogus data and errors.
Feature extraction:- To extract only task relevant data : i.e to obtain the interesting attributes of data .
Pattern extraction and discovery :- This step is seen as process of data mining , where one should concentrate the effort.
Visualization of the data and Evaluation of results :- To create knowledge base.
Classification is a technique of data mining to classify each item into predefined set of groups or classes.
The goal of classification is to accurately predict the target class for each item in the data.
For example, a classification model could be used to identify loan applicants as low, medium, or high credit risks.
The simplest type of classification problem is binary classification. In binary classification, the target attribute has only two possible values: for example, high credit rating or low credit rating. Multiclass targets have more than two values: for example, low, medium, high, or unknown credit rating.
Sentiment analysis is a sub-domain of opinion mining where the analysis is focused on the extraction of emotions and opinions of the people towards a particular topic.
Sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic.
The attitude may be his or her judgment or evaluation, affective state (that is to say, the emotional state of the author when writing), or the intended emotional communication (that is to say, the emotional effect the author wishes to have on the reader).
With opinion mining, we can distinguish poor content from high quality content.
Random Forest Technique
In this technique, a set of decision trees are grown and each tree votes for the most popular class, then the votes of different trees are integrated and a class is predicted for each sample.
This approach is designed to increase the accuracy of the decision tree, more trees are produced to vote for class prediction. This approach is an ensemble classifier composed of some decision trees and the final result is the mean of individual trees results.
Facebook : https://www.facebook.com/E2MatrixTrainingAndResearchInstitute/