These individual-level behaviors embedded in the original social media data represent customer preferences, purchase histories, significant life events, moods, personalities, and other attributes that are obtained through text mining, which can be stored in social media data marts.
The social networking pioneers we know today came in the late the 1960s, when the bulletin board was one of the first interactive message sharing platforms. Later (in the 1990s, when Craigslist and AOL entered the spotlight), the social revolution was the basis for rapid growth. Social networking took off in 21st century, first Friendster, LinkedIn, MySpace, Flickr, Vimeo, YouTube, and then the 2004 launch of Facebook and the 2006-born Twitter, as well as the recent Google + and Pinterest.
The widespread adoption of social media, accompanied by digital trends, has had a direct impact on brands, which have developed mobile digital strategies for environments with moving characteristics. Social communication effectively prolongs the relationship between brands and customers. Before the advent of E-commerce and social media, customers typically study the product and then make a clear purchase, and the relationship between the buyer and seller ends until there is a need for purchase. Word-of-mouth is limited to the customer's physical social network. Now, customers ' opinions are magnified through social networks, and may even reach the entire consumer community.
Brand marketers know that today's consumers are actively collecting information before buying, looking for praise and bad reviews, and better quick price comparisons, which can be done with several clicks on a mobile device. They also know that consumers are now more responsive to the responses of others in social networks, and this has led to the development of a new type of impact loyalty program aimed at material incentives and rewarding individuals with strong brand influence. Consumers are becoming the guardians of the brand, so that adjusting brand personality and brand identity is never so important to the survival of the brand.
So how do branded businesses manage the collection of digital interactive information? Technology is stepping up catch-up with social consumers. Social networks have provided site-specific traffic and data-statistics tools (such as Facebook Insights, YouTube Insights, and social media management kits, such as HootSuite), as well as impact measurement portals such as Klout, The portal provides Third-party options for branding engagement tracking metrics. Many commercial social listening tools, such as Radian6, SM2, Viralheat and Sysomos, provide reports, text analysis, management, affective analysis, visitor information, and participatory workflow. These tools are increasing in scope and usability, but many aspects of these tools are still at an early stage of development. Emotional analysis, for example, lacks accuracy, and the price of social data provided by services such as Twitter firehose and companies like Gnip and DataSift is still expensive and limited in terms of data availability. As a result, some people strongly demand the expansion of these business tools with kernel text mining capabilities and the creation of a proprietary social media data mart. Social media data marts store consumer-level information from social media interactions and all relevant digital information, such as location, equipment, mobile behavior, mobile payments, platforms, and the speed associated with commenting data.
Text Mining and semantic methods
Given that social media generate large amounts of consumer data, how can brands turn these raw social media comments from Twitter, Facebook, blogs, and forums into operational intelligence data that can be used as a basis for action? The answer is to apply text mining and semantic techniques to new sources of unstructured data.
Text mining refers to the techniques used to extract information from different text sources. Why is it so important? It is generally estimated that in all business-related information, 80% of the information is unstructured and semi-structured text data. In other words, all embedded business information and consumer behavior data will be wasted if you do not apply text analysis to the large amount of data represented by this 80% of information. The term text mining is often called text analysis has a lot of practical significance, such as spam filtering, extracting information from opinions and suggestions on e-commerce sites, social listening and opinion mining on blogs and comment sites, enhanced customer service and e-mail support, automated processing of business documents, electronic discovery in the legal field, Measure consumer preferences, claim analysis and fraud detection, as well as cyber crime and national security applications.
Text mining is similar to data mining because it also identifies interesting patterns within the data. Although manual (and highly labor-intensive) text mining appeared in the 1980s. In recent years, the field of text mining is very important for discovering unknown information by defining search engine results algorithm and filtering data source. Technologies such as machine learning, data statistics, computational linguistics and data mining all play an important role in this process. For example, the knowledge discovery goal of text is to use natural language processing (NLP) to detect underlying semantic relationships from text, content, and implied contexts. This process is designed to replicate using NLP and then measure the same type of language differences, pattern recognition, and understanding when reading and working with text.
There are various methods in the field of text mining. The following are a list of common and subsequent steps involved in text mining.
The first step in text mining is to identify the text-based sources you want to analyze and collect this material by retrieving information or by selecting a grammar library that contains the set of text files and the content of interest. The deployment of the extended NLP can invoke "partial part-name tagging" and text order to parse the grammar (that is, the lexical text) and apply the Named Entity recognition (that is, a reference to the identification of the brand, people's names, places, common acronyms, and so on). The Filter stopwords step of the iteration involves the deletion of the disabled word, which refines the desired subject content. Pattern identified datastore recognizes e-mail addresses and phone numbers, and coreference recognizes noun phrases and related objects in the text, followed by relationship, Fact and Event extraction. Usually generates N-grams, which creates a series of consecutive words as a term. Finally, performing semantic analysis, social media interception and classification tools are now widely used in this way to extract attitudinal information about an object or subject. Many times, various mapping and rendering capabilities also provide visualization for further accurate verification.
Text Mining Tools
Text mining software and applications have many commercial and open source options. IBM offers a wide range of robust text mining solutions. A powerful scheme with ibm®infosphere®biginsights™ large data function provides additional text analysis module, which can run text analysis extraction from Infosphere biginsights cluster. The IBM spss® program has a wide range of dimensions and scope. One tool that is very effective in searching for a document and assigning it to a topic is IBM SPSS Modeler, which provides a graphical interface to perform common text document categorization and analysis. Another product IBM SPSS Text Analytics for surveys uses NLP, which is useful for analyzing open investigation issues within a document. IBM SPSS Modeler Premium and SPSS Text Analytics for surveys run on the same engine, but are more scalable to handle a comprehensive workbench document that facilitates structured and unstructured data integration (PDF, Web page, blog , emails, Twitter feeds, etc.). A related custom code node for Facebook extends the capabilities of the SPSS Modeler Premium to directly read data from Facebook wall and integrate with Twitter feeds in SPSS Modeler. To gain more social media channels of view.
RapidMiner and R are the most popular tools in open source text mining. R has a larger user base, it is a programming language that requires source code, there are many algorithms to choose from. But scalability has always been a problem for r, so, for large datasets, R is not an ideal choice if there is no workaround. The RapidMiner user base is small, but it does not seek source code and has a powerful user interface (UI). And it is highly scalable, capable of handling cluster and database programming. IBM provides a JAQL R module that integrates R projects within a query, allowing MapReduce jobs to run R computations in parallel.