Current location - Recipe Complete Network - Catering training - What are the methods of data analysis?
What are the methods of data analysis?
Commonly used list of nine for reference:

I. Formula disassembly

The so-called formula decomposition method is to decompose the influencing factors of an index layer by layer by formula.

For example, analyze the reasons for the low sales of a product and decompose it by formula.

Second, comparative analysis.

Contrast method is the most commonly used method to compare two or more groups of data.

We know that isolated data is meaningless, and only comparison can make a difference. For example, year-on-year and month-on-month comparison, growth rate, fixed base ratio, comparison with competitors, comparison between categories, comparison of characteristics and attributes, etc. Contrast method can find the law of data change, which is frequently used and often combined with other methods.

Comparing the sales of company AB in the figure below, although the sales of company A generally increased and were higher than those of company B, the growth rate of company B was very fast and higher than that of company A, and even if the growth rate decreased in the later period, the final sales still caught up.

Three. A/b test

A/Btest refers to two or more versions of a Web or App interface or process, which are accessed by similar groups of visitors in the same time dimension, collecting user experience data and business data of each group, and finally analyzing and evaluating the best version for formal adoption. The flow of A/b testing is as follows:

(1) Status analysis and hypothesis establishment: analyze business data, determine the most critical improvement points at present, put forward assumptions for optimization and improvement, and put forward optimization suggestions; For example, we found that the user conversion rate is not high. We assume that the conversion rate brought by the promoted landing page is too low, and we must find ways to improve it.

(2) Set goals and make plans: Set main goals and measure the advantages and disadvantages of each optimized version; Set auxiliary targets and evaluate the impact of the optimized version on other aspects.

(3) Design and development: Make two or more optimized versions of design prototypes and complete the technical realization.

(4) Traffic distribution: determine the distribution ratio of each online test version. In the initial stage, the flow setting of the optimization scheme can be very small, and the flow can be gradually increased according to the situation.

(5) Collecting and analyzing data: collecting experimental data to judge the validity and effect: if the statistical significance reaches 95% or above and lasts for a period of time, the experiment can be ended; If it is lower than 95%, it may be necessary to extend the test time; If the statistical significance does not reach 95% or even 90% for a long time, it is necessary to decide whether to stop the experiment.

(6) Finally, according to the test results, determine to release a new version, adjust the split ratio to continue the test, or continue to optimize the iterative scheme and re-develop the online test when the test effect is not achieved.

The flow chart is as follows:

Fourthly, quadrant analysis.

Through the division of two or more dimensions, the expected value is expressed by coordinates. Directly change from value to strategy, so as to carry out some landing promotion. Quadrant method is a strategy-driven thinking, which is often associated with product analysis, market analysis, customer management and commodity management. For example, the following figure shows the four-quadrant distribution of advertisement clicks, with the X-axis from left to right indicating from low to high and the Y-axis from bottom to top indicating from low to high.

Advertisements with high click-through rate and high conversion rate show that the crowd is relatively accurate and efficient. Advertisements with high click-through rate and low conversion rate show that most people who click in are attracted by advertisements, and low conversion rate shows that the people targeted by advertising content are somewhat inconsistent with the actual audience of products. Advertising with high conversion and low click shows that the target audience of advertising content is highly consistent with the actual audience of the product, but the advertising content needs to be optimized to attract more people to click. Advertisements with low click-through rate and low conversion rate can be discarded. There is also the classic RFM model, which divides customers into eight quadrants according to three dimensions: recency, frequency and currency.

Advantages of quadrant method:

(1) Find out the cause of the problem.

Through quadrant analysis, the attribution analysis of events with the same characteristics is carried out, and the reasons for * * * are summarized. For example, in the above advertising case, the events in the first quadrant can extract effective promotion channels and strategies, and the third and fourth quadrants can exclude some ineffective promotion channels;

(2) Establish a grouping optimization strategy

Delivery quadrant analysis can establish optimization strategies for different quadrants. For example, in the RFM customer management model, customers are divided into different types according to quadrants, such as key development customers, key retention customers, general development customers and general retention customers. Give more resources to key development customers, such as VIP service, personalized service and up-selling. Sell higher-value products to potential customers or take some preferential measures to attract them back.

Pareto analysis of verb (abbreviation of verb)

Pareto rule, derived from the classic 28 rule. For example, in terms of personal wealth, it can be said that 20% of the people in the world hold 80% of the wealth. In data analysis, it can be understood that 20% of the data produces 80% of the effect, and it is necessary to mine around this 20% of the data. When using the 28 rule, it is often related to ranking, and the top 20% is considered as valid data. The "28" method focuses on analysis and is applicable to any industry. Find the key points, find their characteristics, and then think about how to turn the remaining 80% into this 20% to improve the effect.

Usually, it will be used for product classification to measure and establish ABC model. For example, a retail enterprise has 500 SKUs and the corresponding sales of these SKUs, so which SKUs are more important? This is the problem of prioritizing in enterprise management.

The common practice is to take the product SKU as the dimension and the corresponding sales as the basic measurement index, arrange these sales indicators from big to small, and calculate the percentage of the cumulative total sales of the current product SKU to the total sales.

If the percentage is within 70% (inclusive), it is classified as Class A. If the percentage is within 70~90% (inclusive), it is classified as Class B. If the percentage is within 90~ 100% (inclusive), it is classified as Class C. The above ratio can also be adjusted according to your actual situation.

ABC analysis model can be used not only to divide products and sales, but also to divide customers and customer transactions. For example, who are the customers who contribute 80% profits to the enterprise, and what is the proportion? Assuming that there is 20%, we know that we should focus on maintaining these 20% customers with limited resources.

Six, funnel analysis

Funnel method is a funnel diagram, a bit like an inverted pyramid. It is a simplified way of thinking, which is often used in the development of new users, the change of shopping conversion rate and the analysis of some processes.

The above picture is a classic marketing funnel, which vividly shows a sub-link in the whole process from acquiring users to finally converting them into purchases. Adjacent chain conversion rate refers to quantifying the performance of each step with data indicators. Therefore, the whole funnel model is to divide the whole purchase process into steps, then measure the performance of each step with the conversion rate, and finally find out the problematic links through abnormal data indicators, so as to solve the problems, optimize the steps, and finally achieve the purpose of improving the overall purchase conversion rate.

The core idea of the whole funnel model can actually be classified as decomposition and quantification. For example, to analyze the transformation of e-commerce, what we need to do is to monitor the user transformation of each layer and find the optimizable points of each layer. For users who don't follow the process, their transformation model is specially drawn to shorten the path and improve the user experience.

There is also a classic hacker growth model, AARRR model, which refers to acquisition, activation, retention, revenue and referral, that is, user acquisition, user activation, user retention, user revenue and user communication. This is a common mode in product operation. Combined with the characteristics of the product itself and the life cycle position of the product, we pay attention to different data indicators and finally formulate different operation strategies.

It can be clearly seen from the following AARRR model diagram that the life cycle of the whole user is gradually decreasing. By disassembling and quantifying all the links in the whole user life cycle, and comparing the horizontal and vertical data, we can find the corresponding problems and finally carry out continuous optimization iteration.

Seven, path analysis

User path analysis tracks the user's behavior path from a start event to an end event, that is, monitoring the user's flow direction, which can be used to measure the effect of website optimization or marketing promotion and understand the user's behavior preferences. Its ultimate goal is to achieve business goals, guide users to complete the optimal path of products more efficiently, and finally urge users to pay. How to conduct user behavior path analysis?

(1) Calculate every first step when users use websites or apps, then calculate the flow direction and transformation of each step in turn, and truly reproduce the whole process from opening the app to leaving users through data.

(2) Check the path distribution when users use the product. For example, after visiting the home page of an e-commerce product, what percentage of users searched, what percentage of users visited the classified page, and what percentage of users directly visited the product details page.

(3) Analysis of path optimization. For example, which path is the most visited by users; At which step, users are most likely to lose.

(4) Identifying user behavior characteristics through paths. For example, analyzing whether users are goal-oriented or aimless browsing.

(5) subdivide users. Users usually classify according to the purpose of using the APP. For example, the users of car APP can be subdivided into attention-oriented, intention-oriented and purchase-oriented users, and the path analysis of different access tasks is carried out for each type of users. For example, for an intention-oriented user, what are his paths and problems when comparing different models? Another method is to use the algorithm to cluster analysis based on all the access paths of users, classify users according to the similarity of access paths, and then analyze each type of users.

Take e-commerce as an example. From logging in to the website //APP to paying successfully, the buyer has to go through the processes of browsing the home page, searching for goods, adding to the shopping cart, submitting orders, paying orders and so on. However, the real shopping process of users is a tangled and repeated process. For example, after submitting an order, the user can return to the home page to continue searching for goods, or cancel the order. There are different motives behind every road. After in-depth analysis with other analysis models, we can find fast user motivation and guide users to the optimal path or expected path.

Example of user behavior path map:

VIII. Reservation Analysis

User retention means that new members/users still have certain attributes and behaviors such as access, login, use or transformation after a certain period of time, and the ratio of retained users to new users at that time is the retention rate. Retention rates are divided into three categories according to different periods, taking the retention of login behavior identification as an example:

The first daily reservation can be subdivided into the following categories:

(1) Next-day retention rate: (among the new users on the same day, the number of users logged in the next day)/the total number of new users on the first day.

(2) Third-day retention rate: (among the users added on the first day, there are also users who log in on the third day)/the total number of users added on the first day.

(3) 7th-day retention rate: (among the users added on the first day, there are also users who log in on the 7th day)/the total number of users added on the first day.

(4) 14-day retention rate: (among the new users on the first day, there are also users who logged in on 14)/the total number of new users on the first day.

(5) 30th-day retention rate: (among the users added on the first day, there are also users who logged in on the 30th day)/the total number of users added on the first day.

The second type of weekly retention, Zhou Du retention rate, refers to the number of new users who still log in every week compared with the first week.

The third monthly retention rate refers to the number of users who are still logging in every month compared with the first week. The retention rate is aimed at new users, and the result is a matrix report (only half of which has data). Each data record row is the date, and the column is the retention rate in different time periods. In general, the retention rate will gradually decrease over time. The following monthly user retention curves are generated by taking monthly retention as an example:

Nine, cluster analysis

Cluster analysis is an exploratory data analysis method. Usually, we use cluster analysis to group and classify seemingly disordered objects in order to better understand the research objects. Clustering results require high similarity between objects in the group, while low similarity between objects in the group. In user research, cluster analysis can solve many problems, such as the information classification of websites, the correlation of web page clicking behavior and the classification of users. Among them, user classification is the most common situation.

There are many common clustering methods, such as K-Means, spectral clustering and hierarchical clustering. Take the most common K-means as an example, as shown below:

It can be seen that data can be divided into three different clusters: red, blue and green, and each cluster should have its own unique attributes. Obviously, cluster analysis is an unsupervised learning and a classification model without labels. When we cluster the data and get the clusters, we usually analyze each cluster separately to get more detailed results.

Get more data analysis and learning information, welcome to pay attention to the official WeChat account of Yunhai with the same name ~