Current location - Recipe Complete Network - Take-out food franchise - Research on Stock Forecasting Based on WeChat Big Data
Research on Stock Forecasting Based on WeChat Big Data

research on stock forecasting based on wechat big data

big data is a hot topic in recent years, which has great influence both internationally and domestically. Economics, politics, sociology and many scientific disciplines will undergo great or even essential changes and development, which will further affect human value system, knowledge system and lifestyle. At present, the global economy has generated an unprecedented amount of data. If we compare the large amount of data generated every day to the great flood in the mythical period, it is completely correct. This flood of data is unprecedented. It is brand-new, powerful and, of course, frightening but extremely exciting.

The topic I shared is how to use big data technology to study stock forecasting in the Internet environment. Today, I want to share four points that I think are meaningful.

1. Business forecast under big data

According to big data, we can effectively forecast faults, people flow, traffic, electricity consumption, stock market, disease prevention, transportation, food distribution and industrial supply and demand. What we are concerned about in this paper is the prediction of the stock market.

the core of big data is prediction, which depends on the analysis of data. So is the analysis method designed based on the results of random sampling, and will there be errors in such analysis method?

traditionally, due to the limitations of resources and technology, such as people and computing resources, it is impossible to process all the data to obtain the results that people are concerned about from the perspective of computer processing power. Therefore, random sampling came into being, and the selected individuals represented the whole population, such as using random sampling to make the inference result more scientific. However, since big data is mentioned, it is a new understanding that resources have developed to a certain extent and technology has developed to a certain stage. Just like the emergence of electricity, human beings have entered a stage of rapid development, and so has big data, which means all samples and infers from the whole sample. In this paper, the meaning of big data is the flow information of all stocks on the whole social network. From the data source, this paper does not use all the data on social networks, but only analyzes WeChat, the most representative social media, as the information source.

Interactive data can reflect users' emotions, while search data can reflect users' concerns and intentions. Which of these two data is more valuable in stock market forecasting?

I think they are all valuable. Interactive data reflects users' likes and dislikes of a specific stock, which can be simply described as whether to continue holding or selling the stock. The search data represents the process of users collecting information about the stock, which is the concept of attention. A high search degree of a stock means that the news has great influence. Interaction represents direction, and search represents amplitude.

We know that the conclusions drawn by these two kinds of data will be different. How do you balance the situation reflected by these two kinds of data to make a prediction?

as mentioned in the previous question, if it is a matter of principle such as stock recommendation, buying and selling, interactive data should be considered. However, if it has been bought, search data can provide a concept of range, similar to bond rating A, AA, AAA, etc., for investors' reference, because different investors have different tolerance for risks.

does it mean that the main distribution channel is Weibo? Wechat WeChat official account is very popular now. Have you considered publishing news through this channel?

In fact, there are many ways to disseminate information, and the influence of WeChat as a new media can't be underestimated. However, at present, email, SMS and other ways are the least technical input, and WeChat official account will be considered to push stock and market news in the future.

if messages are pushed through wechat WeChat official account in the future, will the pushed messages be collected again as data sources? How much impact will this have?

will be collected, but the daily information about individual stocks on the Internet will reach a great amount. This push will increase the weight of recommended stocks by 1 point, and the weight of each stock will be hundreds of thousands, so the impact will be minimal.

The data source is WeChat WeChat official account. Apart from the consideration of accuracy, have you considered that collecting data in this way will less violate personal privacy?

From a legal point of view, searching WeChat or other personal chat records is an infringement of personal privacy. Therefore, if Tencent opens such an interface, every citizen can complain, protest or even institute legal proceedings against such behavior until he corrects his mistakes and compensates for the losses.

does this mean that even if there are illegal acts, the results will be borne by Tencent, and we, as users of data, do not need to bear any legal responsibilities?

In the whole society, as a system technology provider, we should abide by the ethics of big data and abide by national laws. If personal privacy is violated, the system will not collect it. Google has a motto "Google does not do evil", and so does the system mentioned in this article.

2. Stock recommendation experiment based on big data

The timeliness of stocks reflects the timeliness of publishing WeChat articles. The higher the timeliness, the greater the data value.

The stock's popularity reflects the current frequency of a stock's attention. The greater the attention frequency, the higher the possibility of rising.

data integrity: we save the search results of about 2,236 stocks issued in Shenzhen and Shanghai (except the pioneering version) on the WeChat search website in a circular way.

data consistency: the file format is determined by the program responsible for saving data files, and a single process ensures the consistency of files.

Accuracy of data: As the analyzed subscription number articles were provided by WeChat official account of WeChat platform, the damage of false news to the prediction system was eliminated to some extent.

Timeliness of data: Considering the disk reading and writing, the network bandwidth where the acquisition program is located, and the shielding of the acquisition program by the search engine, there is an interval of 5 seconds between the acquisition of two pieces of information in the program, so theoretically, 11,18 seconds (3.1 hours) can collect the data needed for the recommendation of the day. For each trading day, collecting all data between 9: and 9: 3 requires more than 7 devices to achieve the best results. This experiment is limited by the test equipment. On one equipment, data collection starts at 6: a.m. every trading day, which also meets the requirements of timeliness.

data analysis: look at the opening and closing prices of three high-priority stocks on the same day, and then compare them with the Shanghai Composite Index on the same day (April 8, 215), and it can be concluded that the algorithm is superior to the stock price difference income of the whole stock with the Shanghai Composite Index as the sample.

experimental conclusion: according to the above method, the system recommends the stocks of the day every day, buys them at the opening and sells them on the second trading day. After 21 trading days in a month (from March 1st, 215 to March 31st, 215), the revenue of the system is 2%/ month. There is a positive correlation between forecasting market trends and investment sentiment by searching WeChat official account on WeChat, so it can be used as a factor in stock selection.

3. The development trend of big data in stock forecasting

There are three kinds of network data:

One is browsing data, which is mainly used to analyze consumer behavior in the field of e-commerce. Browsing data reflects the user's visit steps at every step, further portrays the user's visit path, and analyzes the jump probability of different pages.

second, search data, which mainly refers to the time series data of keyword search frequency recorded by search engines, can reflect the interests, concerns and intentions of hundreds of millions of users.

thirdly, interactive data, mainly from Weibo, WeChat and social networking sites, reflects users' inclination and emotional factors.

Robert, winner of the p>213 Nobel Prize in Economics? Schiller's views were quoted by countless interviewees. The investment model designed by Schiller in the 198s is still praised by the industry. In his model, he mainly refers to three variables: the planned cash flow of investment projects, the estimated cost of company capital, and the reaction of the stock market to investment (market sentiment). He believes that the market itself has subjective judgment factors, investor sentiment will affect investment behavior, and investment behavior will directly affect asset prices.

The computer extracts useful information by analyzing news, research reports, social information, search behavior, etc. with the help of natural language processing methods; With the help of machine learning intelligent analysis, in the past, quantitative investment can only cover dozens of strategies, while big data investment can cover thousands of strategies.

The research on economic prediction based on Internet search data and social behavior has gradually become a new academic hotspot, and has made some achievements in the fields of economy, society and health. In the application of capital market, it is found that search data can effectively predict the future stock market activity (measured by trading volume index) and the change of stock price trend.

for search data: the correlation mechanism between internet search behavior and stock market. This research belongs to the cross field of behavioral finance and internet, and its principle is: the adjustment of stock volume and price is the response of investors' behavior in the stock market; At the same time, there are corresponding signs of investor behavior in the Internet search market. What we need to do is to find the behavioral indicators that are ahead of stock trading in the Internet search market, and synthesize the leading search indicators of many investors to predict the future stock trading.

just like the weather forecast, we constantly optimize the model, fill in a lot of information, and then give the results. And 8% of the information processed is "unstructured" data, such as policy documents, natural events, geographical environment, scientific and technological innovation, etc. This kind of information is usually difficult for computers and models to digest. Using semantic analysis, the financial dialogue in interactive data can be quantified as investment advice between "-1 (extremely bearish)" and "1 (extremely bullish)", and the data text of interactive data can be analyzed as a signal of stock market investment.

4. The future that is happening

Big data is not a cold world full of algorithms and machines, and the role of human beings still cannot be completely replaced. What big data provides us is not the final answer, but the reference answer. The help is temporary, and better methods and answers are still in the near future.

Big data has a wide influence on the practical level and has solved a lot of daily problems. Big data is even more important, and it will reshape our life, work and way of thinking. In some respects, we are facing a deadlock, which has a greater impact than the sharp expansion of the scope and scale of social information caused by other epoch-making innovations. The ground under our feet is moving. What was certain in the past is being questioned. Big data requires people to re-discuss the nature of decision-making, fate and justice. Having knowledge once meant mastering the past, but now it means being able to predict the future.

Big data is not a cold world full of algorithms and machines, in which human beings still need to play an important role. The unique weaknesses, illusions and mistakes of human beings are all necessary, because the other end of these characteristics leads to human creativity, intuition and talent. This suggests that we should be willing to accept similar inaccuracies, because inaccuracies are one of the characteristics that make us human. Just like we learn to deal with chaotic data, because these data serve a broader goal. Chaos is bound to constitute the essence of the world and the job of the human brain, and whether it is the chaos of the world or the chaos of the human brain, only by learning to accept and apply them can we benefit.

I believe that using basic data, search data, interactive data and then weighted calculation, all stocks can be selected by big data, thus giving investment advice. I believe that our flesh has just entered the era of big data, but our spirit is still stuck in small data and sampling thinking. Those who take the lead in breaking the inherent thinking with rationality will also take the lead in obtaining the benefits brought by big data.