Current location - Recipe Complete Network - Food world - Black technology behind microscopic marketing insight
Black technology behind microscopic marketing insight

"Advertising seems simple, but every advertisement should be based on a lot of data, information and research lasting for months. -claude Hopkins

I. "Black Technology Behind Microscopic Insight

Market insight is the eyes of marketers at any time, and where you look and what you see will influence the future business tide.

more than p>2 years ago, "market insight" was also called "market research". Due to the limited sample size available, the research results are not accurate now, and it is even more difficult to call it "insight".

The rise of the Internet makes "insight" really possible. However, limited by technology and product capabilities, advertisers can only deliver through coarse-grained industry-wide labels. For example, different advertisers can only use a "beauty industry label", and business analysis still cannot achieve the ideal "accuracy".

nowadays, in the 5G era, the traffic tide is gradually fading away, and the nearly saturated mobile internet advertising market is announcing the coming of the stock era, and marketing has entered a refined stage. This means that insight needs to be more accurate to support clearer business decisions.

Take the beauty industry as an example. Advertisers need to know not only where the people who are interested in beauty are, but also who are interested in their own brands.

the placement of advertisements is not only limited and fixed, but also can be combined with contextual scenes. For example, products that focus on whitening can be implanted into all content scenes related to whitening.

Insight is no longer just a rough industry survey, but an analysis of the market benefits of a certain sub-category or even a certain product.

in fact,

all these functions are hidden in the huge cloud image of the commercial data products of the huge engine. With these fine insights, the middle and upper reaches of the once relatively vague marketing now have a clearer vision, and advertisers can be targeted and control more decision details.

"Behind the microscope-level insight is the machine's reeling from the trillions of data streams and the efficient linkage between human beings and machines.

At first, the massive engine technical team decided to upgrade its technical capabilities at three levels around the demand of "accurate insight" and develop more detailed, flexible and fast solutions:

1. Basic layer: improve the understanding of the machine's content and produce more diversified labels.

2. application layer: insight should not only be accurate, but also be really needed by advertisers. Therefore, build a standardized label production platform to flexibly meet the personalized label needs of advertisers and achieve "what you need is what you get".

3. efficiency layer: improve the speed of data query, and ensure that advertisers can see the analysis results immediately, so as to quickly follow up the decision.

second, the basic layer: let the machine understand a richer world

In the huge engine, the content is the most basic raw material, and countless content streams are merged into the basic disk of data, which provides inexhaustible power for business analysis. But in fact, just as crude oil has to go through a series of industrial processes to become commercially valuable oil, it still needs a lot of operations to extract accurate business insights from massive content. One of the most critical links is to make the machine understand more information.

The greater the amount of information obtained by the machine, the finer the granularity of the output label, and finally more tangible business insights will be found.

as a whole, it starts from two aspects.

firstly, the text understanding ability of the machine is improved to word granularity in recognition granularity; The second is to make the machine have the ability to understand the video in terms of recognition breadth. Both of them are to let the machine get more information from the massive content and "see a richer world."

1.

Fine-grained text comprehension ability

In text recognition, the machine's comprehension ability is mainly divided into three levels according to the fineness. In the same article, the primary level machine can only know that this text is about cars, so the classification of labels is also very rough; Intermediate-level machines can understand the sentence level and identify how much space in this article about cars is about engines; Advanced machines are smarter and can identify key words in sentences. The brand, model, shape, performance, configuration and other aspects of a car in this article can be accurately identified.

In the massive engine,

the text understanding level of the machine has been accurate to word granularity, reaching the minimum unit of semantic understanding at present

. To put it simply, technical students will formulate a set of keyword strategies with commercial attributes, such as semantic relevance, word frequency, hot search trend, whether the data source has commercial attributes, etc. According to this strategy, the machine will sort the recognized words according to the degree of criticality, and the words that meet the prescribed strategy will be ranked higher and finally defined as commercial keywords. If these business keywords go through the systematic "art design", they are the cloud images of words that we often see in the analysis.

2.

Wider range of content recognition

For machines, video usually combines images, audio, text and other forms, so it is more difficult to recognize than text. In the technical field, the ability to realize and understand various forms of information through machine learning is called

multimodal learning

, in which "modality" refers to the carriers of various information, such as text, images and sounds. Therefore, video understanding is a typical multi-modal learning application scenario. Through "multimodal learning", the machine can identify more data forms and understand the content more fully.

generally speaking, making the machine understand the video is mainly divided into three steps: representation, fusion and classification.

"Representation is similar to translation, that is, it transforms different types of data such as text, images and sounds into a data language that machines can understand, that is, data with the same structure. In the "fusion stage", the machine will adopt different strategies to integrate the information of multiple modes, find the correlation between these information, and form a unified cognition. Finally, after fully understanding, the machine classifies the data according to the rules of first-class and second-class industry attributes, and classifies similar data into one category, and finally outputs "labels".

multimodal technology popular science video: how does the machine understand the video?

In layman's terms, with the blessing of multimodal technology, it is like human beings mastering many languages. On the one hand, in the absence of a certain mode, we can understand the content with another mode; On the other hand, by fusing the information of different modes, the machine can understand the content more accurately.

by understanding the text and video, the machine divides the huge content stream into various tags, including relatively coarse-grained category tags and fine-grained keywords, which form a huge commercial tag library and become the underlying foundation to meet the different marketing needs of advertisers.

Third, the application layer: efficiently meet the needs of personalized insight

Although through content understanding technology, the machine can finally output more accurate labels. However, these labels are standardized products and cannot be changed and adjusted after production, so it is still difficult to meet the personalized needs of some advertisers.

for example, if an advertiser only wants to put people interested in his own brand, or only wants to know the analysis of the interest points related to his own product, then he needs to reproduce a set of personalized labels that meet his own needs, which requires several processes, such as defining the labels, mining them in the underlying database according to rules, and evaluating and testing, before he can go online.

all these processes are implemented on the label platform. Simply put, the label platform is

a label production and management tool based on the ability of content understanding

. Through a set of standardized processes, business students who don't know technology can customize the labeling rules according to actual needs and flexibly produce labels on the platform.

Later, after internal testing, the label platform was opened to the outside world, and it became a "label factory" on a huge cloud map.

in the words of technical students, the value of the label platform is like opening the kitchen in a restaurant. If there is nothing on the menu that suits the appetite of the guests, then you can go directly to the kitchen, choose the right ingredients and make the desired food.

in a word, the tag platform makes accurate insight "adaptive: not only accurate, but also really needed by advertisers.

finally, through content understanding and labeling platform, advertisers can see the market trends of various categories by analyzing the content indicators of the whole platform. Even by analyzing the UGC and PGC content of a specific category, we can know whether the selling point of the product is consistent with the user's cognition, what are the positive and negative comments, and the performance of the product benefits.

in terms of people, advertisers can circle the interest and opportunity groups of this product on the label platform, and find out the KOL with high coincidence with the target people, which greatly reduces the risk of marketing decision.

fourth, the efficiency layer: making business analysis at your fingertips

just like the ancient marching war, the delivery time of the battle report closely affects the strategic decision, which is related to winning or losing the battle. The same is true for business analysis. For advertisers, if the data can't be seen immediately, it means that it can't be resumed as soon as possible, and its value will be weakened. Even if the insight is accurate, it will still be like a leopard in a tube.

In fact, every time an advertiser sends a query request, the system has to perform a series of complicated operations such as query, calculation and analysis in a massive database, and finally the target data is presented to the advertiser. But in the eyes of advertisers, all this only happened in the blink of an eye.

such an efficient processing speed is mainly due to the optimization of data storage mode. For machines, different types of data storage methods largely determine the speed of query. Just like finding Four Great Classical Novels in a library with 1, books, if you can sort and sort books according to their contents and initials, you will soon be able to find all of Four Great Classical Novels.

therefore, in terms of data storage mode, the team hired a "foreign aid-ClickHouse", a high-performance open source database management system, which specializes in dealing with cloud images, a scene with a large amount of data and often undertaking various flexible query requirements. With the characteristics of column storage structure and column-by-column calculation, combined with data fragmentation processing on the business side, ClickHouse can efficiently read and calculate the data needed by advertisers.

For example, advertisers want to analyze the content of women who love chocolate in first-and second-tier cities. Traditional databases need to read all the data to screen out three labels of first-and second-tier cities, women and chocolate lovers in turn, and the last three conditions are the target group of advertisers. However, if ClickHouse is used as a foreign aid, the system does not need to read all the data, but only needs to query the columns where these three labels are located, and then perform parallel processing on these three groups of data on the user segment, which greatly saves the query time.

on this basis, the data storage space is reduced by "BitMap technology"

. Bit is the smallest unit in computer data system, and a bit value can be 1 or . We often hear that "byte" can be converted into 8 bits. And "BitMap" uses the data structure of bit array to establish a mapping relationship between the original data and the positions in the bit array. Because the storage unit of Bit is very small, it can often save a lot of storage space.

To put it in a highly abstract way, "BitMap works like an English abbreviation. For example, it takes 45 characters to write IELTS in the full name form "International English Language Testing System", but if it is simply abbreviated as "IELTS", it only takes 5 characters, so the machine reading time will be greatly shortened.

according to the feedback of business test, with the combination of ClickHouse and BitMap, the query speed of huge cloud images has been increased by 1-5 times, and the query time has been controlled within 3-5 seconds, which truly realizes the "what you need is what you get" of business analysis.

Conclusion:

Content understanding solves the problem of label accuracy from the bottom, label platform makes accurate insight play more value from the top, and query technology makes all information enter people's sight in seconds. It is through technical breakthroughs that the ability to understand more business details has been achieved.

From creative production to insight analysis, now the massive engine team has more new ideas, such as making emotional analysis more delicate, the system more intelligent, and the production more efficient. The scientific nature of marketing is also moving towards the general public in countless thoughts. I believe that these subtle thoughts will also bring more advanced technology and solve more problems in the future.