Watch, listen and write. AI and text processing
One of the interesting topics around the intersection between AI and market research is the AI’s ability to process text. At first, this does not sound like something too complicated. An example, we have OCR tools for years; transcribers. Natural language processing is stepping in.
Market research relies on exact data and, at the same time, generates a lot of text defining a few areas to look closely at.
Text-coding and translations. Almost every survey has open-end questions. Text-coding turns these questions into qualitative data.
- Alongside the regular sorting, there is another data layer available in the responses. The emotion layer that accompanies every answer can provide meaningful data if trained over time. The categorization of longer open-end answers is an easy task for the AI. For brief sentences, however, the results may get diluted due to a lack of context. There are steps in this direction but are still on an experimental level.
- Another great AI application is for training specific models to catch gibberish open-end answers by detecting how close to the natural language is a particular text. While this type of development requires extensive research and constant models re-retraining, as pioneers in this area, the results we see are quite impressive.
- Machine translations are handy as well – they save time, and their reliability increases every year. Here, again the AI might bump in translation obstacles, but these concerns, in general, are on a sentence-level accuracy (SACC) that relates to expert annotators, differing per language model. The meaning of this is that for more straightforward translations, the SACC is between 0.89 and 0.98 for high-quality language pairs like English-French, English-German, or English-Spanish. Most survey texts’ low complexity makes the machine translation a fit for the industry if done with the right prerequisites in mind.
Speech and video to text. Focus groups, audio, and video interviews are generating a lot of audio content. This content needs to be processed accordingly with a series of procedures and quality checks to turn into text, and finally to be qualified via text-coding.
- The audio content itself is incredibly valuable. It facilitates the respondents with the flexibility to have thoughts flow and for less time to provide more natural and meaningful research information.
- Video content also comes with visual signals of the user’s emotions, an additional data layer. The younger audience gets distracted more quickly and is comfortable with the video formats and chats; thus, we expect in the upcoming years the way surveys are conducted to change completely, redefined by the digital natives.
We look closely at the speech and video to text opportunities and challenges in another article.
In general, the experiments with text categorization and translation are scratching the possibilities’ surface. We will be yet to see some fantastic market research solutions coming up in the next few years.