“Data voids are gaps in search engine results that bad actors can exploit by filling them with disinformation and manipulative content”
Analysis
- Claim: "Data voids" are gaps in search results that malicious actors can fill with disinformation and manipulative content
- Verdict: TRUE
- Evidence Level: L2 — multiple scientific sources confirm the concept and mechanism
- Key Anomaly: Data voids emerge not only from absence of content, but also because credible media fail to cover topics users are interested in
- 30-Second Check: The data voids concept was developed by Microsoft Research and Data & Society researchers, documented in peer-reviewed publications, and confirmed by empirical studies of search behavior
Steelman — What Proponents Claim
The concept of "data voids" was first formulated by researchers Michael Golebiewski from Microsoft Bing and Alice Marwick from Data & Society in 2019 (S001, S015). According to this theory, data voids represent specific search queries or thematic areas where there is insufficient high-quality, authoritative information available.
The key mechanism works as follows: when a user enters an obscure or specific search query and the search engine cannot find relevant results from reliable sources, algorithms are forced to return what is available — often low-quality content, manipulative materials, or outright disinformation (S001, S015). Malicious actors can intentionally create content targeting rare search phrases, knowing that competition for these queries is minimal.
Researchers identify several types of data voids (S002):
- Natural voids — emerge organically when authoritative media simply do not cover certain topics
- Manipulative voids — created intentionally through use of specific terms and phrases not used in mainstream sources
- Exploited voids — existing gaps that are discovered and filled with disinformation
It is important to note that data voids differ from "data deficits." Deficits occur when much information exists but it is misleading, confusing, false, or even harmful (S011). Voids, by contrast, are characterized precisely by the absence of quality content.
What the Evidence Actually Shows
Empirical research confirms the existence and impact of data voids on disinformation spread. A study published in Nature in 2024 highlights that data voids represent a previously underappreciated aspect of the online misinformation phenomenon (S006).
Critical research on search behavior published in the Journal of Experimental Political Science showed that when users search online to evaluate false news, they risk falling into data voids where they find only corroborating evidence from low-quality sources (S013). Results indicate that online search to evaluate misinformation can actually increase belief in false claims if users fall into data voids.
A study of extremist content in Google published in Big Data & Society in 2023 applied a critical approach to big data analytics to gauge the contours of data voids in search queries reflecting extreme-right narratives (S007). The study demonstrated how data voids contribute to the politics of exclusion and marginalization of certain groups.
Particularly revealing is a 2025 study examining LLM-powered chatbot references to Kremlin disinformation (S009). The research showed that the appearance of disinformation content in LLM responses often reflects information gaps rather than intentional manipulation. The authors identify a fourth critical factor: data voids often emerge when credible media fail to cover topics users are interested in (S009).
A 2025 study in healthcare showed that data voids pose significant challenges to online health information, especially in areas where reliable information is scarce or absent (S003). This has direct public health consequences, as people seeking medical information may find only unverified or dangerous recommendations.
Analysis of performative links and search engines published in Information, Communication & Society in 2025 showed that data voids employ search terms rather than direct links, making them particularly effective for manipulation (S005). The research emphasizes that there must be search queries that can retrieve the manipulative content.
Conflicts and Uncertainties
Despite broad recognition of the data voids concept in academia, important nuances and limitations exist in understanding this phenomenon.
First, not all data voids result from malicious actions. As noted in Data & Society research, many voids emerge naturally when authoritative sources simply do not consider certain topics worthy of coverage (S001, S015). This creates an ethical dilemma: should journalists and researchers cover every emerging topic, risking legitimization of marginal narratives?
Second, tension exists between the concepts of data voids and data deficits. As First Draft explains, deficits are not the result of deliberate actions from bad actors but typically occur when much information exists but it is misleading (S011). The boundary between these concepts is not always clear, and in practice they may overlap.
Third, research on vaccine misinformation shows that data voids are just one of many mechanisms for spreading false information (S019). The WHO Vaccine Misinformation Management Field Guide mentions data voids as part of a broader infodemic ecosystem but emphasizes the need for a comprehensive approach.
A critical 2025 study questions the compatibility of various online activities, noting that regular social media users mask their true identities behind fictional avatars, and that amplifying conspiracy narratives has little in common with exploiting data voids (S014). This suggests that epistemologies grounded in neat taxonomies may be insufficient for understanding the complex reality of online disinformation.
Additionally, research on psychological aspects of misinformation emphasizes that data voids can exist both in a psychological sense as "data deficits" — knowledge gaps paired with high information demand — and in the information environment if there is no or limited information available online on a particular emerging topic (S016).
Interpretation Risks
Several critical risks exist in misinterpreting the data voids concept, which could lead to counterproductive strategies for combating disinformation.
Risk 1: Overestimating Intentionality. Not all data voids are created by malicious actors. Many emerge naturally because authoritative sources do not cover certain topics (S009). Assuming malicious intent in all cases can lead to ineffective countermeasures and unjustified censorship.
Risk 2: Ignoring Platform Roles. Data voids are not only a content problem but also a search engine algorithm problem. As research in NASIG notes, search engines are transforming, and their role in creating and maintaining data voids requires critical analysis (S002). Focusing exclusively on content without considering the algorithmic component will be incomplete.
Risk 3: False Dichotomy Between Voids and Deficits. In practice, data voids and deficits often overlap and interact. Attempting to strictly separate these concepts may lead to oversimplified understanding of a complex information ecosystem (S011, S016).
Risk 4: Underestimating the Problem in Specialized Domains. Healthcare research shows that data voids present particularly serious challenges in medical information, where absence of reliable sources can have direct consequences for people's health (S003). General strategies for combating disinformation may be insufficient for specialized domains.
Risk 5: Ignoring Threat Evolution. As EU vs Disinfo notes, the list of tactics, techniques, and procedures (TTPs) in disinformation is far from final, as malign actors keep innovating and the threat landscape keeps evolving (S017). Data voids are just one of many mechanisms, and focusing exclusively on them may lead to blindness to other threats.
Risk 6: The Large Language Model Problem. The emergence of LLM chatbots creates a new dimension to the data voids problem. When insufficient information exists for training models, they may generate false information (S018). This requires rethinking the data voids concept in the context of generative AI.
Practical Recommendations
Based on analysis of scientific literature, the following recommendations can be formulated:
- For journalists and media: Increase support for reliable information sources on high-demand topics, even if they seem marginal (S009). Proactive coverage can prevent data void formation.
- For search platforms: Develop mechanisms for detecting data voids and warning users when quality information on a query is absent (S006).
- For researchers: Continue studying interactions between data voids, deficits, and other disinformation spread mechanisms (S016).
- For users: Develop media literacy, including understanding that absence of results from authoritative sources does not mean alternative sources are trustworthy (S013).
- For AI developers: Consider the data voids problem when training large language models and develop mechanisms to prevent generation of false information in areas with insufficient data (S009, S018).
Conclusion
The data voids concept is scientifically grounded and confirmed by multiple empirical studies. The claim that data voids represent gaps in search results that can be filled with disinformation and manipulative content corresponds to reality and is confirmed by L2-level evidence.
However, understanding nuances is important: not all data voids are created maliciously, they interact with other disinformation mechanisms, and their impact varies depending on context. Effective response to this phenomenon requires a comprehensive approach involving actions from media, platforms, researchers, and users themselves.
Examples
Search queries about rare diseases filled with pseudoscientific sites
When people search for information about rare medical conditions, they often encounter a lack of authoritative sources in search results. These data voids are quickly filled by sites promoting unverified treatments or conspiracy theories. To verify information, look for sources from recognized medical organizations, check for scientific publications, and consult with qualified specialists. Pay attention to publication dates and the presence of references to peer-reviewed research.
Manipulation of search queries during elections
Malicious actors create content for specific search queries about candidates or political events that have not yet been covered by mainstream media. Using SEO optimization, they occupy top positions in search results with disinformation or manipulative materials. To verify, compare information with several independent sources, check facts through specialized fact-checking services, and pay attention to who owns the website. Critically evaluate emotionally charged content and verify the original sources of quotes.
Filling voids in searches about new technologies
When new technologies or terms emerge, there is a time gap before expert content is created. Scammers exploit this to promote fake investment schemes or spread panic through pseudo-expert articles. Verify authors' qualifications, seek opinions from recognized experts in the field, use academic databases and technical forums. Be especially cautious with content promising quick profits or using alarmist rhetoric.
Red Flags
- •Утверждает, что пустоты заполняются дезинформацией, но не различает случайное отсутствие контента от целенаправленной манипуляции
- •Приписывает злоумышленникам активность без доказательств их причастности — путает возможность с реальным действием
- •Игнорирует, что авторитетные источники могут не освещать тему по легитимным причинам (недостаток данных, этика), а не из-за пустоты
- •Не уточняет, какие именно запросы создают пустоты — обобщает от редких случаев на всю поисковую экосистему
- •Предполагает, что первый результат в выдаче автоматически заполняет пустоту, игнорируя фильтры и алгоритмы ранжирования
- •Не разделяет пустоты, возникшие из-за низкого спроса, от пустот, созданных цензурой или подавлением информации
Countermeasures
- ✓Запросите в Google Trends данные по нишевым запросам за 5+ лет: проверьте, действительно ли авторитетные источники игнорируют тему или она просто низкочастотна
- ✓Проанализируйте SERP конкурентов через SEMrush/Ahrefs: определите, заполняют ли пустоты легитимные издания или только маргинальные сайты
- ✓Проведите A/B тест поисковых алгоритмов: сравните выдачу Google, Bing, DuckDuckGo на одинаковый редкий запрос — ищите системные различия
- ✓Изучите архивы Wayback Machine за 10 лет: проверьте, была ли информация доступна ранее и почему исчезла из индекса
- ✓Интервьюируйте SEO-специалистов и журналистов: спросите, почему они не освещают конкретную тему — недостаток спроса или редакционная политика
- ✓Сравните цитирования в Google Scholar: если авторитетные исследования существуют, но не ранжируются, это указывает на алгоритмическую, а не информационную проблему
- ✓Проверьте наличие контента через специализированные базы (PubMed, arXiv, ResearchGate): пустота может быть в Google, но не в научной экосистеме
Sources
- Data Voids: Where Missing Data Can Easily Be Exploitedscientific
- Data Voids and Echo Chambers: The Transformative Journey of Searchscientific
- Into the Void: Understanding Online Health Information in Low-Web Datascientific
- How online misinformation exploits 'information voids' — and what to doscientific
- Google, data voids, and the dynamics of the politics of exclusionscientific
- 'Google this!' How performative links and search engines organisescientific
- LLMs grooming or data voids? LLM-powered chatbot references to Kremlin disinformationscientific
- Identifying 'data deficits' can pre-empt the spread of disinformationmedia
- Testing the Effect of Information on Discerning the Veracity of News in Real-Timescientific
- Why misinformation must not be ignoredscientific
- The Fragmentation of Truthmedia
- Vaccine Misinformation Management Field Guideother