Home - Contacts - Terms -

Mi Islita

Temporal Co-Occurrence: How does a Developing Event Affects Search Results?

Temporal Co-Occurrence - This experiment investigates how an in-progress event might shape search results in the websphere and the blogosphere. FINDALL and EXACT query modes were tested. A developing natural event, Hurricane Rita, was investigated. Search, containment and co-occurrence responses were also tested. Applications to document volume and search volume are discussed.

Dr. E. Garcia
Mi Islita.com
Email | Last Update: 06/14/06

Article 5 of the series Keywords Co-Occurrence and Semantic Connectivity

Topics

Abstract

Introduction

Procedure

Data Collection and Analysis

Experimental Results

Search Results

Containment Results

Co-Occurrence Results

Implications

Search Volume

Hurricane Katrina Revisited

Conclusion

References

Abstract

The IR community and search marketers have investigated in the past how external events affect search results from commercial search engines. Often such studies are conducted in the websphere using the default FINDALL query mode of search engines. Events studied are usually not in-progress but sudden events that have occurred in time. However, little has been written regarding how unexpected, in-progress events or query modes shape search results or when these deteriorate significantly.

In this article, we present preliminary experimental data aimed at addressing how an in-progress event might shape search results from commercial search engines. The experiment was conducted in the websphere and the blogosphere. FINDALL and EXACT query modes were used. A developing natural event, Hurricane Rita, was investigated and search results were monitored over time. Several containment and co-occurrence responses were tested. The EF-Ratio response seems to outperform all of them. Our results suggest that FINDALL searches do not provide a reliable analytical response. The experiment questions the validity of keyword research reports that do not discriminate between query modes or database results.

Introduction

According to the National Oceanic & Atmospheric Administration(NOAA), Hurricane Katrina made landfall at approximately 7:10 a.m. EDT on Aug. 29, 2005, in Plaquemines Parish, Lousiana (1). The hurricane impacted not only the environment and lives, but also the state economy, the stock market, and, as expected, search results from commercial search engines. This prompted us to investigate how evolving events like developing news stories, natural disasters, terrorist activities and other world events influence search results.

Prior studies indicate that there is a connection. According to an IBM group, searches for world trade center produced completely different results before and after the terrorist attack on September 11, 2001, indicating a change in the meaning of query terms (2). And according to Kleinberg, "understanding the pattern of a developing news story requires considering not just the content of the relevant articles but also how they evolve over time" (3). These studies suggest that external events can shape the meaning of queries and content of documents.

Since the summer of 2005 was an active hurricane season, we made plans for studying how the next hurricane might impact search results. We didn't have to wait that long. A month later, NOAA issued an alert for a new hurricane -Hurricane Rita- which eventually and according to their reports made landfall at approximately 3:30 a.m. EDT on Sept. 24, 2005, on the extreme southwest coast of Louisiana between Sabine Pass, Texas, and Johnson's Bayou in Louisiana (4).

Thus, on Sept. 21 we started collecting query results relevant to this developing event.

Procedure

We decided in advanced on where, how and what to search. First, we did not want to limit the experiment to the websphere or to the default query mode of search engines. Since we were interested in searches involving the terms hurricane and rita only, search results in EXACT and FINDALL modes were monitored. Documents retrieved in EXACT mode contain query terms in sequence (order and proximity matters), while documents retrieved in FINDALL (AND) mode contain query terms in no particular order (order and proximity do not matter). These search modes have been described elsewhere.

Next, a decision was made on what to measure. For these type of studies one can measure either query volume or document volume. Query volume (search volume) refers to what users search for, while document volume (relevancy volume) refers to number of documents found through searches.

We realized that in order to measure the former as intended one would have to have access to Google's search logs. Since this was not possible, we monitored the latter: document volume. Since Google results might vary slightly according to user's location and data centers, we consider our results experimental approximations or educated estimates. Furthermore, these estimates are valid for the portion of the web and blog spheres searchable through Google.

Google was used for the experiment because at the time of the experiment we were under the impression that the other major search engines (Yahoo! and MSN) did not have the capabilities of searching the blogosphere with the required query modes. For instance, Yahoo! launched its blog search service around October 10, 2005 and MSN Spaces was limited to blog searches within the MSN domain.

To automize the process we wrote a simple program that gets a multiterm query and resubmits new queries, each one returning the corresponding results. Thus, for the input query hurricane rita the program resubmitted at once the following queries to Google Search and Google Blog (5, 6):

  1. k12 = k1 + k2 = hurricane rita
  2. "k12" = "k1 + k2" = "hurricane rita"
  3. k1 = hurricane
  4. k2 = rita

k12, k1 and k2 were submitted using Google's default mode (FINDALL). The double quotes in "k12" are used to indicate that this query was submitted using Google's EXACT mode. In addition, the program submitted the transposed case (rita followed by hurricane; i.e., k21 and "k21"). However, since the effect of external events on transposed queries was out of the scope of this report, we did not include those results. This was done to simplify the discussion. However, those results are on file and can be supplied upon request.

Data Collection and Analysis

Table 1 and Table 2 show the experimental data, collected on a daily basis and over the next 31 days.

 Experimental Data

These results were used to monitor the following answer sets:

  1. n12 = number of documents relevant to hurricane and rita
  2. "n12" = number of documents relevant to the hurricane rita sequence
  3. n1 = number of documents relevant to hurricane
  4. n2 = number of documents relevant to rita

We analyzed these sets using the following containment measures:

These measures are represented as Venn Diagrams in Figure 1.

 Venn Diagram of several sets of results

Figure 1. Venn Diagram representation for several answer sets.

Note that P1, P2, "P1" and "P2" are containment measures while the EXACT-to-FINDALL ratio (EF-Ratio) is both a containment and co-occurrence measure. For instance, P1 is the fraction of n2 contained in n1 and P2 is the fraction of n1 contained in n2. In terms of conditional probabilites,

Unlike these, the EF-Ratio is the probability that k1 and k2 will co-occur in sequence provided that k1 and k2 have occurred. The EF-Ratio can be obtained from the "P1"/P1 and "P2"/P2 ratios.

Experimental Results

Figure 2 shows temporal co-occurrence curves, obtained by monitoring the n12 and "n12" answer sets in the websphere and blogosphere. The curves were obtained by querying Google in FINDALL mode for k12 = hurricane rita and in EXACT mode for "k12" = "hurricane rita".

As expected FINDALL curves were taller than EXACT curves since results in EXACT mode are a subset of the results obtained in FINDALL mode. Subtracting the areas under the curves gives a crude estimate of the amount of noise in the n12 set; i.e. the number of documents retrieved in FINDALL mode that did not target the "hurricane rita" sequence even when they might contain the two terms.

 Search Results

Figure 2. Google search results for hurricane rita using FINDALL and EXACT modes.

In Figure 2 the broken line indicates the approximate landfall time. Since NOAA reported that Hurricane Rita made landfall few hours after midnight (at approximately 3:30 a.m. EDT on Sept. 24) and we collected data in the afternoon, we placed the broken line between day 3 (Sept 23) and day 4 (Sept 24). We believe that even with all these approximations, some interesting observations can be extracted from the experiment.

The first and more obvious observation is the overall shape of the curves. Websphere curves increased before the hurricane made landfall. The day after landfall FINDALL results jumped from n12 = 21,600,000 to n12 = 58,900,000 and EXACT results jumped from "n12" = 9,760,000 to "n12" = 36,800,000.

The overall trend of the curves was to increase, reaching a local peak around day 11 (Oct. 1) and 12 (Oct. 2). This was consistent with an increasing number of documents targeting the terms and Google indexing and ranking these.

With such increments, a lot of "noisy documents" were ranked. A crude estimate gives the amount of noise:

Day 4 (Sept. 24): 21,600,000 - 9,760,000 = 11,840,000
Day 5 (Sept. 25): 58,900,000 - 36,800,00 = 22,100,000

That is, by landfall the noise in the n12 set actually surpassed the number of documents targeting the exact sequence: 11,840,000 vs 9,760,000. This illustrates how contaminated with noise FINDALL results can be.

This was not an isolated incident. By day 11 and 12 the amount of noise was:

Day 11 (Oct 1): 89,400,00 - 46,200,000 = 43,200,000
Day 12 (Oct 2): 88,900,000 - 51,100,000 = 37,800,000

Thus, for the purpose of collecting web analytics or conducting keyword research studies for the hurricane rita sequence, FINDALL results did not provide a reliable response. This is why we monitored both FINDALL and EXACT results. By day 15 and for the remaining of the experiment the number of documents relevant to hurricane rita decreased.

This "rise-and-fall" behavior was not observed in the blogosphere curves, at least not during the full lifespan of our experiment. What we observed was an increasing trend. Several factors might account for this trend:

Let's explain (a) and (b) first.

As pointed out by Chris Sherman, Associate Editor of Search Engine Watch (7),

"Google defines blogs as sites that use RSS and other structured feeds and update content on a regular basis."

Although Google Blog search focuses primarily on content published to the blogosphere, it's not a true full-text search across all sources... This is because some publishers only syndicate excerpts of content via RSS. Google's blog search indexes all of the content it finds in feeds, but does not attempt to access and index the full content available on a publisher's web server."

"Google blog search results point primarily to individual blog postings, with a title and snippet from each -strongly resembling Google's web search results. In some cases, links to "related blogs" are presented at the top of search results if a query suggests that the user is looking for a particular blog rather than a specific blog posting."

Thus, the incomplete way Google indexes the blogosphere and the discrete way we sampled their results might account for the observed curves. Still by days 9 and 10 the curves experienced a deceleration, indicating a change of trends. We cannot exclude the possibility that potential burst activities as described by Kleinberg and Kumar (4, 8) were masked by (a) and (b).

As pointed out by Kumar, blogs differs from traditional web pages structurally because blogs represent concatenations of messages authored by a single individual. Also the culture of blogs focuses heavily on local communities between a small number of bloggers. Members list one another's blogs and respond to other community members' blogs. These interactions occur during a brief burst of heavy activity. Topics arise, become prominent and then fade away, as pointed out by Kumar (8).

However, little has been written with regard to how an unexpected in-progress event affects blogosphere results. These events are different from the sudden, single or isolated events discussed in online communities. Unlike developing events, abrupt or surprise events like the 9/11 event don't have a prior history affecting search results.

In contrast a developing event, such as a hurricane landfall, has a prior history and may lead to other equally evolving events with their own lifespan. Thus, one would expect not just a burst but some sort of topic persistency taking place on the blogsphere by means of word associations and relatedness. This should reflects on analytical responses extracted from the blogosphere. This might explain why within the lifespan of our experiment the blogosphere curves did not fade away but approached a plateau, indicating a discourse persistency -online communities kept discussing events associated to Hurricane Rita.

Containment Results

Answer sets obtained using the default FINDALL mode of search engines are often contaminated with noise. This noise can actually mask important trends that might develop over time. Therefore, monitoring search results for a term sequence using the default FINDALL mode of search engines might be a contraindicated procedure, especially if one wants to extract temporal patterns and assess trends. Rather, containment measures should be monitored.

In Figure 1 we have described the five possible containment measures for a two-term query. Figure 3 shows the results of monitoring these measures during our experiment. Several observations can be made.

Containment Measures in the Websphere and Blogosphere

Figure 3. Containment measures in the websphere and blogosphere.

First, the EF-Ratio curve provides a distinct and the largest incremental response when compared with the other containment measures. Second, websphere curves increase up to day 5 (Sept. 25). Then from day 5 to day 14 these actually decreased. The same overall trends are observed in the other containment curves. These changes in trend are not evident from Figure 2.

Moreover, the websphere curves shown in Figure 2 increased from day 5 to day 15 because at the same time the fraction of documents returned by Google that did not target the "hurricane rita" exact sequence increased. The amount of noise was actually at its peak (compare with Table 1 and Table 2). This does not mean that the terms were not co-occurring in the noisy documents, it only means that these were not co-occurring according to the query sequence.

In other words, by merely inspecting Figure 2 one may think that in the websphere and between day 5 and day 15 there was an improvement in search result counts relevant to the query hurricane rita; or that around day 16 and 17 search quality deteriorated. Figure 3 shows this not to be the case. Search results quality actually deteriorated after day 5, the day after landfall. This trend continued for the lifespan of our experiment and can be rationalized from the overall shape of the EF-Ratio curve.

Co-Occurrence Results

Often terms co-occurring in a sequence are strongly related since they might modify each other or place each other in a sort of local context. In addition, term relatedness tend to decrease as the distance between terms increases. Thus, the lack of ordering and proximity between terms tend to introduce noise in association responses even when these co-occur in a document. This is illustrated in Figure 4.

Association Responses

Figure 4. Association responses in the websphere and blogosphere.

Here two association measures were plotted in ppt, the pairwise co-occurrence index (c-index) and the Salton Index. c-indices are defined as given in reference (9), while Salton indices are defined according to Noyons (10, 11). The quoted Salton Index corresponds to the "n12" set.

Note that websphere curves peak within the noisy 5-to-15 day interval while blogosphere curves level as in Figure 2 and 3. As mentioned before, this was caused by the fraction of documents returned by Google that did not target the "hurricane rita" exact sequence. A quick comparison between Figures 2, 3 and 4 reveals that neither the n12 (FINDALL) nor the co-occurrence curves (c-index and Salton Index curves) were able to discriminate the noise and trends. The EF-Ratio outperformed all these curves. It is evident from the EF-Ratio curve that after day 5 the fraction of documents that target the query sequence decreased; i.e., an increasing fraction of the n12 answer set deteriorated after this day.

Implications

The procedure herein described was a preliminary attempt at understanding how an evolving event might affect search results in the websphere and blogosphere. In principle, these type of studies can be applied to

From the marketing standpoint, co-monitoring hurricanes and searches over time for other key terms could be used to investigate consumers' behaviors during the evolution of such atmospheric phenomena (e.g., searches in connection with the insurance, real estate or construction industry). Other atmospheric events like tornadoes, floods, or storms can be investigated as well. Moreover, detecting search patterns from virtual communities during the progress of emergencies might be relevant to homeland security programs.

While our experiment was quite preliminary and limited to monitoring document volume over time, search volume can also be monitored in a similar way. This might interest companies that provide search volume services.

Currently some search engines and marketing firms provide these type of services. For instance, Google publishes Google Zeitgeist (12). Overture (acquired by Yahoo!), Wordtracker, Trellian and others provide a variety of search volume services. Monitoring results from these services over time should allow one to extract important word patterns and seasonal trends.

However, when using such services one should be aware of what the reports actually account for. One may want to consider the following:

  1. Are these the result of fusion; i.e., of combining results from dissimilar databases?
  2. Are these the result of mixing; i.e., of counting hits for a term regardless of whether the term was queried alone or as part of a phrase?
  3. How do the reports discriminate between query modes? If so, which fraction accounts for users searching in EXACT or FINDALL?
  4. How do the several containment and co-occurrence responses evolve over time in the websphere or blogosphere?
  5. When, why or how do changes in overall trends occur?
  6. During which time interval the measured response ended contaminated with a significant amount of noise?

If the reporting service combines search volumes from dissimilar databases or mixes search counts more likely that will be detrimental to the research. Unless stated in the report, a keyword researcher will not know which fraction of results comes from which database or how partial counts are affecting total counts.

On the other hand, if a keyword reporting service cannot discriminate between query modes, then the reports might account for what users searched, not for how users actually searched. Under such contaminated scenarios, trying to extract word patterns or to correlate searches with user's interests or search behaviors might as well compound the noise.

Conclusion

We have presented a preliminary experiment aimed at addressing how an in-progress event might affect search results from commercial search engines and when these deteriorate significantly. The experiment was conducted in the websphere and blogosphere using both FINDALL and EXACT query modes. Several containment and co-occurrence responses were tested, with the EF-Ratio outperforming all of them.

Our results suggest that FINDALL -default search mode in most search engines- does not provide a reliable analytical response for conducting temporal studies shaped by developing events, although this may be possible in the blogosphere to some extent. The experiment questions the validity of keyword research reports that do not discriminate between query modes or database results.

The EF-Ratio seems to be a better choice for monitoring seasonal and temporal trends and for conducting co-occurrence studies over time. This metric seems to outperform FINDALL and other containment responses and can be used to estimate when an increasing fraction of results deteriorates. Applications to quality control of search results and for studying other type of events have been proposed.

In the blogosphere, we observed that in-progress events seem to introduce persistency in the discourse of virtual communities. However, more testing is required before ruling this as an artifact of our experimental set-up.

Appendix A - Google Trends Search Volume Tool

Added on May 06/14/06

This article was first published last November of 2005. Back then there was no way of comparing document volumes with search volumes or any two search volumes in Google. That is possible now with the release few days ago of Google Trends. Despite its many limitations, the service provides at least a way to look back in time and compare search volumes in the websphere and newsphere.

Figure A.1 shows the temporal evolution of search volume relevant to the queries discussed in this article. I've added Texas as a reference query since the in-progress event was specific to the Texas area. Note that all these query signals show superimposed peaks around exactly the same time the document volume signals reached their peaks in Figure 2. Note also that a peak marked "A" about a month before the multioverlapping of peaks correspond to a different news: the Katrina Hurricane. More on this later. Keep reading.

Since Google suggests the use of parentheses for multiterm searches, we repeated the queries, obtaining similar results in Figure A.2. Overall, the service provides a crude way of comparing search volume signals in FINDALL and EXACT mode. This can be accomplished by using parentheses and double quotes and by visually inspecting the areas under the peaks or their heights. This is illustrated in Figure A.2 to A.5. Here we tried the previous queries as well as the following alternate queries: car insurance, home insurance, and real estate. Queries about home insurance and car insurance exhibit similar trends. Interestingly around the same time the search volume signal for Hurricane Rita peaked all these queries decreased.

 Search Volume - Google Trends

Figure A.1. Google search volume signals using FINDALL and EXACT modes.

 Search Volume - Google Trends

Figure A.2. Google search volume signals using parentheses, FINDALL and EXACT modes.

 Search Volume - Google Trends

Figure A.3. Google search volume signals using parentheses, FINDALL and EXACT modes for car insurance.

 Search Volume - Google Trends

Figure A.4. Google search volume signals using parentheses, FINDALL and EXACT modes for home insurance.

 Search Volume - Google Trends

Figure A.5. Google search volume signals using parentheses, FINDALL and EXACT modes for real estate.

Hurricane Katrina Revisited

We were also curious about the Katrina search volume peak that occured before Hurricane Rita peaks. Thus, we compared this search volume signal with the search volume signal for gas prices The goal was to see if such signals could provide additional insight on how an in-progress event affects interrelated search volumes. Figure A.6 is enlightening.

 Search Volume - Google Trends

Figure A.6. Google search volume signals using parentheses, FINDALL and EXACT modes for hurricane katrina and gas prices.

Note the overlapping between search volume peaks. Developing events can indeed influence interrelated search results and this phenomenon is measurable. The next step of this research is to find a way to convince search engines of providing tools that would allow direct comparison of EF-Ratios for both search volume and document volume.

Prev: Co-Occurrence and the Scope of Terms

References
  1. NOAA, NOAA Conducts Aerial Survey of Regions Ravaged by Hurricane Katrina NOAA News Online (Story 2495), 2005.
  2. NOAA, NOAA Performs Aerial Survey of REgions Affected by Hurricane Rita, NOAA News Online (Story 2511), 2005.
  3. Einat Amitay, David Carmel, Michael Herscovici, Ronny Lempel, and Aya Soffer, Trend Detection Through Temporal Link Analysis, JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 55(XX):1-12, 2004.
  4. J. Kleinberg Temporal Dynamics of On-Line Information Streams Department of Computer Science, Cornell University.
  5. Google, Inc. Google Search
  6. Google, Inc. Google Blog
  7. Chris Sherman, Google Launches Industrial Strength Blog Search September 14, 2005.
  8. R. Kumar, J. Novak, P. Raghavan, A. Tomkins On the Bursty Evolution of Blogspace WWW2003, May 20-24, 2003, Budapest, Hungary, 2003.
  9. E. Garcia, Keywords Co-Occurrence and Semantic Connectivity, Mi Islita.com, 2004.
  10. Ed Noyons, CWTS WWW Projects Centre for Science and Technology Studies (CWTS), Leiden University.
  11. J. E. Levesley, Mapping the Field of Human Ageing Research Appendix A, 2004.
  12. Google, Inc. Google Zeitgeist

Thank you for using this site.
Status of the Current Document 
W3C CSS Validation  W3C XHTML Validation
Copyright © 2006 Mi Islita.com -