«Wordnews»
Fine Arts Department, Communication Design,
State Academy of Art and Design, Stuttgart, Germany.
e-mail benjamin.fischer-at-typedown.com.
The project «Wordnews» focuses on systematics for
the interpretation and visual display of textual information. As a first
example the current news headlines of several leading international news
sources are being analysed and displayed. The output tries to visualise meaning
according to calculations on the quantitative occurrence of words.
For some time now I have been concerned
with systems that are able to deal with large amounts of textual information.
Those systems analyse the meaning of a given information and represent it
visually. Interesting research and implementations exist in particular in
librarianship. Semantic maps play a major role. The most diverse contents can
be connected in clusters on the basis of their semantic references with the
help of these mappings. In the case of automated semantic analyses they make
very high demands on programming, thesauruses and linguistic databases.
For my project «Wordnews» I chose a
variety of textual analysis in which I assign certain advanced qualities to the
words according to their quantitative occurrence.
Yahoo provides a service
at http://news.yahoo.com, which gives in accumulated form the news of important
newspapers and agencies. The «Top Stories» contain the ten most current news
stories of both AP and Reuters as well as the news of the Los Angeles Times and
the Washington Post. These top stories are offered in the so-called RSS format.
RSS is the abbreviation for «Rich Site Summary» or «Really Simple Syndication».
In contrast to regular web sites these RSS-files can be read by other programs
and therefore be processed further. Thus it is possible to include these
contents in your own interfaces.
The current news stories
which I receive from Yahoo (or any other adequate source) in this way are
evaluated in a specially developed
software. Words that are not relevant for the meaning, such as about, above,
always, but, even, the, etc. (about 300 so-called stopwords) are filtered out
in order not to falsify the analysis. The occurrences of the words are
analysed: Words which appear several times are evaluated by their percentage -
in comparison to the total number of words. If a word occurs more often it will
get a higher status than a word that only occurs once. This happens when
several News-Sources report on the same subject. Accordingly, these subjects
are more relevant as one that is mentioned by only one source.
The result is presented as connected by content. Words that occur more frequently are shown larger in comparison to the others according to their quantitative occurrence. Thus a kind of «word-carpet» is generated where some words are more accentuated than others. Frequently mentioned subjects are visually singled out whereas subjects of less general interest are much smaller in size. An immediate shift from the visual view to the original view of the actual news lines is always possible.
I have implemented other
functions such as searching the entire pool of all available news sources by
using the search option of Yahoo News. When clicking on a certain term in
the «top-stories-analysis» a search for
this subject is started, i.e. all news which contain this word are shown. Due
to the fact that so many sources are contained in Yahoo News (approx. 7.000 in
35 different languages) a comprehensive survey of the searched term is
provided. If the results of a search are presented in the analysis the searched
subject is shown proportionally large since it must be part of every news
story. Because of the graded representation of the other words a new
«word-carpet» is created - now regarding a more clearly defined topic. In order
to reach even more flexibility I have also added a form in which one can search
for any given word or subject.
Analysis of top
stories: http://www.typedown.com/external-01/news/yahoo-wordnews.php
Yahoo News: http://news.yahoo.com
Yahoo News
RSS-Feeds: http://news.yahoo.com/rss
Patterns in Unstructured Data,
Discovery, Aggregation, and Visualization.
A
Presentation to the Andrew W. Mellon Foundation by Clara Yu, John Cuadrado,
Maciej Ceglowski, J. Scott Payne, National Institute for Technology and Liberal
Education (NITLE). http://javelina.cet.middlebury.edu/lsa/out/cover_page.htm
Foltz, P. W. «Using Latent Semantic Indexing for Information Filtering». In R.
B. Allen (Ed.) Proceedings of the Conference on Office Information Systems,
Cambridge, MA, 40-47. http://www-psych.nmsu.edu/~pfoltz/cois/filtering-cois.html
Dominic Widdows, Scott Cederberg and
Beate Dorow. Visualisation Techniques
for Analysing Meaning. Fifth
International Conference on Text, Speech and Dialogue, Brno, Czech Republic, September 2002.
http://infomap.stanford.edu/papers/visualising-meaning.pdf
The Visual Display of Quantitative
Information, Edward R. Tufte,
http://www.edwardtufte.com/tufte/books_vdqi
Envisioning Information, Edward R.
Tufte,
http://www.edwardtufte.com/tufte/books_ei
Peter Cho, Cybercartography: Mapping in media art
http://www.design.ucla.edu/~petercho/fall03/cybercartography.pdf
Current Practices in Perceptual
Mapping, 1997 Sawtooth Software Conference
Thomas A. Wittenschläger, John A.Fiedler.
http://www.populus.com/techpapers/download/current_practices_perceptual_mapping.pdf
Inhaltliche Strukturierung von
Ressourcen - Eine Einführung in XML,
von Margarete Payer und Alois Payer. http://www.payer.de/xml/xml01.htm
Wikipedia – Perceptual Mapping. http://en.wikipedia.org/wiki/Perceptual_mapping
Getting the News Out - RSS and the
Semantic Web:
http://www.pcmag.com/article2/0,1759,1265724,00.asp
This is my collection of bookmarks
concerning the themes – last updated 2004-11-09.
1. Research
National Institute for Technology & Liberal Education (NITLE): http://www.nitle.org/semantic_search.php
Information Mapping, Seminar am FB Design der FH Aachen:
http://seminare.design.fh-aachen.de/imap/
W3C Sematic Web: http://www.w3.org/2001/sw/
Living Semantic Web: http://dmag.upf.es/livingsw/index.html
Sematic Weblogs: http://journal.dajobe.org/journal/2003/07/semblogs/
Information Studies 277 - Information Retrieval Systems: User-Centered
Designs:
http://polaris.gseis.ucla.edu/pagre/is277.html
Using Latent Semantic Indexing for Information Filtering:
http://www-psych.nmsu.edu/~pfoltz/cois/filtering-cois.html
The ConceptNet Project: http://web.media.mit.edu/~hugo/conceptnet/
2. Companies
Thinkmap visualization software facilitates communication, learning,
and discovery: http://www.thinkmap.com/
Populus Technical Papers:
Perceptual Mapping: http://www.populus.com/techpapers/map.php
TheBrain Technologies Corporation: http://www.thebrain.com/
The Hive Group - Creators of Honeycomb Technology: http://www.hivegroup.com/
Map Bureau: http://www.mapbureau.com/
Google Newsmap: http://www.marumushi.com/apps/newsmap/
NewsIsFree Top News Map: http://www.newsisfree.com/newsmap/top.php
Stamen «In the News»: http://news.stamen.com/
10x10 / 100 Words and Pictures that Define the Time: http://www.tenbyten.org/10x10.html
Textarc, an alternate Way to
view a text: http://textarc.org/
TouchGraph GoogleBrowser V1.01:
http://www.touchgraph.com/TGGoogleBrowser.html
Concept Based Information Representation and Retrieval – Infomap: http://infomap.stanford.edu/
Ideagraph, an Idea Development Tool for the Semantic Web: http://www.ideagraph.net/
Netzspannung.org – Semantic Map: http://www.netzspannung.org/about/tools/semantic-map/
Textalyser: http://textalyser.net/
WORDCOUNT - Tracking the Way We Use Language: http://www.wordcount.org
The Mute Map: http://docs.metamute.com/view/Home/TheMuteMap
Hotlinks: http://dev.upian.com/hotlinks/
Memeufacture: http://memeufacture.com/