Data Input 337
// Find the index of the end tag
int endIndex = s.indexOf(endTag, startIndex);
// If we don’t find the end tag,
if (endIndex = = –1) return " " ;
// Return the text in between
return s.substring(startIndex,endIndex);
}
With this technique, we are ready to connect to a web site from within Processing and grab data to
use in our sketches. For example, we could read the HTML source from http://www.nytimes.com and
look for today’s headlines, search http://fi nance.yahoo.com for stock quotes, count how many times the
word “ Flickr ” appears on your favorite blog, and so on. HTML, however, is an ugly, scary place with
inconsistently formatted pages that are diffi cult to reverse engineer and parse eff ectively. Not to mention
the fact that companies change the source code of web pages rather often, so any example that I might
make while I am writing this paragraph might break by the time you read this paragrap.
For grabbing data from the web, an XML (Extensible Markup Language) feed will prove to be more
reliable and easier to parse. Unlike HTML (which is designed to make content viewable by a human’s
eyes) XML is designed to make content viewable by a computer and facilitate the sharing of data across
diff erent systems. We will get into how XML works more in Section 18.17 . For now, let’s examine how we
might grab the weather for any given zip code from Yahoo’s XML weather feed. Information about all of
Yahoo’s XML feeds can be found here: http://developer.yahoo.com/rss/ . e weather XML feed is here:
http://xml.weather.yahoo.com/forecastrss?p 10025
One way to grab the data from a weather feed is to use the Processing XML library (which facilitates
reading from an XML document). However, in order to demonstrate String parsing on a lower level,
as an exercise, we will use our loadStrings( ) scraping techniques and search for bits of information
embedded in the XML source manually. Admittedly, this is somewhat of a silly pursuit since XML is
designed to be parsed without having to resort to this methodology. For comparison, we will look at this
sample example using two diff erent XML libraries in Sections 18.7 and 18.8.
Looking in the XML source from the above URL, we can see that the temperature today (which happens
to be August 1, 2007 at the time of this writing) is 88
ⴰ
F— temp “ 88 ” .
< yweather:condition text = " Fair" code = " 34" temp = " 88" date = " Wed, 01 Aug 2007
3:51 pm EDT "/
e temperature is variable but the XML format is not, and therefore we can deduce that the start tag for
our search should be:
temp"
and the end tag:
"
(i.e., the fi rst quote after the start tag).
Knowing the start and end tags, we can use giveMeTextBetween( ) to pull out the temperature.
IndexOf() can also take a second
argument, an integer. That second
argument means: Find the fi rst
occurrence of the search String
after this specifi ed index. We use
it here to ensure that end index
follows start index.