The web can be modeled as a directed graph. Come up with a graph traversal algorithm. Make the algorithm non-recursive and breadth-first.
Pseudo code to model the web as a graph may look something like this:
URLlist <- br="" for="" queue="" storing="" urls.="">URLlist.push(intial list of URLs)
while (!URLlist.empty())
{
URL = URL.pop()
if (seen(URL)) continue;
setseen(URL)
fetch URL
for all the URL's in the page pop them on the URLlist
}
->
URLlist <- br="" for="" queue="" storing="" urls.="">URLlist.push(intial list of URLs)
while (!URLlist.empty())
{
URL = URL.pop()
if (seen(URL)) continue;
setseen(URL)
fetch URL
for all the URL's in the page pop them on the URLlist
}
->
Comments
Post a Comment
https://gengwg.blogspot.com/