豊田正史(Masashi Toyoda) 福地健太郎(Kentarou Fukuchi) A Web Community Chart for Navigating Related Communities ウェブコミュニティチャート: 関連ページグループの作成・閲覧手法 豊田正史(Masashi Toyoda) 福地健太郎(Kentarou Fukuchi) A web community is a set of web pages written by individuals or associations who have a common interest on a specific topic. This poster introduces the notion of a Web Community Chart, which is a graph connecting related communities, so that the user can navigate through related communities. We propose a technique for building the chart automatically from a given set of seed pages, and show the result chart of companies created from thousands of company web pages. A Part of the Community Chart コミュニティチャートの一部 Software Makers ソフトウェアメーカ Computer Device Makers 周辺機器メーカ Computer Makers コンピュータメーカ (Digital) Camera Makers (デジタル)カメラメーカ Cable Makers ケーブルメーカ
Data Set for Experiments 実験に用いたデータセット Applications アプリケーション Data Set for Experiments 実験に用いたデータセット Navigation support 閲覧支援 Integrate a graphical representation of the chart into web browsers What’s related community service 関連コミュニティ検索サービス When a seed page is given by the user, the system returns a community including the seed, and other related communities Archive: 17 million Japanese web pages (90GB) crawled in 1999 Seed Set: A manually maintained URL list of companies and organizations (about 5000 unique URLs) 21M URLs pointed to by retrieved pages The user can know what kinds of and how many communities are existing around the current interest 17M URLs retrieved by the crawler Related Page Algorithm 関連ページ発見手法 Calculates pages related to a given seed page by: Building a subgraph of the Web around the seed page Extracting good authorities and hubs from the subgraph Authorities: pages pointed by many good hubs Hubs: pages pointing many good authorities Result 実験結果 About 1800 communities Many valuable communities are clearly classified and connected to related ones Companies related to computers (the above Figure) The mass media (TV stations, newspapers, etc.) Companies related to music (CD, instrument, etc.) Linux communities (users groups, package provider etc.) ... Hub Authority Hub Authority Seed Hub Authority auth(n) = Σ hub(m), for all m pointing to n hub(n) = Σ auth(m), for all m pointed to by n Building the Community Chart コミュニティチャート作成手法 Classify seeds and connect communities using similarities of their related pages Community A Community B Community C URL1 URL2 URL3 URL4 Related pages Related pages Related pages Related pages URL1.1 URL1.2 … URL1.10 URL2.1 URL2.2 … URL2.10 URL3.1 URL3.2 … URL3.10 URL4.1 URL4.2 … URL4.10 High Similarity Low Similarity No Similarity