An Architecture for COnsistenct Nomadic Content Access
Motivation | Our Approach | Projects | People | Publications
Motivation: Future access to web-based content is likely to be dominated by two trends: (a) increasing amounts of dynamic, personalized content, and (b) a significant growth in ``on-the-move'' access using various mobile resource-constrained devices. These trends point to a situation where a user would have ubiquitous access to content, but require that content be efficiently delivered to the user irrespective of location, and in a form most suited to the user's end device. Unfortunately, classical caching and transcoding solutions do not work well together particularly in light of the above trends, necessitating a new caching architecture built from the ground-up to handle problems caused by dynamic content, transcoded versions of objects, and the nomadic nature of users.
Our approach: Based on the following two important observations: (1) despite the dynamically generated and personalized nature of web content, at the underlying level a relatively large amount of such content can in fact be shared, as shown in the following personalized yahoo pages, where s1, s2, s3, s4, and s5 are sharable, p1 and p2 are personalized objects.
(2) Though people are ''on the move'' to access information and services by mobile devices, most users exhibit a relatively static access pattern, often starting from a set of popular documents and following the links contained therein.
These two observations suggest our novel cache architecture named CONCA (COnsistent Nomadic Content Access), which attempts to support, from the ground up, caching of dynamic personalized content for (mobile) users by (1) reusing the shared portions of dynamic content; (2) exploiting knowledge of user content access preferences to efficiently support transcoding and nomadic access (e.g., by prefetching).(3) acting as an edge server to absorb some server load to the edge.
The logical organization of each CONCA node is show as follows, which consists two parts: shared and personalized. More details can be found at our overview paper.
Keyword-based Fragment Detection
Fragment-based caching has been proposed as a promising technique for dynamic Web content delivery and caching. Most of these approaches either assume the fragment-based content is served by Web server automatically, or look at server-side caching only. There is no method of extracting fragments from an existing dynamic Web content, which is of great importance to the success of fragment-based caching. Also, current technologies for supporting dynamic fragments do not allow to take into account changes in fragment spatiality, which is a popular technique in dynamic and personalized Web site design. In this project, we intent to address these shortcomings. The first, DyCA, a Dynamic Content Adapter, is a tool for creating fragment-based content from original dynamic content. Our second proposal is an augmentation to the ESI standard that will allow it to support looking up fragment locations in a mapping table that comes attached with the template. This allows the fragments to move across the document without needing to reserve the template. (ICWE 2004 paper). The software of DyCA will be available soon.
Performance Evaluation of Peer-to-Peer Web Caching: Hype or Reality?
We systematically examine the design space of peer-to-peer Web caching systems in three orthogonal dimensions: the caching algorithm, the document lookup algorithm, and the peer granularity. Based on the observation that the traditional URL-based caching algorithm suffers considerably from the fact of cacheability decrease caused by the fast growing of dynamic and personalized Web content, we propose to use the content-based caching algorithm. In addition to compare two existing document lookup algorithms, we propose a simple and effective geographic-based document lookup algorithm. Four different peer granularities, i.e., host level, organization level, building level, and centralized , are studied and evaluated using a seven-day Web trace collected at a medium-size education institution. Using a trace-driven simulation, we compared and evaluated all design choices in terms of two performance metrics: hit ratio and latency reduction. Finally, several implications derived from the analysis are also discussed. (ICPADS 2004 paper). Based on this, we have proposed a new peer-to-peer Web sharing system called Tuxedo.
Modeling Object Characteristics of Dynamic Web Content
By analyzing the content of six web sites that serve dynamic content over a two week period, we derive a set of models that characterizes this content in terms of a small number of independent parameters. Our studies find that the sizes and freshness times of component objects can be captured very well using Exponential and Weibull distributions respectively, and demonstrate siginificant content reusability across both the temporal and spatial dimensions.
Dynamic Content Emulator (DYCE)
Based on our recent work on modeling object characteristics of dynamic web content, and we have designed and implemented a Java-based dynamic content emulator (DYCE, pronounced as "dice"), which can serve the request for document templates and objects respectively (see DYCE paper) . DYCE is public available now. Currently, we are extending the function of DYCE.
Service Execution Environment for Edge Computing
We are also investigating the extension of CONCA node to support value-added services, and edge services. ICAP and/or SOAP are used to cooperate with Internet services, such as image filtering, language translation etc. A service execution container which will be integrated in CONCA node is under developing.
Workload Characterization of Personalized Web Site
Cooperating with NYUHome team, we recently analyzed the characteristics of NYUHome, which is a typical personalized web site, and studied the implications of these characteristics on dynamic web caching. A two-week period of NYUHome traces will be public available soon. If you are eager to play with the trace, please send email to firstname.lastname@example.org.
CONCA Prototype with ESI Support
Using edge side include (ESI) as the language to describe document template and part of personal assistants. A CONCA prototype is being implemented.
Dr. Weisong Shi
Dr. Vijay Karamcheti (NYU)
Eli Collins (NYU)
Vidula Pant (moved to CSFB)
Technical Reports and Publications
Yonggen Mao and Weisong Shi, Performance Evaluation of Peer-to-Peer Web Caching Systems, Technical Report MIST-TR-2004-013, April 2004, submitted.
Daniel Brodie, Amrish Gupta, and Weisong Shi, Accelerating Dynamic Web Content Delivery using Keyword-based Fragment Detection, in Proceedings of the 4th International Conference on Web Engineering (ICWE 2004), July 28-30, 2004, Munich, Germany. Best Paper Award. (accept rate 12% for regular papers)
Zhaoming Zhu, Yonggen Mao, and Weisong Shi, Workload Characterization of Uncacheable HTTP Traffic, in Proceedings of the 4th International Conference on Web Engineering (ICWE 2004), July 28-30, 2004, Munich, Germany. (A full version is available at Technical Report MIST-TR-03-003). (accept rate 32% for short papers)
Yonggen Mao, Zhaoming Zhu, and Weisong Shi, Peer-to-Peer Web Caching: Hype or Reality?, in Proceedings of the 10th IEEE International Conferences on Parallel and Distributed Systems, July 7-9, 2004. California. (accept rate: 30%)
Daniel Brodie, Amrish Gupta, and Weisong Shi, Keyword-based Fragment Detection for Dynamic Web Content Delivery, Proceedings of 2004 World Wide Web Conference (poster), May 17-22, 2004, New York.
Jayashree Ravi, Weisong Shi, and Chengzhong Xu, PACE: Prefetching and Filtering of Personalized Emails at the Network Edges, Technical Report CS-MIST-TR-2003-005, May 2003. submitted.
Vikrant Mastoli, Valmik Desai and Weisong Shi, SEE: A Service Execution Environment for Edge Services, in Proceedings of the Third IEEE Workshop on Internet Applications (WIAPP'03)}, San Jose, CA, June 23-24, 2003. (A full version is available at Technical Report CS-MIST-TR-2003-002).
Weisong Shi, Kandarp Shah, Yonggen Mao and Vipin Chaudhary, Tuxedo: A Peer-to-Peer Caching System, in Proceedings of 2003 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'03), June 23-26, 2003, Las Vegas, Nevada.
Weisong Shi, Eli Collins and Vijay Karamcheti, Modeling Object Characteristics of Dynamic Web Content, Journal of Parallel and Distributed Computing (JPDC) special issue on scalable Internet services and architecture, Vol. 63, No. 10, pages 963-980, Oct. 2003.
Weisong Shi, Eli Collins and Vijay Karamcheti, Modeling Object Characteristics of Dynamic Web Content, in Proceedings of the IEEE Globecom 2002 conference, Taipei, China, November 17-21, 2002. ( The full version is available as NYU computer science department technical report TR2001-822).
Weisong Shi, Randy Wright, Eli Collins and Vijay Karamcheti, Workload Characterization of a Personalized Web Site ---- and Its Implications for Dynamic Content Caching, in Proceedings of the 7th International Conference on Web Content Caching and Distribution (WCW'02), Boulder, Colorado, August 14-16, 2002.
Weisong Shi, Eli Collins and Vijay Karamcheti, DYCE: Model-based Emulation of Dynamic Web Content, in Poster Proceedings of 11th International World Wide Web Conference (WWW 2002), Hawaii, May 7-11, 2002 (The full version of this paper is available here ).
Weisong Shi and Vijay Karamcheti, CONCA: An Architecture for Consistent Nomadic Content Access, Workshop on Caching, Coherence, and Consistency (WC3 '01), in conjunction with ACM ICS'2001, Sorrento, Italy, June 17, 2001.