An Architecture for COnsistenct Nomadic Content Access 


Motivation | Our Approach | Projects | People | Publications  

Motivation: Future access to web-based content is likely to be dominated by two trends: (a) increasing amounts of dynamic, personalized content, and (b) a significant growth in ``on-the-move'' access using various mobile resource-constrained devices. These trends point to a situation where a user would have ubiquitous access to content, but require that content be efficiently delivered to the user irrespective of location, and in a form most suited to the user's end device. Unfortunately, classical caching and transcoding solutions do not work well together particularly in light of the above trends, necessitating a new caching architecture built from the ground-up to handle problems caused by dynamic content, transcoded versions of objects, and the nomadic nature of users.  

Our approach: Based on the following two important observations: (1) despite the dynamically generated and personalized nature of web content, at the underlying level a relatively large amount of such content can in fact be shared, as shown in the following personalized yahoo pages, where s1, s2, s3, s4, and s5 are sharable, p1 and p2 are personalized objects.  

(2) Though people are ''on the move'' to access information and services by mobile devices,  most users exhibit a relatively static access pattern, often starting from a set of popular documents and following the links contained therein. 

These two observations suggest  our novel cache architecture named  CONCA (COnsistent Nomadic Content Access),  which attempts to support, from the ground up, caching of dynamic personalized content for (mobile)  users by  (1) reusing  the shared portions of dynamic content;  (2) exploiting knowledge of user content access preferences to efficiently support transcoding and nomadic access (e.g., by prefetching).(3) acting as an edge server to absorb some server load to the edge.  

The logical organization of each CONCA node is show as follows, which consists two parts: shared and personalized. More details can be found at our overview paper


Keyword-based Fragment Detection 

Fragment-based caching has been proposed as a promising technique for dynamic Web content delivery and caching. Most of these approaches either assume the fragment-based content is served by Web server automatically, or look at server-side caching only. There is no method of extracting fragments from an existing dynamic Web content, which is of great importance to the success of fragment-based caching. Also, current technologies for supporting dynamic fragments do not allow to take into account changes in fragment spatiality, which is a popular technique in dynamic and personalized Web site design. In this project, we intent to address these shortcomings. The first, DyCA, a Dynamic Content Adapter, is a tool for creating fragment-based content from original dynamic content. Our second proposal is an augmentation to the ESI standard that will allow it to support looking up fragment locations in a mapping table that comes attached with the template. This allows the fragments to move across the document without needing to reserve the template. (ICWE 2004 paper). The software of DyCA will be available soon. 

Performance Evaluation of Peer-to-Peer Web Caching: Hype or Reality?

We systematically examine the design space of peer-to-peer Web caching systems in three orthogonal dimensions: the caching algorithm, the document lookup algorithm, and the peer granularity. Based on the observation that the traditional URL-based caching algorithm suffers considerably from the fact of cacheability decrease caused by the fast growing of dynamic and personalized Web content, we propose to use the content-based caching algorithm. In addition to compare two existing document lookup algorithms, we propose a simple and effective geographic-based document lookup algorithm. Four different peer granularities, i.e., host level, organization level, building level, and centralized , are studied and evaluated using a seven-day Web trace collected at a medium-size education institution. Using a trace-driven simulation, we compared and evaluated all design choices in terms of two performance metrics: hit ratio and latency reduction. Finally, several implications derived from the analysis are also discussed. (ICPADS 2004 paper). Based on this, we have proposed a new peer-to-peer Web sharing system called Tuxedo. 

Modeling Object Characteristics of Dynamic Web Content 

By analyzing the content of six web sites that serve dynamic content over a two week period, we derive a set of models that characterizes this content in terms of a small number of independent parameters. Our studies find that the sizes and freshness times of component objects can be captured very well using Exponential and Weibull distributions respectively, and demonstrate siginificant content reusability across both the temporal and spatial dimensions. 

Dynamic Content Emulator (DYCE)

Based on our recent work on modeling object characteristics of dynamic web content,  and we have designed and implemented a Java-based dynamic content emulator (DYCE, pronounced as "dice"), which can serve the request for document templates and objects respectively (see DYCE paper) . DYCE is public available now. Currently, we are extending the function of DYCE. 

Service Execution Environment for Edge Computing 

We are also investigating the extension of CONCA node to support value-added services, and edge services.  ICAP and/or SOAP are used to cooperate with Internet services, such as image filtering, language translation etc. A service execution container which will be integrated in CONCA node is under developing.

Workload Characterization of Personalized Web Site

Cooperating with NYUHome team,  we recently analyzed the characteristics of NYUHome, which is a typical personalized web site,  and studied the implications of these characteristics on dynamic web caching.  A two-week period of NYUHome traces will be public available soon. If you are eager to play with the trace,  please send email to

CONCA Prototype with ESI Support

Using edge side include (ESI) as the language to describe  document template and part of personal assistants. A CONCA prototype is being implemented.


    Dr. Weisong Shi 

    Vikrant Mastoli

    Jayashree Ravi

    Yonggen Mao

    Amrish Gupta

    Daniel Brodie

    Zhaoming Zhu

Former members:

    Dr. Vijay Karamcheti (NYU)

    Eli Collins (NYU)

    Vidula Pant  (moved to CSFB)

Technical Reports and Publications