HOURS: HarmOnic and Universal Resource Sharing
Foundations for
Resource Sharing in Computing Communities
Motivation
Shared Cyberinfrastructure
(CI)—federated sharing of dispersed pools of geographically
distributed computing resources under coordinated control—has
been considered as a promising platform for solving large-scale
problems in science and engineering, e.g., NSF's TeraGrid and DoE's
Open Science Grid,
for building the next generation of distributed Enterprise
applications, e.g., open grid services architecture (OGSA), and for
sharing resources in our daily lives, e.g., KaZaA, BitTorrent. This
trend is a natural combination of “technology push”
and “application pull,” and could lead to the
emergence of a new service-oriented computing industry if it can be
successfully deployed. On one hand, the low cost of COTS
(Commodity-Off-The-Shelf) hardware makes it affordable to build a
high-end cluster with reasonable performance. On the other hand, the
ever-increasing multidisciplinary trend of most research fields demands
collaboration among multiple computing sites.
We envision that there are five challenges related
to resource sharing in such an open environment, including
- Heterogeneity. The heterogeneity of
resources
makes the multiple-resource sharing difficult, because of lacking of a
formal metric for the trading among different resources;
- Untrustedness.
Enforcing a cooperative, adaptive, and anti-maliciousness P2P sharing
environment on top of an untrusted and private P2P community is really
a challenge;
- Selfishness.
The possible threats launched by
selfish peers, such as cheating and boasting, can destroy the
cooperative resource sharing. Enforcing a fair resource sharing
framework to limit the negative effect of the selfish peers is one of
the goals for our approach;
- Autonomy
and Cooperation.
Peers
usually belong to different administrative domains which may have
different local policies. How to effective and efficient integration of
these local policies and general resource sharing is a challenge;
- Incentives.
Free riders are the considerable population in the P2P community. To
attract the peer to contribute to the community is an old but still
ongoing problem. In this project, we intend to address all these
problems.
- Adaptiveness The
inherent dynamics of computing communities, e.g., joining/leaving the
community or add/remove computing powers, necessitates the support of
adaptiveness across all layers.
This
project is funded by U.S. NSF (CCF-0643521).
Top
Our Approach
Choosing
the good partner and avoiding the bad partner are the first but very
important step for the resource sharing. Making use of the knowledge of
the social network to build the trust inference model is our approach
to achieve this goal. With the help of the trust model, every peer can
build a neighbor set with high-quality peers, thus improve the
reliability and stability of resource sharing. However, the trust model
can not introduce more incentives, cooperativeness, and efficiency for
the resource sharing. We borrow the idea from the real economics to
build a currency-based model to complement the missing part of the
trust model. Currencies play a role to bridge every units.
People can
use the currency to buy needed merchandises, or save it for the future
use. So the currency circulation is lite-version of the
merchandise exchange. This phenomenon match the P2P system very well.
The overview of our approach is showed in the following figure.

In the HOURS
project, we propose an approach that consists of two
models: M-CUBE,
a Multiple
CUrrency
Based
Economic
model, as the decentralized trading scheme, and aPET,
an adaptive PErsonalized
Trust model,
to provide the trustworthiness of the peer to support
M-CUBE. The M-CUBE model provides a general and flexible substrate to
support most of high level resource management services required by the
P2P computing, such as resource coallocation, quality of service (QoS)
control, advance reservation and scheduling algorithms. aPET is built
on top of our previous work on PET, which derives the
trustworthiness from the reputation evaluation and risk evaluation. The
trustworthiness value provided by PET will be treated as the view of
the peer by M-CUBE. The unique feature of our approach is seamless
integrating the trustworthiness and dependability of peers into the
resource trading.
Top
Projects
adaptive PET
|
Building
a good
cooperation in the P2P resource sharing is a fundamental and
challenging research topic due to peer anonymity, peer independence,
high dynamics of peer behaviors and network connections, and the
absence of a perfect security mechanism of P2P systems. We propose PET,
a personalized trust model to help the construction of a good
cooperation, especially in the context of economic-based solutions for
the P2P resource sharing. The trust model consists of two parts:
reputation evaluation and risk evaluation. Reputation is the
accumulative assessment for the long-term behavior, while the risk
evaluation is the opinion of the short-term behavior. Two kinds of
knowledge, interaction-derived experience (local knowledge) , and the
recommendation (knowledge of other peers), are used to derive the
reputation. Selecting the weights of these two parts
are
environment specific, and is a decision on the trade off between the
reliability and efficiency. In the P2P system, we are suggesting to put
more weight on the reputation considering high dynamics is a
special characteristic in this kind of system. The risk part
is
employed to deal with the dramatic spoiling of peers, which makes PET
differ from other trust models based on the reputation only. Risk
evaluation can make the trust model more sensitive. How to make
sensitive match the correctness is our ongoing research. This
work contributes to first modeling the risk as the opinion of
short-term trustworthiness and combining with traditional reputation
evaluation to derive the trustworthiness in this field. The
figure-I is the description of the original PET model. Recently, we are
working on an adaptive PET model that captures the dynamics of the
system.
|

Figure-I
Derivation
of Trustworthiness Value
|
M-CUBE
|
M-CUBE,
a
multiple-currency based economic model, is a self-policing and
distributed approach that is based on top of PET model, to lay a
foundation for heterogeneous resource sharing in an untrusted P2P
computing environment. With the help of the trust management and the
merits of the economic institution, M-CUBE provides a novel
self-policing and quality-aware framework for the sharing of
heterogeneous resources, and is a flexible universal
infrastructure for building high-level resource management related
services. M-CUBE is built upon currency-based mechanism,
where
the uniqueness of M-CUBE is each peer has its own currency. There are
four major modules in M-CUBE: the Price
Regulator decides the
price of the resources; the Ratio
Regulator determines the
exchange ratio of the currency based on the trustworthiness value
provided by the PET model; the Service
Discovery module is in
charge of discovering the available resources provided by remote peers;
finally the Currency Exchange
module enables peers to bargain
until the agreement of the currency exchange is reached, and then makes
the exchange.
|
- Ratings Analysis:
Ratings
(also known as recommendations) from our family members,
friends, and the people with high reputation are treated as reliable
information sources, which are helpful and time-saving for human being
to explore the quality of other people or services. Hinted by the human
social network, the researchers are of great interest to employ ratings
in the computing society. However,
whether the ratings in the computing world has the same positive effect
is still an open problem. In this subproject, we intend to answer the
following questions: Are ratings always helpful? How to
configure factors related to the rating so as to improve the
performance of the system? How the rating affected by other factors,
such as the quality of raters, the scale of the system, the
dissemination mode, and so on? and Are the complicated aggregation
algorithms better than the simple ones? We found some very
interesting results, part of them is published in CollaborateCom 2005.
The
simulator of different ratings algorithms comparison is
available at Ratings-Simulator.
- SWAP: Data Management for
Data-Intensive Scientific Applications The
amount of scientific data generated by simulations or collected from
large scale experiments have reached levels (e.g., peta bytes) that can
not be stored in researcher's local computer center. In several
applications, such as High Energy Physics, the data, which is stored in
the back-end tapes, is fetched using an on-demand fashion by 100s-1000s
of scientists all over the world. Such data are vital to large
scientific collaborations dispersed over wide-area networks via either
dedicated or commercial networks. The access of data out of
tape for
analysis is usually the main bottleneck, rather than computing
resources. In this project, we are investigating an efficient
distributed data management scheme that intelligent swap the data
between archived tapes and storage disks contributed by the computer
centers. The goal is to maximize the resource utilization and
throughput
and minimize the makespan of scientific
applications.
- Workflow scheduling Scheduling
is the key to the performance of grid workflow applications where a
collection of dependent jobs run on a heterogeneous and dynamic
environment. Previous efforts in this area propose either static
scheduling strategy which maps jobs to resources before execution time
in order to achieve optimal performance of entire workflow, or dynamic
alternative which schedules individual job only when it is ready to
execute without considering workflow as a whole. While sizable work
supports the claim that static scheduling performs better for workflow
applications than the dynamic one, there have been arguments about
whether this advantage can be really realized given the fact that grid
environment changes constantly and the majority of real world grid
workflow systems are implemented with dynamic strategy only. In this
project, we investigate how to assure and exploit these intrinsic
benefits of static scheduling strategies in grid environment and
proposes a novel adaptive rescheduling concept, which allows the
workflow planner works collaboratively with the run time executor and
reschedule in a proactive way had the grid environment changes
significantly. By adaptive rescheduling, a planner can better adapt to
the grid dynamics, i.e., resource availability and performance
fluctuation etc. An HEFT-based adaptive rescheduling algorithm is
presented, evaluated and compared with traditional static and
dynamic strategies respectively. We are
currently
investigating the multiple-DAG scheduling in a failure-prone
environment.
- Peer-to-Peer Web Server Sharing
Peer-to-Peer Web server sharing is a case study of the HOURS
project. The objective of this subproject is improving the reliability
and availability of cooperative Web servers. The basic idea is a group
of peers pool in their resources to help each other during individual
peer's peak loads and/or system failures. The main concept behind the
workability of this arrangement is an understanding that not all
companies who form the peer to peer network will have peak loads on
their web sites at the same instant of time. In fact if the companies
are geographically dispersed spanning many time zones, the
effectiveness of this system is more apparent.
Top
People
Jacqueline D. Brown
Zhengqiang
(Sean) Liang
Tung Nguyen
Dr.
Weisong Shi
Brandon
Szeliga (former member)
Zhifeng Yu (former member)
Jayashree
Ravi (former member)
Publications
and Technical Reports
Lei Wang, Jianfeng Zhan, Weisong Shi, Yi Liang, and Lin Yuan, In Cloud, Do MTC or HTC Service Providers Benefit from the Economics of Scale?
in Proceedings of 2nd Workshop on Many-Task Computing on Grids and
Supercomputers, Co-located with ACM/IEEE SC09, Portland, Oregon,
November 16, 2009.
Brandon Szeliga, Tung Nguyen and Weisong Shi, DiSK: A Distributed Shared Disk Cache for HPC Environments, in Proceedings of the 5th CollaborateCom, Washington D.C., Nov 12-14, 2009.
Ling Liu and Weisong Shi,
Trust and Reputation Management in Future Computing Systems and
Applications: Guest Editorial, Journal of Computer Science and
Technology, September 2009.
Zhifeng Yu, Technical Report MIST-TR-2008-013, Ph.D.
Dissertation: Toward
Practical Multi-Workflow Scheduling in Cluster and Grid Environments
Z. Liang and W. Shi,
TRECON: A Trust-based Economic Framework for Efficient Internet
Routing, accepted by IEEE Transactions on Systems, Man and Cybernetics,
part A, to appear, June 2009.
Ying Song, Yanwei Zhang, Yuzhong
Sun and Weisong Shi, Utility Analysis for Internet-Oriented
Consolidation in VM-based Data Centers, in Proceedings of the 2009 IEEE
International Conference on Cluster Computing (Cluster 2009), New
Orleans, Aug. 31 - Sep. 4, 2009.
Chenjia Wang, Kevin Monaghan
and Weisong Shi, HACK: A Health-based Access Control Mechanism for
Dynamic Enterprise Environments, Proceedings of the 2009 IEEE/IFIP
International Symposium on Trusted Computing and Communications,
August 29-31, 2009.
Z. Liang and W. Shi, A Reputation-driven
Scheduler for Autonomic and Sustainable Resource Sharing in Grid
Computing, Journal of Parallel and Distributed Computing, to
appear, May 2009.
Tung Nguyen, Anthony Cutway and
Weisong Shi, Toward Differentiated Services for
Data Centers, in 8th USENIX Symposium on Operating Systems Design and
Implementation (OSDI WiP session), San Diego, CA, December 7-10,
2008.
Zhifeng Yu, Chenjia Wang and Weisong Shi, FLAW: Failure-Aware Workflow Scheduling in
High Performance Computing Environments, Technical Report MIST-TR-2007-010, November 2007, submitted.
Zhifeng Yu and Weisong Shi, A Planner-Guided Scheduling Strategy for Multiple
Workflow Applications, in Proceedings of the fourth International
Workshop on Scheduling and Resource Management for Parallel and Distributed
Systems, in conjunction with ICPP 2008, September 8, 2008. .
Brandon Szeliga, John Cavicchio and
Weisong Shi, DIMM: A Distributed Metadata Management for
Data-Intensive HPC Environments, in Workshop on Data-Aware
Distributed Computing, in conjunction with HPDC 2008, Boston, June 24 2008.
Zhengqiang Liang and Weisong Shi, Analysis of Ratings on Trust
Inference in Open Environments,
Elsevier Performance Evaluation,Vol. 65, No. 2 pp. 99-128. Feb., 2008.
[Online].
Zhifeng Yu and Weisong Shi, An Adaptive Rescheduling
Strategy for Grid Workflow Applications, in
Proceedings of the 21st IPDPS 2007, Long Beach, Mar 26 -30,
2007.
Zhengqiang Liang and Weisong Shi, TRECON:
A Framework for Enforcing Trusted ISP Peering, in Proceedings of
the 15th IEEE International Conference on Computer Communications and
Networks (ICCCN '06), Arlington, Oct 9-11, 2006.
Zhengqiang
Liang and Weisong
Shi,
Performance Evaluation of Different Rating Aggregation Schemes
in
Reputation Systems, Proceedings of the first IEEE International
Conference on
Collaborative Computing: Networking, Applications, and Worksharing
(CollaborateCom '05), San Jose, December 19-21, 2005.
Zhengqiang Liang and
Weisong Shi, Analysis of
Recommendations of Trust Inference in Open Environments, Technical
Report
MIST-TR-2005-002, February 2005. submitted under review.
(under revision)
Zhengqiang
Liang and Weisong Shi, Enforcing
Cooperative Resource Sharing in Untrusted Peer-to-Peer Environments,
ACM Journal of Mobile Networks and
Applications (MONET), Vol. 10, No. 6, pp. 771-783,
December 2005. (A full version is available at Technical Report MIST-TR-04-014.)
Zhengqiang Liang and
Weisong Shi, PET:
A PErsonalized Trust Model with Reputation and Risk Evaluation for P2P
Resource Sharing, in
Proceedings of HICSS-38, January,
2005.
Jayashree
Ravi, Zhengqiang Liang, and Weisong Shi, "A
Case
for Peer-to-Peer Web
Server Sharing," Technical Report MIST-03-011, Nov., 2003.
Top