| Reliability is a very important aspect in distributed system studies and developments. In procedure control systems and computer integrated manufacture systems, for example, an application may relate to many application entities (processes) which may locate at several different sites. If one of the entities fails, the others may be affected. So every entity should be maintained to tolerate failures. We here propose a mechanism for tolerating failure, called distributed application entity group (DAEG), which will make an application entity reliable. In this mechanism, an application entity will be implemented with a process group, including several members that may locate at different sites. We present how the group communicates with other groups/entities in an application, how the entities in a group cooperate with each other and hold a consistent view about the group. In addition, time requirement is guaranteed by setting the timer. We introduce a method by which timer is used to check failures and to select a new group leader to let application it supports continue. We believe that the mechanism could support various applications in distributed systems. |