| \input texinfo @c -*-texinfo-*- |
| @comment %**start of header (This is for running Texinfo on a region.) |
| @setfilename ipc.info |
| @settitle Inter Process Communication. |
| @setchapternewpage odd |
| @comment %**end of header (This is for running Texinfo on a region.) |
| |
| @ifinfo |
| This file documents the System V style inter process communication |
| primitives available under linux. |
| |
| Copyright @copyright{} 1992 krishna balasubramanian |
| |
| Permission is granted to use this material and the accompanying |
| programs within the terms of the GNU GPL. |
| @end ifinfo |
| |
| @titlepage |
| @sp 10 |
| @center @titlefont{System V Inter Process Communication} |
| @sp 2 |
| @center krishna balasubramanian, |
| |
| @comment The following two commands start the copyright page. |
| @page |
| @vskip 0pt plus 1filll |
| Copyright @copyright{} 1992 krishna balasubramanian |
| |
| Permission is granted to use this material and the accompanying |
| programs within the terms of the GNU GPL. |
| @end titlepage |
| |
| @dircategory Miscellaneous |
| @direntry |
| * ipc: (ipc). System V style inter process communication |
| @end direntry |
| |
| @node Top, Overview, Notes, (dir) |
| @chapter System V IPC. |
| |
| These facilities are provided to maintain compatibility with |
| programs developed on system V unix systems and others |
| that rely on these system V mechanisms to accomplish inter |
| process communication (IPC).@refill |
| |
| The specifics described here are applicable to the Linux implementation. |
| Other implementations may do things slightly differently. |
| |
| @menu |
| * Overview:: What is system V ipc? Overall mechanisms. |
| * Messages:: System calls for message passing. |
| * Semaphores:: System calls for semaphores. |
| * Shared Memory:: System calls for shared memory access. |
| * Notes:: Miscellaneous notes. |
| @end menu |
| |
| @node Overview, example, Top, Top |
| @section Overview |
| |
| @noindent System V IPC consists of three mechanisms: |
| |
| @itemize @bullet |
| @item |
| Messages : exchange messages with any process or server. |
| @item |
| Semaphores : allow unrelated processes to synchronize execution. |
| @item |
| Shared memory : allow unrelated processes to share memory. |
| @end itemize |
| |
| @menu |
| * example:: Using shared memory. |
| * perms:: Description of access permissions. |
| * syscalls:: Overview of ipc system calls. |
| @end menu |
| |
| Access to all resources is permitted on the basis of permissions |
| set up when the resource was created.@refill |
| |
| A resource here consists of message queue, a semaphore set (array) |
| or a shared memory segment.@refill |
| |
| A resource must first be allocated by a creator before it is used. |
| The creator can assign a different owner. After use the resource |
| must be explicitly destroyed by the creator or owner.@refill |
| |
| A resource is identified by a numeric @var{id}. Typically a creator |
| defines a @var{key} that may be used to access the resource. The user |
| process may then use this @var{key} in the @dfn{get} system call to obtain |
| the @var{id} for the corresponding resource. This @var{id} is then used for |
| all further access. A library call @dfn{ftok} is provided to translate |
| pathnames or strings to numeric keys.@refill |
| |
| There are system and implementation defined limits on the number and |
| sizes of resources of any given type. Some of these are imposed by the |
| implementation and others by the system administrator |
| when configuring the kernel (@xref{msglimits}, @xref{semlimits}, |
| @xref{shmlimits}).@refill |
| |
| There is an @code{msqid_ds}, @code{semid_ds} or @code{shmid_ds} struct |
| associated with each message queue, semaphore array or shared segment. |
| Each ipc resource has an associated @code{ipc_perm} struct which defines |
| the creator, owner, access perms ..etc.., for the resource. |
| These structures are detailed in the following sections.@refill |
| |
| |
| |
| @node example, perms, Overview, Overview |
| @section example |
| |
| Here is a code fragment with pointers on how to use shared memory. The |
| same methods are applicable to other resources.@refill |
| |
| In a typical access sequence the creator allocates a new instance |
| of the resource with the @code{get} system call using the IPC_CREAT |
| flag.@refill |
| |
| @noindent creator process:@* |
| |
| @example |
| #include <sys/shm.h> |
| int id; |
| key_t key; |
| char proc_id = 'C'; |
| int size = 0x5000; /* 20 K */ |
| int flags = 0664 | IPC_CREAT; /* read-only for others */ |
| |
| key = ftok ("~creator/ipckey", proc_id); |
| id = shmget (key, size, flags); |
| exit (0); /* quit leaving resource allocated */ |
| @end example |
| |
| @noindent |
| Users then gain access to the resource using the same key.@* |
| @noindent |
| Client process: |
| @example |
| #include <sys/shm.h> |
| char *shmaddr; |
| int id; |
| key_t key; |
| char proc_id = 'C'; |
| |
| key = ftok ("~creator/ipckey", proc_id); |
| |
| id = shmget (key, 0, 004); /* default size */ |
| if (id == -1) |
| perror ("shmget ..."); |
| |
| shmaddr = shmat (id, 0, SHM_RDONLY); /* attach segment for reading */ |
| if (shmaddr == (char *) -1) |
| perror ("shmat ..."); |
| |
| local_var = *(shmaddr + 3); /* read segment etc. */ |
| |
| shmdt (shmaddr); /* detach segment */ |
| @end example |
| |
| @noindent |
| When the resource is no longer needed the creator should remove it.@* |
| @noindent |
| Creator/owner process 2: |
| @example |
| key = ftok ("~creator/ipckey", proc_id) |
| id = shmget (key, 0, 0); |
| shmctl (id, IPC_RMID, NULL); |
| @end example |
| |
| |
| @node perms, syscalls, example, Overview |
| @section Permissions |
| |
| Each resource has an associated @code{ipc_perm} struct which defines the |
| creator, owner and access perms for the resource.@refill |
| |
| @example |
| struct ipc_perm |
| key_t key; /* set by creator */ |
| ushort uid; /* owner euid and egid */ |
| ushort gid; |
| ushort cuid; /* creator euid and egid */ |
| ushort cgid; |
| ushort mode; /* access modes in lower 9 bits */ |
| ushort seq; /* sequence number */ |
| @end example |
| |
| The creating process is the default owner. The owner can be reassigned |
| by the creator and has creator perms. Only the owner, creator or super-user |
| can delete the resource.@refill |
| |
| The lowest nine bits of the flags parameter supplied by the user to the |
| system call are compared with the values stored in @code{ipc_perms.mode} |
| to determine if the requested access is allowed. In the case |
| that the system call creates the resource, these bits are initialized |
| from the user supplied value.@refill |
| |
| As for files, access permissions are specified as read, write and exec |
| for user, group or other (though the exec perms are unused). For example |
| 0624 grants read-write to owner, write-only to group and read-only |
| access to others.@refill |
| |
| For shared memory, note that read-write access for segments is determined |
| by a separate flag which is not stored in the @code{mode} field. |
| Shared memory segments attached with write access can be read.@refill |
| |
| The @code{cuid}, @code{cgid}, @code{key} and @code{seq} fields |
| cannot be changed by the user.@refill |
| |
| |
| |
| @node syscalls, Messages, perms, Overview |
| @section IPC system calls |
| |
| This section provides an overview of the IPC system calls. See the |
| specific sections on each type of resource for details.@refill |
| |
| Each type of mechanism provides a @dfn{get}, @dfn{ctl} and one or more |
| @dfn{op} system calls that allow the user to create or procure the |
| resource (get), define its behaviour or destroy it (ctl) and manipulate |
| the resources (op).@refill |
| |
| |
| |
| @subsection The @dfn{get} system calls |
| |
| The @code{get} call typically takes a @var{key} and returns a numeric |
| @var{id} that is used for further access. |
| The @var{id} is an index into the resource table. A sequence |
| number is maintained and incremented when a resource is |
| destroyed so that access using an obsolete @var{id} is likely to fail.@refill |
| |
| The user also specifies the permissions and other behaviour |
| charecteristics for the current access. The flags are or-ed with the |
| permissions when invoking system calls as in:@refill |
| @example |
| msgflg = IPC_CREAT | IPC_EXCL | 0666; |
| id = msgget (key, msgflg); |
| @end example |
| @itemize @bullet |
| @item |
| @code{key} : IPC_PRIVATE => new instance of resource is initialized. |
| @item |
| @code{flags} : |
| @itemize @asis |
| @item |
| IPC_CREAT : resource created for @var{key} if it does not exist. |
| @item |
| IPC_CREAT | IPC_EXCL : fail if resource exists for @var{key}. |
| @end itemize |
| @item |
| returns : an identifier used for all further access to the resource. |
| @end itemize |
| |
| Note that IPC_PRIVATE is not a flag but a special @code{key} |
| that ensures (when the call is successful) that a new resource is |
| created.@refill |
| |
| Use of IPC_PRIVATE does not make the resource inaccessible to other |
| users. For this you must set the access permissions appropriately.@refill |
| |
| There is currently no way for a process to ensure exclusive access to a |
| resource. IPC_CREAT | IPC_EXCL only ensures (on success) that a new |
| resource was initialized. It does not imply exclusive access.@refill |
| |
| @noindent |
| See Also : @xref{msgget}, @xref{semget}, @xref{shmget}.@refill |
| |
| |
| |
| @subsection The @dfn{ctl} system calls |
| |
| Provides or alters the information stored in the structure that describes |
| the resource indexed by @var{id}.@refill |
| |
| @example |
| #include <sys/msg.h> |
| struct msqid_ds buf; |
| err = msgctl (id, IPC_STAT, &buf); |
| if (err) |
| !$#%* |
| else |
| printf ("creator uid = %d\n", buf.msg_perm.cuid); |
| .... |
| @end example |
| |
| @noindent |
| Commands supported by all @code{ctl} calls:@* |
| @itemize @bullet |
| @item |
| IPC_STAT : read info on resource specified by id into user allocated |
| buffer. The user must have read access to the resource.@refill |
| @item |
| IPC_SET : write info from buffer into resource data structure. The |
| user must be owner creator or super-user.@refill |
| @item |
| IPC_RMID : remove resource. The user must be the owner, creator or |
| super-user.@refill |
| @end itemize |
| |
| The IPC_RMID command results in immediate removal of a message |
| queue or semaphore array. Shared memory segments however, are |
| only destroyed upon the last detach after IPC_RMID is executed.@refill |
| |
| The @code{semctl} call provides a number of command options that allow |
| the user to determine or set the values of the semaphores in an array.@refill |
| |
| @noindent |
| See Also: @xref{msgctl}, @xref{semctl}, @xref{shmctl}.@refill |
| |
| |
| @subsection The @dfn{op} system calls |
| |
| Used to send or receive messages, read or alter semaphore values, |
| attach or detach shared memory segments. |
| The IPC_NOWAIT flag will cause the operation to fail with error EAGAIN |
| if the process has to wait on the call.@refill |
| |
| @noindent |
| @code{flags} : IPC_NOWAIT => return with error if a wait is required. |
| |
| @noindent |
| See Also: @xref{msgsnd},@xref{msgrcv},@xref{semop},@xref{shmat}, |
| @xref{shmdt}.@refill |
| |
| |
| |
| @node Messages, msgget, syscalls, Top |
| @section Messages |
| |
| A message resource is described by a struct @code{msqid_ds} which is |
| allocated and initialized when the resource is created. Some fields |
| in @code{msqid_ds} can then be altered (if desired) by invoking @code{msgctl}. |
| The memory used by the resource is released when it is destroyed by |
| a @code{msgctl} call.@refill |
| |
| @example |
| struct msqid_ds |
| struct ipc_perm msg_perm; |
| struct msg *msg_first; /* first message on queue (internal) */ |
| struct msg *msg_last; /* last message in queue (internal) */ |
| time_t msg_stime; /* last msgsnd time */ |
| time_t msg_rtime; /* last msgrcv time */ |
| time_t msg_ctime; /* last change time */ |
| struct wait_queue *wwait; /* writers waiting (internal) */ |
| struct wait_queue *rwait; /* readers waiting (internal) */ |
| ushort msg_cbytes; /* number of bytes used on queue */ |
| ushort msg_qnum; /* number of messages in queue */ |
| ushort msg_qbytes; /* max number of bytes on queue */ |
| ushort msg_lspid; /* pid of last msgsnd */ |
| ushort msg_lrpid; /* pid of last msgrcv */ |
| @end example |
| |
| To send or receive a message the user allocates a structure that looks |
| like a @code{msgbuf} but with an array @code{mtext} of the required size. |
| Messages have a type (positive integer) associated with them so that |
| (for example) a listener can choose to receive only messages of a |
| given type.@refill |
| |
| @example |
| struct msgbuf |
| long mtype; type of message (@xref{msgrcv}). |
| char mtext[1]; message text .. why is this not a ptr? |
| @end example |
| |
| The user must have write permissions to send and read permissions |
| to receive messages on a queue.@refill |
| |
| When @code{msgsnd} is invoked, the user's message is copied into |
| an internal struct @code{msg} and added to the queue. A @code{msgrcv} |
| will then read this message and free the associated struct @code{msg}.@refill |
| |
| |
| @menu |
| * msgget:: |
| * msgsnd:: |
| * msgrcv:: |
| * msgctl:: |
| * msglimits:: Implementation defined limits. |
| @end menu |
| |
| |
| @node msgget, msgsnd, Messages, Messages |
| @subsection msgget |
| |
| @noindent |
| A message queue is allocated by a msgget system call : |
| |
| @example |
| msqid = msgget (key_t key, int msgflg); |
| @end example |
| |
| @itemize @bullet |
| @item |
| @code{key}: an integer usually got from @code{ftok()} or IPC_PRIVATE.@refill |
| @item |
| @code{msgflg}: |
| @itemize @asis |
| @item |
| IPC_CREAT : used to create a new resource if it does not already exist. |
| @item |
| IPC_EXCL | IPC_CREAT : used to ensure failure of the call if the |
| resource already exists.@refill |
| @item |
| rwxrwxrwx : access permissions. |
| @end itemize |
| @item |
| returns: msqid (an integer used for all further access) on success. |
| -1 on failure.@refill |
| @end itemize |
| |
| A message queue is allocated if there is no resource corresponding |
| to the given key. The access permissions specified are then copied |
| into the @code{msg_perm} struct and the fields in @code{msqid_ds} |
| initialized. The user must use the IPC_CREAT flag or key = IPC_PRIVATE, |
| if a new instance is to be allocated. If a resource corresponding to |
| @var{key} already exists, the access permissions are verified.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| EACCES : (procure) Do not have permission for requested access.@* |
| @noindent |
| EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@* |
| @noindent |
| EIDRM : (procure) The resource was removed.@* |
| @noindent |
| ENOSPC : All id's are taken (max of MSGMNI id's system-wide).@* |
| @noindent |
| ENOENT : Resource does not exist and IPC_CREAT not specified.@* |
| @noindent |
| ENOMEM : A new @code{msqid_ds} was to be created but ... nomem. |
| |
| |
| |
| |
| @node msgsnd, msgrcv, msgget, Messages |
| @subsection msgsnd |
| |
| @example |
| int msgsnd (int msqid, struct msgbuf *msgp, int msgsz, int msgflg); |
| @end example |
| |
| @itemize @bullet |
| @item |
| @code{msqid} : id obtained by a call to msgget. |
| @item |
| @code{msgsz} : size of msg text (@code{mtext}) in bytes. |
| @item |
| @code{msgp} : message to be sent. (msgp->mtype must be positive). |
| @item |
| @code{msgflg} : IPC_NOWAIT. |
| @item |
| returns : msgsz on success. -1 on error. |
| @end itemize |
| |
| The message text and type are stored in the internal @code{msg} |
| structure. @code{msg_cbytes}, @code{msg_qnum}, @code{msg_lspid}, |
| and @code{msg_stime} fields are updated. Readers waiting on the |
| queue are awakened.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| EACCES : Do not have write permission on queue.@* |
| @noindent |
| EAGAIN : IPC_NOWAIT specified and queue is full.@* |
| @noindent |
| EFAULT : msgp not accessible.@* |
| @noindent |
| EIDRM : The message queue was removed.@* |
| @noindent |
| EINTR : Full queue ... would have slept but ... was interrupted.@* |
| @noindent |
| EINVAL : mtype < 1, msgsz > MSGMAX, msgsz < 0, msqid < 0 or unused.@* |
| @noindent |
| ENOMEM : Could not allocate space for header and text.@* |
| |
| |
| |
| @node msgrcv, msgctl, msgsnd, Messages |
| @subsection msgrcv |
| |
| @example |
| int msgrcv (int msqid, struct msgbuf *msgp, int msgsz, long msgtyp, |
| int msgflg); |
| @end example |
| |
| @itemize @bullet |
| @item |
| msqid : id obtained by a call to msgget. |
| @item |
| msgsz : maximum size of message to receive. |
| @item |
| msgp : allocated by user to store the message in. |
| @item |
| msgtyp : |
| @itemize @asis |
| @item |
| 0 => get first message on queue. |
| @item |
| > 0 => get first message of matching type. |
| @item |
| < 0 => get message with least type which is <= abs(msgtyp). |
| @end itemize |
| @item |
| msgflg : |
| @itemize @asis |
| @item |
| IPC_NOWAIT : Return immediately if message not found. |
| @item |
| MSG_NOERROR : The message is truncated if it is larger than msgsz. |
| @item |
| MSG_EXCEPT : Used with msgtyp > 0 to receive any msg except of specified |
| type.@refill |
| @end itemize |
| @item |
| returns : size of message if found. -1 on error. |
| @end itemize |
| |
| The first message that meets the @code{msgtyp} specification is |
| identified. For msgtyp < 0, the entire queue is searched for the |
| message with the smallest type.@refill |
| |
| If its length is smaller than msgsz or if the user specified the |
| MSG_NOERROR flag, its text and type are copied to msgp->mtext and |
| msgp->mtype, and it is taken off the queue.@refill |
| |
| The @code{msg_cbytes}, @code{msg_qnum}, @code{msg_lrpid}, |
| and @code{msg_rtime} fields are updated. Writers waiting on the |
| queue are awakened.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| E2BIG : msg bigger than msgsz and MSG_NOERROR not specified.@* |
| @noindent |
| EACCES : Do not have permission for reading the queue.@* |
| @noindent |
| EFAULT : msgp not accessible.@* |
| @noindent |
| EIDRM : msg queue was removed.@* |
| @noindent |
| EINTR : msg not found ... would have slept but ... was interrupted.@* |
| @noindent |
| EINVAL : msgsz > msgmax or msgsz < 0, msqid < 0 or unused.@* |
| @noindent |
| ENOMSG : msg of requested type not found and IPC_NOWAIT specified. |
| |
| |
| |
| @node msgctl, msglimits, msgrcv, Messages |
| @subsection msgctl |
| |
| @example |
| int msgctl (int msqid, int cmd, struct msqid_ds *buf); |
| @end example |
| |
| @itemize @bullet |
| @item |
| msqid : id obtained by a call to msgget. |
| @item |
| buf : allocated by user for reading/writing info. |
| @item |
| cmd : IPC_STAT, IPC_SET, IPC_RMID (@xref{syscalls}). |
| @end itemize |
| |
| IPC_STAT results in the copy of the queue data structure |
| into the user supplied buffer.@refill |
| |
| In the case of IPC_SET, the queue size (@code{msg_qbytes}) |
| and the @code{uid}, @code{gid}, @code{mode} (low 9 bits) fields |
| of the @code{msg_perm} struct are set from the user supplied values. |
| @code{msg_ctime} is updated.@refill |
| |
| Note that only the super user may increase the limit on the size of a |
| message queue beyond MSGMNB.@refill |
| |
| When the queue is destroyed (IPC_RMID), the sequence number is |
| incremented and all waiting readers and writers are awakened. |
| These processes will then return with @code{errno} set to EIDRM.@refill |
| |
| @noindent |
| Errors: |
| @noindent |
| EPERM : Insufficient privilege to increase the size of the queue (IPC_SET) |
| or remove it (IPC_RMID).@* |
| @noindent |
| EACCES : Do not have permission for reading the queue (IPC_STAT).@* |
| @noindent |
| EFAULT : buf not accessible (IPC_STAT, IPC_SET).@* |
| @noindent |
| EIDRM : msg queue was removed.@* |
| @noindent |
| EINVAL : invalid cmd, msqid < 0 or unused. |
| |
| |
| @node msglimits, Semaphores, msgctl, Messages |
| @subsection Limis on Message Resources |
| |
| @noindent |
| Sizeof various structures: |
| @itemize @asis |
| @item |
| msqid_ds 52 /* 1 per message queue .. dynamic */ |
| @item |
| msg 16 /* 1 for each message in system .. dynamic */ |
| @item |
| msgbuf 8 /* allocated by user */ |
| @end itemize |
| |
| @noindent |
| Limits |
| @itemize @bullet |
| @item |
| MSGMNI : number of message queue identifiers ... policy. |
| @item |
| MSGMAX : max size of message. |
| Header and message space allocated on one page. |
| MSGMAX = (PAGE_SIZE - sizeof(struct msg)). |
| Implementation maximum MSGMAX = 4080.@refill |
| @item |
| MSGMNB : default max size of a message queue ... policy. |
| The super-user can increase the size of a |
| queue beyond MSGMNB by a @code{msgctl} call.@refill |
| @end itemize |
| |
| @noindent |
| Unused or unimplemented:@* |
| MSGTQL max number of message headers system-wide.@* |
| MSGPOOL total size in bytes of msg pool. |
| |
| |
| |
| @node Semaphores, semget, msglimits, Top |
| @section Semaphores |
| |
| Each semaphore has a value >= 0. An id provides access to an array |
| of @code{nsems} semaphores. Operations such as read, increment or decrement |
| semaphores in a set are performed by the @code{semop} call which processes |
| @code{nsops} operations at a time. Each operation is specified in a struct |
| @code{sembuf} described below. The operations are applied only if all of |
| them succeed.@refill |
| |
| If you do not have a need for such arrays, you are probably better off using |
| the @code{test_bit}, @code{set_bit} and @code{clear_bit} bit-operations |
| defined in <asm/bitops.h>.@refill |
| |
| Semaphore operations may also be qualified by a SEM_UNDO flag which |
| results in the operation being undone when the process exits.@refill |
| |
| If a decrement cannot go through, a process will be put to sleep |
| on a queue waiting for the @code{semval} to increase unless it specifies |
| IPC_NOWAIT. A read operation can similarly result in a sleep on a |
| queue waiting for @code{semval} to become 0. (Actually there are |
| two queues per semaphore array).@refill |
| |
| @noindent |
| A semaphore array is described by: |
| @example |
| struct semid_ds |
| struct ipc_perm sem_perm; |
| time_t sem_otime; /* last semop time */ |
| time_t sem_ctime; /* last change time */ |
| struct wait_queue *eventn; /* wait for a semval to increase */ |
| struct wait_queue *eventz; /* wait for a semval to become 0 */ |
| struct sem_undo *undo; /* undo entries */ |
| ushort sem_nsems; /* no. of semaphores in array */ |
| @end example |
| |
| @noindent |
| Each semaphore is described internally by : |
| @example |
| struct sem |
| short sempid; /* pid of last semop() */ |
| ushort semval; /* current value */ |
| ushort semncnt; /* num procs awaiting increase in semval */ |
| ushort semzcnt; /* num procs awaiting semval = 0 */ |
| @end example |
| |
| @menu |
| * semget:: |
| * semop:: |
| * semctl:: |
| * semlimits:: Limits imposed by this implementation. |
| @end menu |
| |
| @node semget, semop, Semaphores, Semaphores |
| @subsection semget |
| |
| @noindent |
| A semaphore array is allocated by a semget system call: |
| |
| @example |
| semid = semget (key_t key, int nsems, int semflg); |
| @end example |
| |
| @itemize @bullet |
| @item |
| @code{key} : an integer usually got from @code{ftok} or IPC_PRIVATE |
| @item |
| @code{nsems} : |
| @itemize @asis |
| @item |
| # of semaphores in array (0 <= nsems <= SEMMSL <= SEMMNS) |
| @item |
| 0 => dont care can be used when not creating the resource. |
| If successful you always get access to the entire array anyway.@refill |
| @end itemize |
| @item |
| semflg : |
| @itemize @asis |
| @item |
| IPC_CREAT used to create a new resource |
| @item |
| IPC_EXCL used with IPC_CREAT to ensure failure if the resource exists. |
| @item |
| rwxrwxrwx access permissions. |
| @end itemize |
| @item |
| returns : semid on success. -1 on failure. |
| @end itemize |
| |
| An array of nsems semaphores is allocated if there is no resource |
| corresponding to the given key. The access permissions specified are |
| then copied into the @code{sem_perm} struct for the array along with the |
| user-id etc. The user must use the IPC_CREAT flag or key = IPC_PRIVATE |
| if a new resource is to be created.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| EINVAL : nsems not in above range (allocate).@* |
| nsems greater than number in array (procure).@* |
| @noindent |
| EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@* |
| @noindent |
| EIDRM : (procure) The resource was removed.@* |
| @noindent |
| ENOMEM : could not allocate space for semaphore array.@* |
| @noindent |
| ENOSPC : No arrays available (SEMMNI), too few semaphores available (SEMMNS).@* |
| @noindent |
| ENOENT : Resource does not exist and IPC_CREAT not specified.@* |
| @noindent |
| EACCES : (procure) do not have permission for specified access. |
| |
| |
| @node semop, semctl, semget, Semaphores |
| @subsection semop |
| |
| @noindent |
| Operations on semaphore arrays are performed by calling semop : |
| |
| @example |
| int semop (int semid, struct sembuf *sops, unsigned nsops); |
| @end example |
| @itemize @bullet |
| @item |
| semid : id obtained by a call to semget. |
| @item |
| sops : array of semaphore operations. |
| @item |
| nsops : number of operations in array (0 < nsops < SEMOPM). |
| @item |
| returns : semval for last operation. -1 on failure. |
| @end itemize |
| |
| @noindent |
| Operations are described by a structure sembuf: |
| @example |
| struct sembuf |
| ushort sem_num; /* semaphore index in array */ |
| short sem_op; /* semaphore operation */ |
| short sem_flg; /* operation flags */ |
| @end example |
| |
| The value @code{sem_op} is to be added (signed) to the current value semval |
| of the semaphore with index sem_num (0 .. nsems -1) in the set. |
| Flags recognized in sem_flg are IPC_NOWAIT and SEM_UNDO.@refill |
| |
| @noindent |
| Two kinds of operations can result in wait: |
| @enumerate |
| @item |
| If sem_op is 0 (read operation) and semval is non-zero, the process |
| sleeps on a queue waiting for semval to become zero or returns with |
| error EAGAIN if (IPC_NOWAIT | sem_flg) is true.@refill |
| @item |
| If (sem_op < 0) and (semval + sem_op < 0), the process either sleeps |
| on a queue waiting for semval to increase or returns with error EAGAIN if |
| (sem_flg & IPC_NOWAIT) is true.@refill |
| @end enumerate |
| |
| The array sops is first read in and preliminary checks performed on |
| the arguments. The operations are parsed to determine if any of |
| them needs write permissions or requests an undo operation.@refill |
| |
| The operations are then tried and the process sleeps if any operation |
| that does not specify IPC_NOWAIT cannot go through. If a process sleeps |
| it repeats these checks on waking up. If any operation that requests |
| IPC_NOWAIT, cannot go through at any stage, the call returns with errno |
| set to EAGAIN.@refill |
| |
| Finally, operations are committed when all go through without an intervening |
| sleep. Processes waiting on the zero_queue or increment_queue are awakened |
| if any of the semval's becomes zero or is incremented respectively.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| E2BIG : nsops > SEMOPM.@* |
| @noindent |
| EACCES : Do not have permission for requested (read/alter) access.@* |
| @noindent |
| EAGAIN : An operation with IPC_NOWAIT specified could not go through.@* |
| @noindent |
| EFAULT : The array sops is not accessible.@* |
| @noindent |
| EFBIG : An operation had semnum >= nsems.@* |
| @noindent |
| EIDRM : The resource was removed.@* |
| @noindent |
| EINTR : The process was interrupted on its way to a wait queue.@* |
| @noindent |
| EINVAL : nsops is 0, semid < 0 or unused.@* |
| @noindent |
| ENOMEM : SEM_UNDO requested. Could not allocate space for undo structure.@* |
| @noindent |
| ERANGE : sem_op + semval > SEMVMX for some operation. |
| |
| |
| @node semctl, semlimits, semop, Semaphores |
| @subsection semctl |
| |
| @example |
| int semctl (int semid, int semnum, int cmd, union semun arg); |
| @end example |
| |
| @itemize @bullet |
| @item |
| semid : id obtained by a call to semget. |
| @item |
| cmd : |
| @itemize @asis |
| @item |
| GETPID return pid for the process that executed the last semop. |
| @item |
| GETVAL return semval of semaphore with index semnum. |
| @item |
| GETNCNT return number of processes waiting for semval to increase. |
| @item |
| GETZCNT return number of processes waiting for semval to become 0 |
| @item |
| SETVAL set semval = arg.val. |
| @item |
| GETALL read all semval's into arg.array. |
| @item |
| SETALL set all semval's with values given in arg.array. |
| @end itemize |
| @item |
| returns : 0 on success or as given above. -1 on failure. |
| @end itemize |
| |
| The first 4 operate on the semaphore with index semnum in the set. |
| The last two operate on all semaphores in the set.@refill |
| |
| @code{arg} is a union : |
| @example |
| union semun |
| int val; value for SETVAL. |
| struct semid_ds *buf; buffer for IPC_STAT and IPC_SET. |
| ushort *array; array for GETALL and SETALL |
| @end example |
| |
| @itemize @bullet |
| @item |
| IPC_SET, SETVAL, SETALL : sem_ctime is updated. |
| @item |
| SETVAL, SETALL : Undo entries are cleared for altered semaphores in |
| all processes. Processes sleeping on the wait queues are |
| awakened if a semval becomes 0 or increases.@refill |
| @item |
| IPC_SET : sem_perm.uid, sem_perm.gid, sem_perm.mode are updated from |
| user supplied values.@refill |
| @end itemize |
| |
| @noindent |
| Errors: |
| @noindent |
| EACCES : do not have permission for specified access.@* |
| @noindent |
| EFAULT : arg is not accessible.@* |
| @noindent |
| EIDRM : The resource was removed.@* |
| @noindent |
| EINVAL : semid < 0 or semnum < 0 or semnum >= nsems.@* |
| @noindent |
| EPERM : IPC_RMID, IPC_SET ... not creator, owner or super-user.@* |
| @noindent |
| ERANGE : arg.array[i].semval > SEMVMX or < 0 for some i. |
| |
| |
| |
| |
| @node semlimits, Shared Memory, semctl, Semaphores |
| @subsection Limits on Semaphore Resources |
| |
| @noindent |
| Sizeof various structures: |
| @example |
| semid_ds 44 /* 1 per semaphore array .. dynamic */ |
| sem 8 /* 1 for each semaphore in system .. dynamic */ |
| sembuf 6 /* allocated by user */ |
| sem_undo 20 /* 1 for each undo request .. dynamic */ |
| @end example |
| |
| @noindent |
| Limits :@* |
| @itemize @bullet |
| @item |
| SEMVMX 32767 semaphore maximum value (short). |
| @item |
| SEMMNI number of semaphore identifiers (or arrays) system wide...policy. |
| @item |
| SEMMSL maximum number of semaphores per id. |
| 1 semid_ds per array, 1 struct sem per semaphore |
| => SEMMSL = (PAGE_SIZE - sizeof(semid_ds)) / sizeof(sem). |
| Implementation maximum SEMMSL = 500.@refill |
| @item |
| SEMMNS maximum number of semaphores system wide ... policy. |
| Setting SEMMNS >= SEMMSL*SEMMNI makes it irrelevent.@refill |
| @item |
| SEMOPM Maximum number of operations in one semop call...policy. |
| @end itemize |
| |
| @noindent |
| Unused or unimplemented:@* |
| @noindent |
| SEMAEM adjust on exit max value.@* |
| @noindent |
| SEMMNU number of undo structures system-wide.@* |
| @noindent |
| SEMUME maximum number of undo entries per process. |
| |
| |
| |
| @node Shared Memory, shmget, semlimits, Top |
| @section Shared Memory |
| |
| Shared memory is distinct from the sharing of read-only code pages or |
| the sharing of unaltered data pages that is available due to the |
| copy-on-write mechanism. The essential difference is that the |
| shared pages are dirty (in the case of Shared memory) and can be |
| made to appear at a convenient location in the process' address space.@refill |
| |
| @noindent |
| A shared segment is described by : |
| @example |
| struct shmid_ds |
| struct ipc_perm shm_perm; |
| int shm_segsz; /* size of segment (bytes) */ |
| time_t shm_atime; /* last attach time */ |
| time_t shm_dtime; /* last detach time */ |
| time_t shm_ctime; /* last change time */ |
| ulong *shm_pages; /* internal page table */ |
| ushort shm_cpid; /* pid, creator */ |
| ushort shm_lpid; /* pid, last operation */ |
| short shm_nattch; /* no. of current attaches */ |
| @end example |
| |
| A shmget allocates a shmid_ds and an internal page table. A shmat |
| maps the segment into the process' address space with pointers |
| into the internal page table and the actual pages are faulted in |
| as needed. The memory associated with the segment must be explicitly |
| destroyed by calling shmctl with IPC_RMID.@refill |
| |
| @menu |
| * shmget:: |
| * shmat:: |
| * shmdt:: |
| * shmctl:: |
| * shmlimits:: Limits imposed by this implementation. |
| @end menu |
| |
| |
| @node shmget, shmat, Shared Memory, Shared Memory |
| @subsection shmget |
| |
| @noindent |
| A shared memory segment is allocated by a shmget system call: |
| |
| @example |
| int shmget(key_t key, int size, int shmflg); |
| @end example |
| |
| @itemize @bullet |
| @item |
| key : an integer usually got from @code{ftok} or IPC_PRIVATE |
| @item |
| size : size of the segment in bytes (SHMMIN <= size <= SHMMAX). |
| @item |
| shmflg : |
| @itemize @asis |
| @item |
| IPC_CREAT used to create a new resource |
| @item |
| IPC_EXCL used with IPC_CREAT to ensure failure if the resource exists. |
| @item |
| rwxrwxrwx access permissions. |
| @end itemize |
| @item |
| returns : shmid on success. -1 on failure. |
| @end itemize |
| |
| A descriptor for a shared memory segment is allocated if there isn't one |
| corresponding to the given key. The access permissions specified are |
| then copied into the @code{shm_perm} struct for the segment along with the |
| user-id etc. The user must use the IPC_CREAT flag or key = IPC_PRIVATE |
| to allocate a new segment.@refill |
| |
| If the segment already exists, the access permissions are verified, |
| and a check is made to see that it is not marked for destruction.@refill |
| |
| @code{size} is effectively rounded up to a multiple of PAGE_SIZE as shared |
| memory is allocated in pages.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| EINVAL : (allocate) Size not in range specified above.@* |
| (procure) Size greater than size of segment.@* |
| @noindent |
| EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@* |
| @noindent |
| EIDRM : (procure) The resource is marked destroyed or was removed.@* |
| @noindent |
| ENOSPC : (allocate) All id's are taken (max of SHMMNI id's system-wide). |
| Allocating a segment of the requested size would exceed the |
| system wide limit on total shared memory (SHMALL).@refill |
| @* |
| @noindent |
| ENOENT : (procure) Resource does not exist and IPC_CREAT not specified.@* |
| @noindent |
| EACCES : (procure) Do not have permission for specified access.@* |
| @noindent |
| ENOMEM : (allocate) Could not allocate memory for shmid_ds or pg_table. |
| |
| |
| |
| @node shmat, shmdt, shmget, Shared Memory |
| @subsection shmat |
| |
| @noindent |
| Maps a shared segment into the process' address space. |
| |
| @example |
| char *virt_addr; |
| virt_addr = shmat (int shmid, char *shmaddr, int shmflg); |
| @end example |
| |
| @itemize @bullet |
| @item |
| shmid : id got from call to shmget. |
| @item |
| shmaddr : requested attach address.@* |
| If shmaddr is 0 the system finds an unmapped region.@* |
| If a non-zero value is indicated the value must be page |
| aligned or the user must specify the SHM_RND flag.@refill |
| @item |
| shmflg :@* |
| SHM_RDONLY : request read-only attach.@* |
| SHM_RND : attach address is rounded DOWN to a multiple of SHMLBA. |
| @item |
| returns: virtual address of attached segment. -1 on failure. |
| @end itemize |
| |
| When shmaddr is 0, the attach address is determined by finding an |
| unmapped region in the address range 1G to 1.5G, starting at 1.5G |
| and coming down from there. The algorithm is very simple so you |
| are encouraged to avoid non-specific attaches. |
| |
| @noindent |
| Algorithm: |
| @display |
| Determine attach address as described above. |
| Check region (shmaddr, shmaddr + size) is not mapped and allocate |
| page tables (undocumented SHM_REMAP flag!). |
| Map the region by setting up pointers into the internal page table. |
| Add a descriptor for the attach to the task struct for the process. |
| @code{shm_nattch}, @code{shm_lpid}, @code{shm_atime} are updated. |
| @end display |
| |
| @noindent |
| Notes:@* |
| The @code{brk} value is not altered. |
| The segment is automatically detached when the process exits. |
| The same segment may be attached as read-only or read-write and |
| more than once in the process' address space. |
| A shmat can succeed on a segment marked for destruction. |
| The request for a particular type of attach is made using the SHM_RDONLY flag. |
| There is no notion of a write-only attach. The requested attach |
| permissions must fall within those allowed by @code{shm_perm.mode}. |
| |
| @noindent |
| Errors:@* |
| @noindent |
| EACCES : Do not have permission for requested access.@* |
| @noindent |
| EINVAL : shmid < 0 or unused, shmaddr not aligned, attach at brk failed.@* |
| @noindent |
| EIDRM : resource was removed.@* |
| @noindent |
| ENOMEM : Could not allocate memory for descriptor or page tables. |
| |
| |
| @node shmdt, shmctl, shmat, Shared Memory |
| @subsection shmdt |
| |
| @example |
| int shmdt (char *shmaddr); |
| @end example |
| |
| @itemize @bullet |
| @item |
| shmaddr : attach address of segment (returned by shmat). |
| @item |
| returns : 0 on success. -1 on failure. |
| @end itemize |
| |
| An attached segment is detached and @code{shm_nattch} decremented. The |
| occupied region in user space is unmapped. The segment is destroyed |
| if it is marked for destruction and @code{shm_nattch} is 0. |
| @code{shm_lpid} and @code{shm_dtime} are updated.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| EINVAL : No shared memory segment attached at shmaddr. |
| |
| |
| @node shmctl, shmlimits, shmdt, Shared Memory |
| @subsection shmctl |
| |
| @noindent |
| Destroys allocated segments. Reads/Writes the control structures. |
| |
| @example |
| int shmctl (int shmid, int cmd, struct shmid_ds *buf); |
| @end example |
| |
| @itemize @bullet |
| @item |
| shmid : id got from call to shmget. |
| @item |
| cmd : IPC_STAT, IPC_SET, IPC_RMID (@xref{syscalls}). |
| @itemize @asis |
| @item |
| IPC_SET : Used to set the owner uid, gid, and shm_perms.mode field. |
| @item |
| IPC_RMID : The segment is marked destroyed. It is only destroyed |
| on the last detach.@refill |
| @item |
| IPC_STAT : The shmid_ds structure is copied into the user allocated buffer. |
| @end itemize |
| @item |
| buf : used to read (IPC_STAT) or write (IPC_SET) information. |
| @item |
| returns : 0 on success, -1 on failure. |
| @end itemize |
| |
| The user must execute an IPC_RMID shmctl call to free the memory |
| allocated by the shared segment. Otherwise all the pages faulted in |
| will continue to live in memory or swap.@refill |
| |
| @noindent |
| Errors:@* |
| @noindent |
| EACCES : Do not have permission for requested access.@* |
| @noindent |
| EFAULT : buf is not accessible.@* |
| @noindent |
| EINVAL : shmid < 0 or unused.@* |
| @noindent |
| EIDRM : identifier destroyed.@* |
| @noindent |
| EPERM : not creator, owner or super-user (IPC_SET, IPC_RMID). |
| |
| |
| @node shmlimits, Notes, shmctl, Shared Memory |
| @subsection Limits on Shared Memory Resources |
| |
| @noindent |
| Limits: |
| @itemize @bullet |
| @item |
| SHMMNI max num of shared segments system wide ... 4096. |
| @item |
| SHMMAX max shared memory segment size (bytes) ... 4M |
| @item |
| SHMMIN min shared memory segment size (bytes). |
| 1 byte (though PAGE_SIZE is the effective minimum size).@refill |
| @item |
| SHMALL max shared mem system wide (in pages) ... policy. |
| @item |
| SHMLBA segment low boundary address multiple. |
| Must be page aligned. SHMLBA = PAGE_SIZE.@refill |
| @end itemize |
| @noindent |
| Unused or unimplemented:@* |
| SHMSEG : maximum number of shared segments per process. |
| |
| |
| |
| @node Notes, Top, shmlimits, Top |
| @section Miscellaneous Notes |
| |
| The system calls are mapped into one -- @code{sys_ipc}. This should be |
| transparent to the user.@refill |
| |
| @subsection Semaphore @code{undo} requests |
| |
| There is one sem_undo structure associated with a process for |
| each semaphore which was altered (with an undo request) by the process. |
| @code{sem_undo} structures are freed only when the process exits. |
| |
| One major cause for unhappiness with the undo mechanism is that |
| it does not fit in with the notion of having an atomic set of |
| operations on an array. The undo requests for an array and each |
| semaphore therein may have been accumulated over many @code{semop} |
| calls. Thus use the undo mechanism with private semaphores only.@refill |
| |
| Should the process sleep in @code{exit} or should all undo |
| operations be applied with the IPC_NOWAIT flag in effect? |
| Currently those undo operations which go through immediately are |
| applied and those that require a wait are ignored silently.@refill |
| |
| @subsection Shared memory, @code{malloc} and the @code{brk}. |
| Note that since this section was written the implementation was |
| changed so that non-specific attaches are done in the region |
| 1G - 1.5G. However much of the following is still worth thinking |
| about so I left it in. |
| |
| On many systems, the shared memory is allocated in a special region |
| of the address space ... way up somewhere. As mentioned earlier, |
| this implementation attaches shared segments at the lowest possible |
| address. Thus if you plan to use @code{malloc}, it is wise to malloc a |
| large space and then proceed to attach the shared segments. This way |
| malloc sets the brk sufficiently above the region it will use.@refill |
| |
| Alternatively you can use @code{sbrk} to adjust the @code{brk} value |
| as you make shared memory attaches. The implementation is not very |
| smart about selecting attach addresses. Using the system default |
| addresses will result in fragmentation if detaches do not occur |
| in the reverse sequence as attaches.@refill |
| |
| Taking control of the matter is probably best. The rule applied |
| is that attaches are allowed in unmapped regions other than |
| in the text space (see <a.out.h>). Also remember that attach addresses |
| and segment sizes are multiples of PAGE_SIZE.@refill |
| |
| One more trap (I quote Bruno on this). If you use malloc() to get space |
| for your shared memory (ie. to fix the @code{brk}), you must ensure you |
| get an unmapped address range. This means you must mallocate more memory |
| than you had ever allocated before. Memory returned by malloc(), used, |
| then freed by free() and then again returned by malloc is no good. |
| Neither is calloced memory.@refill |
| |
| Note that a shared memory region remains a shared memory region until |
| you unmap it. Attaching a segment at the @code{brk} and calling malloc |
| after that will result in an overlap of what malloc thinks is its |
| space with what is really a shared memory region. For example in the case |
| of a read-only attach, you will not be able to write to the overlapped |
| portion.@refill |
| |
| |
| @subsection Fork, exec and exit |
| |
| On a fork, the child inherits attached shared memory segments but |
| not the semaphore undo information.@refill |
| |
| In the case of an exec, the attached shared segments are detached. |
| The sem undo information however remains intact.@refill |
| |
| Upon exit, all attached shared memory segments are detached. |
| The adjust values in the undo structures are added to the relevant semvals |
| if the operations are permitted. Disallowed operations are ignored.@refill |
| |
| |
| @subsection Other Features |
| |
| These features of the current implementation are |
| likely to be modified in the future. |
| |
| The SHM_LOCK and SHM_UNLOCK flag are available (super-user) for use with the |
| @code{shmctl} call to prevent swapping of a shared segment. The user |
| must fault in any pages that are required to be present after locking |
| is enabled. |
| |
| The IPC_INFO, MSG_STAT, MSG_INFO, SHM_STAT, SHM_INFO, SEM_STAT, SEMINFO |
| @code{ctl} calls are used by the @code{ipcs} program to provide information |
| on allocated resources. These can be modified as needed or moved to a proc |
| file system interface. |
| |
| |
| @sp 3 |
| Thanks to Ove Ewerlid, Bruno Haible, Ulrich Pegelow and Linus Torvalds |
| for ideas, tutorials, bug reports and fixes, and merriment. And more |
| thanks to Bruno. |
| |
| |
| @contents |
| @bye |
| |