diff --git a/hw5-doc/README.md b/hw5-doc/README.md new file mode 100644 index 0000000..3d3516c --- /dev/null +++ b/hw5-doc/README.md @@ -0,0 +1,593 @@ +# Homework 5 - CSE 320 - Spring 2022 +#### Professor Eugene Stark + +### **Due Date: Friday 5/6/2022 @ 11:59pm** + +## Introduction + +The goal of this assignment is to become familiar with low-level POSIX +threads, multi-threading safety, concurrency guarantees, and networking. +The overall objective is to implement a server that simulates the behavior +of a Private Branch Exchange (PBX) telephone system. +As you will probably find this somewhat difficult, to grease the way +I have provided you with a design for the server, as well as binary object +files for almost all the modules. This means that you can build a +functioning server without initially facing too much complexity. +In each step of the assignment, you will replace one of my +binary modules with one built from your own source code. If you succeed +in replacing all of my modules, you will have completed your own +version of the server. + +For this assignment, there are four modules to work on: + + * Server initialization (`main.c`) + * Server module (`server.c`) + * PBX module (`pbx.c`) + * TU (telephone unit) module (`tu.c`) + +It is probably best if you work on the modules in the order listed. +You should turn in whatever modules you have worked on. +Though the exact details are not set at this time, I expect that your +code will be compiled and tested in the following configurations: + + 1. Blackbox tests using your `main.c` and your `server.c` (if implemented, + otherwise my `server.o`), and my `pbx.o` and `tu.o`. + 2. Blackbox tests using the modules you implemented, with mine for the rest. + 3. Unit tests on your `pbx.c` in isolation. + 4. Unit tests on your `tu.c` in isolation. + +For each configuration, you will receive points for whatever test cases in +that configuration are passed. If you are able to achieve some reasonable +functionality in a module, it will probably be beneficial to submit it. +It is probably not a good strategy to submit modules that are completely +broken. +It is definitely not a good strategy to submit code that does not compile, +as this might prevent you from getting any points at all! + +### Takeaways + +After completing this homework, you should: + +* Have a basic understanding of socket programming +* Understand thread execution, mutexes, and semaphores +* Have an understanding of POSIX threads +* Have some insight into the design of concurrent data structures +* Have enhanced your C programming abilities + +## Hints and Tips + +* We strongly recommend you check the return codes of all system + calls. This will help you catch errors. + +* **BEAT UP YOUR OWN CODE!** Exercise your modules with rapid and + concurrent calls to ensure that they do not crash or deadlock. + Ideally, besides basic sequential tests, you would write multi-threaded + test drivers to achieve this. We will use tests of this nature in grading. + +* Your code should **NEVER** crash. We will be deducting points for + each time your program crashes during grading. Make sure your code + handles invalid usage gracefully. + +* You should make use of the macros in `debug.h`. In non-debugging, + "production" use, your code should basically be silent, except perhaps + to emit a one-line error message just before terminating in case a fatal + error situation requires an abort. + **FOLLOW THIS GUIDELINE!** `make debug` is your friend. + +> :scream: **DO NOT** modify any of the header files provided to you in the base code. +> These have to remain unmodified so that the modules can interoperate correctly. +> We will replace these header files with the original versions during grading. +> You are of course welcome to create your own header files that contain anything +> you wish. + +## Helpful Resources + +### Textbook Readings + +You should make sure that you understand the material covered in +chapters **11.4** and **12** of **Computer Systems: A Programmer's +Perspective 3rd Edition** before starting this assignment. These +chapters cover networking and concurrency in great detail and will be +an invaluable resource for this assignment. + +### pthread Man Pages + +The pthread man pages can be easily accessed through your terminal. +However, [this opengroup.org site](http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread.h.html) +provides a list of all the available functions. The same list is also +available for [semaphores](http://pubs.opengroup.org/onlinepubs/7908799/xsh/semaphore.h.html). + +## Getting Started + +Fetch and merge the base code for `hw5` as described in `hw0`. You can +find it at this link: https://gitlab02.cs.stonybrook.edu/cse320/hw5. +Remember to use the `--stategy-option=theirs` flag for the `git merge` +command to avoid merge conflicts in the Gitlab CI file. + +### The Base Code + +Here is the structure of the base code: + +``` +. +├── demo +│ └── pbx +├── hw5.sublime-project +├── include +│ ├── debug.h +│ ├── pbx.h +│ ├── server.h +│ └── tu.h +├── lib +│ └── pbx.a +├── Makefile +├── src +│ ├── globals.c +│ ├── main.c +│ ├── pbx.c +│ ├── server.c +│ └── tu.c +└── tests + ├── basecode_tests.c + ├── script_tester.c + └── __test_includes.h + +``` + +The base code consists of header files that define module interfaces, +a library `pbx.a` containing binary object code for my +implementations of the modules, a source code file `globals.c` that +contains definitions of some global variables, and a source code file `main.c` +that contains a stub for function `main()`. The `Makefile` is +designed to compile any existing source code files and then link them +against the provided library. The result is that any modules for +which you provide source code will be included in the final +executable, but modules for which no source code is provided will be +pulled in from the library. The `pbx.a` library was compiled +without `-DDEBUG`, so it does not produce any debugging printout. +Also provided is a demonstration binary `demo/pbx`. This executable +is a complete implementation of the PBX server, which is intended to help +you understand the specifications from a behavioral point of view. +It also does not produce any debugging printout, however. + +Most of the detailed specifications for the various modules and functions +that you are to implement are provided in comments that precede stubs for +these functions in the various source files. In the interests of brevity and +avoiding redundancy, those specifications are not reproduced in this document. +Nevertheless, the information they contain is very important, and constitutes +the authoritative specification of what you are to implement. + +> :scream: The various functions and variables defined in the header files +> constitute the **entirety** of the interfaces between the modules in this program. +> Use these functions and variables as described and **do not** introduce any +> additional functions or global variables as "back door" communication paths +> between the modules. If you do, the modules you implement will not interoperate +> properly with my implementations, and it will also likely negatively impact +> our ability to test your code. + +The function stubs that appear in the various source files have been +commented out in the basecode. This is important, because when the linker +does not see any definitions for these functions, it will link in binaries +from the `pbx.a` library. If you choose to implement a function in one of the +modules, you must uncomment the stubs for all the other functions in that module, +otherwise the linker will still link the library version of that module, +which will result in "multiply defined" errors at link time. + +## The PBX Server: Overview + +"PBX" is a simple implementation of a server that simulates a PBX +telephone system. A PBX is a private telephone exchange that is used +within a business or other organization to allow calls to be placed +between telephone units (TUs) attached to the system, without having to +route those calls over the public telephone network. We will use the +familiar term "extension" to refer to one of the TUs attached to a PBX. + +The PBX system provides the following basic capabilities: + + * *Register* a TU as an extension in the system. + * *Unregister* a previously registered extension. + +Once a TU has been registered, the following operations are available +to perform on it: + + * *Pick up* the handset of a registered TU. If the TU was ringing, +then a connection is established with a calling TU. If the TU was not +ringing, then the user hears a dial tone over the receiver. + * *Hang up* the handset of a registered TU. Any call in progress is +disconnected. + * *Dial* an extension on a registered TU. If the dialed extension is +currently "on hook" (*i.e.* the telephone handset is on the switchhook), +then the dialed extension starts to ring, indicating the presence of an +incoming call, and a "ring back" notification is played over the receiver +of the calling extension. Otherwise, if the dialed extension is "off hook", +then a "busy signal" notification is played over the receiver of the +calling extension. + * *Chat* over the connection made when one TU has dialed an extension +and the called extension has picked up. + +The basic idea of these operations should be simple and familiar, +since I am sure that everyone has used a telephone system :wink:. +However, we will need a rather more detailed and complete specification +than just this simple overview. + +## The PBX Server: Details + +Our PBX system will be implemented as a multi-threaded network server. +When the server is started, a **master server** thread sets up a socket on +which to listen for connections from clients (*i.e.* the TUs). When a network +connection is accepted, a **client service thread** is started to handle requests +sent by the client over that connection. The client service thread registers +the client with the PBX system and is assigned an extension number. +The client service thread then executes a service loop in which it repeatedly +receives a **message** sent by the client, performs some operation determined +by the message, and sends one or more messages in response. +The server will also send messages to a client asynchronously as a result of actions +performed on behalf of other clients. +For example, if one client sends a "dial" message to dial another extension, +then if that extension is currently on-hook it will receive an asynchronous +"ring" message, indicating that the ringer is to be activated. + +Messages from a client to a server represent commands to be performed. +Except for messages containing "chat" sent from a connected TU, every message +from the server to a client will consist of a notification of the current state +of that client, as it is currently understood by the server. +Usually these messages will inform the client of a state change that has occurred +as a result of a command the client sent, or of an asynchronous state change +that has occurred as a result of a command sent by some other client. +If a command sent by a client does not result in any state change, then the +response sent by the server will simply report the unchanged state. + +> :nerd: One of the basic tenets of network programming is that a +> network connection can be broken at any time and the parties using +> such a connection must be able to handle this situation. In the +> present context, the client's connection to the PBX server may +> be broken at any time, either as a result of explicit action by the +> client or for other reasons. When disconnection of the client is +> noticed by the client service thread, the server acts as though the client +> had sent an explicit **hangup** command, the client is then unregistered from +> the PBX, and the client service thread terminates. + +The PBX system maintains the set of registered clients in the form of a mapping +from assigned extension numbers to clients. +It also maintains, for each registered client, information about the current +state of the TU for that client. The following are the possible states of a TU: + + * **On hook**: The TU handset is on the switchhook and the TU is idle. + * **Ringing**: The TU handset is on the switchhook and the TU ringer is +active, indicating the presence of an incoming call. + * **Dial tone**: The TU handset is off the switchhook and a dial tone is +being played over the TU receiver. + * **Ring back**: The TU handset is off the switchhook and a "ring back" +signal is being played over the TU receiver. + * **Busy signal**: The TU handset is off the switchhook and a "busy" +signal is being played over the TU receiver. + * **Connected**: The TU handset is off the switchhook and a connection has been +established between this TU and the TU at another extension. In this state +it is possible for users at the two TUs to "chat" with each other over the +connection. + * **Error**: The TU handset is off the switchhook and an "error" signal +is being played over the TU receiver. + +Transitions between TU states occur in response to messages received from the +associated client, and sometimes also asynchronously in conjunction with +state transitions of other TUs. +The list below specifies all the possible transitions between states that a TU +can perform. The arrival of any message other than those explicitly listed for +each particular state does not cause any transition between states to take place. + + * When in the **on hook** state: + + * A **pickup** message from the client will cause a transition to the **dial tone** state. + * An asynchronous transition to the **ringing** state is also possible from this state. + + * When in the **ringing** state: + + * A **pickup** message from the client will cause a transition to the **connected** state. + Simultaneously, the calling TU will also make an asynchronous transition from the + **ring back** state to the **connected** state. The PBX will establish a connection + between these two TUs, which we will refer to as *peers* as long as the connection + remains established. + * An asynchronous transition to the **on hook** state is also possible from this state. + This would occur if the calling TU hangs up before the call is answered. + + * When in the **dial tone** state: + + * A **hangup** message from the client will cause a transition to the **on hook** state. + * A **dial** message from the client will cause a transition to either the **error**, + **busy signal**, or **ring back** states, depending firstly on whether or not a TU + is currently registered at the dialed extension, and secondly, whether the TU at + the dialed extension is currently in the **on hook** state or in some other state. + If there is no TU registered at the dialed extension, the transition will be to the + **error** state. + If there is a TU registered at the dialed extension, then if that extension is + currently not in the **on hook** state, then the transition will be to the + **busy signal** state, otherwise the transition will be to the **ring back** state. + In the latter case, the TU at the dialed extension makes an asynchronous transition + to the **ringing** state, simultaneously with the transition of the calling TU to the + **ring back** state. + + * When in the **ring back** state: + + * A **hangup** message from the client will cause a transition to the **on hook** state. + Simultaneously, the called TU will make an asynchronous transition from the **ringing** + state to the **on hook** state. + * An asynchronous transition to the **connected** state (if the called TU picks up) + or to the **dial tone** state (if the called TU unregisters) is also possible from this state. + + * When in the **busy signal** state: + + * A **hangup** message from the client will cause a transition to the **on hook** state. + + * When in the **connected** state: + + * A **hangup** message from the client will cause a transition to the **on hook** state. + Simultaneously, the peer TU will make a transition from the **connected** state to + the **dial tone** state. + * An asynchronous transition to the **dial tone** state is also possible from this state. + This would occur if the peer TU were to hang up. + + * When in the **error** state: + + * A **hangup** message from the client will cause a transition to the **on hook** state. + +Messages are sent between the client and server in a text-based format, +in which each message consists of a single line of text, the end of which is +indicated by the two-byte line termination sequence `"\r\n"`. +There is no *a priori* limitation on the length of the line of text that is sent +in a message. +The possible messages that can be sent are listed below. +The initial keywords in each message are case-sensitive. + + * Commands from Client to Server + * **pickup** + * **hangup** + * **dial** #, where # is the number of the extension to be dialed. + * **chat** ...arbitrary text... + + * Responses from Server to Client + * **ON HOOK** #, where # reports the extension number of the client. + * **RINGING** + * **DIAL TONE** + * **RING BACK** + * **CONNECTED** #, where # is the number of the extension to which the + connection exists. + * **ERROR** + * **CHAT** ...arbitrary text... + +### Demonstration Server + +To help you understand what the PBX server is supposed to do, I have provided +a complete implementation for demonstration purposes. +Run it by typing the following command: + +``` +$ demo/pbx -p 3333 +``` + +You may replace `3333` by any port number 1024 or above (port numbers below 1024 +are generally reserved for use as "well-known" ports for particular services, +and require "root" privilege to be used). +The server should report that it has been initialized and is listening +on the specified port. +From another terminal window, use `telnet` to connect to the server as follows: + +``` +$ telnet localhost 3333 +Trying 127.0.0.1... +Connected to localhost. +Escape character is '^]'. +ON HOOK 4 +``` + +You can now issue commands to the server: + +``` +pickup +DIAL TONE +dial 5 +ERROR +hangup +ON HOOK 4 +``` + +If you make a second connection to the server from yet another terminal window, +you can make calls. + +Note that the server will silently ignore syntactically incorrect commands, +such as "dial" without an extension number. If commands are issued when the +TU at the server side is in an inappropriate state (for example, if "dial 5" +is sent when the TU is "on hook"), then a response will be sent by the server +that just repeats the state of the TU, which does not change. + +## Task I: Server Initialization + +When the base code is compiled and run, it will print out a message +saying that the server will not function until `main()` is +implemented. This is your first task. The `main()` function will +need to do the following things: + +- Obtain the port number to be used by the server from the command-line + arguments. The port number is to be supplied by the required option + `-p `. + +- Install a `SIGHUP` handler so that clean termination of the server can + be achieved by sending it a `SIGHUP`. Note that you need to use + `sigaction()` rather than `signal()`, as the behavior of the latter is + not well-defined in a multithreaded context. + +- Set up the server socket and enter a loop to accept connections + on this socket. For each connection, a thread should be started to + run function `pbx_client_service()`. + +These things should be relatively straightforward to accomplish, given the +information presented in class and in the textbook. If you do them properly, +the server should function and accept connections on the specified port, +and you should be able to connect to the server using the test client. + +## Task II: Server Module + +In this part of the assignment, you are to implement the server module, +which provides the function `pbx_client_service()` that is invoked +when a client connects to the server. You should implement this function +in the `src/server.c` file. + +The `pbx_client_service` function is invoked as the **thread function** +for a thread that is created (using ``pthread_create()``) to service a +client connection. +The argument is a pointer to the integer file descriptor to be used +to communicate with the client. Once this file descriptor has been +retrieved, the storage it occupied needs to be freed. +The thread must then become detached, so that it does not have to be +explicitly reaped, it must initialize a new TU with that file descriptor, +and it must register the TU with the PBX module under a particular +extension number. The demo program uses the file descriptor as the +extension number, but you may choose a different scheme if you wish. +Finally, the thread should enter a service loop in which it repeatedly +receives a message sent by the client, parses the message, and carries +out the specified command. +The actual work involved in carrying out the command is performed by calling +the functions provided by the PBX module. +These functions will also send the required response back to the client +(each syntactically correct command will elicit a single response that +contains the resulting state of the TU) +so the server module need not be directly concerned with that. + +## Task III: PBX Module + +The PBX module is the central module in the implementation of the server. +It provides the functions listed below, for which more detailed specifications +are given in the source file `pbx.c`. + + * `PBX *pbx_init()`: Initialize a new PBX. + * `void pbx_shutdown(PBX *pbx)`: Shut down a PBX. + * `int pbx_register(PBX *pbx, TU *tu, int ext)`: Register a TU with a PBX. + * `int pbx_unregister(PBX *pbx, TU *tu)`: Unregister a TU from a PBX. + * `int pbx_dial(PBX *pbx, TU *tu, int ext)`: Dial an extension. + +The PBX module will need to maintain a registry of connected clients and +manage the TU objects associated with these clients. +The PBX will need to be able to map each extension number to the associated TU object. +As the PBX object will be accessed concurrently by multiple threads, +it will need to provide appropriate synchronization (for example, using mutexes and/or +semaphores) to ensure correct and reliable operation. +Finally, the `pbx_shutdown()` function is required to shut down the network connections +to all registered clients (the `shutdown(2)` function can be used to shut down a socket +for reading, writing, or both, without closing the associated file descriptor) +and it is then required to wait for all the client service threads to unregister +the associated TUs before returning. Consider using a semaphore, possibly in conjunction +with additional bookkeeping variables, for this purpose. + +## Task IV: TU Module + +The TU module implements objects that simulate a telephone unit. +The functions provided by this module are shown below. More detailed specifications +can be found in the source file `tu.c`. + + * `TU *tu_init(int fd)`: Initialize a new TU with the file descriptor for a client. + * `void tu_ref(TU *tu, char *reason)`: Increase the reference count (see below) of a TU by one. + * `void tu_unref(TU *tu, char *reason)`: Decrease the reference count of a TU by one, + freeing the TU and associated resources if the reference count reaches zero. + * `int tu_fileno(TU *tu)`: Get the file descriptor for the network connection underlying a TU. + * `int tu_extension(TU *tu)`: Get the extension number for a TU. + * `int tu_set_extension(TU *tu, int ext)`: Set the extension number for a TU. + * `int tu_pickup(TU *tu)`: Take a TU receiver off-hook (i.e. pick up the handset). + * `int tu_hangup(TU *tu)`: Hang up a TU (i.e. replace the handset on the switchhook). + * `int tu_dial(TU *tu, int ext)`: Use a TU to originate a call to a specified extension. + * `int tu_chat(TU *tu, char *msg)`: "Chat" during a call to another TU. + +Each TU object will contain the file descriptor of an underlying network connection +to a client, as well as a representation of the current state of the TU. +It will need to use the file descriptor to issue responses to the client, +as well as any required asynchronous notifications, whenever any of the `tu_xxx` +functions is called. +Since the TU objects will be accessed concurrently by multiple threads, +the TU module will need to provide appropriate synchronization +Changes to the state of a TU will require exclusive access to the TU. +Some operations require simultaneous changes of state of two TUs; these will require +exclusive access to both TUs at the same time in order to ensure the simultaneity of +the state changes. Care must be taken to avoid deadlock when obtaining exclusive +access to two TUs at the same time. +Sending responses and notifications to the network client managed by a TU will also +require exclusive access to the TU, in order to ensure that messages sent by separate +threads are serialized over the network connection, rather than possibly intermingled. + +In order for TU objects to exist independently of a PBX object, yet still avoid the possibility +of having "dangling pointers" to TU objects that have unexpectedly been freed, the TU module +uses a *reference counting* scheme. The basic idea of reference counts is that they are +an integer field stored in an object that keeps track of the number of references +(*i.e.* pointers) that exist to that object. When a pointer to a TU object is copied, +the reference count on that object must be increased in order to account for the additional +pointer that now exists. When a pointer to a TU object is discarded, the reference count on +that object must be decreased in order to account for the pointer that no longer exists. +When the reference count on a TU object has been decremented to zero, that means there are +no longer any pointers by which that object can be accessed and therefore the object can +safely be freed. Incrementing and decrementing the reference count of a TU is performed +by calling the `tu_ref()` and `tu_unref()` functions. These functions take as an argument +the TU to be manipulated, but in addition they take a second argument `msg`. +This is for debugging purposes: when you increment or decrement a TU you should supply +as the `msg` argument a string giving the reason why the reference count is being changed. +If, when you implement the TU module, you have the `tu_ref()` and `tu_unref()` functions +announce when they are called, along with the reason why, it will make it easier to +debug reference count issues. + +## Test Exerciser + +The `tests` directory in the basecode contains a test driver with a few selected tests +to get you started. These tests are coded using Criterion as usual, but note that it +is important that when you run the tests you supply the `-j1` argument to `pbx_tests`. +This is because each test starts its own server instance and if you run them all +concurrently only one server instance will be able to bind the port number that is +being used and the other server instances will fail. + +The file `basecode_tests.c` contains the Criterion portion of the tests and the file +`script_tester.c` contains the test driver that they use. The file `__test_includes.h` +contains some macros and declarations that are shared between the two files. +The overall idea of the tests are that they are table-driven. Each test has a "script", +which defines an array of `TEST_STEP` structures. The `TEST_STEP` structure is +defined in `__test_includes.h`. Each step contains an `id`, which identifies a particular +TU that should execute the step, a `command` which is the command to be sent to the TU, +an `id_to_dial` field which specifies the extension to be dialed in case the `command` +is TU_DIAL, a `response` field, which specifies the TU state that it is expected will +be sent in the response from the server, and a `timeout` field, which can be used to +place a limit on how long the tester will wait for a response from the server. +The `timeout` values themselves have type `struct timeval` (see `man 2 setitimer` for +a definition). There are several predefined timeout values of various lengths defined +in `__test_includes.h`. The idea is that if the expected response from the server does +not arrive before the timeout expires, then the test script fails. A timeout value of +`ZERO_SEC` means the test driver will wait indefinitely for the response from the server. + +To use the full capabilities of the test driver is probably somewhat complicated, +since if you get multiple TUs sending commands in a concurrent fashion you have to +start worrying about asynchronous notifications from the server "crossing in front of" +the expected response to a command. But you can probably follow the basic pattern in +the scripts that I have provided to create other similar scripts that test the ability of +your server process other series of commands issued in rapid succession. + +## Debugging Multi-threaded Code with `gdb` + +The `gdb` debugger has features for debugging multi-threaded code. +In particular, it is aware of the presence of multiple threads and it provides +commands for you to switch the focus of debugging from one thread to another. +Use the `info threads` command to get a list of the existing threads. +Use the command `thread nnn` (replace `nnn` by the thread number) to switch +the focus of debugging to that thread. Once you have done, this, the `gdb` +commands such as `bt` that examine the stack will be executed with respect +to the selected thread. Threads can be stopped and started independently +using `gdb`, as well. + +## Submission Instructions + +Make sure your hw5 directory looks similarly to the way it did +initially and that your homework compiles (be sure to try compiling +both with and without "debug"). +Note that you should omit any source files for modules that you did not +complete, and that you might have some source and header files in addition +to those shown. You are also, of course, encouraged to create Criterion +tests for your code. + +It would definitely be a good idea to use `valgrind` to check your program +for memory and file descriptor leaks. Keeping track of allocated objects +and making sure to free them is potentially one of the more challenging aspects +of this assignment. + +To submit, run `git submit hw5`.