Functional Principles and Design Decisions for PRNGD.
=====================================================

PRNGD has been designed to act as a /dev/urandom replacement. It features
an EGD compatible socket interface, so that it can be used instead of EGD,
which is a /dev/random replacement.
In the following I want to explain the design properties of PRNGD, leading
to its strong and weak points.

- PRNGD shall always return random bytes:
  * EGD collects entropy into a pool by calling programs and reading its output.
  * Other processes read random bytes from EGD emptying the pool and EGD
    refills by calling more processes. If the random bytes are read faster
    than EGD can refill, EGD will not return random bytes until the pool is
    refilled.
    This makes EGD unusable if you have a large number of processes requiring
    entropy (e.g. inetd started processes like imap/pop daemons).
  * PRNGD uses a _pseudo_ random number generator to generate the random bytes.
    Thus it can never run out of stuff and will always return random bytes.
    On the other hand, the random bytes generated are not truly random
    (actually, those generated by EGD also are not truly random) and there
    is a risk involved that by sucking lots of entropy from the daemon an
    attacker might guess the contents of the random pool and break your keys.
  * This potential risk cannot be avoided, it is present by design, as only
    a _pseudo_ random number generator can avoid the problem of running out
    of entropy. /dev/urandom faces the same problem.
    Only a hardware RNG using thermal noise or radioactive decay can generate
    truly random bytes.
    See below on how PRNGD tries to minimize this risk.
- PRNGD should be low on resource usage:
  * EGD is written in PERL and hence allows easy porting, but it forces a
    perl interpreter to be running. From my experience this is eating up
    resources.
  * PRNGD is written in C. Most activities are performed using system or
    library calls and trying to avoid spawning external processes. It will
    never spawn more than one process at a time.
    On a 1996 HP-UX box, PRNGD tends to eat around 10-20 Minutes of CPU
    time per month, depending on the amount of entropy requested (and hence
    the amount of operations to be performed inside the PRNG). The memory
    footprint is around 100K.
- PRNGD should be robust against system malfunctions:
  * EGD sometimes tend to run out of entropy and does not refill.
    I don't speak PERL, so I am not completely sure, but it seems that EGD
    records "failure" of started gathering processes and does not use them
    any longer. If the system runs out of memory or out of processes, no
    gatherers can be started and all gatherers are disabled.
    (This is just my theory, I don't speak perl, as stated above, and don't
    have a clue on how to debug EGD.)
  * PRNGD will always use the same gatherer processes, regardless whether they
    fail at any time or not. This way a transient resource shortage is simply
    ignored and PRNGD will continue to work.
  * In case any gatherer fails, PRNGD will "kill -9" it after some time to
    not leave any processes hanging around.
- PRNGD should provide good random bytes:
  * There is an excellent paper by Peter Gutmann:
    http://www.cryptoengines.com/~peter/06_random.pdf
    Read it!
  * On startup, PRNGD tries to seed its internal pool as good as possible
    by reading back its saved entropy state and calling all gatherers.
    (The entropy state is saved at shutdown time by retrieving random bytes
    from the PRNG, so that it does not reveal information about the internal
    state bits. It is fed back as coming from untrusted source like any other
    input.) You can also run without seed file, PRNGD will call all available
    gatherers until it has enough entropy available. On my 1996 HP-UX box this
    only takes one or two seconds. Reading back the seed would also be safe
    when then contents would be known, because it is only used to initialize
    the PRNG but then entropy gathering is started immediately.
    (So: why throw away old seed, it doesn't hurt to read it back.)
  * Whenever entropy is requested, PRNGD will completely mix the pool, retrieve
    the random bits, then remix, thus yielding 2 properties:
    + All bits retrieved depend on _all_ bits in the pool!
    + When accessing the pool by any means (poking in memory etc) the pool
      has been remixed, so that one cannot get information about the state
      of the pool when entropy was retrieved.
    + An entropy count is maintained that is increased when entropy is added
      and decreased when random bytes are retrieved. As soon as the entropy
      count goes down below a given threshold (that defaults to 8192 bits),
      external gatherer processes are called continuously to add new entropy.
      This is what EGD does, too. I am not sure how good the entropy obtained
      this way really is (ever run "tail /var/adm/syslog/syslog.log" 10
      times in a row?), but as it is always mixed into the large pool it is
      better than nothing...
  * PRNGD uses the following seeding:
    + Quite often (by default around every 17 seconds), a seed_stat()
      is performed by stat()ing a file or directory like /etc/passwd,
      /tmp, ... which is changed or accessed very frequently. This will
      only give a very small amount of bits every time, but every bit helps :-)
    + Less frequently (by default around every 49 seconds), an external
      gathering process is spawned (similar to what EGD does, but in the case
      of PRNGD the frequency is not related to the retrieval of random bytes).
      The output of the process is mixed into the pool.
    + The exact schedule is not fixed, but it depends on the intervals given
      above (default 17 and 49 seconds). When PRNGD is idle, after the shorter
      interval (here 17 seconds), a seed_stat() is performed. The external
      gatherer is started, if more than 49 seconds have passed since the
      last gatherer was started. Since 49 cannot be diveded by 17, the
      external gatherer is not spawned with a frequency of 49 seconds, but
      with some uncertainty.
      To further increase this uncertainty, this decision is performed after
      the select() call, which will also be triggered by external processes
      communicating with PRNGD...
    + Whenever the call to a gatherer process is finished, additional bits are
      mixed into the pool by "internal seeding". Internal seeding is using
      cheap system calls to times(), gettimeofday(), getpid(), getrusage()
      where available. Each of these calls will not provide much entropy
      (only some microsecond values are uncertain with respect to granularity
      etc on a ntp-synchronized host, the resource usage will be quite static
      etc), but every bit helps...
