0000859: Add posix_random family of interfaces

ID	Project	Category	View Status	Date Submitted	Last Update

0000859	1003.1(2013)/Issue7+TC1	System Interfaces	public	2014-07-18 13:59	2016-07-14 15:13

Reporter	tedu	Assigned To
Priority	normal	Severity	Comment	Type	Enhancement Request
Status	Closed	Resolution	Withdrawn

Name	Ted Unangst
Organization	OpenBSD
User Reference
Section	posix_random
Page Number	0
Line Number	0
Interp Status	---
Final Accepted Text


Summary	0000859: Add posix_random family of interfaces
Description	Cryptographic software requires a source of unpredictable (pseduo) random numbers. Various nonstandard system interfaces exist for this purpose, but they have subtle differences in behavior between platforms. Attempts to build reliable portable software often fail due to this variation. The interfaces below are adapted from the arc4random family of interfaces first introduced in OpenBSD in 1996 and experiences gained since then. They have been renamed to be more standard like, although the OpenBSD project would not object to standardizing the existing names.
Desired Action	SYNOPSIS #include <stdlib.h> uint32_t posix_random(void); void posix_random_buffer(void *buf, size_t nbytes); uint32_t posix_random_uniform(uint32_t upper_bound); DESCRIPTION This family of functions provides higher quality data than those described in rand(3), random(3), and drand48(3). The generated numbers must be unpredictable. In particular, the sequence must not be shared between processes after fork(). The arc4random() function returns a single 32-bit value. arc4random_buf() fills the region buf of length nbytes with random data. arc4random_uniform() will return a single 32-bit value, uniformly distributed but less than upper_bound. This is recommended over constructions like "arc4random() % upper_bound" as it avoids "modulo bias" when the upper bound is not a power of two. All of these functions are thread safe. RETURN VALUES These functions are always successful, and no return value is reserved to indicate an error. RATIONALE The standard does not specify a required algorithm, leaving implementations some flexibility so long as they meet the interface requirements. No mechanism is provided to seed or reseed these functions, which places an unnecessary burden on application developers. The implementation is responsible for ensuring correct operation at all times.
Tags	No tags attached.

tedu 2014-07-18 14:04 reporter bugnote:0002314	oops, I copied the man page too literally. The description should more properly read: The posix_random() function returns a single 32-bit value. posix_random_buffer() fills the region buf of length nbytes with random data. posix_random_uniform() will return a single 32-bit value, uniformly distributed but less than upper_bound. This is recommended over constructions like "posix_random() % upper_bound" as it avoids "modulo bias" when the upper bound is not a power of two.

mdempsky 2014-07-18 17:35 reporter bugnote:0002317	I'm very supportive of this if it can be accepted. A couple notes though: 1. I think for most other posix_FOO() functions, there was often an existing FOO() function very similar in design/purpose. I'm a little concerned that naming these just posix_random(), etc. might lead people to think they're just a new API for the (non-cryptographically secure) rand() and/or random() functions. So it might be worth considering some alternative names like "posix_saferandom()", "posix_cryptrandom()", "posix_securerandom()", etc. (For reference, Windows provides a function "CryptGenRandom", and Java provides a "SecureRandom" class. In the cryptographic community the adjective "safe" is getting some use such as in safecurves.cr.yp.to.) 2. POSIX doesn't seem to use the term "quality" in describing the other RNG APIs, so it would seem a bit out of place to refer to it here. It might be better to directly address the requirement that the API provide random bytes suitable for cryptographic use. 2b. It's theoretically of use to quantify the security level somehow (e.g., Salsa20 makes some specific claims about conjectured attack costs in http://cr.yp.to/snuffle/security.pdf), but in the interest of simplicity, I'm inclined to leave that as a quality-of-implementation issue. Perhaps just require implementations to document their choice of RNG and the conjectured security levels? 3. This API doesn't explain that it uses a "sequence" of random values like the rand(), srand() or drand48() functions do, so saying "In particular, the sequence must not be shared between processes after fork()." is a bit out of place too I think. Either we should clarify that these APIs also use a sequence, or we should try to revise the wording to simply say that each invocation needs to generate new independent values, even after a fork(). Not sure how to word that off hand though; I keep wanting to say "unique", but that might be construed as implying that values aren't reused. I'll look around for some alternative wording options. 4. If this is added to <stdlib.h>, it needs to be shaded CX, since it's an extension to an ISO C header.

mdempsky 2014-07-18 17:51 reporter bugnote:0002318	Oh a few more thoughts: 5. I agree that the functions should never return if they fail, but I fear asserting they're "always successful" might be dubious. We might want to instead specify something like if they fail, they should terminate abnormally as if by abort()? 6. Can/should the functions block to ensure sufficient seed entropy is available? If so, do we need to worry about thread cancellation and/or signal handling? I'm inclined to say they should block (i.e., the output should always be "safe"), signals received while blocking should not cause it to fail (i.e., not abort), and could go either way on thread cancellation.

tedu 2014-07-18 18:22 reporter bugnote:0002319	For additional reference, there is also a rand_s function in Windows. It returns an error, but appears that the only defined error is passing a null pointer. http://msdn.microsoft.com/en-us/library/sxtz2fa8.aspx (For the random_buffer() function described above, passing NULL or any other invalid pointer would result in a segfault; it does not perform such parameter checking.)

dalias 2014-07-21 02:59 reporter bugnote:0002320	I strongly object to the addition of any interface that terminates the calling process as a consequence of failure. The only safe way to use such an interface is to fork each time you want to use it, and this is prohibitively slow and error-prone (e.g. you also need to exec if the original process may be multi-threaded, since calling AS-unsafe functions in the child after fork is UB for a multi-threaded parent) to the point of making the whole interface useless. Instead, the interface just need a proper way of reporting failure to the caller, and should be designed to discourage misuse, i.e. it should not require you to do awkward things like set errno to 0 before calling then check errno after return. The functions as proposed clearly have no proper way to report failure, so I think they should be rejected in their current form. posix_random_buffer could easily be fixed by changing the return type from void to int, and having it return -1 on failure and 0 on success. For the others, either the return value would need to be returned via a pointer to the storage (but this makes posix_random essentially a duplicate of posix_random_buffer with an implicit second argument of 4), or they could take a pointer to a location to store a failure flag (and not modify the pointed-to value on success). I like this latter design because it allows you to use them in complex expressions and loops without checking for failure on each call, but gives you an easy way to check for error at the end of the whole computation (and then discard the result) much like what can be done with floating point exception flags. However I also fear that naive users may just throw away the flag.

mdempsky 2014-07-21 04:07 reporter bugnote:0002321	In practice, I don't expect these APIs to normally call abort(). I was just suggesting that in extreme situations, they should fail loudly/catastrophically rather than silently. E.g., on most implementations, passing a null pointer to memcpy() will generally generate a SIGSEGV, which defaults to killing the process. Also, on systems that use memory overcommit, calling a function that needs dynamic binding might hit an out-of-memory condition trying to update the PLT entry and cause the process to die. Those are the sorts of conditions I had in mind for mentioning abort(). However, looking around more, I see functions like getpid() and getuid() do simply specify they always succeed, even if existing implementations provide mechanisms (e.g., ptrace, systrace, seccomp-bpf) that could cause them to fail. I think there's benefit to keeping the API simple and failure-free, if possible. So my preference would be to specify "These functions are always successful" if the abort() wording is contentious (and in retrospect it seems superfluous anyway). My second preference would be to mark the interfaces as optional (i.e., under an option group) so they don't need to be provided at all by implementations that can't meet the "always successful" constraint. Put differently: what sort of error conditions might we want to specify for these functions?

dalias 2014-07-21 04:41 reporter bugnote:0002322	Passing a null pointer to memcpy is a completely different situation: by invoking undefined behavior, you give the implementation license to abort (or do whatever else it pleases). As for the situations cited with overcommit and OOM during lazy binding, I consider these at least extremely low-quality, and in my opinion non-conforming, implementations. For the proposed posix_random interfaces, the obvious error condition is "insufficient entropy available to produce non-predictable output". This would normally have an underlying cause for the inability to obtain entropy (e.g. EMFILE or ENFILE attempting to open /dev/urandom or similar). I'm not sure whether it would be desirable to expose such underlying causes or just a generic error code indicating that sufficiently non-predictable output could not be produced. Of course I would really just prefer that these functions not be permitted to fail, and that they not be optional. This does not give an implementation license to crash/abort when it can't satisfy the requirements; rather, it renders the implementation non-conforming if it can't satisfy the requirement. However, I think it should be possible for all implementations to satisfy the requirements of an always-working posix_random (independent sequences in every process including after fork), albeit possibly not with ideal strength properties. For example, at fork time an implementation could consume enough output from posix_random in the parent to fill the internal state of the prng in the child before actually forking. This does not require obtaining any resource that could fail to be available. Another approach, if reseeding from /dev/urandom or similar is deemed important by the implementer, is for fork to fail when the resource (e.g. fd) needed to obtain entropy cannot be allocated.

eblake 2014-08-21 15:51 manager bugnote:0002352 Last edited: 2014-08-21 15:52	If we're going to add a new interface, it should be HARD to abuse it. Returning random bits leaves you nowhere to report errors other than the multi-statement dance of 'errno = 0; bits = call(); if (errno) report failure' - but too many progammers will omit error checking. I'd much rather require users to do: 'uint32_t bits; if (call(&bits) != 0) report failure', because that is harder to get wrong. Besides, gcc can even warn users on functions that are marked __attribute__((__warn_unused_result__)).

eblake 2014-09-12 18:37 manager bugnote:0002385	An apt comment I received from Julian Coleman: If I remember correctly, there has already been discussion in at least some of the BSD camps specifically about the name "arc4random()". It was considered a poor choice to have the algorithm in the function name(s). So, that would be a vote for changing the name, at least.

eblake 2014-09-12 18:43 manager bugnote:0002386	Upstream Linux kernel has just proposed adding a syscall getrandom() for getting random numbers without a file descriptor: http://lwn.net/Articles/606141/ as a superset of the existing BSD syscall getentropy(); maybe standardizing something along this line of thought will be better. #include <linux/random.h> int getrandom(void *buf, size_t buflen, unsigned int flags); A call will fill buf with up to buflen bytes of random data that can be used for cryptographic purposes, returning the number of bytes stored. As might be guessed, the flags parameter will alter the behavior of the call. In the case where flags == 0, getrandom() will block until the /dev/urandom pool has been initialized. If flags is set to GRND_NONBLOCK, then getrandom() will return -1 with an error number of EAGAIN if the pool is not initialized.

dalias 2014-09-12 18:57 reporter bugnote:0002387	The new Linux getrandom, and the BSD getentropy, are the type of primitive you would want to use in implementing posix_random. So from my perspective, the question is a matter of which one is more useful to standardize: an underlying primitive to get secure entropy, or an interface that wraps this operation conveniently with a way to get multiple "random" numbers based on the entropy source, but without having to go back to the kernel each time. My feeling is that, as proposed so far, posix_random does not offer sufficient guarantees on how it's implemented to be very appealing to programs that need secure random numbers, e.g. for key generation. If it's standardized, I fear most software will just ignore it and use system-specific mechanisms like getrandom or getentropy instead. Maybe there are improvements that could be made to assure users that posix_random can be used safely?

ajosey 2014-11-21 09:29 manager bugnote:0002438	The Open Group Base Working Group has balloted in favor of sponsoring the interfaces for inclusion in Issue 8. The submitter is requested to submit a revised proposal taking into account the comments raised in the bug report so far.

tedu 2014-12-01 20:36 reporter bugnote:0002457	Notes about the revised proposal here, then a more complete revised proposal to follow. I think most comments are regarding the always successful requirement and the circumstances under which that may not be possible to achieve, generally due to lack of entropy early on. This adds a function `int posix_random_init(void)`. If successful, it returns 0. On error, it returns an error code. Notable error values may be EAGAIN to indicate "low entropy" or EACCESS to indicate that /dev/random could not be opened, etc. Once init() returns 0, it should not be possible for any other errors to occur. Notably, it can't "degrade" from operational to not. Systems are still expected to make a best effort attempt to provide decent random numbers. For example, Linux may seed with getrandom(RND) and if that fails, seed with getrandom(URND) and then return EAGAIN (assuming I understand getrandom() semantics properly). I've revised the text about the "sequences" generated to basically be the definition of a cryptographic RNG. Hopefully this addresses the concern that users may be uncertain about the safety of these interfaces.

dalias 2014-12-01 20:41 reporter bugnote:0002458	What is the proposed behavior if posix_random_init is called more than once? Is it required to be safe in cases where multiple threads race to call it first? Etc. Since the purpose of this function, from a "black box" perspective, does not seem to be performing any "initialization" that's meaningful to the caller, but rather querying whether posix_random is working, I think it would make more sense to name it as such, and this would also get rid of any confusion about whether "multiple-init" is safe.

tedu 2014-12-01 20:47 reporter bugnote:0002459	SYNOPSIS #include <stdlib.h> uint32_t posix_random(void); void posix_random_buffer(void *buf, size_t nbytes); uint32_t posix_random_uniform(uint32_t upper_bound); DESCRIPTION This family of functions provides a random number generator suitable for use in cryptographic applications. Future values cannot be predicted by observation of previous values and previous values cannot be determined by observation of future values. The posix_random_init() function may optionally be called to initialize and seed the random number generator. If posix_random_init() returns successfully, subsequent calls to the other functions are guaranteed to meet the properties described above. If posix_random_init() indicates failure, the remaining functions will continue to function, albeit with possibly weakened security. The posix_random() function returns a single 32-bit value. posix_random_buffer() fills the region buf of length nbytes with random data. posix_random_uniform() will return a single 32-bit value, uniformly distributed in the range [0, upper_bound). This is recommended over constructions like "posix_random() % upper_bound" as it avoids "modulo bias" when the upper bound is not a power of two. All of these functions are thread safe. RETURN VALUES The posix_random_init() function returns 0 for success and a non-zero error code to indicate failure. No error values are defined or reserved for the other functions.

eblake 2014-12-02 00:13 manager bugnote:0002460	0000859:0002459 is missing the declaration of posix_random_init, and the description of specific error values it must use if it fails. Should there be an interface for easily getting a uint64_t and/or double value? If an interface for producing a double is provided, would it be better as a double in the range [0.0,1.0) with a fixed exponent (modulo the boundary case of 0), so that all returned values are equally distributed, vs. a completely random double (where equal distribution of bit patterns produces more doubles close to 0 than it does doubles close to the maximum due to floating point scaling and subnormal values)? Naively constructing 64-bit numbers from a PRNG that only tracks 32 bits of state risks bias in those 64-bit numbers (as an extreme example, consider a 1-bit PRNG that outputs the repeating sequence 0, 1, 1, 0: constructing 2-bit numbers from that PRNG will only create two out of four possible values, which is extremely biased). Of course, if the underlying source is true randomness and not a PRNG, or if the PRNG has enough state, then concatenating 64-bit numbers from consecutive 32-bit calls is still going to be evenly distributed; but having a dedicated interface for this makes life easier (back to our design ideal of making it hard to get wrong via naive use). Likewise, a dedicated interface for getting a double, instead of requiring bit-twiddling or playing with ldexp and friends, would make this proposal much more appealing.

eblake 2014-12-02 00:22 manager bugnote:0002461	Should the proposal explicitly mention that this family of interfaces cannot be re-seeded to replay an earlier random stream? Even if only in an application usage, mentioning the differences between the other random interfaces (where a specific seed can give deterministic behavior, which can be nice for debugging other issues) and this one (where it looks to be intentional that we are not exposing an interface for setting a seed, although the wording still appears to intentionally allow an internally-seeded PRNG with sufficient state to be used as an implementation, and use of actual random hardware is merely a quality-of-implementation detail).

mdempsky 2014-12-02 00:35 reporter bugnote:0002462	[re 0002460] I think the RNG is failing at being "suitable for cryptographic applications" if it outputs 32-bit values that can't safely be concatenated to form an unbiased 64-bit value. Additional interfaces are fine by as long as we believe they'll be useful in practice, but I'm a bit worried about expanding the scope too much. It's also perhaps worth noting that C++11 separates the idea of "uniform random number generator" (i.e., an object that returns random integers) from the idea of "random number distributions" (i.e., objects that take random integers and massage them into various useful distributions such as uniform, normal, bernoulli, etc.). So I'd be inclined to limit ourselves to defining what's necessary for the system to provide a cryptographically secure RNG, and for now defer to applications to decide how to adapt that to their needs. At least within OpenBSD's base system + third-party software ports, the proposed functions (fill a buffer, random 32-bit value, and random 32-bit value with an upper bound) have served the most common needs. I think simple random 64-bit values have come up a few times (usually satisfied by concatenating two 32-bit values). I don't think I've ever seen a need for a random floating point value though.

tedu 2014-12-02 02:28 reporter bugnote:0002463	We have very rarely found a need for 64 bit random numbers. Some notes on practical applications: Used to generate 16 or 32 bit IDs (pids, DNS, RPC, etc.) 64 bit IDs are less frequently encountered in most protocols. Used to select a random index from an array. This is what the uniform() function is for. 64 bit array indices are possible, but not practical for most object sizes. (Why would one need to select a random index into an array containing more than 4 billion char, for instance?) Used to generate keys and nonces, etc. These are generally larger than 64 bits, and the buffer() function is used. The only time I have needed 64 bit numbers I actually needed a size_t/intptr_t and used the buffer() interface. 64 bit variants don't seem necessary, but if "completeness" has strong appeal, I do not think adding them would be harmful or pose any great burden for implementations. If a double interface is added, I think it should be along the lines of uniform(), and return a value in the range [0, bound). Or [lower, upper). The implementation should do the necessary scaling. As noted, the floating point range is itself not uniform. I have some reservations that users will interpret a random double to represent 64 random bits, but an implementation that actually did that would be strongly biased.

dalias 2014-12-02 03:04 reporter bugnote:0002464	Concatenating N-bit pseudo-random values to produce a 2N-bit pseudo-random value is not such a simple matter. In order for the latter to be uniformly distributed, uniform distribution of the former does not suffice; rather, every possible pair of two N-bit numbers must occur in sequence with equal probability. Obviously short-period PRNGs fail to satisfy this property by a simple counting argument, but I don't think it's obvious that well-known CSPRNGs satisfy it, if they even do. Assuming sufficiently large period and other properties necessary for a CSPRNG, it seems plausible to me that even if the 2N-bit concatenation is not uniform, it may be indistinguishable from uniform for most practical purposes. However, upon repeating the process to get larger and larger random numbers you'll eventually reach a point where the output is not remotely uniform. I'm not an expert in this area, but I feel like if these interfaces are to be usable for cryptographic purposes (e.g. producing keys), some further guarantees with regard to periodicity, quality of distribution of posix_random_buffer output for large values of nbytes, etc. are needed.

tedu 2014-12-02 03:22 reporter bugnote:0002465	Regarding the limits of uniformity, using something like AES-CTR is for example not uniform after concatenation because two blocks will never repeat. The state space of a true stream cipher like ChaCha20 is a little more forgiving here. In practice, the fact that AES-CTR never repeats does not appear to have diminished its acceptability. The wording "Future values cannot be predicted by observation of previous values" does not directly address periodicity, but I think it should be inferred to mean that even after "many" observations, prediction remains impossible. i.e., the period is "long". Would adding "indistinguishable from uniform noise" be sufficient to guarantee distribution?

nsz 2014-12-02 13:04 reporter bugnote:0002467	i think the description of posix_random_uniform should say something about 0 upper_bound i'd explicitly mark this case as undefined behaviour (an implementation would typically do %upper_bound operation in a loop which is ub for 0) (the other sensible definition i can think of is to interpret 0 as upper_bound == 2^32, but that's probably too clever)

tedu 2014-12-04 00:10 reporter bugnote:0002473	A few notes about the choice of name. This interface exists on several platforms already as arc4random() and third party software also exists which uses it (sometimes providing their own copy for platforms that don't have it). The existing platforms already basically conform to the proposed addition here, though some lack the uniform() function added later. Standardizing on the existing name would make for an easier transition for much of this software, instead of requiring all parties to rewrite everything. Also, people searching the internet for "posix_random" are unfortunately likely to find lots of information about the POSIX "random" function. It would be bad for them to then start using that interface. The letters "rc4" in the name need not imply or guarantee any particular implementation.

nmav 2014-12-10 08:45 reporter bugnote:0002488	One comment for posix_random_init() is that it is difficult to use. That is, how is a library expected to use that interface? A lot of code contains something like "srand(time(0));x=rand();" every time a random number is produced, most probably the story will be repeated for that interface. I'd suggest to have an implicit initialization to simplify usage. I also think that the description should make clear that this interface will produce unpredictable numbers to both processes generated after a fork. The initial description had it explicit, but the second it has it subtle - I believe it should be explicit.

eblake 2014-12-10 15:28 manager bugnote:0002489	Are you aware that OpenBSD is considering breaking the contract of rand() and friends? > List: openbsd-cvs > Subject: CVS: cvs.openbsd.org: src > From: Theo de Raadt <deraadt () cvs ! openbsd ! org> > Date: 2014-12-08 21:45:20 > Message-ID: 17119865312585819206.enqueue () cvs ! openbsd ! org > > CVSROOT: /cvs > Module name: src > Changes by: deraadt@cvs.openbsd.org 2014/12/08 14:45:20 > > Modified files: > include : stdlib.h > lib/libc/stdlib: Makefile.inc drand48.c lcong48.c lrand48.c > mrand48.c rand.3 rand.c rand48.3 rand48.h > random.3 random.c seed48.c srand48.c > > Log message: > Change rand(), random(), drand48(), lrand48(), mrand48(), and srand48() > to returning strong random by default, source from arc4random(3). > Parameters to the seeding functions are ignored, and the subsystems remain > in strong random mode. If you wish the standardized deterministic mode, > call srand_deterministic(), srandom_determistic(), srand48_deterministic(), > seed48_deterministic() or lcong48_deterministic() instead. > The re-entrant functions rand_r(), erand48(), nrand48(), jrand48() are > unaffected by this change and remain in deterministic mode (for now). > > Verified as a good roadmap forward by auditing 8800 pieces of software. > Roughly 60 pieces of software will need adaptation to request the > deterministic mode. > > Violates POSIX and C89, which violate best practice in this century. > ok guenther tedu millert

safinaskar 2014-12-10 18:23 reporter bugnote:0002490	Interfaces for getting high quality random numbers have a number of issues in existing systems. So, it would be very nice if we fix all them is this proposal. We should create some really high quality API. Well, issues with existing implementations: 1. On Linux standard way of getting high quality random numbers is /dev/random and /dev/urandom. But this method requires opening files, and so it doesn't work when there is no free file descriptors. This caused LibreSSL security hole, and then this caused getrandom system call proposal to Linux (then this syscall was actually added). Links: http://threatpost.com/overblown-libressl-prng-vulnerability-patched/107245 http://port70.net/~nsz/47_arc4random.html http://lists.openwall.net/linux-kernel/2014/07/17/235 (also, consider getrandom API in this message) So, I think we should write in POSIX that our posix_random should work even if there is no free file descriptors. So, we explicitly will prohibit implementation of posix_random as a wrapper on /dev/[u]random. (And if you don't like this my proposal, then explicitly write the opposite, i. e. "posix_random may not work if there is no available free descriptors") 2. On Linux /dev/random always generates truly random numbers (i. e. numbers got from some hardware, such as touchpads or some special hardware random numbers generators). If there is no entropy available, then /dev/random blocks. /dev/urandom generates truly random numbers, too. But if there is no entropy, it returns pseudo-random numbers. "man 4 random" on Linux recommends using /dev/random for "long-lived GPG/SSL/SSH keys". But well-known cryptography researcher D. J. Bernstein says here ( http://lists.randombit.net/pipermail/cryptography/2013-August/004983.html ) an objection to that man page and says that /dev/urandom should be used nearly everywhere (if it was properly initialized). But in the same mail Bernstein says another thing: there is very important for random source to be properly initialized (using some truly random source or using random seed which is preserved across reboots). And /dev/urandom on Linux still works even if not initialized and this is bad. So, please specify in POSIX, what kind of numbers posix_random generates by default. Always truly random ones? Or it can fail-back to pseudo-random ones? And does it fail/block if it was never initialized? I think (using Bernstein's arguments) that posix_random should not block by default if initialized. I. e. if it is initialized, then it should always return numbers immediately even if this is pseudo-random numbers. But it may provide some option to alter this behavior. Then, I don't what whatever it should block/fail if it was not initialized. Probably, yes. At least posix_random should provide option to force error if the generator was not initialized. P. S. I didn't read all this discussion, sorry. I may repeat somebody

nmav 2014-12-10 19:28 reporter bugnote:0002491	@safinaskar. I'd suggest not to burden this function with entropy estimators. The best is to mark the functions explicitly as "cryptographic pseudo-random number generator". Then it is clear what it can be used for.

tedu 2014-12-10 20:16 reporter bugnote:0002492	Having had a week to ponder it, I'm less happy with the idea of an init() function. I proposed it as a compromise to returning error codes, but I think it's the worst kind of compromise: worse than the sum of its parts. Using init() is awkward and really only adds an opportunity for application code to do something wrong. I think this interface is much better without it, as originally proposed. OpenBSD has done a pretty good job in my opinion of building an interface that can't fail. Libressl portable and libottery-lite are also converging on portable solutions that come very close, even without the benefit of complete system integration.

safinaskar 2014-12-10 22:33 reporter bugnote:0002493	nmav, okey, but we still need way such that application can determine that generator was initialized. We need it. Because there is virtual machines with R/O boot, which has no entropy sources, and which don't preserve random seed across reboots, so they are in predictable state which is not satisfactory for cryptography

philip-guenther 2014-12-10 22:58 reporter bugnote:0002494	To reply to eblake: http://austingroupbugs.net/view.php?id=859#c2489 > Are you aware that OpenBSD is considering breaking the contract of rand() and friends? Yes. I'm the 'guenther' mentioned in Theo's commit message. :-) To provide context, I'll quote from Theo's public note about why the project did this. The full note can be seen at http://marc.info/?l=openbsd-tech&m=141807224826859&w=2 > [rand/random/rand48 are] used in two patterns: > 1. Under the assumption it provides good random numbers. > This is the primary usage case by most developers. > This is their expectation. > 2. A 'seed' can be re-provided at a later time, allowing > replay of a previous "random sequence", oh wait, I mean > a deterministic sequence... ... > I started my audit of the ports tree for rand() and random() use, as > indicated by the linker. I found 1221 ports out of 8800 used them. > > Differentiating pattern 1 from pattern 2 involved looking at the > seed being given to the subsystem. If the software tried to supply > a "good seed", and had no framework for re-submitting a seed for > reuse, then it was clear it wanted good random numbers. Those ports > could be eliminated from consideration, since they indicated they > wanted good random numbers. > > This left only 41 ports for consideration. Generally, these are doing > reseeding for reproduceable effects during benchmarking. Further > analysis may show some of these ports do not need determinism, but if > there is any doubt they can be mindlessly modified as described below. ... So we have a software ecosystem where programs that don't need determinism are getting it because the portable APIs are deterministic. For many programs, this may be of little consequence, but I doubt anyone would take the bet that none* of the programs using the rand/random/rand48 APIs has a security or correctness bug from this. In some cases, the OpenBSD ports team have locally patched programs to use our arc4random APIs, but pushing those changes upstream when the larger community, particularly Linux, does not have those APIs raises the dev and maintenance costs to those 3rd party developers. Solving this for the larger ecosystem is one of the goals of the posix_random API proposal, by providing a portable, non-deterministic random API and letting programs express their intent. We believe we've demonstrated that this should be something that the larger community should be able to implement and spread widely. The change in OpenBSD to rand/random/*rand48 is an attempt to protect our community immediately in a way that we believe we can afford and accept. Perhaps we'll find that this change actually does cause too many problems to be sustained even in OpenBSD. We certainly are not proposing that the standards for those functions be changed at this time. (OpenBSD does has standards compliance as a goal, but security has higher priority for us; if a standards requirement has turned out to reduce security in practice, we'll look for pragmatic ways to reduce the security cost. Note that even in this change we continue to provide a way for programs that require deterministic behavior to get it. The project's stance should surprise no one; other projects refuse to comply with POSIX requirements because they similarly are unwilling to accept with the costs of complying in performance or other metrics.)

nmav 2014-12-12 08:53 reporter bugnote:0002497	@safinaskar, this is for a POSIX API to access a CPRNG. It should simply assume that the OS will somehow initialize it. How or whether the OS will provide sufficient initialization is outside the scope of the API. If we want to cope with OSes that may not provide sufficient initialization under certain circumstances, then the best would be to modify the API to return error codes, instead of adding an initialization function. An initialization function complicates the API usage by libraries, and that API should be as simple as possible and target universal adoption.

steffen 2014-12-16 13:20 reporter bugnote:0002499	Since this is a new posix_ interface and the future is even more parallism rather than less -- shouldn't it be considered to make this an object with a clear init (new, create) / use / destroy cycle rather than another intransparent single-instance function interface? Maybe with a one-shot function call interface that uses a builtin object, for convenience sake. Btw., after being hit and disclassified as a "lesser-secure application" by a very large free email+++++ provider (requiring users and myself to enter a blinking stylish web interface and agree in being "lesser secure") even though using TLSv1.2 transport i have to say that i'm currently a bit fretted. There are programs which use the old obsolete and insecure random interfaces for purposes for which they are absolutely sufficient. /* In that case it's only used for boundaries and Message-Id:s so that * srand(3) should suffice */ Thank you.

steffen 2015-03-13 14:59 reporter bugnote:0002581	I've looked at the current extended (arc4)random implementations of FreeBSD and NetBSD. Whereas the former uses a single object protected by a global lock, the latter uses a multiple-level approach with thread-specific data and minherit(2) to ensure that the TSD PRNG state is zeroed on fork(2). Repeating my argument that the future is more parallelism rather than less i think POSIX should ensure that users of the new POSIX interface can adapt that to their needs, because: . The former approach may be so undesirable that implementation of a project-internal PRNG is necessary. . The latter, on the other hand, may cause resource wastage in a classical manager / worker thread model (and how easy it was if any other approach would result in undefined behaviour) with pools of many workers. I don't think this is a weak argument even in times where a lot of mobile phones have more power and resources than the notebook i'm using to write this message, because there are also many resource-preserving machines available, with more massive-parallel superlow-resource-consumption in sight. Finally, wasting resources for plain nothing is inherently bad design. So i'll propose a slightly extended interface instead that uses objects. Changing the mentioned implementations (and many systems have identical ones for much longer than a decade) to adapt this is practically zero effort, it's just that they have to expose the internal interface. Using objects users are free to decide in which threads they want random number generators, they can reuse objects at will (e.g., if a thread is taken from a pool to handle action A that requires random numbers to be generated it can be given access to PRNG B); the single global builtin object (accessible via NULL) can be used by all those which don't need anything special. I'll post this next. Shall this object-based approach not find any friends please have a look at the post thereafter, which is a slightly redefined version of the current state of this issue.

steffen 2015-03-13 15:00 reporter bugnote:0002582 Last edited: 2015-03-13 15:02	SYNOPSIS #include <stdlib.h> int prand_create(prand_t self); void prand_delete(prand_t self); uint32_t prand_random(prand_t self_or_null); uint32_t prand_random_uniform(prand_t self_or_null, uint32_t bound); void prand_random_buf(prand_t self_or_null, void buf, size_t len); int prand_stir(prand_t self_or_null); int prand_addrandom(prand_t self_or_null, void const *buf, size_t len); DESCRIPTION The prand family of functions provides a cryptographic pseudorandom number generator seeded from the system entropy pool. prand is designed to prevent an adversary from guessing outputs, unlike rand(3) and random(3). prand_create() and prand_delete() can be used to generate PRNG objects with completely isolated internal states. prand_delete() shall ensure that the memory used for the PRNG is zeroed. In order to use such objects from within multiple threads, proper synchronization methods must be used. The prand_random(), prand_random_uniform(), prand_random_buf(), prand_stir() and prand_addrandom() functions can either be used in conjunction with an isolated PRNG object that has been created by prand_create(), or be given a NULL prand_t argument, in which case a library global internal, multithread-safe PRNG object is used instead. prand_random() returns an integer in [0, 2^32) chosen independently with uniform distribution. prand_random_uniform() returns an integer in [0, bound) chosen indepen- dently with uniform distribution. prand_random_buf() stores len bytes into the memory pointed to by buf, each byte chosen independently from [0, 256) with uniform distribution. prand_stir() draws entropy from the operating system and incorporates it into the given objects PRNG state to influence future outputs. prand_addrandom() incorporates len bytes from the buffer buf into the given objects PRNG state to influence future outputs. It is not necessary for an application to call prand_stir() or prand_addrandom() before calling other prand functions with the library internal global PRNG. The first such call to any prand function will initialize the PRNG state unpredictably from the system entropy pool. Shall the system entropy pool fail to provide enough entropy, an undefined pseudorandom algorithm shall be used to initialize the PRNG state in order to serve the request. These initialization steps shall be performed on each access of the PRNG unless the system can satisfy the request and provide entropy. In order to ensure that the PRNG is seeded with entropy, prand_stir() or prand_addrandom() can be used beforehand random numbers are requested from the library internal global PRNG. RETURN VALUE Upon successful completion, the prand_create(), prand_stir() and prand_addrandom() functions shall return zero. Otherwise an error value shall be returned to indicate the error. ERRORS The prand_create(), prand_stir() and prand_addrandom() functions shall fail if: [EAGAIN] The system is currently incapable of providing entropy. In the case of prand_create() no resources shall be bound to the self object on error. Calling prand_delete() on such an object is undefined. RATIONALE The prand functions provide the following security properties against three different classes of attackers, assuming enough entropy is provided by the operating system: · An attacker who has seen some outputs of any of the prand_random functions cannot predict past or future unseen outputs. · An attacker who has seen the library's PRNG state in memory can- not predict past outputs. · An attacker who has seen one process's PRNG state cannot predict past or future outputs in other processes, particularly its par- ent or siblings. One `output' means the result of any single request to an prand_random function, no matter how short it is. Implementations shall ensure that the entropy state of the library internal global PRNG is zeroed on fork.

steffen 2015-03-13 15:01 reporter bugnote:0002583	SYNOPSIS #include <stdlib.h> uint32_t posix_random(void); uint32_t posix_random_uniform(uint32_t bound); void posix_random_buf(void buf, size_t len); int posix_random_stir(void); int posix_random_addrandom(void const buf, size_t len); DESCRIPTION The posix_random family of functions provides a cryptographic pseudorandom number generator automatically seeded from the system entropy pool and safe to use from multiple threads. posix_random is designed to prevent an adversary from guessing outputs, unlike rand(3) and random(3). posix_random() returns an integer in [0, 2^32) chosen independently with uniform distribution. posix_random_uniform() returns an integer in [0, bound) chosen indepen- dently with uniform distribution. posix_random_buf() stores len bytes into the memory pointed to by buf, each byte chosen independently from [0, 256) with uniform distribution. posix_random_stir() draws entropy from the operating system and incorpo- rates it into the library's PRNG state to influence future outputs. posix_random_addrandom() incorporates len bytes from the buffer buf into the library's PRNG state to influence future outputs. It is not necessary for an application to call posix_random_stir() or posix_random_addrandom() before calling other posix_random functions. The first call to any posix_random function will initialize the PRNG state unpredictably from the system entropy pool. Shall the system entropy pool fail to provide enough entropy, an undefined pseudorandom algorithm shall be used to initialize the PRNG state in order to serve the request. These initialization steps shall be performed on each access of the PRNG unless the system can satisfy the request and provide entropy. In order to ensure that the PRNG is seeded with entropy, posix_random_stir() or posix_random_addrandom() can be used beforehand random numbers are requested from the PRNG. RETURN VALUE Upon successful completion, the posix_random_stir() and posix_random_addrandom() functions shall return zero. Otherwise an error value shall be returned to indicate the error. ERRORS The posix_random_stir() and posix_random_addrandom() functions shall fail if: [EAGAIN] The system is currently incapable of providing entropy. RATIONALE The posix_random functions provide the following security properties against three different classes of attackers, assuming enough entropy is provided by the operating system: · An attacker who has seen some outputs of any of the posix_random functions cannot predict past or future unseen outputs. · An attacker who has seen the library's PRNG state in memory can- not predict past outputs. · An attacker who has seen one process's PRNG state cannot predict past or future outputs in other processes, particularly its par- ent or siblings. One `output' means the result of any single request to an posix_random function, no matter how short it is. Implementations shall ensure that entropy states are zeroed on fork.

steffen 2015-03-13 15:03 reporter bugnote:0002584	P.S.: fixed some newline mangling in the object-based one. The posted texts are edited versions of the NetBSD manual [1]. [1] http://netbsd.gw.com/cgi-bin/man-cgi?arc4random++NetBSD-current

tedu 2015-03-14 01:00 reporter bugnote:0002585	Strong objection to complicating the interface. Threading performance should, at best, be considered a quality of implementation issue. The addition of a stir interface is not necessary. If it's optional, then it serves no purpose and should not be included. Shall the system entropy pool fail to provide enough entropy, an undefined pseudorandom algorithm shall be used to initialize the PRNG state in order to serve the request. This is extremely dangerous, and renders the entire proposal useless. The standard should require that these functions work, not permit them to fail.

nmav 2015-03-14 10:59 reporter bugnote:0002586	I'm also opposed to a stir interface. I don't see how that can be used at all. The basic question is when should this function be called? If it is after X amount of data depending on the internal PRNG, why isn't it be called internally by posix_random_buf? The same for prand_addrandom... The purpose of the interface is to be simple, if the user is responsible of seeding it... we duplicate the current mess in rng interfaces. Other than that, I think we should agree on what is the purpose of the interface. Is it to provide a process with high quality randomness that can be used as a PRNG? Then the plain posix_* interface in #0002459 is sufficient.

nmav 2015-03-14 11:03 reporter bugnote:0002587	Said that, I'd prefer an interface which provides error codes, such as: int posix_random(uint32_t val); int posix_random_uniform(uint32_t upper_bound, uint32_t val); int posix_random_buffer(void *buf, size_t nbytes); Which will return 0 on success and -1 on failure with errno set appropriately. That way this interface will be able to propagate errors from getrandom() in Linux and getentropy() in BSDs. Without error codes what should that interface do in case getentropy fails?

steffen 2015-03-14 12:18 reporter bugnote:0002588	> Shall the system entropy pool > fail to provide enough entropy, an undefined pseudorandom algorithm shall > be used to initialize the PRNG state in order to serve the request. > >This is extremely dangerous, and renders the entire proposal useless. The >standard should require that these functions work, not permit them to fail. For one that surely depends on what the implementation chooses to do. Then the possibility that this situation happens seems so low that some implementations call abort() when the system cannot serve enough entropy for seeding (including OpenBSD i think) -- calling abort() cannot -- imho -- be the right answer for a standardized interface, imagine what the Wall Street would say when their system aborts because some PRNG cannot become generated. And then, most important to me, users of the interface can detect this situation and actively prevent usage of randoms created like that. Note that in the object-based approach the constructor would simply not return an object at all in this case, so that users would have to retry in order to get a usable object, i.e., cannot produce any pseudo random numbers due to lack of the PRNG object. >Said that, I'd prefer an interface which provides error codes, such as: This i wanted to avoid by moving the possible error case to some constructor (alike) function so that there is a single point of (possible) failure only. The latter can also be achieved by a posix_random_init() function. >I'm also opposed to a stir interface. I don't see how that can be used at all. The basic question is when should this function be called? It should be up to the implementation to internally decide if and when such things happen. But giving users the opportunity to "reset" seems like a good thing to me; recalling my worker thread pool example it would seem natural to _stir() the PRNG state when a single job is done, so that the next usage of the PRNG in the next job is stirred, whatever this means, and if it means anything at all; like this implementations could be enabled to use faster update paths, at least in the one or other case, or simply do nothing. It should also be pointed out that the existing (to the extend i have looked) interfaces have these interfaces and that therefore applications may use them. And in the end this is an interface for programmers which design libraries and applications -- software is used in medicine, in atomic plants, airplanes and spaceships, and pretty often it works out quite well. >Threading performance should, at best, be considered a quality of implementation issue. I haven't stated anything about threading performance, though of course user-controllable and/or detached PRNGs will impose lesser locking or even don't require synchronization at all, while at the same time allowing users to control resource usage. And of course i'd assume that lesser random bytes per PRNG mean lesser exposed internal state, though this is an extremely unacademical statement.

nmav 2015-05-05 08:42 reporter bugnote:0002647	>> Said that, I'd prefer an interface which provides error codes, such as: > This i wanted to avoid by moving the possible error case to some constructor > (alike) function so that there is a single point of (possible) failure only. > The latter can also be achieved by a posix_random_init() function. It is not always possible to move all checks to a constructor. Most PRNGs and DRBGs which will be used to implement the interface require periodic reseeding and during that an error may occur. So for me, an ideal interface would be the one in #0002587.

steffen 2015-05-05 11:22 reporter bugnote:0002651	A pseudo-filesystem with open(2)/read(2)/close(2) requires special kernel support, which seems suboptimal. Using user-space objects comes near.

mirabilos 2015-05-27 22:25 reporter bugnote:0002683	If there is an implementation which allows for failure, applications will continue to ignore it; there already exists arc4random, which is guaranteed to never fail if it exists, and if posix_getrandom (or however it's named) lacks this guarantee, applications WILL continue to ship their own copy of arc4random and use it. Make it optional but do not permit failure. As for blocking: duh. No. There are ways. Bake in some entropy into the kernel image (yes this breaks reproducible builds, no, I do not care), and additionally pass some via the bootloader (with OpenBSD boot(8) and GNU GRUB getting an extra bonus over things like LILO here because they can read files that can actually be updated once the system has booted). In MirBSD, we actually bake the complete initialised arc4random structure into the kernel (I've got a pure-mksh implementation of it, to generate it), so calling arc4random(9) as well as arc4random(3) is safe at any time (though extremely early calls will get the same result, which is why we try to get extra entropy into the kernel as early as possible, will adopt OpenBSD's passing-from-the-bootloader idea (I had a similar one but could not get it working, I believe I can get working by myself the way OBSD does it now), and have config(8) update it in the binary kernel image on each call). So there's absolutely no excuse for a random number request to fail, period. (As for "embedded router flash" blabla: surely each flashed image need not be the same; people could change the program that actually does the flashing to replace a couple of bytes, after verifying the integrity of the image, but before burning it into EEPROM. If that's truly machine-specific and the device has an RTC, just add the time as early as possible on each boot as well.)

nmav 2015-05-28 07:09 reporter bugnote:0002684	There is no _if_ there. Implementations which rely on the existing kernel interfaces can fail and they have to indicate it somehow. Applications can always wrap around an interface that fails with a void function that calls abort if they fail. But there are no posix functions which call abort automatically when they fail.

mirabilos 2015-05-29 22:18 reporter bugnote:0002686	@nmav: “Implementations which rely on the existing kernel interfaces” is two-faced. We need not consider any existing implementations because we're designing a new interface. New implementations for this merely must simply ensure they do not fail. If they have to rely on kernel interfaces that, unlike OpenBSD's, can fail, then they have to either do the necessary work in a constructor, call abort() plus kill(getpid(), 9) no matter how discouraged that otherwise be, or just not offer this interface. That being said: why do you insist on talking about a kernel? I smell Linuxism here.

steffen 2015-05-30 12:34 reporter bugnote:0002687	> We need not consider any existing implementations because we're designing a new interface. The arc4random is older than the century, widely spread and in use. > New implementations for this merely must simply ensure they do not fail. If they have to rely on kernel interfaces that, unlike OpenBSD's, can fail, then they have to either do the necessary work in a constructor But since it is new in POSIX it seems wise to adjust this interface a little bit and allow for a testable point of failure. (E.g., the OS i'm writing this on documents for _stir() that it "reads data from /dev/urandom", but still the function returns void.) Most applications will not care about errors (they have been educated to do so for a long time), but for the others who want to ensure (re)seeding was indeed successful and "strong", new code could be written as, e.g., while (posix_random_stir()) sched_yield(); And i reiterate, also in respect to this example, that the interface is really asking to give users the option to define the lifetime of objects just as they desire, like pasted in id=2582. For implementors which yet provide arc4random it would most likely mean nothing but exposing the interface they use internally anyway. It is cheap, it scales just as desired, and it is easy to use.

nmav 2015-06-19 08:16 reporter bugnote:0002725	mirabilos, a new interface which will be implemented in existing systems, not in a vacuum. In all existing systems the process of obtaining quality random numbers may fail. Asking applications to be killed because the system for some reason cannot provide random numbers on a specific point in time is absurd. The points of possible failure of this function is at initialization, or during reseeding. At these points a high quality random generator may be queried to seed to PRNG. Even if we can hide the error at initialization by not allowing a process to start if the kernel doesn't provide random numbers, failure on reseeding (which is implicit on the proposed function) cannot be hidden. Thus the function would have to indicate failure.

nmav 2015-08-14 08:26 reporter bugnote:0002789	After deliberation I align with the simplistic approach. It is much better for the API not to require any checks, so I prefer the API described in #0002459. It is possible for OSes to be modified to make random number gathering a process that cannot fail, and thus it makes sense to follow the simplest approach.

EdSchouten 2015-09-17 15:05 reporter bugnote:0002830	Quick question: is there an actual need for the posix_random() function itself? Isn't it as trivial to just call into posix_random_buffer() and pass in the location of the uint32_t where you want to store the value? We could consider axing posix_random() and renaming posix_random_buffer() to the former. That would look a bit cleaner. Furthermore, maybe it would make more sense to let the _uniform() function operate on an uintmax_t, instead of limiting it to 32-bit arithmetic. Proposed API: void posix_random(void *buf, size_t nbytes); uintmax_t posix_random_uniform(uintmax_t upper_bound);

nmav 2015-09-17 15:36 reporter bugnote:0002831	For the use cases I'm considering this API for, I have no use for the original posix_random(), so the proposed API of EdSchouten, seems simpler to me.

EdSchouten 2015-09-17 21:42 reporter bugnote:0002832	An additional remark: posix_random_uniform(x) returns a random number in [0, x). I can imagine where this comes from: it is a drop-in replacement for doing 'val % x'. But what bothers me, is that it does introduce some corner cases. What if posix_random_uniform() is invoked with an upper bound of zero? I suspect that the existing implementations crash, as they do a modulo/division by zero. On the other end of the spectrum, there is no way you can make it return UINT32_MAX (or UINTMAX_MAX per my proposal). I agree that in that case you shouldn't be using this function in the first place, but that would only hold if the upper bound is a constant -- not a run-time variable. In my opinion the function should be changed to return [0, x] instead of [0, x).

eblake 2015-09-17 22:06 manager bugnote:0002833	posix_random_uniform(3) is NOT the same as posix_random()%3. As a thought experiment, consider a random interface that returns 2 bits of information (0, 1, 2, 3) with equal probability. If you take that result % 3 to get a random number in the range [0,3) (that is, 0, 1, or 2), you will be heavily biased: 0 will occur 50% of the time, while 1 and 2 occur only 25% of the time. Of course, the bias is less observable when taking [0,UINT32_MAX] % 3, but it is still there. A uniform range is designed for cases of non-power-of-two upper bounds, with an emphasis of NOT providing an unfair bias to the lower numbers of the range. Pragmatically, you can achieve uniform results by discarding any result beyond the tail end of the range (or exact multiple of the range) - but at an expense of an unknown number of tries (no guarantee how many discards you will need to process before getting an in-range result). Also, if you are looking for a uniform result with fewer bits than the mantissa of a floating point number [such as 53 for IEEE double], then you can create a double between [1.0,2.0) with uniform exponent and random mantissa, and then scale that to your integer range for an unbiased result (and it is this half-open range between 1.0 and 2.0 that conveniently matches the specification of posix_random_uniform() being a half-open range). But when it gets larger than a mantissa, you have to be careful that composing a larger number by concatenating two smaller random numbers does not itself introduce bias. There is no need for posix_random_uniform(UINT32_MAX) - just call posix_random() instead (since that is already uniform over uint32_t). [or uintmax_t, if we go that way]. But if you are worried about the interface, then simply specify posix_random_uniform(0) be a synonym for posix_random(), covering the range [0,UINT32_MAX], by merely documenting that every input x to posix_random_uniform() results in the range [0,x-1], and 0-1 is UINT32_MAX.

EdSchouten 2015-09-18 06:54 reporter bugnote:0002834	Yes, I am well aware that on the implementation side posix_random_uniform() is more complex than just computing the remainder, as it also eliminates modulo bias. I only wanted to convey that it can be used in cases where a programmer had to write 'arc4random() % n', but wanted to achieve that without any bias. The trick you mentioned with floating point numbers is interesting. I seem to remember certain C libraries use this to implement drand48() and erand48(). They first construct a floating point number between [1.0, 2.0) and subtract 1.0 to place the result in the right domain. My remark is completely unrelated to that matter. What bothers me is that the upper_bound may not always be a fixed value. It can be a runtime variable -- maybe even a value coming from unchecked sources. Letting the function crash the process in case it happens to be zero is a bad thing in my opinion. Using [0, x] instead of [0, x-1] still feels a bit more natural and cleaner in my opinion, but making posix_random_uniform(0) return values across the entire range is a step in the right direction. I'd love to see that happen. Thanks!

steffen 2015-09-18 10:14 reporter bugnote:0002836	I seem to remember certain C libraries use this RANDOM.TEX v.1 from Donald Arseneau supports min/max clipping, too. It explicitly uses 32-bit signed, so you better check max not integer wrapping when using the same algorithm in C.

tedu 2015-09-18 15:16 reporter bugnote:0002837	Should we petition the C standard to change x % 0 to evaluate to x too? What about div()? etc, etc.

EdSchouten 2015-09-18 16:10 reporter bugnote:0002838	We should petition to come up with a set of interfaces that are functional, easy to use and are not prone to common mistakes. What makes you assume that integer division/modulo is any way related to generating a uniform random number with a range? Dividing an integer by zero makes no sense, of course, but why should we forbid computing a random number that spans the entire range of the integer type? What I was proposing previously (letting the function generate a number between [0, x]) is actually available in Python under the name random.randint() and in C++ under the name std::uniform_int_distribution.

shware_systems 2015-09-21 17:23 reporter bugnote:0002840	Re: 2837 If anything, from the theoretical standpoint, the evaluation of X % 0 should be 0, not X, as X / 0 is defined as exactly infinity with 0 left over, and matching sign (or swapped sign if negative 0 is supported). As few, if any, virtual or hardware architectures provide support for distinct integer infinity representations, I'm more inclined that leaving the behavior as unspecified is appropriate, or a change to implementation-defined at most so the behavior used by each platform gets documented. I believe most hosted implementations are expected to execute raise(SIGFPE) for this circumstance, or a signal defined as representing integer overflow specifically.

dalias 2015-09-21 17:33 reporter bugnote:0002841	There is no expectation for a signal on division by zero for hosted implementations. Many hardware architectures do not do this natively, and emulating it is an unjustified cost. C simply leaves the behavior of division by zero undefined; it does not even generate an unspecified or implementation-defined value or signal. Aside from that, I think we would all benefit from refraining from "bikeshedding" the topic of posix_random. If anyone has important input on how the interfaces should behave to be usable or practical to use or to implement (things like reportable failure vs guarantee of non-failure) I don't want to discourage their discussion, but introducing new areas of disagreement like corner-case behavior of uniform random on [0,0) seems to me like it's just derailing focus on the real issues that will affect users and implementors. The whole topic of posix_random is certainly one that lends itself well to bikeshedding already (everyone thinks they know enough to have something to say) and I would hate to see it drag out forever with no conclusion because of this.

Nach 2015-12-06 06:58 reporter bugnote:0002971 Last edited: 2015-12-06 07:14	Some comments on the proposals: If we want to help average developers do the right thing as much as possible, the following interface can be improved: uint32_t posix_random_uniform(uint32_t upper_bound); To: uint32_t posix_random_uniform(uint32_t lower_bound, uint32_t upper_bound); Where the number returned is between [lower_bound, upper_bound] inclusive. Further, the parameters should be allowed to be passed in either order (meaning if lower_bound > upper_bound, reverse them internally). This will give developers an interface to get any range they want, and cannot ever fail due to the parameters passed as every passed combination is valid and has an intuitive meaning (caller wants range between these two numbers). I'd even suggest offering: uint64_t posix_random_uniform64(uint64_t lower_bound, uint64_t upper_bound); And possibly several other related functions (especially signed integers). The functions being discussed till now are good for some low level situations and for typical cryptographic needs (keys, salts, nonces), however, they don't really cover higher level needs average developers have. For example, I often see custom initial password generation systems which provide users a temporary password or second factor password and similar. Generating such information needs to solely consist of printable ASCII characters or an even more limited character set. The average developer will not be able to figure out how to make use of the currently proposed set of functions to generate what they need without introducing biases. Therefore, they need the following: int posix_random_whitelist(void buffer, size_t buffer_length, const uint8_t whitelist, size_t whitelist_length); int posix_random_text(char buffer, size_t buffer_length, const char whitelist); posix_random_whitelist() would fill buffer with buffer_length bytes from those specified in the whitelist in a uniform manner. (If the whitelist happens to have repeated bytes, then repeated values would be weighted) posix_random_text() would fill buffer with buffer_length bytes like above function, and then NULL terminate it (ensure before calling that buffer has at least buffer_length+1 chars free). These functions should return -1 with errno set to EINVAL if the whitelist is empty. I'm not sure both of these proposed functions are needed, but at least one of them should be offered. Some systems not only require a whitelist for passwords or similar things, but they even need a minimum amount of particular ranges present in the generated password. Without getting into whether such system should exist or not, if we want to allow users to generate secure passwords for such systems, an interface along the lines of the following is required: struct pr_tr { const char whitelist; size_t required; }; int posix_random_text_require(char buffer, size_t buffer_length, struct pr_tr ranges, size_t ranges_amount); This interface is like the above function, but allows for multiple whitelist ranges, each with a specified amount of minimum characters. This should return -1 and errno set to EINVAL if ranges_amount is 0, all supplied ranges have an empty whitelist, there's an empty white list with a corresponding required greater than 0, or the sum of all required exceeds buffer_length. Sometimes developers need a way to shuffle the contents of an array, or shuffle a series of objects in a uniform manner, the following function would fit the bill: int posix_random_shuffle(void data, size_t unit_size, size_t unit_amount); Return -1 with errno set to EINVAL if unit_size is 0. This function is simple to create as a wrapper around something carrying the posix_random_uniform() interface iff the integer size posix_random_uniform() works with is >= sizeof(size_t). As above, I once again stress the need for a 64-bit uniform function so such things can be written correctly. If the idea to supply a 64-bit version of posix_random_uniform() is rejected, then developers have no simple way to create posix_random_shuffle() without biases, and the library must offer it if it wants to support common usage scenarios in higher level applications. It should be added to the proposal that all the functions in this proposal should ensure that process children (or anything children like) created via fork() or a similar system specific interface (such as clone() on Linux) ensure that the random state is either: A) Not shared between parent and child or among multiple children. Or B) Shared between parent and all children in such a way that a request for random data in one process advances the random "stream" shared by all these related processes so that random values are not repeated between the different processes. Above, steffen mentioned adding a posix_random_addrandom() function. I want to highlight the importance of this. There is a real need today for some way to mix user supplied data into the C(P)RNG in order to influence the output of future random values. Unlike when OpenBSD first created its arc4random interface, virtual machine use has exploded, and the interfaces as exist today in OpenBSD and up for proposal may be obsolete. Most hosted servers today are typically running on some kind of VM. VMs have the ability to roll back the state of the machine to a previously defined state, and some setups will automatically do so in response to various conditions that arise. The only way to defend against repeated values during VM state rollback is to ensure that every new series of random requests begins with a call to posix_random_addrandom() and is therefore specific to the random request in question. For example, a server implementing TLS or HTTP Digest Authentication or a similar technology needs to send along server side random values for various algorithms. Anything that uses client side values as part of its generation of random values will be safe. But for any random need that is solely server side, if the server sent Harry a random value, then for whatever reason was rolled back (perhaps due to the influence of Harry), and then Bob connects to the server, the server will send Bob the same random value that Harry is aware of. To ensure that Harry cannot gain access to random data for Bob's session, the RNG needs a way to be made aware of when it is serving Harry and when it is serving Bob and differentiate between them. If posix_random_addrandom() were to be called by the server passing it the IP and Mac addresses, and the user name of the logged in user, or anything of that nature, it will be able to ensure that no two different connections or users share the same "stream" of random values across VM state rollback. However, while standardizing posix_random_addrandom() and giving developers such access would be one way to allow developers to ensure they "hedge" their random stream against VM related issues, it doesn't "force" developers to do the right thing. Perhaps every random function should have void *clientdata, size_t clientdata_length added to it. Although doing so without offering counterpart functions without these parameters would probably annoy developers where VM attacks are not relevant to them. Further, if developers fail to understand why mixing this data in is important, and what should be mixed in, they may just end up using the same constants everywhere which defeats the purpose of forcing users to make security-related decisions. All in all, without allowing developers in userspace some way to differentiate the random "stream" between multiple users in their application or multiple external connections, ensures that the random interface proposed here is useless for properly secured software meant to be run on VMs. Unless of course developers build a whole new set of interfaces on top of it for their application, and never use this interface directly. In which case anything more than a function to provide a random buffer of a given size is overkill, and this entire proposal should be shortened to just posix_random_buffer().

steffen 2015-12-07 12:10 reporter bugnote:0002972	\|Above, steffen mentioned adding a posix_random_addrandom() function. I want No. 1. I just put the original interface up for discussion, the one that is in use and that i also mirrored when i implemented my own (C++) interface.. 2. ..which however encapsulated the interface into a self-contained (lockless C++ class) object. Though: there is no "however", since in fact the original implementation is designed like that, but on the inside. The root of the problem being that whatever is not possible on the lowest level can only be derivated with enormous effort and costs on higher ones. In the worst case you have to go and reinvent the complete wheel because of that. E.g., you have STDIO and if you transform it into the multithreaded world as-is then you have to ensure each and every operation is locked, even if i personally only ever use a manager/worker thread model and thus know for sure that each STDIO object will never be used concurrently in multiple threads. There are more examples from other areas like that, take format encoders which have to produce a result in a single run and thus have to be hooked on the back via callouts instead of being simple contexts that can be iterated over via restart on the front interface. Devilish. So here there is a random generator, the original design of which being from David Mazieres in 1996, 20 years, more ideas from Ilya Mironov (for the original algorithm) in about 2004 (12 years). The first version already introduced an object based and thus scalable internal implementation (struct arc4_stream). And now that parallelism is so much more common than twenty years ago it is time to offer this option to end-user programmers. That was my suggestion to this issue, that is my intent in general: instead of offering some corked version of an interface, one that may not be sufficient for some particular uses cases -- even though the internal interface is very well able to satisfy! --, providing access to the full functionality at the same cost. And the interface is still easy to use. The reason for including _stir() and _addrandom() in this respect is just natural, since that is what is used internally: _stir() is implemented by means of calling _addrandom() with data that is fetched from some undefined implementation-dependend entropy source. I.e., many operating systems use very fancy and costly ways to generate random entropy _and_ to protect their pool with expensive message digests etc. This is a science and i'd assume that only a few dozen people on the world can dare to comment on that for real, and even some of those are sometimes proven wrong, at times. Reseeding user-level pseudo-random generators which we talk about in this topic with such entropy at times via _stir() may be useful to e.g. furtherly improve the outcome of the next random generated, or simply to add "more random" into the pool of the object's entropy. The end-user programmer should have the option to define points in time where such an update may be applied, e.g., because blocking is acceptable at the very point in program execution. Maybe there should even be a guarantee that blocking doesn't occur. Maybe it should even be configurable at object generation time wether auto-reseeding is applied or not. On a per-object level. If the standard instead adds yet another corked closed-box function, what sense does that make? How many of those did yet exist? This interface, on the other hand, has the potential to persist on the one hand, and to be usable for all thinkable generic pseudo-random number tasks on the other. And i simply don't accept consciously brought in irresponsible conscious misunderstandings like the one of yours. Just like the one that has been shamelessly published in RFC 6532 and reads "[.] encoding schemes [.] introduce [.] significant opportunities for processing errors"; twenty years after the referenced standard that is used a billion times a day. Thank you.

tedu 2016-07-14 14:29 reporter bugnote:0003294	I'd like to formally withdraw this proposal. Introducing a new name for existing functions causes unnecessary confusion and complication. Please close.

dalias 2016-07-14 15:12 reporter bugnote:0003295	Is withdrawing the proposal really necessary? Renaming the proposed interface back to arc4random or whatnot seems like an option if people prefer that, but I would just prefer for this whole bikeshed to end with us adopting _something_, whatever the name, that meets the requirements of having a library-safe (including never-fails property) way to obtain random numbers for cryptographic use.

Date Modified	Username	Field	Change
2014-07-18 13:59	tedu	New Issue
2014-07-18 13:59	tedu	Name	=> Ted Unangst
2014-07-18 13:59	tedu	Organization	=> OpenBSD
2014-07-18 13:59	tedu	Section	=> posix_random
2014-07-18 13:59	tedu	Page Number	=> 0
2014-07-18 13:59	tedu	Line Number	=> 0
2014-07-18 14:04	tedu	Note Added: 0002314
2014-07-18 17:35	mdempsky	Note Added: 0002317
2014-07-18 17:51	mdempsky	Note Added: 0002318
2014-07-18 18:22	tedu	Note Added: 0002319
2014-07-21 02:59	dalias	Note Added: 0002320
2014-07-21 04:07	mdempsky	Note Added: 0002321
2014-07-21 04:41	dalias	Note Added: 0002322
2014-08-21 15:51	eblake	Note Added: 0002352
2014-08-21 15:52	eblake	Note Edited: 0002352
2014-09-12 18:37	eblake	Note Added: 0002385
2014-09-12 18:43	eblake	Note Added: 0002386
2014-09-12 18:57	dalias	Note Added: 0002387
2014-11-21 09:29	ajosey	Note Added: 0002438
2014-12-01 20:36	tedu	Note Added: 0002457
2014-12-01 20:41	dalias	Note Added: 0002458
2014-12-01 20:47	tedu	Note Added: 0002459
2014-12-02 00:13	eblake	Note Added: 0002460
2014-12-02 00:22	eblake	Note Added: 0002461
2014-12-02 00:27	eblake	Relationship added	related to 0000743
2014-12-02 00:35	mdempsky	Note Added: 0002462
2014-12-02 02:28	tedu	Note Added: 0002463
2014-12-02 03:04	dalias	Note Added: 0002464
2014-12-02 03:22	tedu	Note Added: 0002465
2014-12-02 13:04	nsz	Note Added: 0002467
2014-12-04 00:10	tedu	Note Added: 0002473
2014-12-10 08:45	nmav	Note Added: 0002488
2014-12-10 15:28	eblake	Note Added: 0002489
2014-12-10 18:23	safinaskar	Note Added: 0002490
2014-12-10 19:28	nmav	Note Added: 0002491
2014-12-10 20:16	tedu	Note Added: 0002492
2014-12-10 22:33	safinaskar	Note Added: 0002493
2014-12-10 22:58	philip-guenther	Note Added: 0002494
2014-12-12 08:53	nmav	Note Added: 0002497
2014-12-16 13:20	steffen	Note Added: 0002499
2015-03-13 14:59	steffen	Note Added: 0002581
2015-03-13 15:00	steffen	Note Added: 0002582
2015-03-13 15:01	steffen	Note Added: 0002583
2015-03-13 15:02	steffen	Note Edited: 0002582
2015-03-13 15:03	steffen	Note Added: 0002584
2015-03-14 01:00	tedu	Note Added: 0002585
2015-03-14 10:59	nmav	Note Added: 0002586
2015-03-14 11:03	nmav	Note Added: 0002587
2015-03-14 12:18	steffen	Note Added: 0002588
2015-05-05 08:42	nmav	Note Added: 0002647
2015-05-05 11:22	steffen	Note Added: 0002651
2015-05-27 22:25	mirabilos	Note Added: 0002683
2015-05-28 07:09	nmav	Note Added: 0002684
2015-05-29 22:18	mirabilos	Note Added: 0002686
2015-05-30 12:34	steffen	Note Added: 0002687
2015-06-19 08:16	nmav	Note Added: 0002725
2015-08-14 08:26	nmav	Note Added: 0002789
2015-09-17 15:05	EdSchouten	Note Added: 0002830
2015-09-17 15:36	nmav	Note Added: 0002831
2015-09-17 21:42	EdSchouten	Note Added: 0002832
2015-09-17 22:06	eblake	Note Added: 0002833
2015-09-18 06:54	EdSchouten	Note Added: 0002834
2015-09-18 10:14	steffen	Note Added: 0002836
2015-09-18 15:16	tedu	Note Added: 0002837
2015-09-18 16:10	EdSchouten	Note Added: 0002838
2015-09-21 17:23	shware_systems	Note Added: 0002840
2015-09-21 17:33	dalias	Note Added: 0002841
2015-12-06 06:58	Nach	Note Added: 0002971
2015-12-06 07:06	Nach	Note Edited: 0002971
2015-12-06 07:14	Nach	Note Edited: 0002971
2015-12-07 12:10	steffen	Note Added: 0002972
2016-07-14 14:29	tedu	Note Added: 0003294
2016-07-14 15:12	dalias	Note Added: 0003295
2016-07-14 15:13	~~Don Cragun~~	Interp Status	=> ---
2016-07-14 15:13	~~Don Cragun~~	Status	New => Closed
2016-07-14 15:13	~~Don Cragun~~	Resolution	Open => Withdrawn
2017-03-31 07:17	~~Don Cragun~~	Relationship added	related to 0001134

View Issue Details

Relationships

Activities

Issue History

related to	0000743	Closed		1003.1(2013)/Issue7+TC1	RAND_MAX should guarantee even distribution over a power of 2
related to	0001134	Closed		1003.1(2016/18)/Issue7+TC2	Add getentropy interface