[Note: In the interest of brevity and helpfulness, I sacrificed precision and detail. Also I just got bored trying to correct anything, and I noticed something nearly as soon as I posted it. So...whatever at this point; probably close enough]
The Scarlet Book standard (SACD) effectively forces 'better' behavior on the part of production engineers, while building in more margin for error. Anecdotally, I've heard that the peak meter on a lot of PCM recording and processing tools simply reports the instantaneous amplitude of incoming samples. Unfortunately, samples rarely fall along the maxima or minima of the incoming signal, which means analog peak is often in excess of the recorded peak value. Note that this does not mean that analog peak is unrecoverable; the familiar strictures of sampling apply. What it does mean is that the downstream playback device needs a certain amount of headroom to avoid clipping, and/or some means of rescaling the input. Historically, most consumer digital playback devices (e.g. CD players, 'DACs,' and so forth) made no such accommodations; meaning, you'll get clipping. One notable exception is that the PMD100 does rescale on the order of about 1dB, so devices using this filter do potentially have some headroom.*
The development of the SACD standard took a more systems-oriented approach and ultimately imposes a measure of conservatism in attempting to maximize peak and average amplitude. It restricts what can pass as "legal" data for the purposes of producing an SACD; that is, it imposes a concrete penalty (the SACD won't go to press) if one passes illegal data. Chiefly, it restricts peak modulation depth, as measured by an unweighted moving average of uh...I think it's 28 samples (I don't have the specification in front of me so I'll have to go by memory). A sort of industry 'rule of thumb' for DS modulators is that they are adequately linear up to about a 50% modulation depth, and 50% is what was ultimately chosen for 0dBFS. In practice, acceptable linearity tends to persist somewhere in excess of 50% MD for most modulators in commercial devices, while the SB restriction works out to (IIRC) something like 71% or about +3dBFS. So, some amount of headroom above 0dBFS can be built-in on the playback side if following the specification, though note that this comes at some cost to maximizing e.g. SNR specification at 0dBFS, so I suppose it isn't guaranteed that a manufacturer will opt to do so.** Secondly, a simple unweighted average in this case is going to integrate out of band noise and as a result will overestimate analog peak, with this effect tending to increase with the use of dynamic compression. Third, 0dBFS is still the canonical target peak, on the presumption that downstream headroom will not be built-in. If you combine these factors, what you get is some reduction in the likelihood of clipping on the playback side (though the degree to which this is the case will vary between devices), due to (1) the potential of violating the restrictions when following popular conventions in PCM recording, and (2) likelihood of having some measure of headroom built into the playback system. So, given these considerations, why did I express the prior opinion in the other post? Because sensible restrictions can be built into a PCM recording system (and safeguards in the playback system), too, and result in superior performance.
*Another example would be a 'NOS' R-2R DAC in cases where the reconstruction filter has adequate headroom. I don't recommend this solution, though - such DACs do not adequately filter aliases, which in turn intermodulate with the passband to produce in-band distortion and noise.
**Nor are manufacturers necessarily forthcoming about whether they opted to do so or not. Note that the ESS Sabre's modulator is linear beyond 71% MD and, as I recall, this particular converter does build in headroom although, so far as I understand, it does so for both DSD and PCM data.