working ITB at higher sampling rates

Recently, I’ve moved from 44.1kHz up to 96kHz sampling rate for my current production. I would have loved to do this step earlier but it wasn’t possible with the older DAW generation in my case. With the newer stuff I was easily able to run a 44.1kHz based production with tons of headroom (resource wise – talking about CPU plus memory and disk space) and so I switched to 96kHz SR and still there is some room left.

I know there is a lot of confusion and misinformation floating around about this topic and so this small article is about to give some theoretical insights from a developer perspective as well as some hands-on tips for all those who are considering at what SR actually to work at. The title already suggests working ITB (In The Box) and I’ll exclude SR topics related to recording, AD/DA converters or other external digital devices.

Why favor a higher SR at all?

Working in the digital audio domain, there are quite a number of reasons to increase the underlying sampling rate of the system. First of all, most feedback based algorithms are taking advantage of a higher SR resulting in a better perceived audio quality. Most prominently, all types of IIR filters are affected especially but not limited to the case where very steep slopes are computed. As an added sugar, the curve warping gets moved outside the hearing spectrum at higher SR.

Whenever modulation at audio rate occurs, there will be serious distortion. This can be minimized to some extend at higher SR and so a wide range of digital audio applications is capitalizing from it: FM oscillators, ring modulation, compression and limiting – just to name a few. The so-called IMD (inter modulation distortion) is even worse in the digital domain since also aliasing gets introduced. This occurs when the additionally created distortion content exceeds the Nyquist frequency and gets folded back into the spectrum below. Increasing the SR, immediately relieves this effect. In general, this applies to all kind of non-linear processing and therefore all types of saturation and distortion algorithms are benefiting from higher SR.

Choosing a SR to work at

In practise, this is basically a matter of how much resources are available (computational wise). If your DAW allows to work at a very high SR I would simply say: Go for it. There are several constraints, though. First and foremost, you are limited to what your host (a sequencer, a sample editor) actually supports which (in most cases) basically is exactly what the underlying audio hardware supports. Secondly, the same goes for all the instrument and effect plug-ins you are using, which leads to the question:

Why are some plug-ins limited to work at certain SR’s?

Some plug-in developers pre-compute certain algorithm coefficients, e.g. for a filter network to save DSP resources. Or, they pre-calculate look-up tables for complex functions, just as an example. In case where such parameters are SR dependent, they have to choose a set of specific SR’s that they gonna support. VoS plug-ins don’t use such techniques and they run on any arbitrary SR above 44.1kHz.

Oversampling a plug-in vs. changing the projects host SR

Some plug-ins are offering internal oversampling (re-sampling of the audio signal to a higher SR) as an alternative to a globally higher SR in the host. As always in life, there is a serious tradeoff though. Re-sampling a signal requires steep filtering at Nyquist which in general is a tradeoff between computational cost, steepness of the filter, ripple in the passband and resulting impulse response of the filter. For instance, if a plug-in uses a FIR filter to obtain very steep filtering, pre-ringing occurs. On the other hand, if IIR filtering is used, the filter might be not steep enough to eliminate aliasing content. However, the additional computational cost is a “local” phenomenon opposed to an overall higher SR which causes a higher CPU demand in all software running in the host.

Running an overall higher SR on the other hand avoids tons of up and down sampling and all the artifacts that are typically introduced by doing so and just requires some down sampling at the end which can be done with an high quality offline sample rate converter. If computing power is not an issue my vote goes clearly to use a higher SR instead of local re-sampling.

Realtime vs. offline SR conversion

The usual plug-in internal realtime up and down sampling is typically based on a “double/half sampling” approach which simply doubles and afterwards halfs the sampling rate and leads to 2x, 4x, 8x, 16x (and so on) effective internal sampling rates. This is computational wise a little bit cheaper than re-sampling between arbitrary SR’s. In an offline re-sampler this is not an issue and they can also invest way more CPU cycles into high quality interpolation/filtering. Some quality comparisons are online available at http://src.infinitewave.ca/. Note that with r8brain and SoX you’ll get highest quality SR converters already in the freeware domain and there is no need for commercial solutions (quality wise).

88.2kHz vs. 96kHz when coming from 44.1kHz

The aforementioned offline re-samplers quality is actually so good that there is no reason to stick to the a simpler double sampling approach when coming from 44.1kHz. If CPU allows, go 96kHz right away.

Migrating a project

In theory, one might expect to migrate an existing project from lower to higher SR w/o any further issues or manual intervention. In reality it’s not that easy. I’ve converted entire projects from lower to higher SR and the difference is not that much dramatic but the mix needed to be revised and fixed after the conversion.

This is mostly related to filter coefficient computation and filter warping near to Nyquist frequency which moves after a SR change. As an example, the main filter in NastyDLA changes slightly nearby Nyquist after SR change causing slightly more self oscillation. There are countless other examples in all kinds of available plug-ins, whether being free or commercial so better check your project after conversion.

Higher SR just during mastering?

If one can’t handle an overall higher SR production, is it still valid to use a higher SR limited to the mastering process? Absolutely yes! Just up-sample the mix and perform all the mastering tasks such as equalizing, compression and maximizing. As already discussed they all benefit from oversampling. At the end of the chain render the file with your host and then down-sample the file with a high quality offline SR converter down to 44.1kHz.

Comments

  1. Léo Saramago says:

    The idea of higher resolution in audio achieved by increasing samplerate is a misconception. The basic question is: how do we get one single sample? It’s via windows of measurement – statistics. After some point(around 50kHz/60kHz), signal representation becomes more and more influenced by spectral leakage, and, of course, the Time for each window of analysis becomes smaller. So, is there any difference towards final results? Sure, but there’s no more accuracy, specially regarding lowend information. The extra highend people claim to perceive is actually distortion.

    • You did not get it. The higher quality does not come from the audio file itself, it comes from the digital processing at higher SR.

      • Léo Saramago says:

        Yes, higher for processing could potentially lead to better results. But that one sentence you’ve just written is a important piece that was missing and no matter how hard we try there’s always room for confusion – hardware manufacturers will prey on that old misconception.

        • hing was missing. We did understand it perfectly. It is that simple. Working in higher audio rate/freq will likely yield better results most of the time. It’s a great article. Nothing new to me but nicely explained.

      • kohugaly says:

        Low frequency resolution loose does relate to samplerate. As you know IIR filters preform filtering by multiplying previous outputs (and inputs) and adding them to current one. the coefficients get very small when dealing with low frequencies and summing small values with big ones result in precision loose. If you double the samplerate you double the number of calculations + you have to half the cutoff frequency (as in digital domain 0=dc 1=niquist) =>even smaller precision. That’s why filters dealing with low frequencies internally almost always work at 64bit precision. That becomes necessary at higher samplerates. On the other hand at high samplerates the response of the filters becomes more “analog”.

  2. Agreed, Varosound. And everybody working ITB should know all of that. However, if your computer cannot cope with higher SR, I say not to worry, since the difference in results is on the “audio purist” side. Having said that, I work at 96kHz only for years now, and I remember noticing various kinds of betterment in sound and mixes when I changed to it. I’d say it’s especially noticeable with compressors, limiters, saturation, filters… hmmm well, you said that already.

  3. sorry to ask here. most of your plugins are 4x (right?) oversampled automatically. does it mean that the oversampling factor is constant or that the upsampled SR is fixed ? because in the first case, working @ upsampled rates, it would stress exponentially the CPU since it would upsamle, say, at 384KHz, that is actually nostly unnecessary(even if some plugin developers allow for much more)!
    thanks

    • I also want to know the answer to this question! An answer for both currently supported plugins and for your nasty series (sorely missed. Especially NastyCS and tabletop) Sure the algorithms may not be as complex as your newer plugs, but they deliver fast solid results.

      • +1 that wants to know this. IMO, plugs should never have oversampling by default and it should always be an option, like Voxengo does.

        • Indeed, if a VoS plug-in does have OS then it is fixed and it does not switch of at higher host SR’s.

          • Would you add support for switching?

          • then, even if you had a better daw, the overall cpu hit won’t be lowered by using your plugin… this is one of the reasons why I never switched to higher SRs even with a i7 sandy bridge. basically I should double the buffer, or triple it, to compensate for the cpu, AND still I would get exceptional CPU absorption.
            well, since you are on the way for TWO releases…
            🙂
            fixed maximum SR
            or
            switch
            could be a great plus…
            (specially if you consider that a channelstrip plugin could have many many instances in a single project..)
            I don’t know anything about it, but I noticed some plugin developers use highly optimised algorythms (minimum phase, of course) hardly noticeable on the cpu. probably the ripple or the freq-range aren’t the best out there, but for non-mastering purposes it’d be the best!
            thanks for your response, your attention & your great job!!
            enrico

  4. Great article!

    One question that I got now: At which stage do you use dithering in case you work on a project at a SR above 44.1k and you want to render the final mix to CD-Quality (44.1, 16bit) ?

    Thanks

    • Dithering is a matter of bit reduction and not SR reduction so you will do that exactly at the stage where you render to 16bit.

      • Thanks for your fast answer and again for this great abstract about SR.

        (of course I’m aware that bit reduction and sample rate reduction are seperate processes, but we often tend to combine these factors when talking about the “quality” of an audio recording, e.g. 24bit/96kHz ; 16bit/44.1kHz – that’s why the question is closely related to your abstract)

        Lokking forward to your DC 2012 entry!!

  5. Great article, very good read!

    I’m happy mixing @ 88.2 because somehow the cpu-bite when running 96 is way too much for my i5-2500 to handle. When it comes to mastering I go for 96 and it’s totally worth it, feels a lot more precise specially when EQ’ing.

    I should also note that some of you guys might get better results by using SRC software such as Voxengo’s r8brain instead of converting the SRs on your DAWs. In my experience Cubase is notably bad at it while Samplitude does it a little better. Don’t know about other DAWs though, so you guys might wanna test that.

  6. you dont want change sample rates after limiting and dithering to 16 bit, if you render a file 16/96 khz and src it to 44.1 it will cause over shoots in the file that wasnt there in the first place

    • will check that back …

    • Chris, you are right.

      I don’t know if that will happen always but I noticed that on both red rooster and wavelab.

      I guess the conversion attempts to rebuild the cut-off peaks or maybe it just runs into problems rounding the numbers at too low a bitdepth.

      In the end it is always better to sample down and then dither to 16-bit because your sample conversion will be more accurate on the higher bitdepth.

      Leave at least 0.2 – 0.3 db headroom (by setting the ceiling accordingly) when using a brickwall limiter anyway, as radiostations and also consumer electronics frequently sample up/down, causing the same unintentional distortions for your listeners.

  7. seriously Boosty it does, if you are mastering at high sampling rates, eq and compress first, then render the file (32bit float 96khz) , downsample to 44.1 , then load the 32/44.1 file back into DAW, then limit and dither to 16/44.1 that how i do it

  8. @Jeroen the first time i noticed it was a couple of years back, was bouncing down to 16 44 form 32 96 and and when i played the file back there was all sorts of compression artifacts (pumping) and i think wat you say is true about the reconstruction phase, it adds maybe .3/.4 of a db to the peaks

    • Hi Chris,

      TC electronic did some research a few years ago and it turned out that consumer electronics output distorted and clipped audio because of the oversampling used.

      When using brickwall limiting you chip off peaks. As a result interpolating between samples might return values higher than 0dbfs. In most cases did would cause overs fed into the D-A stage. Some developers did actually fight this with higher bitdepth converters. Sometimes the analog circuit of the CD-players did not have enough headroom to take the A-D’s output.

      I guess the effect you experienced is basically the same as the same interpolation will be needed.

      TC (and also sony, toneboosters) make limiters that prevent these overshoot by oversampling the sidechain of the limiter.

      If you want your audio sounding clean on all devices use such a limiter or keep a healthy 0.3 db headroom before dithering.

      Something I didn;t mention before is that dithering is also added up to your signal. And althought this is a tiny signal it might just cause some overshoot on a master already brickwalled to 0 dbfs!

  9. After reading the article, it pretty much answered all my questions from mails. Including the comments with the “fixed OS” and if they still apply on OS or not (though only mentioned in the comments in here). The post should maybe be updated with that info.

    Another thing that might be good to know (also in this article), is that while offline OS from 44kHz to 96kHz, there is no content added, just the resolution is more fine, which in turn is then used for mastering. Plugins with 4x internal OS can then benefit from from that. So a 96kHz production can now be edited in “internal” 384kHz – which is more than suitable. In theory (and to my understanding, if I OS that plugin with an external OS plugin even more, I should get twice as much OS (so instead of 4x, I get 8x) – but this is just theory.

    And if we talk about 16x OS, we already get close to the editing SR of SACD.
    http://en.wikipedia.org/wiki/Sampling_rate

    In that regard, there is an article, or better said video, about this topic existing:

    In this example, the presentator shows what’s actually happening on both upsampling and downsampling of pure content only. We’re not talking about plugins, just audio files. The differences are marginal (around +/-2dB), and in non-hearable ranges or drowned in the noise floor. So in the end it doesn’t matter WHAT sampling rate you use – the sound is only slightly different. The ADC/DAC is more important (is it linear enough? did you record at suitable bitrate?) – but this is not the theme of this article.

    However – this is also the reason, why properly recorded and edited samples at 44kHz or 48kHz work in every production. May it be clean (without edits) or altered through plugins. So we have to(!) keep that in mind.

    Yet another thing I’d like to adress, is the fact that a jump from 44.1kHz to 96kHz is definitely more drastic than going from 48kHz zo 96kHz. I wouldn’t go by an odd multiplier, but rather an even one.

    Personally, I work in 48kHz with OS plugins, even though I have a AD/DA that is capable of going way higher (with the sideeffect to loose out on ADAT slots) and I have an i7 920. But other than mastering and reconstructing tape material – I don’t see a reason to go higher (yet). Especially if the final product is still a 44kHz MP3/AAC, or 48kHz at max – since it’s the most widespread and consumed audio format.

    In this context I also have to repeat what I wrote on KVRaudio:
    Does it matter if we still talk about masterings with RMS values of -6dB(FS)?

    We should rather focus on something that is more important than worrying about SRC. Loudness being one.

    Thanks for reading, and thanks for writing/updating that article. Definitely appreciated.

    • about the loudness, that was my very same thought, and since here the topic was strictly theoretical on SR conversion, I didn’t say a thing. But I had the same reaction of yours: why concern about data loss, if working @ K12 or K14 there’s usually still a lot of room for further compression? in the real world, far from technical compromises and idealistic (loudness) wars here’s my solution: if the situation deserves it, just run a high SR and totally uncompressed audio (live performances, installations, presentations, accurate listening). it can be considered a reference version as well. If not, Just use a compressed audio AND a compressed format: the song for everyone’s mp3 player with everyone’s cans need to be specifically tailored, specially for electronic music. Look at Kraftwerk, or Robert Lippok: their (older) beats can sound either amazing or shit depending on what system you have. that’s because the peak to rms ratio requires a better amp to give more gain and achieve more punch. otherwise, it’s perceived as thin.
      that said, if you you find yourself jogging with Kraftwerk or Rococo Rot in your ears, probably you have found Eternal Peace and won’t need to struggle about such things..

      • Well, if we purely look at the topic “Sampling Rate within the box (ITB)”, it does make reason to either utilize tools that have excellent coded OS matrixes (I count VoS and Slate Digital to these developers), or use 96kHz right to begin with (if you can use it). The benefits were clearly stated. Though I kind of missed in the article, that at higher sampling rates OS can be theoretically turned off, or wouldn’t make much more sense at sampling rates higher than 192khz. Hard driven distortion and compression making the exception, of course.

        But if we look outside of the small picture, outside of this frame, and take a look at the whole thing. There is much more to take into consideration.

        For example:
        – mid to high class ADC/DAC offer OS on both input and output (RME and Behringer’s UltraMatch for example with at least 64x OS), but is your ADC/DAC linear enough? Where do things start to go downhill?
        – in this case, -18dBFS as reference level (ideal work leve) exists for a reason and we also talk about working at a reasonable level, not constantly 0dBFS.
        – as the YouTube video confirmed (the one I posted earlier), at such levels, the SR only defines the “resolution” of the recording. The difference between 48kHz and 96kHz is marginal (+/-2dB), in non-hearable ranges or drown in the noise floor. So the difference is hard to spot while A/Bing (at east if we go by even multiplication). Some can hear it, most people won’t.

        There still exist excellent digital consoles that work at 48kHz only, or even 44kHz. Yet these productions, may they be full dynamic or compressed to their limits (beyond K-12!), still sound great compared to HD recordings. Or what about provided sample packs these days? A very large portion (about 98%) of them are still available in either 44kHz or 48kHz at most. Yet do you hear any difference if they are used in 96kHz? Talk about VST plugins from yesteryear. Granted, those without OS can benefit from higher sampling rates. But even without OS, EQs can just sound as good (only the internal bitrate might cause problems – different topic alltogether!).

        Then again, can Average Joe even hear or appreciate it all that? Does the consumer have the “ears” (taking about the iPod earbud generation) or the equipment and acoustically corrected room to really benefit from HD recordings? We’re way beyond the times of 8kHz, 16kHz and 22kHz as resolution, so we barely hear any difference anymore.

        Do you think, large scale studios constantly work at >96kHz and ITB? Especially if we talk about surround mixes for cinema at like… 11.2? This would definitely break the mold, especially financially. In this particular case, the audio stream is going OTB, then edited and simply recorded at a higher sampling rate, which is in turn then converted to suitable Blu-Ray/DVD audio streams, or lower resolution files like WAV 44.1 or MP3/AAC. The same happens with music prost production or standard 2.0 mastering. To simply save resources – even with modern rigs.

        There so much more to the equation of “HD audio”. Bitrate (at least 20bit! the lower the bits, the more degraded the sound, on top of having a lower dynamic range), sampling rate (at least 44khz, 22kHz sounds a bit dull, going lower removes upper frequencies – going higher than 44kHz only offers a finer resolution but not necessarily a noticable better sound), ADC/DAC (both in terms of recording/editing and playback at the consumer side), loudness (even at K-12 things start to degrade).

        In this case, it doesn’t matter (IMO) what you use – if you don’t stick to certain well established rules, or constantly overdo it, even the highest sampling rate and bitrate can’t help you and your production.

        What could make a definite difference, is a recording and playback system, that is capable of reproducing a frequency range of 10Hz-32kHz for example. Since OTB tape machines work on a purely analog basis, they could in “theory” reproduce that, but utimately fail due to the limitations of the medium (the tape). There are headphones and speakers out there already, that can playback from 15Hz to 35khz, and they do indeed sound more crisp than standard 20-20kHz ones (or even those that go from 22Hz to 19khz) – sometimes too crisp for my liking. But our ADC/DACs (recording module, MP3 player) are still locked to 20-20khz mostly, and not everyone can hear in these ranges (we rather feel it).

        THIS however, is still far from being pulled through, lies on a whole different ballpark and goes way off topic from this initial article.

        SUMMARY:
        OS has setbacks, but if coded well, it can work just as good. Higher SR has it’s benefits compared to plain OS, but the CPU tax can be non economical. Solution: go for what works best for you. The rest is written on a different piece of paper. But we shouldn’t overlook these parts of the equation.

  10. Man THANK YOU for this article! It cleared up a lot of questions that I had regarding sample rate. I’m still rockin an old dual-core so my CPU will definitely struggle with higher SR but at least now I understand the benefits more and know what to listen for. Thanks again!

  11. Is it possible to tell if a specific VST supports higher SRs – and what would happen if, for example, you have a mastering chain with one VST that is fixed at 44.1KHz where the rest are capable of 96KHz?

    • With direct-X plug-ins I experienced system crashes when loading plug-ins with the wrong SR, so you don’t want to try this during a live session. With VST I never got that kind of trouble.

      In Wavelab plug-ins that do not support the used samplerate are either not loaded with a message about the samplerate or they are loaded but bypasses (as is the case with Powercore plug-ins).

      I think other DAW’s do something alike.

      So I guess with VST plug-ins it is safe to try (after you save your project!).

      Obviously you cannot mix different SR in your chain without SRC plug-ins in between!

      • Thanks Jeroen – it makes sense that the DAW would report an issue – I managed to load my typical chain with the DAW (FL Studio) and soundcard set to 96KHz, and all seems to be working well.

        I tried looking for a VST properties inspector, but none of them seem to report the supported rates (including FL Studio, Audiomulch, Bidule) – just a case of “suck it and see” I suppose 🙂

  12. Interesting, I wonder if the real world audio difference between running at 44.1 or 48 compared to 96k justifies the large cpu / hard disk hit that it takes to run at that SR. Would have been nice to have had some thoughts on that outside of the theory.

    • Jeroen Schilder says:

      Hi Dan,

      It can really make a huge difference even if your end product is just 44.1 kc but it all depends on what you are working with and how you use it. Especially if you are working with prosumer type hardware, like most of us do, you can run into problems with AD on 44.1 kc .

      For an AD converter sampling at 44.1 is very hard to do as the SR is very close to the highest audible frequencies. A very steep brickwall filter is neccesary to avoid aliasing which produces very nasty distortions. A lot of mid-priced A-D converters are mathematically compromised when it comes down to this anti-aliasing filter.

      Sampling at 88.2 or 96 kc does not need the filter to be that close to audible frequencies and in most cases using a cheap AD on 96 or 88.2 and then downsampling to 44.1 with a program like R8brain will gives better results with clearer highs, less colouring by phaseshifts, massively improved stereo imaging and less aliasing. The off-line filtering is just better than the filter in the chips. Offcourse this all does not apply if you use a mastering grade AD.

      Standard digital equalizers cannot filter sounds close to the niquist frequency (22.05 kc with 44.1 kc SR) so if you need EQ in the upper octave with a typical plug-in EQ it won’t be any good. This does not apply to EQ’s that upsample/downsample your audio during processing, but they will actually use more CPU as they’ll also have to do the samplerateconversion on the fly… Definitely no reason to stay on 44.1 kc here.

      Also a lot of compressors actually sound smoother on higher samplerates, especially the feedback type compressors, and they are loved for their sound. Nowadays most of them are upsampled too.

      So 96 kc brings you better defined AD conversion and probably better sounding audio processing, even if you eventually downsample to 44.1 kc.

      Please note that I have a mastering point of view, when you are tracking and mixing you might have other considerations. Running 24+ tracks on 96 kc will not be possible on all systems. Maybe if you record on 44.1 or 48 kc/24 bit, process with upsampled plug-ins you will get real close to your ideal as well as long as you are prepared to invest in some very nice (and expensive) AD.

      In the market today with cheap HD and DDR you could ask yourself what in the end would be more economically… Do you have to be able to record 24 channels at once? In that case upgrading your computer to do more numbercrunching might be a lot cheaper then replacing hardware that works fine on 96kc!

      • Interesting. Thanks Jeroen. However, I recently noticed that even with good hardware I’m better off sampling at 96kHz to begin with.

        I record through my Apogee Mini-Me into an RME AiO. I was doing some tests recently to see if I could start working at 96kHz all the way. I can’t yet, unfortunately, because of some cpu hungry synths that I love too much… But in the process I found out that I get audibly better results if I track at 96kHz and then downsample to 48kHz (with r8brain pro) for mixing, instead of recording straight to 48kHz. I was quite surprised by this.

        The difference is subtle but audible. I record flutes mostly and this makes the highs a bit smoother. I especially can notice that the breath noise sound more natural this way. Maybe it’s the Mini-Me that is too old by modern standards?

        • Hi Sylvan,

          Well, I guess this is your own personal test of your Mini-Me’s anti-aliasing filter. Theoretically sampling 48 kc straightaway should sound better as you don’t have to do a downsampling step. If downsampling to 48 kc sounds better then recording 48 kc straightaway then this filter isn’t the best.

          This is exactly what I meant. Mid-prices AD’s (yeah I know the Mini-Me had a pretty high price-tag, but it isn’t a Prism either) are compromised. I’ve been sampling 96 kc and downsampling for years as the results are better, even with latest stuff.

          Do you know the SACD? SACD sounds better then CD, but oddly, when you playback a SACD on a normal 16-bit player it also sounds better then the 16-bit/44.1 kc CD. I wonder why…

          I also have a question for you: Why would someone use 48 kc ?

          Cheers

          Jeroen

          • Haha, good question 🙂
            I started this with my previous recording setup which was of lower quality than now, for similar reason than I described. I could hear a small audible improvement when working at 48 kHz all the way and then downsampling to 44.1 at the end (still with r8brain pro). Since it’s not much of a cpu hit from 44.1 to 48kHz I could afford it.
            I just stuck with that way of working until now without re-questionning it… But yeah, I guess now I can just downsample my live recording from 96 to 44.1 kHz and stick to that. But I’ll do a few tests just to make sure. Old habits die hard…

  13. I just don’t really get it right until now. What keeps me asking is the fact that I don’t only record audio.
    I use a lot of Samples that of course mostly are in 44.1. So if i am going to load those samples into logic, it needs to be at 44.1. If its not the 44.1 samples playing to fast at higher setted sample rates.

    Is there any good way to work with 44.1 samples under higher sample rates in logic?
    Or is it possible to load those in the samples which is probably translating internal to higher samples rates used in logic.?

    thnx!

  14. First of all, I’d love to see 88.2kHz or similar becoming a standard sooner or later. But considering the industry’s (itunes, spoitfy, ect) main focus on bandwidth transfer rates, I seriously doubt if such a standard will ever see daylight. Superaudio CDs and audio DVD clearly failed to establish higher rates in public.

    To prevent any misunderstandings, such a high rate is not required for simple playback or recording purposes. Especially with the latest generation of AD/DA converters.

    Such a higher rate would have substantial benefits regarding the costs and quality of ADA converters. As Jeroen mentioned, the technical demands for anything going to or coming from 44.1kHz are particularly steep (which gives less than a tenth octave room for the Nyquist filter to remove about 100dB of aliasing). It would also make the design of most audio processors MUCH easier.

    However, I don’t think that oversampling should be put into the end user’s hand. This is a far to complicated topic for most people and most generalizations turn out to be plain mistakes or source of additional problems.

    Most users have no idea of basic signal processing principles such as linear processing vs non-linear processing. It makes absolutely no reason to oversample linear processes (EQs, linear filters, delays, reverb, static gain). Oversampling can even reduce their performance! In the case of a simple IIR based filter, such as a peaking EQ, oversampling N times will actually increase precision problems N times (at least). That is, the filter will be half as “good” at the double rate. This becomes quickly audible at low frequencies. Generally, keep in mind that doubling the sample-rate also means halving the precision! Sadly, precision is the most important thing for all recursive filters (the basic component of almost anything audio).

    Also, the few algorithms who truly benefit from a higher rate are all non-linear in their nature. This includes that the non-linearity will most probably be multi-dimensional and generate far higher harmonics than any external “brute-force” oversampling work-flow could handle. IMO, The only person which can truly prevent aliasing problem is the original designer. And he should solve these problems independently from the project-rate.

    Another problem is related to the later target rate of 44.1kHz and it’s limiting. A limited 96kHz file will overshoot if down-sampled. This overshoot (due to the Gibbs phenomenon) is substantial and will require a final limiter to recover these 2-3dBs and practically undo most of the sonic advantages of the higher rate work-flow. A specialized oversampled limiter however, can prevent these overshoots with clever trickery, completely without an additional target rate limiter.

    Finally, let me add that the many non-linear algorithms actually have no or only weak reasons to oversample the original signal. IMHO, in a perfect world, processors should oversample (or undersample) their PROCESS as required to provide the bandwidth and precision needed by the algorithm, not the MUSIC. 🙂

    So, to conclude: I would love to see 88.2 becoming a standard, but I’m not sure about the naive user-based oversampling control, as one usually has zero idea about the things going on under the hood. It’s a difficult subject.

    • kohugaly says:

      I agree with most of the things you’ve written, except the thing about oversampling the filters. the frequency response of the filter gets “wrapped” into the given bandwith, so for LPF the gain is zero at Nyquist , but in analog world gain is zero at infinite frequency. This becomes a problem when the Nyquist is near our hearing range. Also having the Nyquist frequency higher lets you put cutoff frequency higher. But I agree with the precision, but with 64bit math that is really not an issue.

      • kohugaly, you have a valid point. But what you are describing is not a general property of digital filter. It only happens if:

        A. The application/users defines that the (theoretical) behaviour of analogue filters marks the ultimate truth (in reality, no analogue implementation has an infinite bandwidth and poor circuits can face very similar problems). This is fine, I am not trying to value this decision.

        B. The designer makes the mistake to assume that analogue recursive models can be easily converted into a digital representation.

        😉

        In fact, this frequency warping effect is well known since a few centuries, and practically perfect alternatives are widely established. It isn’t particularly difficult to design digital recursive filters that behave like the theoretical analogue model (actually, they can do this better than any real world analogue implementation).

        Of course, oversampling also can reduce problems of the naive assumption described by B.). But this isn’t particularly clever or responsible engineering IMHO. Oversampling will most of all reduce the bandwidth, add aliasing and substantially increase the CPU load.

        As for the precision issue, of course, 64bit math gives enough precision for most use cases up to 88.2/96kHz. But this is also the point where problems begin. Oversample 4 times (to say, 192kHz) and try to use a 4th order IIR high pass at 40Hz. It will sound strangely ineffective and a closer analysis will reveal severe precision problems (truncation noise, limit cycles and probably plenty denormals too). Similar problems happen if one tries to run a very high order IIR filter at a higher rate. But unlike the purely linear frequency high freq warping effects you mentioned, these low frequency effect will be of the non-linear type (noise, non-harmonic distortion).

        This is just one example that higher rates do not equal “better processing”. They can easily provoke the opposite effect. It’s a compromise!

  15. If I’m already running my project at 96 kHz (/ 24 Bit):

    1. Is there any benefit from using the oversampling option in the multitude of plugins I use?
    2. Is there any possibility that 96 kHz is sufficient and oversampling via plugins will in fact introduce unwanted artefacts? (eg. poorly implemented realtime up/down SRC within a plugin)
    3. If there is a legitimate argument for plugin oversampling, what is the optimal rate for maximum benefit before returns are diminished so far it is no longer worth the CPU / time in an OFFLINE render?, x2? x4? .. (x0?)

Trackbacks

  1. […] article on working ITB at higher sampling rates. I will try this on my next Logic composition and post […]

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.