Using csdr for auxiliary receivers on a websdr system

Northern Utah WebSDR
Using "csdr" for auxiliary receiver streams on a WebSDR system

Preliminary Version

What is this?

On another page (see: "Duplicating receiver streams using ALSA and asoundrc" - Link) we discussed using tools in ALSA to take an already-existing raw I/Q stream from a "sound card" type receiver (which could be an actual sound card - or it could be an RSP1a running at up to 768 kHz as described on this page: Operating a WebSDR receiver at 384 or 768 kHz using the 16 bit signal path - Link) and "duplicating" this stream so that we could use our already-existing receiver data elsewhere.

One of the "use cases" mentioned with regard to doing this was to send this audio to some other service - perhaps a fixed-frequency receiver that might be used for decoding FT-8, WSPR or other types of digital signals - or it could be a much more wideband use, such as "hearing" a large chunk of a band for use with a CW or RTTY skimmer. If a fairly wideband receiver is being used on a WebSDR, then it only makes sense that we could "re-use" its raw I/Q data in other applications.

Example:

Let's consider an example of an RSP1a tuned to 7125 kHz and operating with a 768 kHz sample rate. As such, it "hears" all of the U.S. 40 meter amateur band (7000-7300 kHz) - and more - so it's a good candidate for receiving a specific segment of that band. As it is, the "raw" 768 kHz I/Q data isn't terribly useful, but we can use the "csdr" tools written by HA7ILM and available on GitHub (Link is HERE) as well as from the more-current GIT maintained by DD5JFK (Link is HERE).

Comments:

I won't go through the installation of Github on your Linux machine - or how to clone and build the binaries as that is covered many other places, but from this point on the assumption will be made that you have already done so.

I will also assume that you have looked at the page "Duplicating receiver streams using ALSA and asoundrc" - (Link) and implemented that as well with an available source of raw I/Q audio. As described in that document, we'll presume that our source of I/Q audio is a device called "f_loop0_out3" - one of four "duplicated" outputs from this receive hardware.

If you are planning to us a raw I/Q rate higher than 196 kHz, you should look at the page "Operating a WebSDR receiver at 384 or 768 kHz using the 16 bit signal path" - Link mentioned above.

As you will see, "csdr" uses parameters for controlling frequency and bandwidth that are ratios of the sampling rate rather than absolute frequencies. This math is simple linear algebra, and since we are configuring "receivers" that are fixed in frequency and bandwidth, making such calculations on the back of an envelope are not particularly onerous.

We must first get our audio from the stream that we have created and one of the ways that this may be done is to use "arecord". This utility is convenient in that it does everything that we need it to - including convert the sample type output by it to the floating-point format required by csdr - and we can invoke it thusly:

arecord -D f_loop0_out3 -c 2 -f float_le -r 768000

This invocation uses as its input device "f_loop0_out3" as its source, specifies that there are two audio channels (the "-c 2" part), that the output format is little-endian floating point (the "-f float_le" portion) and that the rate that we want is 768 kHz (e.g. "-r 768000).

A problem with "arecord" and why you may not be able to use it

While you might think "arecord" would work for this, it has a problem: It will stop after output 2 GB of data. Apparently, this is due to the 2 GB limit of .WAV files and even if you are in "stream" mode using STDIO, this will still cause - in our case, with 768 kHz and two channels with four bytes per sample (because we are converting to floating point using the "-f float_le" argument) - it to stop after 349.5 seconds.

If all you need are short recordings of just a few minutes each - for WSPR decoding or to do analysis, such as verifying that a receiver is working properly and you need just a short audio file - then this might work out for you. If you need a continuous source of receiver data, having it stop after a while is a "show stopper".

Work-around using "fplay":

The work-around is to use the "fplay" utility discussed on this page: "Creating "fplay", a version of the Linux "aplay" utility, but without the speed limit" - (Link) instead. As it turns out, "arecord" and "aplay" are pretty much the same program so we can make "aplay" (and "fplay") act like "arecord" with the addition of the -C argument.

Instead of the above, we simply do:

fplay -C -D f_loop0_out3 -c 2 -f float_le -r 768000

If you used the implementation in the "Duplicating receiver streams" article mentioned earlier (with the "plug-in") we can specify our rate in this line and ALSA will convert it for us. For example, if we wished, instead, to use a 384 kHz bandwidth for our 40 meter receiver, we could have specified "-r 384000" instead and ALSA would have resampled and filtered it - doing a better job at anti-alias filtering than the SDRPlay driver can! If you have enough processing power to do this, running the receiver at higher-than-needed bandwidth and allowing ALSA to resample is recommended.

At this point "fplay" is spitting our I/Q receiver data in floating point format out to STDOUT for "csdr" to catch - but we could have configured it to spit out 16 bit signed, little-endian data by using "-f s16_le" instead - see the comment below about using csdr itself to convert formats.

Let's configure this to capture the 40 meter FT-8 segment found at 7074 kHz, using USB for reception:

Let's add to the above this massively-long line command line:

As mentioned above, we are using the "-f float_le" argument in "fplay" to convert it to a format that csdr wants, but if we already have a 16 bit signed little-endian stream we could have preceded the above with "csdr convert_s16_f" to do the same job.

Taking these statements one-at-a-time:

csdr shift_addition_cc 0.06640625 - This does a "frequency shift" in software and the value is calculated thusly:

([receiver center freq in kHz] - [desired receive center frequency in kHz]) / sample rate in kHz (the units of frequency may be in Hz, kHz or MHz - just as long as all are the same!)

For our example we find that: (7125 - 7074) / 768 = 0.06640625

Once this conversion is done, our new "zero beat" frequency (7074.0 kHz) is now at "zero Hertz" in the I/Q stream being spat out by the above command.

There are several different "shift" algorithms available in csdr, but this one seemed to use the least amount of CPU power.

csdr fir_decimate_cc 16 0.015625 HAMMING - This "decimates" our now frequency-converted signal to a lower sample rate to reduce processor loading for the later steps. Examining this statement a bit closer:

The "16" means "decimate by 16" - which is to say that we divide our 768 kHz sample rate by 16 to get 48 kHz.

The "0.015625" tells us where we want to low-pass filter our signals before we decimate. In our example, we must not allow signals at a frequency higher than half of the output sample rate of 48 kHz meaning that we must remove signals above 24 kHz before we decimate it. The low-pass frequency is calculated as being 0.5 * input_sample_rate * low_pass, so for our example its: 0.5 * 768000 * 0.015625 = 6000 - so our output signal is filtered to remove signals more than 6 kHz away from the new center frequency of 7040 kHz.

The "HAMMING" portion refers to how the FIR (Finite Impulse Response) window is defined. Beyond the scope of this document, suffice it to say that with a bit of extra processing, one can improve the quality of the "sharpness" of the filtering - see the Wikipedia article here.

There are several different decimation algorithms available, but this one seemed to use the least amount of CPU power.

There's another function - "decimating_shift_addition_cc" - that does same function as the "shift" and "decimate", but without filtering. For simple down-sampling, this function isn't useful as it would result in aliasing and the appearance of "out of band" signals - but it could be useful if we have already done filtering before applying it.

csdr bandpass_fir_fft_cc 0.004 0.05 0.005 - This does the actual band-pass filtering and sets the sideband. Taking this statement apart:

The parameters are <lower edge of passband> <higher edge of passband> <transition bandwidth> as a ratio of the sample rate - which, after the decimate step above, is 48 kHz. This translates as follows:

0.004 * 48000 = 192 Hz - The "lower" edge" of our filter begins at 192 kHz
0.05 * 48000 = 2400 Hz - The "upper" edge of our filter is at 2400 Hz.
0.005 * 48000 = 240 Hz - This specifies the "sharpness" of our filtering. The lower this number, the "sharper" it will be, but it will require more FFT bins and more processing horsepower.

The above configuration defines an UPPER sideband filter as we allow audio from 192 to 2400 Hz above our "zero Hertz" frequency of 7074 kHz, but if we wished to have a lower-sideband filter, we would redefine our lower and upper edge as follows:

-0.05 as the new "lower" frequency of "-2400 Hz"
-0.004 as the new "upper" frequency of "-192 Hz"
The new command would be: csdr bandpass_fir_fft_cc -0.05 -0.004 0.005
Since our signals are represented mathematically on our I/Q stream, we can have a "negative" frequency, which is below our "zero Hertz" frequency of 7074 kHz!

csdr realpart_cf - This simply throws away the "imaginary" portion of the signal represented in our I/Q stream. At this point we have done all of the filtering that we need, so we have no need for "negative" frequencies anymore and this means that we can throw away about half of our data to save processor power and, if we were to convey our "received" audio across a network, to save bandwidth.

csdr agc_ff - In most cases, you will want to implement an AGC (Automatic Gain Control), a "device" that continuously analyzes the signals and adjusts the gain up or down so that they are always at the same amplitude/volume. Because we are representing all of our audio data up to this point as floating point data, our signal level can span orders of magnitude (e.g. 10s or 100s of million-fold) from the weakest to the strongest signal - but one typically doesn't want to constantly adjust the volume control to keep the levels constant. Typically, this is done ONLY after all of the band-pass filtering has been completed so that it operates only on the portion of spectrum to which you are listening.

One may configure the parameters for "csdr_agc_ff" as follows:

csdr agc_ff <--profile (slow|fast)> <--hangtime t> <--reference r> <--attack a> <--decay d> <--max m> <--initial i> <--attackwait w> <--alpha l>

IMPORTANT NOTE: The parameters for "agc_ff" are based on the number of samples and are configured assuming a 48 kHz sample rate. If you choose a lower or higher sample rate, scale these numbers linearly as a starting point.

--profile (slow|fast) - This configures a "canned" profile for either slow of fast AGC.

--hangtime - hang time, in number of samples
--reference - a number between 0 and 1 (default 0.8) - the "target" value for the peak audio samples (e.g. 80% of full-scale)
--attack - A fraction, typically between 0.1 and 0.001 used to determine how quickly the AGC responds to the appearance of a strong signal: 0.1 is used for "fast" and 0.01 is used for "slow"
--decay - A fraction, like "--attack", but this is the "release" time - typically 10-100 times smaller in value
-- max - The "maximum" gain, default of 65535, that will be attained in the complete absence of signal. This functions like an "rf gain" control on an analog radio.
--initial - The "initial" gain upon start-up. Typically, this is 1: A high value (say 10000) would likely cause a brief "blast" of audio if a strong-ish signal were present.
--attackwait - This is the number of sample that the AGC should wait before AGC action occurs. This is used to prevent very brief noise spikes from "desensing" the receiver, but it can cause "clicking" on the leading edge of strong signals that suddenly appear.
--alpha - This is the parameter that determines the response of the low-pass filter used in the AGC's detection loop - a default value of 1.5 is typically sufficient.

Unlike some of the other csdr modules, one may pick and choose which parameter to use if we want to modify just one aspect of the AGC. Also, the current settings are displayed upon start-up, giving one a clue as to the ranges for these values. For example, with no parameters specified, the default AGC response is "fast" - which appears to be a bit too fast for my taste for even CW.

One issue with the "fast" profile - (e.g. "--profile fast") is that it's a bit too fast: If you tune in a strong CW note with an audio frequency you will hear a slight "buzz" on it as it oscillates the amplitude - a property absent when "slow" is selected. Most likely this id due to the over-fast "--attack" parameter which is 0.1 in "fast", but "0.01" (in "slow", decay is 0.01 and 0.001, respectively) and that it's able to "ride" the sine wave to a degree.

A related function is csdr agc_s16 which uses the same parameters, but operates not on floating point data, but on signed 16-bit integer data: This might be used if you need an AGC on an audio stream that is already in this format, but its dynamic range is necessarily reduced as a 16 bit integer simply cannot represent the same range from weak to strong as a floating-point value.

Another AGC device is "simple_agc_cc" - but this doesn't work as well for most things as it has a slow attack and decay.

csdr simple_agc_cc <rate> <reference> <max gain>

The "rate" parameter is used for the adjustment of the gain calculation and is typically from about 0.01 to 0.00001, with 0.01 being quite fast (probably too fast as it can cause distortion by acting at voice frequencies) and 0.00001 being very slow (seconds!) to respond to changes in signal level.
The "reference" parameter is the "threshold" about which the AGC is adjusting. Typically this will be 0.25-0.5 for 1/4-1/2 full-scale.
The "max gain" parameter is akin to the "rf gain" setting on an analog radio and it will prevent the AGC from setting the gain above this level. The useful range of this parameter is likely between 100 and 1000000 for most receive hardware.

Yet another AGC device is "fastagc_ff" which isn't terribly great for SSB/CW either, but it's better than "simple_agc_cc"

csdr fastagc_ff <block size> <reference>

The "block size" parameter refers to the number of samples to look at when adjusting the gain. With a 48 kHz sample rate as in the case of this example, a sample size of 4800 would mean that the gain is evaluated 10 times per seconds. It's probably best to avoid too-small block sizes (e.g. less than 0.1% of the number of samples per second) to avoid annoying artifacts.
The "reference" parameter would typically be "0.25" or "0.5" representing 1/4 or 1/2 of full scale.
In testing this AGC function, it seems as though its maximum gain is not sufficient to hear the noise floor of a quiet receiver.

csdr limit_ff - This function prevents any clipping from extraneous signals - such as noise pulses - that might get past the AGC.

csdr convert f_s16 - This is the final step in our reception of the signal as this converts the floating point audio data to 16 bit little-endian signed integers such as those used in ALSA.

At this point its worth noting that our audio is mono: Many applications will be happy with that - but let's take this one more step and cause our receiver audio to be piped to the default audio device using "aplay":

| csdr mono2stereo_s16 | aplay -r 48000 -c 2 -f s16_le &

In the above we recognize the individual steps we described above - but have added two more so that we can hear it on a speaker connected to our default audio device::

csdr mono2stereo_s16 - This function takes the audio that has already been converted to signed integer and creates a stereo signal by duplicating it for both channels.

aplay -r 48000 -c 2 -f s16_le & - This invokes "aplay" using the default sound card. We are specifying a 48 kHz audio rate - the same as we have defined when we decimated the original I/Q data - as well as 2-channel (stereo) audio using little-endian 16 bit signed audio, the same as we have defined in the "csdr convert f_s16" step.

Since we are using a sample rate of 192k or less, either the standard "aplay" utility or the faster "fplay" (described above) will work fine for us, but if you were producing a 384 or 768 kHz file, you would need to use "fplay".

The final version of the command line turns out to be:

With the "&" at the end of the command line, this function will run "forever" but it may be stopped by doing "killall csdr", which will also stop any related "fplay", "aplay" and "arecord" instances.

Configuration for a CW skimmer:

A "CW Skimmer" (see this web page: http://www.dxatlas.com/CwSkimmer/ ) is software that will inhale a large chunk of an amateur band and decode all CW signals within that passband and optionally log and report them. With the hardware above, it's possible to analyze the majority of the 40 meter CW band, so let's construct a hypothetical configuration, starting with our requirements:

We wish to cover from 7000 kHz upwards.
In the U.S., the CW portion of the band ends at 7125, so we'll compromise and choose a 96 kHz sample rate of which, 92 kHz is likely usable.

Because half of the 92 kHz is 46 kHz, we'll put the center of our "new" virtual receiver at 7046, allowing it to cover from 7000 to at least 7092 kHz, so we'll first calculate the "shift" value:

Our receive center frequency is 7125 kHz and our sample rate is 768 kHz, so we use the formula from above: (7125-7046)/768 = 0.102864583
The command would be: csdr shift_addition_cc 0.102864583

With our frequency shifted, we could, in theory, convert directly down to 96 kHz, but since we want 92 kHz of usable bandwidth, a filter that will remove signals beyond +/- 46 kHz from the center implemented at a sample rate of 768 kHz will be quite "expensive" in terms of CPU resources, so let's get to 192 kHz first - a decimation of 4 - and do filter there where it's likely less costly. Since the available bandwidth at 192 kHz is +/- 96 kHz - and we want +/- 46 kHz (for a total of 92 kHz bandwidth) we want 48% of that - so we'll go with a transition bandwidth value of 0.24, according to the formula above - and this should sufficiently remove any aliasing in the 768 to 192 kHz conversion. As mentioned earlier, we can gain a bit of performance with little CPU penalty by specifying a window filter, so we'll use "HAMMING".

The command would therefore be: csdr fir_decimate_cc 4 0.24 HAMMING

At 192 kHz, we can now do better filtering we we can apply band-pass filtering as follows using "bandpass_fir_fft_cc":

We wish to pass from -46 kHz to 46 kHz, relative to the center frequency of 7046 kHz. Because our sample rate is 192 kHz, we see that the ratio of 46 kHz and 192 kHz is (46/192) = 0.23958333 - or 0.24.
We would like to remove signals significantly by the time we get to +/- 48 kHz, so our transition bandwidth is 2 kHz - which is (2/192) = 0.0104 of the bandwidth, so using the formula above, we would multiply that by 0.5 and yield 0.005083.
Since we are using an FFT for filtering, we can improve the filtering performance (e.g. minimize leakage between "bins") by throwing a windowing filter at it with little CPU penalty - and the HAMMING window is pretty good for this.
The command would therefore be: csdr bandpass_fir_fft_cc -0.24 0.24 0.025 HAMMING

We are still at 192 kHz, but since we have done filtering that should prevent aliasing, we can now convert from 192 to 96 kHz while applying minimal additional anti-alias filtering, so we'll do that now:

We need to decimate by 2, and we need minimal filtering, so our transition bandwidth will be at half the sample rate, or a value of "1". Again, we'll use a HAMMING window.
This statement is: csdr fir_decimate 2 1 HAMMING

We are finally at 96 kHz and since the CW Skimmer can accept I/Q audio, we won't do anything else other than apply an AGC and a clipper as follows:

csdr agc_ff --attack 0.025 --decay 0.001 | csdr limit_ff
In the above, we choose an attack and decay slightly slower than the default to minimize the noise artifact mentioned above.

The final piece is the conversion from floating-point to signed-16 bit (little-endian) which is done with the statement: csdr convert f_s16

Again, since we need the I/Q we will not get rid of the "imaginary" data which means we have two audio channels - which also means that we don't need to do the mono-to-stereo conversion, either.

Example: Piping converted audio from a Linux computer to VLC

If we wish, we can then take the 48 kHz audio stream in our first example (the one prior to the CW Skimmer) and make it available to another computer - Linux or Windows - on the network as follows:

Where <port> is the port number on the computer running the above to which you connect. The above uses "netcat" (nc) to stream the raw audio data (from STDIO) via a TCP connection, if a remote device makes that connection.

You can test this by pointing your browser to the address and port of your audio stream: If the IP address of the computer streaming the audio is 192.168.0.32 and you picked "12345" as the port number, if you navigate to "192.168.0.32:12345" you should get a screen full of garbage correlating to the raw audio data - and you'll likely see the data sent by the browser to the audio server on that screen.

As practical example, if you wish to connect to the above stream using VLC - perhaps on your Windows PC - we would do the following:

From the VLC GUI, go to: Media -> Open Network Stream
Under "Please enter a network URL:" enter: tcp/rawaud://192.168.0.32:12345
Click "Play"
You should now hear the audio being streamed from the server.

From the command line the arguments would be:

<path>vlc --intf dummy --volume=100 --demux=rawaud --rawaud-channels=2 --rawaud-samplerate=48000 --rawaud-fourcc=s16l tcp://<ip_addr>:<port> vlc://quit

Where:

<path>vlc - This is the invocation of VLC for your system, including any needed paths - see below.

On Linux, you may need to precede it with "./" as in: ./vlc <arguments>

On Windows you'll need to include the path to the VLC binary ("vlc.exe", as in: C:\Program Files (x86)\VideoLan\vlc <arguments>

--intf dummy - This invokes VLC without bringing up the GUI interface: Without this, the GUI will be presented.
--volume=100 - This specifies a volume level of 100%
--demux=rawaud - Specifies that we are using "raw" audio
--rawaud-channels=2 - Specifies that we are using two channels. If we didn't use "csdr mono2stereo_s16" we would specify a "1" here - and it would also use half the bandwidth!
--rawaud-samplerate=48000 - This matches the sample rate that we have produced in our example above
--rawaud-fourcc=s16l - (e.g. "Signed 16 Bit Little-endian) This specifies the audio format that we are expecting from the receiver in the example above.
tcp://<ip_addr>:<port> - This is the IP address and port of the server running the receiver (e.g. "tcp://192.168.0.32:12345")
vlc://quit - When the stream stops - or if there isn't a corresponding stream to which it can immediately connect when VLC starts - VLC will quit. If this is NOT included, you may end up with a bunch of zombie VLC instances hanging around in memory, doing nothing. Having said that, it might be a good idea to occasionally look for and kill such zombie processes, should they occur.

Additional comments:

Selection of intermediate sample rates

If you have been following along closely, you'll note that the 768 kHz - or even 384, 192 or 96 kHz - sample rates integer-divide nicely into 48 kHz. If, for some reason, you were to need a rate like 44.1 kHz, you would need to readjust the source sample rate - but there's a problem here: When using the ALSA input, the only valid sample rates for the PA3FWM WebSDR server appear to be 24, 48, 96, 192, 384 and 768 kHz, so the use of a sample rate that doesn't divide neatly in 24 or 48 can't be used with "fir_decimate_cc".

There is another function - "fractional_decimator_ff" that, as the name implies, will take fractional decimation rates, but this involves a bit more CPU power - and a bit of care must be taken when doing this as unlike integer rates, fractional rates are not necessarily exact meaning that you may end up with minor sample rate errors that could cause buffer under/overrun issues.

This also brings up another point: In the example above, we decimated from 768 kHz down to 48 kHz in one step - but if we need only 2.4 kHz of bandwidth why didn't we go all of the way down to, say, 6 kHz? To do this would require a very filter prior to the decimation step as we would need to make sure that signals more than, say, 2.5 kHz away from "zero" would need to be significantly attenuated - perhaps by more than 60 dB. While theoretically possible, this would likely require more processing power than decimating to a higher frequency like 48 kHz first where filtering is less critical.

Practically speaking, there's no reason why multiple decimation steps couldn't be done - say from 768 to 192 kHz, and then again from 192 to 48 kHz: A bit of testing would need to be done to verify the efficacy of the filtering (e.g. suitably-reduced aliasing artifacts at each step) and to determine which method (one or two steps of decimation) minimizes CPU requirements.

Another factor affected by the "final" sample rate (e.g. after all decimation is done) is that of the filtering: A higher sample rate requires sharper filters (and more CPU power) to achieve a given "shape factor" (e.g. steepness of a filter at its edges) so a lower sample rate may be beneficial in that regard - but this is a trade-off as getting to a lower sample rate with more decimation steps may incur a CPU usage penalty higher than such filtering.

How much CPU does it use:

Since we are moving the audio data unmodified, it uses negligible CPU power (each instance shows up as "0.0% utilization) and minuscule RAM: Certainly, anything that you plan to do with this data - even shoveling out out the LAN port - is likely to take more processing power than replicating this data!

Conclusion:

In the examples above we show how a higher-bandwidth stream can be converted to a lower-bandwidth stream for use with other applications using CSDR, leveraging an already-existing receiver system for other uses. We also demonstrate how this audio data can be exported across a network (probably a LAN) for use by other servers for additional processing, such as CW Skimming, WSPR and FT-4/8 reception, etc.

It's likely that there are better ways to do the above tasks and if you know of methods that are "lighter weight" in terms of resource utilization, please let me know using the contact information below.

Additional information:

For general information about this WebSDR system - including contact info - go to the about page (link).

For the latest news about this system and current issues, visit the latest news page (link).

For more information about this server you may contact Clint, KA7OEI using his callsign at ka7oei dot com.

For more information about the WebSDR project in general - including information about other WebSDR servers worldwide and additional technical information - go to http://www.websdr.org

Back to the Northern Utah WebSDR landing page