Recently encountered a need to generate Morse code audio files based on input text. After a few searches, I decided to write a builder myself.
Because I wanted to access my Morse code audio file through the Web, I decided to use PHP as my main programming language. The screenshot above shows a Web page that begins to generate Morse code. In the download zip file, contains the Web pages for submitting text and the PHP source files used to generate and present audio files. If you want to test the PHP code, you need to copy the Web page and related PHP files to a PHP-enabled server.
For many people, the moss code is like some old movies, the sequence of "dots" and "dashes", or a series of beeps. Obviously, if you want to use computer code to generate Morse code, this is far from enough to understand. This article will describe the elements that generate Morse code, how to generate audio files in wave format, and how to convert Morse code into audio files in PHP.
Morse code
Morse code is a form of text encoding. Its advantages are easy to encode, and the ear can be easily decoded. In essence, it is through the opening and closing of audio (or radio frequency) to form or short or long audio pulses, generally referred to as dots (dot) and lines (dash), or radio terminology called "Tick" and "click". In modern digital communications terminology, Morse code is an amplitude keying (amplitude shift keying, ask).
In Morse code, characters (Letters, numbers, punctuation marks, and special symbols) are encoded into a sequence of "tick" and "click". So in order to convert the text into Morse code, we first have to determine how to represent "tick" and "click". One obvious option is to use 0 for "tick", 1 for "click", or vice versa. Unfortunately, Morse code uses a variable-length coding scheme. So we also have to use a variable-length sequence, or a way to package data into a fixed-bit-width (fix-bit-size) format that is common to computer memory. In addition, it is necessary to note that the Moss code is not case-sensitive and cannot be encoded for some special symbols. In our implementation, undefined characters and symbols will be ignored.
In this project, memory footprint is not an issue that needs special consideration. So, we propose a simple coding scheme that uses "0" to denote each "tick", "1" to denote each "click", and to place them in a string associative array. The PHP code that defines the Morse code table is as follows:
$CWCODE = Array (' A ' => '), ' B ' => ' 1000 ', ' C ' => ' 1010 ', ' D ' => ', ' E ' => ' 0 ',
' F ' => ' 0010 ', ' G ' = > ' g ', ' H ' => ' 0000 ', ' I ' => ', ' J ' => ' 0111 ', ' K ' => ', ' L ' => ' 0100 ', ' M ' => ', '
N ' => ' Ten ', ' O ' => ', '
P ' => ' 0110 ', ' Q ' => ' 1101 ', ' R ' => ' 010 ', ' S ' => ', ' T ' => ' 1 ',
' U ' => ' 001 ', ' V ' => ' 0001 ', ' W ' => ' 011 ', ' X ' => ' 1001 ', ' Y ' => ' 1011 ', ' Z ' => ' 1100 ', ' 0 ' => ' 11111 '
, ' 1 ' => ' 01111 ', ' 2 ' => ' 00111 ', '
3 ' => ' 00011 ', ' 4 ' => ' 00001 ', ' 5 ' => ' 00000 ', ' 6 ' => ' 10000 ', ' 7 ' => '
11000 ', ' 8 ' => ' 11100 ', ' 9 ' => ' 11110 ', '. ' => ' 010101 ',
', ' => ' 110011 ', ' => ' 10010 ', '-' => ' 10001 ', ' ~ ' => ' 01010 ',
'? ' => ' 001100 ', ' @ ' => ' 00101 ');
Note that if you are particularly concerned about memory footprint, the above code can be interpreted as bit. Add a start bit to each code to form a bit pattern, and each character can be stored in one byte. At the same time, when parsing the final encoding, delete the bit at the left of the start bit to get the real variable length code.
Although many people do not realize that "time interval" is the main factor in defining Morse code, understanding this is the key to generating Morse code. So the first thing we're going to do is define the time interval of the Morse code's internal code (that is, "tick" and "click"). For convenience, we define a "beep" of the sound length of a time unit DT, "tick" and "click" Between the interval is also a time unit dt; Define a "click" Length of 3 dt, the interval between the characters (Letters) is also 3 DT ; The interval between the definition word (words) is 7 dt. So, to sum up, our time interval table looks like this:
In Morse code, the "playback speed" of coded sounds is usually expressed in words/minutes (WPM). Since English words have different lengths, and characters also have different numbers of "tick" and "click", Converting from WPM to (audio) digital sampling is not as simple as it seems. In a programme adopted by an international organization, 5 characters are used as the average length of a word, while a number or punctuation mark is treated as 2 characters. In this way, the average word is 50 units of time dt. In this way, if you specify WPM, then our total playback time is the time unit/minute of WPM, and each "tick" (that is, one time unit dt) is equal to 1.2/wpm seconds. Thus, given a "tick" length of time, the length of the other elements can easily be computed.
As you may have noticed, in the pages shown above, we use "Farnsworth spacing" for options below 15WPM. So what the hell is this "Farnsworth spacing"?
When the operator learns to decode Morse code with his ear, he realizes that the rhythm of the character's appearance changes as the speed of the play changes. When the playback speed is below 10WPM, he can easily identify "tick" and "click" and know which character to send. But when the playback speed of more than 10WPM, the operator's identification will be wrong, he identified more characters than the actual "tick" and "click." When a person is accustomed to the low speed Morse code while studying, there is a problem when handling the high-speed playback code. Because the rhythm has changed, his subconscious recognition will be wrong.
In order to solve this problem, "Farnsworth spacing" was invented. In essence, letters and symbols play faster than 15WPM, and by inserting more spaces between the characters, the overall playback speed is reduced. In this way, the operator can be a reasonable speed and rhythm to identify each character, once all the words characters learning, you can increase the speed, and the receiver only need to speed up the recognition of the character speed. Essentially, the "Farnsworth spacing" technique solves the problem of rhythm change, allowing the receiver to learn quickly.
Therefore, in the whole system, for lower playback speed, are unified into 15WPM. Correspondingly, a "tick" length is 0.08 seconds, but the spacing between the characters and the word is no longer 3 dit or 7 dit, but is adjusted to fit the overall speed.
Generate sound
In the PHP code, a character (the index of the previous array) represents a group of Moss sounds composed of "tick", "click", and blank spacing. We use digital sampling to compose an audio sequence and write it to a file, with the appropriate header information to define it as wave format.
The code that generates the sound is actually quite simple, and you can find them in the PHP file in your project. I find it quite convenient to define a "digital oscillator". Each time the OSC () is invoked, it returns a timed sample generated from the positive black wave. Using sound sampling and audio specifications to generate wave-formatted audio is sufficient. Between 1 and 1 of the generated positive waves are moved and adjusted so that the byte data of the sound can be expressed in 0 to 255来, while 128 indicates 0 amplitude.
At the same time, we need to consider another issue in generating sound. Generally speaking, we generate Morse code through the positive and dark wave switches. But if you do this directly, you'll find that the signal you generate will take up a lot of bandwidth. As a result, radio devices are usually modified to reduce bandwidth consumption.
In our project, we will make such amendments, but only in the way of numbers. Since we already know a minimum sound sample "beep" of the length of time, then it can be shown that the minimum bandwidth of the sound amplitude occurs in length equal to "tick" of the positive and the half period of the black wave. In fact, we use Low-pass filters (low pass filter) to filter audio signals to achieve the same effect. However, since we already know all the signal characters, we can simply filter each character signal directly.
The PHP code that generates "tick", "click", and blank signals is as follows:
while ($dt < $DitTime) {$x = OSC (); if ($dt < (0.5* $DitTime)) {//Generate the rising part of a dit and dah up to half the dit-time $x = $x *sin (M_PI/2.
0) * $dt/(0.5* $DitTime));
$ditstr. = Chr (Floor (120* $x +128));
$dahstr. = Chr (Floor (120* $x +128)); else if ($dt > (0.5* $DitTime)) {//For a dah, the second part of the Dit-time is constant amplitude $dahstr. = ch
R (Floor (120* $x +128));
For a dit, the second half decays with a sine shape $x = $x *sin ((m_pi/2.0) * ($DitTime-$DT)/(0.5* $DitTime));
$ditstr. = Chr (Floor (120* $x +128));
else {$ditstr. = Chr (Floor (120* $x +128));
$dahstr. = Chr (Floor (120* $x +128));
//A space has an amplitude of 0 shifted to 128 $spcstr. = Chr (128);
$dt + + $sampleDT; //At this point the DIT sound has been generated//for another dit-time unit The dah sound has a constant amplitude $d
t = 0;
while ($dt < $DitTime) {$x = OSC ();
$dahstr. = Chr (Floor (120* $x +128));
$dt + + $sampleDT; }//Finally during the 3rd Dit-time, the DAh sound must be completed//and decay during the final half dit-time $dt = 0;
while ($dt < $DitTime) {$x = OSC ();
if ($dt > (0.5* $DitTime)) {$x = $x *sin (m_pi/2.0) * ($DitTime-$DT)/(0.5* $DitTime));
$dahstr. = Chr (Floor (120* $x +128));
else {$dahstr. = Chr (Floor (120* $x +128));
} $dt + = $sampleDT;
}
File in wave format
Wave is a generic audio format. In the simplest form, the wave file represents the audio amplitude of the specified sample rate by including an integer sequence in the header. For more information about wave files, see here Audio file Format specifications website. For the Morse code, we don't need all the parameter options in wave format, just a 8-bit mono, so easy. It is to be noted that multibyte data requires a low priority (Little-endian) byte order. The wave file uses a riff format composed of records called "Block (chunks)".
The wave file starts with an ASCII identifier riff, followed by a 4-byte "block", followed by a header containing the ASCII character wave, and finally the data and sound data that defines the format.
In our program, the first "block" contains a format specifier, which is fmt by ASCII characters and a "block" of 4 times bytes. Here, because I am using the normal pulse code modulation (plain vanilla PCM) format, each "block" is 16 bytes. We will then need this data: number of channels, sound sampling/sec, Average bytes/sec, one chunk (block) alignment indicator, bit (bit)/sound sampling. In addition, because we do not need high quality stereo, we only use mono, we use 11050 samples/sec (standard CD quality audio sampling rate is 44200 samples/sec) of the sampling rate to generate sound, and 8 bits (bit) to save.
Finally, the real audio data is stored in the next "block". It contains ASCII character data, a 4-byte "block" and, finally, real audio data consisting of a sequence of bytes (because we use a 8 bit/sample).
In the program, a sound composed of 8-bit audio amplitude sequences is stored in the variable $SOUNDSTR. Once the audio data has been generated, all the "block" sizes can be computed and then merged into a disk file. The following code shows how to generate header information and audio "blocks". Note that the $RIFFSTR represents the riff header, $fmtstr represents the "block" format, $SOUNDSTR represents the audio data "block."
$riffstr = ' RIFF '. $NSizeStr. ' WAVE ';
$x = samplerate;
$SampRateStr = ';
For ($i =0 $i <4; $i + +) {
$SampRateStr. = chr ($x% 256);
$x = Floor ($x/256);
$fmtstr = ' FMT ' Chr. chr (0). chr (0). chr (0). chr (1). chr (0). chr (1). chr (0).
$SampRateStr. $SampRateStr. chr (1). Chr (0). chr (8). chr (0);
$x = $n;
$NSampStr = ';
For ($i =0 $i <4; $i + +) {
$NSampStr. = chr ($x% 256);
$x = Floor ($x/256);
$SOUNDSTR = ' data '. $NSAMPSTR $soundstr;
Summary and comments
Our text Morse code generator looks good at the moment. Of course, we can also make a lot of changes and improvements to it, such as using other character sets, read text directly from the file, generate compressed audio, and so on. Because the purpose of our project is to make it easy to use on the network, so our simple solution has reached our goal.
Of course, as always, I hope you can make some suggestions on these simple and rude code.