Soundmap

We perceive every sound as four basic properties:

Loudness Duration Timbre Pitch

These four basic things intertwine with each other and they together represent something we call a “note”

LOUDNESS:

Loudness is the information about the quantity of a sound that is relative to the decoder.
The decoder is in most cases the human mind. The decoder can also be a computer or an animal. Lack of any information about a sound means silence.
It is strongly connected to perception of the intensity of a physical force that causes that information. In most cases, loudness is caused by the change of pressure in the air.
The human middle and inner ear structure is not made to respond linearly to greater force but logarithmic. This means that 1x greater force is not 1x greater perception of a sound, but much less. That's why we present intensity of air pressure in logarithmic scale. Intensity of air pressure is measured in unit called decibel (dB). But we still perceive this dB intensity by our mind non-linearly. That's why we measure loudness in units called “sone” and “phone”.
Loudness is additionally explained under pitch.
Example of two sounds where the first is louder than the other:
         

 
The dynamic range of an instrument is the ratio between the loudest and the quietest possible sound that the instrument is capable to produce. The bigger the difference means the wider is the dynamic range. And wide wide dynamic range is desired.

DURATION

Duration is the length of sound, measured in time.
Example of two sounds where the first has a longer duration than the other:
         

 
It becomes more interesting when we got more than one sound...
 
Tempo(t)
Sound1(t) , sound2(t), sound3(t), sound4(t), sound5(t)...
Tempo is a variable that controls the duration of every sound in a sequence of sounds. We measure tempo with beats.
The beat is a mark that indicates time intervals. When a beat happens, a time interval starts, and when next beat happens, the current time interval ends and next time interval starts, and so on.
We can measure timing between beats by counting how many of them happened per one minute (bpm). If 60 beats happen in a minute, it means we got 60 time intervals, so every time interval lasts one second. With 120 bpm we got 120 time intervals and every time interval is half of a second long.
If the tempo gets faster (the time between beats is shorter), every sound duration in a sequence gets shorter and the sequence altogether is shorter.
And vice versa, if the tempo slows down (time between beats is longer), every sound in a sequence lasts longer (expands trough time) and sequence altogether lasts longer.
When the tempo changes, the pitch of a sound should stay the same. (the pitch is explained later)
In music, most often sound (duration) that lasts as long as one time interval is called a “quarter note” and it is written with symbol ♩. Sound that has duration like four quarter notes is called "whole note". So, relative to whole note, quarter note has duration 1/4 and it is sometimes written this way. Names and signs for shorter and longer “note values”(durations) relative to whole note can be found here.
However sound duration that lasts as long as one time interval can be connected to any note value. We define to which note value is one time interval connected to, with something called "time signature". It is well explained in this video.   
Rhythm
Sound1 (t1) , sound2 (t2), sound3 (t3), sound4 (t1), Sound5 (t2) , sound6 (t3), sound7 (t1), sound8 (t2), sound9 (t3), sound10 (t1)...
It is a repeating sequence of sounds durations. So in other words, we get a pattern, of sounds durations.

TIMBRE

The difference in sound we notice when we hear two different musical instruments playing is called the timbre. Because of timbre, humans immediately recognise when two sounds are played on two different instruments.
Example of two timbres where first was made by the guitar and second was made by the piano:

The property that differentiates spoken vowels ("a e i o u") is the timbre. Five different vowels mean five different timbres.
Beside dynamic range, main difference in sound between classical guitar (nylon strings), acoustic guitar (steel strings) and electric guitar (steel strings) is because every of these three instruments has it's own timbre. Different types of used wood give you different timbres. Different types of strings also give you different timbres. So does the thickness of strings. And when you change old strings with new ones, you change the timbre of your instrument as well. Playing (picking) closer or farther to the neck on certain instruments also changes the timbre.
The electric guitar transfers vibrations of strings into electrical current with a device called a pickup. Strings based changing electrical current is then amplified and played by the loudspeaker. Before it is amplified, we can put some electronics that change the electrical current (called effects unit). And by that (device), we can change the timbre of a sound just by the press of a button. Different loudspeakers also adds different timbres to the sound we listen (microphones ads it too).
Here is a sum of basic things that give an electric guitar its unique timbre: wood + strings + pickup(s) + position of pickup(s) + effect unit(s) + amplifier(s) + loudspeaker(s). In my opinion, getting all this things right can be called art. Making good music is art and this is part of it.
We know from which direction the sound came from because we have two ears and the sound arrives to one ear just a little sooner than to the other. This causes something called a phase shift and our mind can automatically decode the source from it. However, timbre also takes part in determining the source of the sound. The shape of our ears and the head change the timbre of a sound, based on direction where it came from. That helps us to determine the source of the sound and is also the main reason why it’s harder to recognize our own voice when we hear a recording of it.

PITCH

Let's start with an example: take your arm, and using a marker, draw a 14 cm line on it the way that it's just slightly wider on elbow side than on wrist side . Then, let someone touch you with one finger somewhere on that line.
The pitch would be the location of that touch. If the person touches you on a wider part of the line, we say it's a lower pitch and if the touch is on a more narrower side, we say it's a higher pitch.
Now, image that we put the same but a four times smaller line inside your ear and let the changes in the air pressure triggered the 'touch'. If the changes in the air are happening slowly, we get touched on a wider location on that line, and if the changes in the air are happening fast, we get touched on a narrower location. The device (bodily organ) that transfers changes of air into different locations (pitches) is called the cochlea. Your mind interprets the touch this way because the line is located inside your ear and touch is triggered by physical force of changes in the air.
Example of two sounds with same loudness and same timbre, but with a different pitch:

We can use the previous 'line' example to explain the LOUDNESS: the stronger the touch is on some location on that line, the greater the loudness is. In other words: if the change in the air pressure is big (not fast), we register sounds as loud and if the change is small, we register sounds as quiet.
You may also notice that some parts of the line on your arm are more sensitive and on other parts you'll need greater pressure for same sense of intensity. So, if the location is the pitch, then mentioned sense of loudness also depends on the pitch. And by using the same pressure, you'll perceive some pitches louder and some pitches quieter. If we want to perceive loudness linearly, we need to reduce the pressure on some places (pitches).
What about timbre? Imagine that someone tries to touch us somewhere on this line that is on our arm trough a small carpet. We will sense the touch in a different way. That difference depends on the smoothness of a carpet and how it is knitted. With different carpets we'll sense differently. And different carpets represent different timbres.
If we touch that line simultaneously with more than just one finger, we get something called a chord. So in other words, with more pitches simultaneously, we get harmony (or disharmony) called chords.
Pitch is the most fundamental thing that must change through time in order to get melodies.
In music, we mark different locations on that line by precise rules. I'll try to explain these in an easy way:
We divide (mark) “the line” to 12 different regions called “octaves”. We label these octaves by using numbers from -1 to 10. Octave that is closer to wider side of the line has the label “-1”, continuing by 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and all the way up to the narrower side that has the label “10”.
Then we divide every region, called octave, to 12 even smaller pieces. We can give them labels like “C, C♯, D, D♯, E, F, F♯, G, G♯, A, A♯, B”.

For exact definition of using Soundmap colors click here.

Scientists use something called the Greenwood function to define where one region begins and where it ends.
The 'line' in an average human ear is around 34 mm long. It is 0.5 mm wide on one side and 0.04 mm on the other. The technical term for the line is the “basilar membrane”. Basilar membrane is located in the previously mentioned device, called cochlea. Together they transform changes in air pressure to pitch (with the help of eardrum and the middle ear). If the air changes 20 times per second, the basilar membrane is touched on a very wide side, and if changes 20 000 times per second, it's touched on a very narrow side. We express changes in air by frequency (f) and in units called Hertz (Hz).
To connect the changes in air to the labels we used before, we must take a frequency and an octave to start with. Most (but not all) scientists agreed to take a frequency of 440 Hz (that means the air changes 440 times in 1 second) and bind it to label “A” in “4”th octave. Then they made a rule for pitch markings that goes as follows:
Every time the frequency doubles, mark a pitch with the same label, but in one octave higher. For example, if the frequency is 440 Hz which corresponds to label “A” in 4th octave, then 880Hz must have the same label “A” but in 5th octave.
And 1760Hz must as well have a label “A” but in 6th octave and so on. The reason for the rule above is called “resonance frequency”.
List of frequencies connected to the “A” label in octaves:

-1 A
0 A
1 A
2 A
3 A
4 A
5 A
6 A
7 A
8 A
9 A
10 A

13.8 Hz
27.5 Hz
55 Hz
110 Hz
220 Hz
440 Hz
880 Hz
1760 Hz
3520 Hz
7040 Hz
14080 Hz
28160 Hz

To get the frequencies for other 11 labels in certain octaves, we must first know the proportions between them in such a matter for frequencies to stay the same. And that proportion is 2^1/12 (≈ 1.0595).
So, to get frequency for the next label “A♯” in “4”th octave, we must multiply 440 Hz with 1.0595 ≈ 466.16 Hz.
And to get the frequency for B in 4th octave: 466.16 * 1.0595 ≈ 493.88 Hz,
and for C in 5th octave: 493.88 * 1.0595 ≈ 523.25 Hz and so on.
Full table in 4th octave:

4 C
4 C♯
4 D
4 D♯
4 E
4 F
4 F♯
4 G
4 G♯
4 A
4 A♯
4 B

261,6 Hz
277,2 Hz
293,7 Hz
311,1 Hz
329,6 Hz
349,2 Hz
370,0 Hz
392,0 Hz
415,3 Hz
440,0 Hz
466,2 Hz
493,9 Hz

For every different frequency, you get touched on different location.

This is how changes in air (expressed in frequencies) are connected to pitch labels (locations), but to get the exact location we have to use the Greenwood function. However, the exact location is not important to us because our mind processes information of pitch in a different way than an ordinary touch... The lower pitch sounds are called "bass" and the higher pitch sounds are called "treble".
The pitch used in above examples for loudness, duration, timbre and first one for pitch is 3D. The second sound for pitch example is 3G.

-1 A	0 A	1 A	2 A	3 A	4 A	5 A	6 A	7 A	8 A	9 A	10 A
13.8 Hz	27.5 Hz	55 Hz	110 Hz	220 Hz	440 Hz	880 Hz	1760 Hz	3520 Hz	7040 Hz	14080 Hz	28160 Hz

4 C	4 C♯	4 D	4 D♯	4 E	4 F	4 F♯	4 G	4 G♯	4 A	4 A♯	4 B
261,6 Hz	277,2 Hz	293,7 Hz	311,1 Hz	329,6 Hz	349,2 Hz	370,0 Hz	392,0 Hz	415,3 Hz	440,0 Hz	466,2 Hz	493,9 Hz